Marcin Borkowski: 2023-09-02 Irregular recurring TODOs in Org mode, part I

Warning: this is the first part of a series which is not even finished yet. And even though it’s not the whole story, it is still a long, a bit meandering post, with quite a lot of code.

Some time ago I mentioned a very peculiar type of TODOs I’d like to implement. These are things I’d like to do from time to time, but not necessarily on a regular basis. A canonical example is an inspirational blog post I’d like to reread once in a while. I admit that this idea is inspired by spaced repetition, where things I want to remember are presented to me repeatedly, but with increasing intervals. Here, however, the situation is a bit different. First of all, I don’t really need to remember these things actively – I just want to be reminded of them from time to time in the future. The second difference is that I’m not sure if increasing intervals would be the best choice here. In classical spaced repetition algorithms the intervals grow exponentially, so after, say, 10 repetitions of something remembered well, the intervals can become so long you are basically guaranteed to see that item at most once or twice again in your life. In this case, I still want to read that blog post a few times – maybe once per two/three years even, but not once per a decade or even less often! The third issue is that with classical SR I can have days without any repetitions as well as days with many of them. Here, I’d prefer to be shown the same anount of “recurring TODOs” per day (preferably one, but I assume that if I like the system, I may have more of them).

So, after some consideration, here is the heuristic I came up with. (Spoiler alert: it’s not the one that will get implemented evetually.) First of all, the “items” are going to be Org mode headlines (well, that is pretty obvious) – they may contain links, but they may also be just pieces of text stored on my disk. I’d prefer to be flexible and not require all of them to be stored on the same level of Org hierarchy – for example, I might want a tree structure, with broad categories (like “blog posts”, “quotations” or “ideas for things to do”) as level-one headlines, narrower categories (like blog posts about “faith”, “computing” or “languages”) as level-two headlines etc.

When I’m introducing a new item to my system, it should be set to be shown again after, say, a random number of days between a week and a month (more or less). When I’m shown an item again, I should be able to decide what to do next – either schedule it for later with the interval doubled compared to the previous one, or schedule it for later for the interval shortened to 7-30 days (like in the beginning), or use an interval similar in length to the previous one.

This looks clever, and in fact much more similar to classic SR than I initially expected, but I am a bit worried if it’s sustainable. Let’s do some estimation. Assume that I’m going to add one item every five days to the system. This means that after a year I’ll have about 73 items to review regularly. Assuming the first interval to be about 20 days, the subsequent one roughly 40 days, and all the next ones 80 days, it gives me an average load of one item per day after a year of using the system. Let’s also assume that I’m going to use that system for the next 5 years (which I think is quite conservative – a more generous but still realistic estimate could well be 20 years). This means that I’ll have 5 items to review every day. A bit too much.

A tinkerer in me wants now to devise an extremely complex system where there is a cap for the number of items to be reviewed per day, the latest date (kept separately for every item) when I want it to be reviewed, a way for more important items to shift the review dates of the less important ones to later… This is fun to think about, probably a bit less fun to implement and almost surely not fun at all to use. So, let’s keep things as simple as possible. Here is another idea. (Spoiler alert: it’s also not the final one.) Since I do not want to get 0 items on some days and 8+ items on other days (which can happen with classical SR), I could turn the whole thing on its head and just ask the system for “the most important/urgent thing to review now”. On a busy day, I might do it once; on a less busy one, I could do it 2-3 times. This means that for every item I should store some data like “when it was reviewed for the last time” and “how many times it was reviewed”, and devise some function which would then calculate the “urgency” of that item. (Note: I know about the famous Eisenhower matrix, but in this case, I consider all the items “equally important”, so I can just sort them by urgency and that’s fine.)

Here is one idea that came to my mind. Let’s have an item last reviewed D days ago, and reviewed N times altogether (the act of entering the item into the system is considered the first review, so N>0). The higher D is, and the lower N is, the more urgent the item is. Let’s compute U(N,D) := N²/D (excluding the items reviewed today) and consider the items with lower U be more urgent. The square is there to make items reviewed more times less and less urgent, so that the average intervals between reviews will grow.

This formula implies a few things. First of all, items reviewed fewer times will have a preference over those reviewed many times. As (N+1)²/N² tends to 1 as N→∞, this preference will (in a sense) become less pronounced over time. For example, consider two items, one newly entered (N=1) and one entered and then reviewed once more (N=2). To achieve the same urgency, the interval between the last review and today will have to be four times larger for the latter item. On the other hand, if we have two items, one reviewed four times and one reviewed five times, the ratio of the intervals to have the same urgency will need to be only equal to (5/4)²≈1.6.

The next thing is, the longer time elapsed from the last review, the more “urgent” the item is. This is pretty obvious, though the urgency will increase over time rather slowly. Another formula I considered was U(N,D) := N²/D(1+ε)for ε somewhere between 0.001 and 0.01, where longer intervals “contribute more” to the urgency. The obvious downside of that formula is its complexity, which makes it more difficult to analyze.

Probably the most important thing about my approach is that I’d like to have a guarantee that no item will be postponed indefinitely – in other words, I’d like to be sure that every item will have the lowest urgency of them all after a finite amount of time. Surprisingly, this seems a non-trivial property to prove! The intuition goes like this: every item’s urgency tends towards zero, and every review will make that item’s urgency jump above 1, so if we have some item I with such that n items have urgency lower than I, its urgency will be the lowest one after (n-1) days. This “proof” has one flaw – it silently assumes that the change in urgencies happening over time does not change the order of items. This, however, is simply not true! For example, consider an item A reviewed for the second time 8 days ago and an item B reviewed for the third time 18 days ago. Assuming neither A nor B is reviewed from yesterday to tomorrow, it means that while A was less urgent than B yesterday, and it will be more urgent tomorrow!

Unless there exists some simple trick which escapes me now, I cannot prove that every item will be reviewed after a finite number of days, even with the simplifying assumption that no items are added to the system.

Well. If I can’t prove it, so be it. Let’s don my programmer’s hat and make a simulation. I whipped up some code to simulate doing reviews (a given number of them every day) while introducing a new item with a given probability (so 0.2 would mean one new item every 5 days on average). For reference, here is the code – it is definitely not the most beautiful thing in the world, but it is just a prototype to perform some experiments and then be got rid of.

;; Recurring TODOs - simulation

(require 'cl-lib)

(defvar recurring-todos ()
  "A list of \"TODO items\" as plists -- the properties are :id (an
integer) and :reviews (dates of review, integers, decreasing).")

(defvar recurring-next 0
  "The next value of :id.")

(defvar recurring-date 0
  "The \"date\" (number of days elapsed).")

(defvar recurring-buffer-name "*Recurring TODOs simulation data*"
  "Data about recurring TODOs simulation as csv.  Every row
corresponds to one review (including the first one, i.e.,
addition of the item to the system).")

(get-buffer-create recurring-buffer-name)
(with-current-buffer recurring-buffer-name
  (insert "date,id,review,interval\n"))

(defun recurring-add-review-datapoint (id date review interval)
  "Add a datapoint about a review to buffer `recurring-buffer-name'."
  (with-current-buffer recurring-buffer-name
    (goto-char (point-max))
    (insert (format "%s,%s,%s,%s\n"
                    date id review interval))))

(defun recurring-add-todo ()
  "Add a new recurring todo to `recurring-todos'."
  (let ((new-item (list :id recurring-next
                        :reviews (list recurring-date))))
    (push new-item recurring-todos)
    (recurring-add-review-datapoint recurring-next
                                    recurring-date
                                    1
                                    "")
    (cl-incf recurring-next)))

(defun recurring-next-day ()
  "Increment `recurring-date'."
  (cl-incf recurring-date))

(defun recurring-last-review (todo)
  "The date of the last review of TODO."
  (car (plist-get todo :reviews)))

(defun recurring-number-of-reviews (todo)
  "The number of reviews of TODO so far."
  (length (plist-get todo :reviews)))

(defun recurring-urgency (date todo)
  "Compute the urgency of TODO."
  (let ((n (recurring-number-of-reviews todo))
        (d (- date
              (recurring-last-review todo))))
    (/ (* n n) d 1.0)))

(defun recurring-review (todo)
  "Review TODO.  Destructive."
  (when todo
    (recurring-add-review-datapoint (plist-get todo :id)
                                    recurring-date
                                    (1+ (length (plist-get todo :reviews)))
                                    (- recurring-date (car (plist-get todo :reviews))))
    (push recurring-date (plist-get todo :reviews))))

(defun recurring-find-most-urgent (date todo-list)
  "Return the most urgent todo."
  (let* ((result nil)
         (urgency most-positive-fixnum))
    (mapc (lambda (todo)
            (let ((new-urgency (recurring-urgency date todo)))
              (when (< new-urgency urgency)
                (setq urgency new-urgency
                      result todo))))
          (cl-remove-if (lambda (todo)
                          (= (recurring-last-review todo)
                             date))
                        todo-list))
    result))

(defun recurring-reset ()
  "Reset the recurring reviews simulation."
  (setq recurring-todos ()
        recurring-next 0
        recurring-date 0))

(defun recurring-simulate (iterations new-frequency review-frequency)
  "Simulate ITERATIONS days of reviewing TODOs.
NEW-FREQUENCY is the probability of adding a new TODO every day.
REVIEW-FREQUENCY is the number of reviews done every day.  Do not
reset the variables, so that a simulation can be resumed."
  (dotimes-with-progress-reporter
      (_ iterations)
      "Simulating reviews..."
    (when (< (cl-random 1.0) new-frequency)
      (recurring-add-todo))
    (recurring-review (recurring-find-most-urgent recurring-date recurring-todos))
    (recurring-next-day)))

So, my first experiment was to run one year of simulation, with one review per day and the probability of introducing a new item equal to 0.2. After running my code I imported the resulting CSV to SQLite and ran a few queries to analyze it.

The number of items reached 84. It turned out that 5 items were reviewed 9 times and the average interval between interviews for that 5 items was about 142 days. The maximum interval between reviews turned out to be 262 days. The one item which achieved such a long interval between reviews was reviewed on days 20, 21, 22, 25, 30, 40, 55, 89, and 351.

The next experiment assumed the same probability of adding a new item, but now I ran the simulation for 10 years. Again, the item with the longest interval between reviews had the intervals between them rise (more or less) exponentially. Each of reviews 2-9 happened after less than two weeks after the previous one, and then every interval was at least 10 times longer than the previous one!

So, back to square one. It turns out that my initial approach was pretty elegant mathematically, but practically useless. How surprising.

(to be continued…)

CategoryEnglish, CategoryBlog, CategoryEmacs, CategoryOrgMode