Content AND Presentation

2024-09-16 Irregular recurring TODOs in Org mode, part II

So, it’s been over a year since the previous part of my attempt to introduce “irregular, but recurring TODOs” into my workflow. This is something I really need, but it’s complicated enough that my inner procrastinator kept putting it off, unfortunately.

Let’s start differently now. First of all, let’s set the “boundary conditions”, or the design goals, more explicitly than before. I don’t want the number of reviews per day to fluctuate too much – that is a given. I also do not want the intervals between reviews to rise dramatically. My goal is not to learn the material, but to be periodically reminded about it, so I prefer reviews in fairly uniform (as opposed to exponentially increasing) intervals. (The first few intervals being larger and larger seems fine, but after that they should stabilize.) Let’s determine the minimum and maximum interval this time. I think that the first interval should be about one week long, the second one about a month, and every subsequent one should be about 2 months. (These numbers are pretty arbitrary, but “seem reasonable”.) Also, every one of these may be made longer, but not too much. So, here is another idea.

Let’s begin with setting the maximum daily number of reviews, and set it to one (initially). After every review of an item, we will schedule the next one for that item, according to the following algorithm:

  • given today’s date d, maximum number of reviews per day m and the number of reviews n this item already had, the function i mapping the number of the reviews to the minimum interval to the next one, and a fixed quotient q,
  • find some random day in the set {d+i(n),...,d+qi(n)} when the number of reviews scheduled is less than m, and schedule the review then, if such a day exists,
  • or increment m and schedule the review on a random day in {d+i(n),...,d+qi(n)} otherwise.

(In the first draft, I wrote that the next review should happen in the first day of the respective set, but I changed it to be more random. The reason was that I was afraid that the order of reviewing the items will stay the same – when some item B is shown after some item A, it might always be shown after A, and I didn’t want that.)

The drawback of this idea is that I’ll need a way to easily answer the question: given a particular day, how many reviews are scheduled then? Assuming that the date of review is associated with the item (for example, stored in its properties), this means being able to scan all the items rather quickly to find out how many are scheduled to d+i(n), to d+i(n)+1 and so on. Org mode is not best suited for that, even though I suspect that the number of items will not be big enough to create a performance problem for the user. Still, maybe it would be better to have a real database for that.

Well, good thing Emacs comes with one, then! Since Emacs 29, the sqlite3 library is one of Emacs’ components. Why not utilize that? One reason is that it would be premature optimization. Let’s keep the idea of using SQLite in mind, but for now Org properties should be more than enough.

But first let’s make another simulation. This time, though, I think I can at least try to predict what is going to happen. Previously, I had no idea what intervals would be selected – my formula determined them only implicitly. Now, the interval lengths are going to be more or less predetermined, so the non-obvious variable is the number of daily reviews. But this time, it can be estimated. Let us assume the following “interval function”:

(defun recurring-next-interval (review-number)
  "Return the minimum interval for the next review."
  (cl-case review-number
    (1 7)
    (2 30)
    (t 60)))

and let us set q=2 (so that the second review after the initial one will happen in between 7 and 14 days, for example). This means that every item will be reviewed at least once every 120 days and at most once every 60 days (after a few initial reviews which are going to be a bit more frequent). While the first few repetitions will happen more often, let’s assume that the average interval between repetitions is going to be 90 days. Assume also that I will add one item per 8 days to the system (and I think this is a safe upper bound). After a year, I’m going to have about 45 items then, so one review per day will stop being enough after two years. In other words, every two years of using the system will add roughly one review per day to my load. This seems to be acceptable for me – if it turns out it’s not, I can always increase the intervals after some time.

Ok, so let’s confirm these back-of-the-envelope calculations. Beware, a long piece of not-the-best-quality Elisp code follows! (Since this is throwaway code, I didn’t bother with good practices etc.)

;; Recurring TODOs - simulation, second attempt

(require 'cl-lib)

(defvar recurring-todos ()
  "A list of \"TODO items\" as plists -- the properties are :id (an
integer), :reviews (dates of review, integers, starting with the
most recent one) and :next (date of the next review).")

(defvar recurring-counter 0
  "The value of :id for the next item created.")

(defvar recurring-date 0
  "The \"date\" (number of days elapsed from the beginning of the
experiment).")

(defvar recurring-buffer-name "*Recurring TODOs simulation data*"
  "Data about recurring TODOs simulation as csv.  Every row
corresponds to one review (including the first one, i.e.,
addition of the item to the system).")

(get-buffer-create recurring-buffer-name)
(with-current-buffer recurring-buffer-name
  (insert "date,id,review,interval\n"))

(defun recurring-add-review-datapoint (date id review interval)
  "Add a datapoint about a review to buffer `recurring-buffer-name'."
  (with-current-buffer recurring-buffer-name
    (goto-char (point-max))
    (insert (format "%s,%s,%s,%s\n"
                    date id review interval))))

(defun recurring-add-empty-row ()
  "Add an empty row to buffer `recurring-buffer-name', signifying that
  the maximum number of repetitions per day was increased."
  (with-current-buffer recurring-buffer-name
    (goto-char (point-max))
    (insert "\n")))

(defun recurring-add-todo ()
  "Add a new recurring todo to `recurring-todos'."
  (let ((new-item (list :id recurring-counter
                        :reviews ()
                        :next nil)))
    (recurring-review-item new-item)
    (push new-item recurring-todos)
    (cl-incf recurring-counter)))

(defun recurring-next-day ()
  "Increment `recurring-date'."
  (cl-incf recurring-date))

(defun recurring-last-review (todo)
  "The date of the last review of TODO."
  (car (plist-get todo :reviews)))

(defun recurring-number-of-reviews (todo)
  "The number of reviews of TODO so far."
  (length (plist-get todo :reviews)))

(defun recurring-next-review (todo)
  "The date of the next review."
  (plist-get todo :next))

(defun recurring-next-interval (review-number)
  "Return the minimum interval for the next review."
  (cl-case review-number
    (1 7)
    (2 30)
    (t 60)))

(defvar recurring-factor 2
  "The maximum factor an interval may be multiplied by.")

(defvar recurring-max-per-day 1
  "The maximum number of reviews per day.
Initially 1.")

(defun recurring-number-of-reviews-on-day (date)
  "The number of reviews scheduled for DATE."
  (cl-reduce (lambda (count todo)
               (if (= date (recurring-next-review todo))
                   (1+ count)
                 count))
             recurring-todos
             :initial-value 0))

(defun recurring-compute-next-review (todo)
  "Return the date of the next review of TODO."
  (let* ((interval (recurring-next-interval (recurring-number-of-reviews todo)))
         (min-date (+ recurring-date interval))
         (max-date (+ recurring-date (ceiling (* recurring-factor interval))))
         (possible-dates (cl-remove-if-not (lambda (date)
                                             (< (recurring-number-of-reviews-on-day date)
                                                recurring-max-per-day))
                                           (number-sequence min-date max-date))))
    (if possible-dates
        (seq-random-elt possible-dates)
      (cl-incf recurring-max-per-day)
      (recurring-add-empty-row)
      (recurring-compute-next-review todo))))

(defun recurring-review-item (todo)
  "Review the TODO item."
  (recurring-add-review-datapoint recurring-date
                                  (plist-get todo :id)
                                  (1+ (length (plist-get todo :reviews)))
                                  (- recurring-date
                                     (or (car (plist-get todo :reviews))
                                         recurring-date)))
  (push recurring-date (plist-get todo :reviews))
  (setf (plist-get todo :next)
        (recurring-compute-next-review todo)))

(defun recurring-review-for-today ()
  "Review items for the current day."
  (mapc #'recurring-review-item
        (cl-remove-if-not (lambda (todo)
                            (= (recurring-next-review todo)
                               recurring-date))
                          recurring-todos)))

(defun recurring-reset ()
  "Reset the recurring reviews simulation."
  (setq recurring-todos ()
        recurring-counter 0
        recurring-date 0
        recurring-max-per-day 1))

(defun recurring-simulate (iterations new-frequency)
  "Simulate ITERATIONS days of reviewing TODOs.
NEW-FREQUENCY is the probability of adding a new TODO every day.
Do not reset the variables, so that a simulation can be resumed."
  (dotimes-with-progress-reporter
      (_ iterations)
      "Simulating reviews..."
    (when (< (cl-random 1.0) new-frequency)
      (recurring-add-todo))
    (recurring-review-for-today)
    (recurring-next-day)))

This time I ran the simulation for 4 years assuming that I add one item every 8 days on average at first, just to see what happens. (In fact, I’ve been actually gathering items for repeating in this system for about one and a half years now, and I have 51 of them so far.) It turned out that I reached 3 repetitions per day (which is roughly consistent with my expectations), and the average interval between repetitions was about 70 days (almost 80 in the fourth year alone). This looks very promising. The second experiment involved 10 years with one item added to the system every 5 days, and the average interval turned out to be 82 days (87 in the last year); the maximum number of repetitions per day reached 9, which is a tiny bit worrying, but probably still acceptable. (Assuming many of my TODOs are of the form “read this article again to be reminded of it”, 9 potentially long articles per day doesn’t look very good – but it just occurred to me that as part of an earlier repetition I might want to summarize the article, which is also very good for keeping it in long-term memory.) Also, if I decide that the daily load is too high, I can just increase the intervals, or even drop some of my items if I decided I no longer need to be reminded of them. Either way, 10 years is long enough that I most probably don’t need to worry about it.

So, the next time I write about this, I really hope to have a functional if minimal setup – in fact, I am (slowly) working on it.

That’s it for today, see you in a few days with the next article!

CategoryEnglish, CategoryBlog, CategoryEmacs, CategoryOrgMode

Comments on this page

2024-09-09 Moving Virtualbox machines to another directory

(Warning: rant-ish post incoming.)

A few days ago I had a slightly atypical need. For some reason, I needed to move a Virtualbox machine to another directory. (Note to younger readers: I am not, and will not, use the word “folder”. These are directories and I’m not going to use another word for it. Note 2: this is not me yelling at cloud, this post has nothing to do with other people’s computers;-).)

Anyway, the Vagrant documentation is comprehensive but not very easy to dig through. After a few failed attempts I found the correct command, which is VBoxManage movevm (by the way, who decided that using capital letters in a CLI tool’s name is a good idea?).

First we need to learn the name (alternatively, the uuid) of the machine we want to move. This is easy: VBoxManage list vms. Let’s assume that the name of our machine is handles. (Usually, these names are generated to be much longer.) This means that by default it’s located in ~/VirtualBox VMs/handles directory (yes, with a literal space, great idea, really…).

The first time I tried to move my machine I tried to do roughly this:

VBoxManage movevm handles --folder ~/directory

and it worked. Then, just to make sure that my success is repeatable, I issued the reverse command:

VBoxManage movevm handles --folder "~/VirtualBox VMs"

and was met with an unpleasant surprise:

VBoxManage: error: Unable to determine free space at move destination ('/home/mbork/~/VirtualBox VMs/'): VERR_FILE_NOT_FOUND
VBoxManage: error: Details: code VBOX_E_IPRT_ERROR (0x80bb0005), component SessionMachine, interface IMachine, callee nsISupports
VBoxManage: error: Context: "MoveTo(Bstr(szTargetFolder).raw(), Bstr(pszType).raw(), progress.asOutParam())" at line 512 of file VBoxManageMisc.cpp

What happened here‽ Well, the first line contains a hint: Virtualbox tries to move the machine to /home/mbork/~/VirtualBox VMs/. It’s pretty easy to see what happened. I used quotes so that the space in the directory name would work (as I hinted above and always say, using spaces in directory names is asking for trouble…), but then Bash didn’t expand the tilde (~) into the name of my home directory, either, and VirtualBox isn’t smart enough to do it for itself.

This is (sadly) an excellent example of poor practices – first, using a space in a directory (or file) name, then, not expecting a very typical use case and not doing The Right Thing (or at least not providing an informative error message). (To be fair, it’s debatable if this is more Virtualbox’s fault or Bash’s fault, or, frankly, user error. But I’m pretty sure putting a space into a directory name is a very bad idea.)

So, what is the “correct” (well, at least working) way to move a Virtualbox VM back into its ~/VirtualBox VMs directory? You can either escape the space with a backslash:

VBoxManage movevm handles --folder ~/VirtualBox\ VMs

or put the tilde outside the quotation marks:

VBoxManage movevm handles --folder ~/"VirtualBox VMs"

That’s it for today. I hope I saved someone a headache and half an hour of searching through the internet…

CategoryEnglish, CategoryBlog

Comments on this page

2024-09-02 Rounding all timestamps in an srt file

This is another time I’m going to revisit subtitling in Emacs. This time I’m going to fix yet another minor annoyance. The srt files contain timestamps with millisecond resolution, which doesn’t make sense at all – when I edit subtitles for a video with 50fps, I don’t really need such precision. Instead, it makes sense for every timestamp to be rounded to the nearest 20 millisecond.

I decided to write a simple piece of Elisp to do the rounding for me. A quick scan through Subed source code revealed a few functions which could be helpful with that: subed-subtitle-msecs-start, subed-set-subtitle-time-start (and similar functions for -stop), and subed-forward-subtitle-id. While browsing the source code I also found the subed-for-each-subtitle macro, which is ideally suited to what I want to do.

So, here is my first attempt to do this.

(defun subed-round-current-timestamps (resolution)
  "Round the timestamps of the current subtitle to RESOLUTION msecs."
  (subed-set-subtitle-time-start
   (* resolution
      (round (subed-subtitle-msecs-start)
             resolution)))
  (subed-set-subtitle-time-stop
   (* resolution
      (round (subed-subtitle-msecs-stop)
             resolution))))

;; Note: first attempt, doesn’t work 100% correctly!
(defun subed-round-timestamps (&optional resolution)
  "Round every timestamp to RESOLUTION milliseconds."
  (interactive "*P")
  (setq resolution (or resolution 20))
  (subed-for-each-subtitle
    (point-min)
    (point-max)
    nil
    (subed-round-current-timestamps resolution)))

(Well, I cheated a bit – my real first attempt was botched because I didn’t read the docstring of round, just looked at its signature, and did not multiply the result by 20.)

As usual, it turns out that the first attempt does not work completely well. In this case, the bug was a bit subtle. Subed mode maintains a minimal temporal distance between the stop time of one subtitle and the start time of the next one, which is 100 milliseconds by default, and that setting affects functions which change the timestamps. Of course, the remedy is simple – it’s enough to turn that customization off for the time of performing the change. Because of Emacs’ dynamic scoping used for options, this is as easy as putting a let around my code.

(defun subed-round-timestamps (&optional resolution)
  "Round every timestamp to RESOLUTION milliseconds."
  (interactive "*P")
  (setq resolution (or resolution 20))
  (let ((subed-enforce-time-boundaries nil))
    (subed-for-each-subtitle
      (point-min)
      (point-max)
      nil
      (subed-round-current-timestamps resolution))))

And that’s pretty much it – it just works now! At first, I thought that I’d have to remember to turn the Subed waveform minor mode off so that the waveforms don’t get updated suring the iteration, but for some reason it turned out to be unnecessary. This is because the Subed waveform mode uses subed-subtitle-motion-hook, which itself is run within post-command-hook – and that is only used when the command is run interactively. (Conversely, when you use C-M-f, that is, subed-shift-subtitle-forward, to move all subtitles by some amount, you need to disable waveform mode manually. Now that I think about it, I might submit a fix for that.)

Also, while checking (again) how to use the numeric prefix as the resolution but default to 20, I discovered the N interactive code. It reads a number from the minibuffer, but uses the numeric prefix argument instead if specified. This is something I needed in my command downloading a Jira task, only I coded that logic manually. I thought it’s a new development, but it turned out that it has been in Emacs for about three decades, possibly more! I definitely don’t want it here – the default of 20 is much more reasonable than asking the user every time – but it’s nice to be aware that every time you look into Emacs docs, you may find something you didn’t know about.

That’s it for today!

CategoryEnglish, CategoryBlog, CategoryEmacs

Comments on this page

More...

CategoryEnglish, CategoryBlog