2023-10-14 Avoiding repetitions

I write a lot of prose. That does not mean I’m good at it nor very creative – I haven’t written any novels or even short stories. By “prose” I mean writing in a natural language (mostly English nowadays). In fact, most of the writing I’ve done in my life is technical in nature (usually either about technology or about mathematics). And (as you obviously know) I’m also a technology geek and an Emacs user. An immediate thought is: how to make Emacs help me write better?

One of the things you should avoid when writing are repetitions, or rather, inadvertent repetitions. Using some word more than once in close proximity is usually not good style (unless you go for some “special effect”, like making several parallel sentences).

Let’s write a simple function to help detect repetitions. When I’ve written a word I suspect I might use too often, I’d like to know how far from this point the nearest other occurrence of that word is. Also, it would be good to have those occurrences highlighted for a while. (The highlighting can also be done by iedit-mode, but that one is case-sensitive by default, and I don’t want that. Interestingly, you can toggle the case sensitivity of iedit-mode with M-C while it is active – but it seems you have to turn it on every time you start using that mode.)

Well, I say “simple”. In fact, it’s far from that. I mean, it’s not exactly difficult to understand, but – like with many other real-world text-editing tasks – there are a few edge cases one must take care of. What if the point is on the first, or last, or only occurrence of some word in the buffer? Will it work if the point is just before the first character of the word, or just after the last one? I guess it would be possible to refine my code so that it is more elegant (or, frankly, less ugly) and probably faster, but I don’t care that much – it works and is not difficult to analyze.

;; find repetitions    -*- lexical-binding: t; -*-

(require 'thingatpt)
(require 'cl)

(defun calculate-distance-to (to fun pred)
  "Calculate distance from BEG to END in units of FUN.
Assume BEG <= END.  FUN is a function moving forward by one unit
of measurement (e.g., a word or sentence)."
  (when to
    (save-excursion
      (let ((count 0))
        (while (funcall pred (point) to)
          (funcall fun)
          (cl-incf count))
        count))))

(defun find-nearest-word-repetitions ()
  "Find and report the nearest repetitions of word at point."
  (interactive)
  (let* ((word (word-at-point))
         (re (format "\\b%s\\b" (regexp-quote word)))
         (case-fold-search t)
         (prev (save-excursion
                 (beginning-of-thing 'word)
                 (when (re-search-backward re nil t)
                   (point))))
         (prev-overlay (when prev
                         (make-overlay prev
                                       (save-excursion
                                         (goto-char prev)
                                         (forward-word)
                                         (point)))))
         (next (save-excursion
                 (end-of-thing 'word)
                 (when (re-search-forward re nil t)
                   (point))))
         (next-overlay (when next
                         (make-overlay next
                                       (save-excursion
                                         (goto-char next)
                                         (backward-word)
                                         (point)))))
         (prev-words (save-excursion
                       (beginning-of-thing 'word)
                       (calculate-distance-to prev #'backward-word #'>)))
         (next-words (save-excursion
                       (end-of-thing 'word)
                       (calculate-distance-to next #'forward-word #'<)))
         (prev-sentences (save-excursion
                           (beginning-of-thing 'sentence)
                           (calculate-distance-to prev #'backward-sentence #'>)))
         (next-sentences (save-excursion
                           (end-of-thing 'sentence)
                           (calculate-distance-to next #'forward-sentence #'<))))
    (message "%s\n%s\n%s"
             (format "Word on point is `%s'." word)
             (if prev
                 (format "The previous occurrence was %s word(s)/%s sentence(s) ago."
                         prev-words prev-sentences)
               "This is the first occurrence.")
             (if next
                 (format "The next occurrence will be in %s word(s)/%s sentence(s)."
                         next-words next-sentences)
               "This is the last occurrence."))
    (when prev
      (overlay-put prev-overlay 'face 'show-paren-match)
      (run-at-time "4 sec" nil (lambda ()
                                 (delete-overlay prev-overlay))))
    (when next
      (overlay-put next-overlay 'face 'show-paren-match)
      (run-at-time "4 sec" nil (lambda ()
                                 (delete-overlay next-overlay))))))

(global-set-key (kbd "C-c r") #'find-nearest-word-repetitions)

As you can see, there is some, well… repetition in the above code. It would be probably possible to abstract away some of it, but I don’t think it’s worth it. One point worth mentioning is the lexical-binding variable – the code would not work under dynamical scoping, because the lambdas in run-at-time are actually closures – they need access to prev-overlay and next-overlay long after this function returns.

And that’s it for today, save for a reminder. If you want to learn some Elisp to be able to write little (or even a bit larger!) commands like this one to help you with your everyday tasks, there are two sources I always recommend. One is the excellent Introduction to programming in Emacs Lisp by the late Robert J. Chassell’s, and if you want to dive a little bit deeper, you can also check my book about Emacs Lisp.

CategoryEnglish, CategoryBlog, CategoryEmacs