I have to admit that this is a bit embarrassing. A long time ago I announced a future post (and promised to release my code) for performing multiple regex replacements in a buffer, possibly in an interactive way. A few months later I started my first programming job (yay!) and promptly forgot about it…
Good thing is, I was reminded about this recently, so I have a chance to revisit the topic. Without further ado, let’s go!
As I said in that previous post, I also wrote a minor mode for selectively (and interactively) performing replacements of some regexen. It proved immensely useful when I worked for Wiadomości Matematyczne as a copyeditor and proofreader. In order to use it, you need to configure it first. There is just one option to do that: mrr-interactive-substitutions
. It is a list of lists of strings. Every sublist starts with a regex to replace, followed by an optional list of key-value pairs, followed by a list of possible replacements.
As of now, there are two options: :hint
(a string, displayed in the echo area when a string matching the regex is found) and :test
(a symbol, which should be a predicate function accepting no arguments; the replacement is skipped if that function returns nil).
The default value for mrr-interactive-substitutions
– which serves as an example – looks like this:
'(("\\(\\b\\| +\\)-\\{1,3\\}\\(\\b\\| +\\)" :hint "TeX dashes" "-" "--" "~-- ") ("d\\([xts]\\)" :hint "Differential operator" :test texmathp "\\\\mathrm{d}\\1"))
Let’s analyze this. The first entry matches a string of one to three hyphens, preceeded (and followed) either by at least one space or a word boundary. The suggested replacements are a hyphen, double hyphen and a double hyphen surrounded by spaces (an unbreakable one on the left and a normal one on the right). The rationale is that many authors do not care about hyphens, en-dashes and em-dashes, so we checked every occurrence of such things and possibly fixed them. (It is disputable whether you really should use an em-dash or just an en-dash surrounded by spaces; we followed Polish typographical customs here, and apparently Robert Bringhurst agrees.)
The second entry is the letter d
followed by one of the letters x
, t
or s
, which are the most common differentials (in integrals, for example). Those are potentially replaced by the same thing, but with an upright d
. Note the use of a capturing group in the regex and \1
in the replacement. Also, I used the :test
keyword here – these strings can appear in words (take “redshift”, for example). Old-time readers of my blog might remember that the function texmathp
returns true if the point is in TeX math mode.
Now, if I say M-x mrr-replace-mode
, Emacs searches for the first match for any of the given regexen, highlights it with an overlay (using the same face as isearch does) and shows the “hint” in the echo area. You have three options then. You can press C-g
to quit mrr-replace-mode
, RET
to move to the next match (or quit if there are no matches), and TAB
to cycle between possible replacements (including the original text – after all, you might decide you want to leave it as it is).
A cool (and rather Emacs’y) thing is that it is a minor mode. This means that you can normally edit the surrounding text while doing the replacements, or do something in a different buffer and come back later. You might even notice that some useful pattern is missing from mrr-interactive-substitutions
, move to your init.el
, add it, evaluate the setq
, get back to the original buffer, move point to the right place and press RET
to look for the new list of patterns – all without even leaving the minor mode! You can also manually edit the text under replacement, but be warned that pressing TAB
then will remove your edits.
I wrote this code almost a decade ago, and it shows. There is at least one stupid bug left (the “hint” is suggested as one of the replacements (how could I have not noticed that?), and the format of the mrr-interactive-substitutions
option is very terse which makes it almost impossible to describe properly in defcustom
. I intend to review the code and fix both issues, but it might take a while. Still, the code is usable as-is (assuming you don’t use the :hint
option), and I can assure you that it helps tremendously with copyediting. (When I worked for “Wiadomości Matematyczne”, I had almost 50 entries in mrr-interactive-substitutions
!) It is not on any Emacs package repository (I am going to finally learn how to add a package to Melpa, and this is going to be one of the packages I am going to put there), but you can find it on Gitlab.