2016-04-26 fill-single-char-nobreak-p

Since Emacs 24 (or so), Polish-speaking users got a very nice gift: a function called fill-single-char-nobreak-p, which can be put into fill-nobreak-predicate. It prevents Emacs from breaking a line after a one-letter word when filling, which is a rule in Polish typography. Of course, when writing in LaTeX, it doesn’t matter at all, since then you use ties (tildes) to prevent breaks anyway (and there is an Emacs library called sierotki.el, which means “little orphans” in Polish, to insert those ties automatically). When writing emails (with hard line-breaks), however, this is a very useful thing.

Unfortunately, one day I found that this function does not work correctly. When a one-letter word is preceded by an opening parenthesis, Emacs ceases to recognize it as such. And here’s why:

(defun fill-single-char-nobreak-p ()
  "Return non-nil if a one-letter word is before point.
This function is suitable for adding to the hook `fill-nobreak-predicate',
to prevent the breaking of a line just after a one-letter word,
which is an error according to some typographical conventions."
  (save-excursion
    (skip-chars-backward " \t")
    (backward-char 2)
    (looking-at "[[:space:]](*[[:alpha:]]")))

As you can see, the function is not complicated. The fix, however, is not trivial. Of course, looking-back would solve the problem, but it might be a tad too slow (as its docstring loyally warns).

Quite interestingly, there is another function which serves the same purpose, though not in stock Emacs. It is called fill-single-letter-word-nobreak-p and is part of the aforementioned sierotki.el package. Here’s its source code, together with some comments:

;;; ----------------------------------------------------------------------
;;; The TeX Magic Space mode equivalent for filling (word wrap)

;;; see: http://www.emacswiki.org/cgi-bin/wiki/FillParagraph
;; It is simplified `fill-french-nobreak-p' from textmodes/fill.el.
;; The function `fill-french-nobreak-p' first appeared in textmodex/fill.el
;; rev. 1.132, and the single-letter detection code first appeared in
;; rev. 1.132, correct in 1.181.  Not present in GNU Emacs 21.3
(defun fill-single-letter-word-nobreak-p ()
  "Don't break a line after single letter word.
This is used in `fill-nobreak-predicate' to prevent breaking lines just
after a single letter word."
  (save-excursion
    (skip-chars-backward " \t")
    (unless (bolp)
      (backward-char 1)
      ;; Don't cut right after a single-letter word.
      (and (memq (preceding-char) '(?\t ?\ ))
	   (eq (char-syntax (following-char)) ?w)))))

(BTW, look up the source for fill-french-nobreak-p, including the docstring and the comment beneath it.) As you can see, it suffers from the same problem. (What’s maybe more interesting, the sierotki.el version won’t fix a one-letter word which is already at the end of the line when filling with e.g. M-q, while the stock Emacs version will.)

My proposal is as follows: it can be safely assumed that a string like a(i ​ should not happen anyway, and so I’d add the opening paren (and maybe a bracket, just in case) to regex along the whitespace characters before a single letter.

Again, the self-documenting nature of Emacs (and – of course – the fact that it’s open-source) show their strengths. Note that fixing this problem does not require Emacs recompilation – you just find the relevant function and correct it. (Also, saying C-h f fill-single TAB helps find the latter function I described; of course, I did have sierotki.el installed.)

CategoryEnglish, CategoryBlog, CategoryEmacs