2015-12-12 Counting sentences

I was a bit astonished that Emacs does not have a count-sentences function. I mean, how is counting sentences different from counting words or lines? You move one sentence/word/line at a time, incrementing a counter, and when you finish, you have the count.

Admittedly, this is one thing Vim does better than Emacs. In Vim, you would have the verb count, the noun sentence, and you would combine them to get the sentence count. (It is possible in Emacs, too, of course; it’s just not the usual way it is done, at least not with interactive commands.)

Of course, writing a sentence counting function can’t be that hard, right? Sure, here it is:

(defun primitive-count-sentences (begin end)
  "Return the number of sentences from BEGIN to END."
  (save-excursion
    (save-restriction
      (narrow-to-region begin end)
      (goto-char (point-min))
      (let ((sentences 0))
	(while (not (looking-at-p "[ \t\n]*\\'"))
	  (forward-sentence 1)
	  (setq sentences (1+ sentences)))
	sentences))))

Of course, this is very crude. For starters, it is not interactive. What’s worse, it has this ugly looking-at-p – if we replaced that with a simple eobp, any whitespace after the last sentence would count as an additional sentence! OTOH, checking for that edge case after every sentence seems inefficient.

So, here is a better version. First of all, it uses the known way to count sentences in region (if active) or the buffer. Also, I decided to apply a simple trick to be able to use eobp instead of looking-at-p.

(defun count-sentences (begin end &optional print-message)
  "Count the number of sentences from BEGIN to END."
  (interactive (if (use-region-p)
		   (list (region-beginning)
			 (region-end)
			 t)
		 (list nil nil t)))
  (save-excursion
    (save-restriction
      (narrow-to-region (or begin (point-min))
			(progn
			  (goto-char (or end (point-max)))
			  (skip-chars-backward " \t\n")
			  (point)))
      (goto-char (point-min))
      (let ((sentences 0))
	(while (not (looking-at-p "[ \t\n]*\\'"))
	  (forward-sentence 1)
	  (setq sentences (1+ sentences)))
	(if print-message
	    (message
	     "%s sentences in %s."
	     sentences
	     (if (use-region-p)
		 "region"
	       "buffer"))
	  sentences)))))

See? Here we just cut the trailing whitespace while narrowing.

This is not a “final” version, either: one finishing touch would be using a singular form “sentence” when the count is equal to one. Let us left that as an easy exercise for the reader.

What’s more, this version is still just an exercise. What I really need is something a bit more complicated. But that will wait for another post – that’s it for now, folks!

CategoryEnglish, CategoryBlog, CategoryEmacs