2019-10-12 Challenge accepted - a Node.js grep

An old friend of mine posted a challenge on Twitter to implement a grep-like utility in one’s language of choice. Instead of complaining that he’s got an unfair advantage – he is a Pythonista, and Python is almost as well-suited to that kind of tasks as Perl – I decided to accept the challenge. Of course, I had to start with Emacs Lisp. For this, I decided to cheat a bit and use a buffer instead of a “file” or “stream” – this is, after all, the most natural data structure in Emacs to perform this kind of task.

After about ten minutes of coding, I came up with this:

(defun my-emacs-grep (regex)
  "Delete lines not matching REGEX in the current buffer."
  (interactive (list (read-regexp "Regex: ")))
  (goto-char (point-min))
  (while (not (eobp))
    (let ((eol-pos (line-end-position)))
      (if (re-search-forward regex eol-pos t)
	  (forward-line)
	(delete-region (point) (1+ eol-pos))))))

This is way shorter than delete-non-matching-lines, which is a built-in equivalent with some bells and whistles attached, but it seems to work fine.

Now, being a JavaScript programmer, I also had to code a JS-grep. This is actually quite an interesting task. Here is a first, naive attempt.

#!/usr/bin/env node
const fs = require('fs');
const re = new RegExp(process.argv[2]);
console.log(
	fs.readFileSync(process.stdin.fd, 'utf8')
		.split('\n')
		.filter(line => re.test(line))
		.join('\n')
);

This works, but not very well – it slurps everything from stdin into memory, which is not how the real grep works – but has an advantage of being very simple and again taking even less than ten minutes to code.

Anyway, let’s make a better one. Node.js has the readline library, and – quite helpfully – the manual has an example of reading a file line-by-line using it. After modifying it slightly I ended up with this:

#!/usr/bin/env node
const fs = require('fs');
const readline = require('readline');
const re = new RegExp(process.argv[2]);

const rl = readline.createInterface({
	input: fs.createReadStream('/dev/stdin'),
	crlfDelay: Infinity,
});

rl.on('line', line => (re.test(line) && console.log(line)));

The most interesting part is that it shows the web lineage of Node.js – even though the newer versions have synchronous operations like fs.readFileSync, the readline library has an event-driven interface. This approach is not extremely helpful when writing CLI scripts, but shines for backends of web applications.

Anyway, here we have three implementations of a very simplistic grep. What should be done now is some benchmarking – but I guess this should wait until we have a Python version to compare with. :-)

CategoryEnglish, CategoryBlog, CategoryEmacs, CategoryJavaScript