An old friend of mine posted a challenge on Twitter to implement a grep-like utility in one’s language of choice. Instead of complaining that he’s got an unfair advantage – he is a Pythonista, and Python is almost as well-suited to that kind of tasks as Perl – I decided to accept the challenge. Of course, I had to start with Emacs Lisp. For this, I decided to cheat a bit and use a buffer instead of a “file” or “stream” – this is, after all, the most natural data structure in Emacs to perform this kind of task.
After about ten minutes of coding, I came up with this:
(defun my-emacs-grep (regex) "Delete lines not matching REGEX in the current buffer." (interactive (list (read-regexp "Regex: "))) (goto-char (point-min)) (while (not (eobp)) (let ((eol-pos (line-end-position))) (if (re-search-forward regex eol-pos t) (forward-line) (delete-region (point) (1+ eol-pos))))))
This is way shorter than delete-non-matching-lines
, which is a built-in equivalent with some bells and whistles attached, but it seems to work fine.
Now, being a JavaScript programmer, I also had to code a JS-grep. This is actually quite an interesting task. Here is a first, naive attempt.
#!/usr/bin/env node const fs = require('fs'); const re = new RegExp(process.argv[2]); console.log( fs.readFileSync(process.stdin.fd, 'utf8') .split('\n') .filter(line => re.test(line)) .join('\n') );
This works, but not very well – it slurps everything from stdin
into memory, which is not how the real grep works – but has an advantage of being very simple and again taking even less than ten minutes to code.
Anyway, let’s make a better one. Node.js has the readline
library, and – quite helpfully – the manual has an example of reading a file line-by-line using it. After modifying it slightly I ended up with this:
#!/usr/bin/env node const fs = require('fs'); const readline = require('readline'); const re = new RegExp(process.argv[2]); const rl = readline.createInterface({ input: fs.createReadStream('/dev/stdin'), crlfDelay: Infinity, }); rl.on('line', line => (re.test(line) && console.log(line)));
The most interesting part is that it shows the web lineage of Node.js – even though the newer versions have synchronous operations like fs.readFileSync
, the readline
library has an event-driven interface. This approach is not extremely helpful when writing CLI scripts, but shines for backends of web applications.
Anyway, here we have three implementations of a very simplistic grep. What should be done now is some benchmarking – but I guess this should wait until we have a Python version to compare with.
CategoryEnglish, CategoryBlog, CategoryEmacs, CategoryJavaScript