2020-07-13 Miller - a csv Swiss Army knife

I often need to work with csv files. Until recently, my go-to tools to do most things with them were Emacs with csv-mode (when I need to interactively edit simpler csvs, LibreOffice (when I need interactivity and more complex things, like multi-line cells, which are not supported by csv-mode), and xsv (when I need to automate some transformations, like selecting a subset of columns, changing the order of columns and other things).

Some time ago, I learned about another csv-related tool: Miller. It is pretty complicated (and has a pretty strange UI), but it has some really unique and useful features.

What it does is transforming files with tabular data. One of the format it supports is (of course) csv, but it can handle JSON (not general JSON, mind you, but JSON containing arrays of similarly structured objects, of course) and a few other formats.

Next time you need to do some non-trivial transformations of csvs, tsvs or similar files, and you think something along the lines “I’d like to have something like awk or sed, but for tabular data”, give Miller a shot – it is possible that it can handle your use-case. As an example, here is what I used it for recently. Imagine a csv with dozens of columns, some of them empty, some pretty long. Imagine that you want to have some human-readable representation of the data in it on your stdout. Of course, you can use xsv select, but Miller has something even better. Given the following input:

id,name,price
1,bread,5
2,tea,10
3,eggs,7

and the following invocation:

mlr --icsv --oxtab cat

(where --icsv means that the input is in csv, --oxtab means that we want the output in Miller’s “vertical tabular” format and cat means that we do not want any transformation besides format conversion), we get this:

id    1
name  bread
price 5

id    2
name  tea
price 10

id    3
name  eggs
price 7

How cool is that?

Head to the documentation to see more useful things!

CategoryEnglish, CategoryBlog