Content AND Presentation

(Note to English-speaking readers: the links entitled Komentarze na tej stronie lead to comment pages.)

2016-05-23 Literal values and destructive functions

In Lisp literature you can often find a warning about not using destructive functions (like sort or nconc) on values coming from literals. This is a good advice, and you should follow it. Here’s a short stab at a (partial) explanation (note that all examples are in Emacs Lisp).

Look at this, innocent-looking code:

(defun destructive-havoc ()
  "Example of destructive havoc."
  (setq foo '(1 3 2))
  (message "before sort, foo is: %s" foo)
  (sort foo #'<)
  (message "after sort, foo is: %s" foo))

Evaluate this defun form and try M-: (destructive-havoc). (You will need to go to the *Messages* buffer to see the actual output of the message calls.) Here it is:

before sort, foo is: (1 3 2)
after sort, foo is: (1 2 3)

Everything is proceeding as we have foreseen. But let’s pull the rabbit out of the hat now: evaluate (destructive-havoc) again. Here’s what you get:

before sort, foo is: (1 2 3)
after sort, foo is: (1 2 3)

What’s going on?

Well, the literal in the function definition was actually changed. (If you evaluate the defun form now, it will be redefined once again to the “correct” value.) If you don’t believe it, try this: M-: (symbol-function #'destructive-havoc), or even better, M-x pp-eval-expression RET (symbol-function #'destructive-havoc) RET and see for yourself.

Of course, if you really need such a list in your code, the remedy is clear: just don’t say '(1 3 2) but (list 1 3 2) in your code. This way, the list will be created again from scratch every time you invoke the function.

I’m wondering why this happens so. My first guess was that the literal lists like this get assigned a place in memory at the time of evaluating the defun, and then sort happily changes that exact place in memory. (Of course, byte-compiling the whole thing does not change the behavior, in case you wondered.)

Actually, defun has nothing to do with it; it seems to me that the moment when the thing gets its place is reading. Here’s something that seems to confirm my suspicion: I can do away with the defun altogether, and use fset and lambda instead (this is more or less what defun does internally, only it uses defalias instead of fset; defalias is a bit higher-level wrapper around fset).

(fset 'destructive-havoc
  (lambda ()
		"Example of destructive havoc."
		(setq foo '(1 3 2))
		(message "before sort, foo is: %s" foo)
		(sort foo #'<)
		(message "after sort, foo is: %s" foo)))

I don’t have time (nor courage, frankly) now to try to look at the C source to try to confirm my suspicions, but it is not that important. What is important is the takeaway of this post: never, ever use literal values in your code if you are going to modify them later. You’re welcome.

CategoryEnglish, CategoryBlog, CategoryEmacs

Comments on this page

2016-05-17 Emacs Lisp closures demystified

It is often claimed that one of the advantages of closures (as in “lexical scoping”) is information hiding (a.k.a. encapsulation). This is true, but as this post shows, you can’t really hide anything in Emacs Lisp;-).

Consider the classical example.

; -*- lexical-binding: t; -*-

(let ((counter 0))
  (defun increase-counter ()
    (setq counter (1+ counter)))
  (defun get-counter ()
    counter))

It is well known how this works: (increase-counter) increases an “internal” counter, and (get-counter) returns its value. (Here, “internal” means “internal to these two functions”.)

As the manual says, however,
Currently, an Emacs Lisp closure object is represented by a list with the symbol ‘closure’ as the first element, a list representing the lexical environment as the second element, and the argument list and body forms as the remaining elements.

(It says also that this is considered an internal implementation detail and hence we recommend against directly examining or altering the structure of closure objects.)

Let’s see what happens when we evaluate (symbol-function 'increase-counter):

(closure ((counter . 4) t) nil (setq counter (1+ counter)))

And here’s the result of (symbol-function 'get-counter):

(closure ((counter . 4) t) nil counter)

(Notice that the mysterious nils are just empty argument lists.)

And here’s where things start to get interesting. Before you evaluate this expression, try to guess the result.

(eq (cadr (symbol-function 'increase-counter))
    (cadr (symbol-function 'get-counter)))

See? Not only does the second part of the closure list look the same, it also is the same object! That’s why the “sharing” of the variable counter between these two closures actually work. Here’s the proof:

(fset 'increase-counter
      '(closure ((counter . 4) t) nil (setq counter (1+ counter))))
(fset 'get-counter
      '(closure ((counter . 4) t) nil counter))

Now symbol-function gives aparently the same results, but the counter variable (in fact, the lexical environment) is no longer shared between the two closures – no wonder, since they are now different (though identically looking) objects.

And finally, let us perform one more trick, just for fun – as the manual explicitly says, we should not rely on the exact structure of the closure objects. We are now going to manually recreate our two closures!

(let ((env '((counter . 0) t)))
  (fset 'increase-counter
	(list 'closure env nil '(setq counter (1+ counter))))
  (fset 'get-counter
	(list 'closure env nil 'counter)))

Try it out, it’s almost magic! We first set up the lexical environment in a temporary variable, then put the carefully crafted lists into the function cells of the symbols increase-counter and get-counter, and that’s it!

(In case you are wondering whether it would be possible to do the same without a temporary variable, the answer is yes. Fasten your seatbelt and look at this:

(fset 'increase-counter
      (list 'closure
	    '((counter . 0) t)
	    nil
	    '(setq counter (1+ counter))))
(fset 'get-counter
      (list 'closure
	    (cadr (symbol-function 'increase-counter))
	    nil
	    'counter))

Don’t try this at home, kids.)

CategoryEnglish, CategoryBlog, CategoryEmacs

Comments on this page

2016-05-15 debug-on-whatever

Debugging Elisp is sometimes tricky. Some time ago I had an issue with a variant of show-paren-mode. The issue looked like a complete hang of Emacs, when even C-g didn’t help. How to debug that?

Of course, I could instrument a few functions for Edebug (which I eventually did anyway). However, a situation when Emacs repeatedly hangs is at least annoying. Of course, one can start a separate Emacs process (and that’s also what I did) and kill it when needed, but it is good to know that that wasn’t the only possibility.

It is not a very well-known thing, but sending Emacs a SIGUSR2 signal is a stronger way than C-g to tell Emacs to stop whatever it is doing. As the manual says (see (info "(emacs) Checklist")):

If you cannot get Emacs to respond to ‘C-g’ (e.g., because ‘inhibit-quit’ is set), then you can try sending the signal specified by ‘debug-on-event’ (default SIGUSR2) from outside Emacs to cause it to enter the debugger.

It turned out that my function looking for the matching delimiter was waaay to slow, and in case of a mismatch took more than 10 seconds to complete. So it wasn’t a hang after all – but C-g apparently didn’t help, since the function was being called after C-g again.

It also turned out that after feeding Emacs the SIGUSR2 signal, C-g started to “work”, i.e., it did stop after the apparent hang. Why was that so?

Apparently, sending SIGUSR2 to Emacs somehow triggered the change of debug-on-quit to t. With that turned on, C-g moved the point away from the place where my code repeatedly tried to find the (nonexistent) matching delimiter to the *Backtrace* buffer, and Emacs started responding normally.

Investigating this further, one can find out more debugging knobs. The function toggle-debug-on-quit toggles the behavior I described above. Its cousin toggle-debug-on-error is very useful when something triggers an error and you don’t know what it is (which happens quite often). It shows a *Backtrace* buffer whenever there is an error, and you can see exactly what called what and with what arguments. Way more useful than Edebug in some cases.

But wait, there’s more! If you do M-x apropos-variable RET debug-on RET, you’ll see quite a few potentially useful things. (And M-x apropos-command RET debug-on RET is left as an exercise to the reader!)

One of them I didn’t know about is debug-on-message. You can set it to a regex, and any call to message with the message matching it will trigger the display of *Backtrace*. This might be especially handy in case of something displaying a message which you have no idea where it comes from. (I had such a problem, with a mysterious “Mark set” being displayed in the echo area. If only I knew about this variable then… Happily, I managed to solve that problem in a bit different way – but that’s another story!) The last thing worth mentioning in this department is the --debug-init command-line switch. If you have some error on Emacs startup and you don’t know why, this is the simplest way to find its cause. (Another one, more involved, but also potentially useful and extremely cool, is to use Artur Malabarba’s Emacs Bug Hunter. Check it out!)

CategoryEnglish, CategoryBlog, CategoryEmacs

Comments on this page

More...

(Więcej means More in Polish; click it to see older entries.)

CategoryEnglish, CategoryBlog