mawk considered harmful

Consider the following little awk script:

/^--- [-[:alnum:].\/]+$/ { if ($2 !~ /[-[:alnum:].]+\/debian\//) print $2 }

According the the O'Reilly's sed & awk book it is, as far as my knowledge goes, normal valid awk. It's intended to be used like such:

~$ zcat debian_package_version.diff.gz | awk -f script.awk

The purpose is to list any files that are in the diff which do not live in the package/debian/ path.

I quickly created and tested it, then took it into production in my chroot. Interestingly enough I was pretty sure I had just tested it on this very same file which I'm now using it on, only now it didn't output anything! Double check the code, it still looks all right. Hrm. Then try this:

(chroot)~$ ls -l /usr/bin/awk
lrwxrwxrwx 1 root root 21 Jan 11  2006 /usr/bin/awk -> /etc/alternatives/awk
(chroot)~$ ls -l /etc/alternatives/awk
lrwxrwxrwx 1 root root 13 Sep 12 00:49 /etc/alternatives/awk -> /usr/bin/gawk
(chroot)~$ exit
~$ ls -l /etc/alternatives/awk
lrwxrwxrwx 1 root root 13 Jun  1 16:47 /etc/alternatives/awk -> /usr/bin/gawk

Yes indeed, after installing gawk inside the chroot it worked as expected. I'm really concerned about mawk not doing what I expected as by default, on a Debian system, only mawk is installed (because it is a lot smaller, claims to be faster too). I can barely believe the script above is not portable but would love if someone could point out to me what I did wrong!


Upon closer examination it seems mawk doesn't like the [:alnum:] bit. It claims to adhere to POSIX tho. It would also be nice if it failed somehow. As I understand it the last closing bracket of [[:alnum:]] should create an error if it doesn't understand POSIX character classes.

No awk implementations where hurt in the creation of this post.