Formatting hexdump’s output

hexdump is a tool used to dump a file (or whatever is piped to it) as a hex file. I personnally like using it with the “canonical” (hex + ASCII) options like this:

While reading the man I saw I could specify a format (-e) or a format file (-f) to change the output. Great!
I can actually control what I want to see.

Understanding the output

A format file is just like your regular file. We will now try to recreate the “canonical” output. First of all we need to analyse what output provides and how:

hexdump -C Made_in_paint_with_love

We can see that there are 3 sections:

  • Offset;
  • Hex data;
  • ASCII data.

The Offset is 8 characters wide, the hex data is 48 characters wide and the ASCII data is 16 characters wide or 18 characters wide with the separators. Each section is separated by a set of two spaces, same goes for the hex data that is separated in half.

The number of leading zeros in the offset is 8, and in the hex data is 2.
This is particularly important so we can get the padding right.

Creating the format file

Use your favorite text editor and/or operating system. I won’t judge you.
The best way to reproduce the same output is to use the same input, so start writing your favorites quotes from your favorite shows.
I’ll be using some data I got from sniffing Republic Commando‘s sockets.

It’s good to be familiar with the printf formatting values.

The first element we need to show is the offset which can be achieved by using the magic value _ax. Values that are printed must be surrounded by quotes. Since we are printing hex numbers in the hex section we will use 02h so that we have a hex number that will always be 2 characters long, even if it’s null.
For the ASCII section we will use _p.

Close enough. The offset needs leading zeros. Let’s put 8 leading zeroes like for the hex variable. Let’s also adjust the padding on the ASCII section.

That’s very close. But we don’t have the last line from the “canonical” output, the _Ax value. And we’re missing the padding cutting the hex data in half. To cut data in half we can print each half separately.

That’s it!

Going further

There is not much to say. This is what the “canonical” output prints. The hardest part was probably understanding how to use the buffer.

This is the final format file. Instead of using hard-coded padding it is possible to specify the padding before the leading zeros as such: "%3.02x" will be equivalent to " %02x".

We are also printing each byte by specifying 8/1 wich translates to print the next 8 values as groups of 1 byte. By writing 8/2 you are grouping the 16 next values, grouping them by 2 and printing each group.
This is manipulating the input as blocks of arbitrary length.

Published by


Junk food tastes good.