Wednesday, October 03, 2012

98% Compression For Diagrams - Unbelievable?


I didn't believe it was possible for compression technology to get any more efficient than it is today.

Take a PNG diagram like the one below:

How small do you think a file containing this diagram could be? The challenge is to get it under a kilobyte. Your choice of file formats and compression algorithms!

The uploaded version of the diagram on this blog may be smaller, but the one I have on disk is a PNG file that is 40,468 bytes in size. Compressing it using zip results in a file 38,394 bytes in size, - a 5% reduction.

That's not too surprising, because PNG is already a pretty efficient file format. All said, that's still way off our target size of a kilobyte. We're asking for an impossible 97.5% compression ratio.

Now, what if I told you there was a way to get there with no loss of information?

Unbelievable?

The above diagram is the standard example on the ditaa.org website. The website hosts the eponymous ditaa tool, which is Free Software under the GNU GPL. Ditaa converts ASCII art into image files. Aha!

The above diagram is generated from an ASCII art text file that weighs in at 1,569 bytes, which is still 50% above our target, but since ASCII text is very space-inefficient, it can be further zipped into a file that is just 713 bytes in size. Compared to the size of the original PNG image that it captures in a lossless representation, that's a compression ratio of 98%!

Here's what the ASCII art version looks like (this is a PNG screenshot, by the way ;-).


[How does one draw ASCII art in the first place? Isn't that a pain? Not really. There's another wonderful tool over at www.asciiflow.com that provides a simple UI to create ASCII art diagrams (doesn't work with IE). Use the tool and follow the conventions specified by ditaa, and you can create really compact diagrams that can be expanded whenever required.]

To me, it's not the compression ratio that's so cool. It's the whole approach to representing technical diagrams. The efficiency is in being able to capture the basics of representation in such a minimalistic way, yet sacrifice nothing in terms of the output's visual appeal.

Full marks to asciiflow and ditaa!

3 comments:

Gladston Arulanandam said...

Interesting but doubt if there will be any real use since storage/BW are inexpensive compared with the constraints this ASCII art imposes. If we limit drawing to using predefined shapes, we can go for even a higher compression.

Gladston

Jim Webber said...

Actually it does lose data. The original image has colours which might be semantically important. The ASCII art version does not.

Ganesh Prasad said...

Jim,

The ASCII art version *is* the original from which the PNG image is generated (by ditaa). And it has the colours encoded inside the figures.

Ganesh