Compressions comparison

From VbGORE Visual Basic Online RPG Engine

Contents

[edit] What is compression?

A compression is an algorithm that attempts to decrease the amount of bytes in a series of bytes. This can be done in two styles - lossy or lossless. Lossy compressions are common in media, such as in GIF and JPG images or movie formats. Some important information is lost (ie the visual pixels), but the file size is greatly reduced. If you had a movie running with knowing every pixel for every frame in 500x500 resolution at 30 frames per second in 32-bit (4 byte) pixel depth, for instance, you would be using 4x500x500x30 bytes (28.6 MBytes) per second just for the pixels. This would make an hour-long movie over 100 Gigabytes. Lossless compression, on the other hand, just rearranges the data to be smaller when compressed, but when decompressed it create the exact same original decompressed file. In this article, we will be addressing only lossless compression.

[edit] Compression in vbGORE

Compression algorithms, along with times to use the compression, varies greatly from game to game. Some games use compression on textures to reduce the RAM usage at a cost of CPU usage, while other games use compression to reduce the size of easy-to-compress (ie redundant data) files to reduce disk space usage. In vbGORE, the only default usage of compression is in the update server, so that is where we will be concentrating our information.

vbGORE uses compression in the update server for only one thing - less bandwidth usage. The cost of doing this is that running the update server with new files requires them being compressed (old files are preserved in compressed format and their integrity is checked with a MD5 hash to see if they have changed) at runtime. Along with this, the client has to decompress the files. This means that even if you don't mind running a compression on some files for 5 days straight on your powerful computer to get the best compression, it isn't always a logical method. For example, PAQ8l, a compression method that comes with vbGORE, creates the best compression by far out of all of the other compressions that come with vbGORE, but the time it takes for compressing and decompressing files is pure insanity.

[edit] Compression algorithms

The only compression algorithms we are going to be using in this article is the ones that come with vbGORE. They are the following:

PAQ8l
PAQ8l is the March 8, 2007 edition of the PAQ compression series. It is one of the best compression algorithms around, but also is easily the slowest and most memory intensive compressions in this list. Only recommended for rare, special occasions.
LZMA
LZMA, or Lempel-Ziv-Markov chain-Algorithm, is an algorithm developed in 1998 and used in 7-Zip in the 7z format. It offers great compression at a decent speed. Its decompression is often a few times faster than compression, though.
RLE
RLE, or Run-Length Encoding, is a very simple compression that takes runs of data (ie AAAAAXAAA) and turns it into a shorter list (ie 5AX3A). This algorithm is very fast, though slightly crippled from being implemented directly into the engine with VB.
RLE Looped
RLE Looped is the exact same as RLE, but the algorithm is run over and over until no more compression can be achieved. The result is a slightly slower, yet very slightly better compression than RLE.
Deflate64
Deflate64 is an enhanced version of Deflate, an algorithm that uses a combination of LZ77 and Huffman encoding techniques to achieve compression. Although it has a very slow compression, its decompression is very fast (can even reach over 100x faster than the compression). In result, it makes a brilliant compression that is client-friendly.

[edit] Benchmarks

The following benchmarks are performed on an Intel T1300 1.66Ghz processor with 502MB of RAM laptop. All encryption and decryption was done with the File Processor tool of vbGORE in compiled mode.

The files compressed were all the files required by the client in the v0.5.2 release. This consisted of 231 files totaling at 15,473,310 bytes in size. Files were only compressed individually - no archiving.

  • Size is represented in percentage of original size, so 20% means that the compressed size is OriginalSize * 0.2 bytes
  • All times are presented in seconds
PAQ8l LZMA RLE RLE Looped Defate64
Size 56.4% 65.1% 97.8% 95.3% 71.0%
Compress time 2350.94 14.54 3.35 19.28 229.93
Decompress time 2253.39 5.02 0.98 2.88 3.82

[edit] Which algorithm to use?

Now that we know the benchmarks of the 5 algorithms, how do we decide which to use? First, we want to take into consideration that compression time is not an important factor for the update server. It is nice to have it compress fast, but I'm sure most people here wouldn't mind waiting an extra hour to gain a bit more compression. The two important things to look at, then, is the size and decompression time. Size has a heavier importance than decompression time, but PAQ8l is completely out until it can manage compressing down to 5% of the original size. A full game distribution using PAQ8l is going to run your decompression time into hours, if not days. If a client doesn't have over about 250MByes of RAM to spare to the PAQ8l decompression process, you can expect the time to boost significantly higher quickly.

RLE and RLE Looped are very fast, but their compression rate is way too weak, especially with RLE. We'll hardly be saving any bandwidth with these two. That brings us down to the last two - LZMA and Deflate64.

LZMA offers a better compression but at a slower decompression time. Though, 5.02 seconds to decompress a series of so many files is hardly going to be a problem. For a full-sized game, this would still take less than a few minutes for a complete update. The battle would be closer if Deflate64 had a same compress time as LZMA, but the compression time is just insanely high compared to LZMA.

So in result, in my personal preference, the best algorithm to use for the update server would be LZMA. If you really wanted to, PAQ8l can be used if you don't care that your users will have to suffer a great downtime to extract the files. If you use PAQ8l, you better hope your updates are very small and not frequent.

[edit] Related pages

Personal tools