Difference between revisions of "Data compression algorithms and tools"

From Archiveteam
Jump to navigation Jump to search
m
(added lrzip section ~~~~)
Line 23: Line 23:
 
* kgb -m file.kgb originalfile
 
* kgb -m file.kgb originalfile
 
* m is a number from 0 to 9 (lowest compression ratio from higher; higher use 1616 MB of RAM, a lot of CPU and time)
 
* m is a number from 0 to 9 (lowest compression ratio from higher; higher use 1616 MB of RAM, a lot of CPU and time)
 +
 +
== lrzip ==
 +
"This is a compression program optimised for large files" -[http://ck.kolivas.org/apps/lrzip/README lrzip readme ]
 +
 +
lrzip is fantastic for archiving - the compression ratio '''improves''' as the size of the input file grows - albeit a  '''''terribly''''' slow compressor. lrzip really shines when compressing large sets of redundant information - but distant, and otherwise unconnected. General purpose compression algorithms would never see this, given their tiny compression window.
 +
 +
[http://ck.kolivas.org/apps/lrzip/README.benchmarks lrzip benchmarks ]
  
 
== External links ==
 
== External links ==

Revision as of 03:45, 7 May 2013

This list contains the most popular data compression algorithms and tools. Mos of them are open source, an important detail if you want to preserve data for a long time from now and to be able to decompress the data in the future.

7z

Bzip2

Gzip

Zip

  • Available by default in any Windows version available today, but if you need cross-platform, use 7-zip.

KGB

The best compression ratio (better than 7z), but a bit slow.

You can install it in Ubuntu with: sudo apt-get install kgb

How to:

  • kgb -m file.kgb originalfile
  • m is a number from 0 to 9 (lowest compression ratio from higher; higher use 1616 MB of RAM, a lot of CPU and time)

lrzip

"This is a compression program optimised for large files" -lrzip readme

lrzip is fantastic for archiving - the compression ratio improves as the size of the input file grows - albeit a terribly slow compressor. lrzip really shines when compressing large sets of redundant information - but distant, and otherwise unconnected. General purpose compression algorithms would never see this, given their tiny compression window.

lrzip benchmarks

External links