This is a collection of commands you can use to compress files with the highest compression levels possible.
This cheatsheet primarily uses XZ/LZMA as it has the best compression ratios most of the time (but not always). This algorithm generally has very slow compression times, but decompression is much faster. Similiarly, compression usually requires a lot of RAM, while decompression needs much less. The bigger is the file you're trying to compress, the more memory you will need.
Tools labeled (raw) below are programs that work on data streams. They do not understand filesystems and therefore cannot be used to compress multiple files or even directories. For such tasks you need an archiver (e.g. tar, 7-Zip, etc.). They will be labeled as (archiver).
Archivers like tar usually work by first creating an uncompressed archive file that is essentially a container for multiple files and directories, and then passing it to a compressor like gzip or xz.
flowchart LR
subgraph src[Files to compress]
direction TB
s_f1[File 1]
s_f2[File 2]
s_f3[File 3]
s_d1[Directory 1]
s_d2[Directory 2]
s_d3[Directory 3]
s_others[...]
end
subgraph raw_tar[my_files.tar]
direction TB
t_f1[File 1]
t_f2[File 2]
t_f3[File 3]
t_d1[Directory 1]
t_d2[Directory 2]
t_d3[Directory 3]
t_others[...]
end
tar([tar])
gzip([gzip])
result([my_files.tar.gz])
src-->tar
tar-->raw_tar
raw_tar-->gzip
gzip-->result
7z a -t7z -m0=lzma2 -mx=9 -mfb=64 -md=32m -ms=on -mmt=2 archive.7z dir1 dir2 file1 file2...Some notes
This command limits the threads used during compression to only 2 (-mmt=2). For some reason this specific implementation has better compression rations when only using 2 threads. You can remove this if you're willing to compromise some amount of effectiveness for speed, but don't expect too much of a speed boost, especially when compressing a single large file.
xz -z -k -9 -e -T32 -vvv some_large_fileAdjust -T32 to however many threads you want to use. If you want to use your entire CPU, set it to the number of it's threads.
Some notes
The -vvv flag just enables verbose logging, which is very useful if you want to know how much RAM will be required during compression and decompression (only estimates, but still useful).
For example, trying to compress a 256GB RAM disk image requires 40GB of RAM, but decompression will only need 65MB.
xz: Filter chain: --lzma2=dict=64MiB,lc=3,lp=0,pb=2,mode=normal,nice=273,mf=bt4,depth=512
xz: Using up to 32 threads.
xz: 39 972 MiB of memory is required. The limiter is disabled.
xz: Decompression will need 65 MiB of memory.
xz: Filter chain: --lzma2=dict=8MiB,lc=3,lp=0,pb=2,mode=normal,nice=64,mf=bt4,depth=0
xz: Using up to 32 threads.
xz: 5 284 MiB of memory is required. The limiter is disabled.
xz: Decompression will need 9 MiB of memory.
server_disk.img (1/1)
100 % 77,1 GiB / 238,5 GiB = 0,323 115 MiB/s 35:22
xz: Filter chain: --lzma2=dict=64MiB,lc=3,lp=0,pb=2,mode=normal,nice=273,mf=bt4,depth=512
xz: Using up to 31 threads.
xz: 38 723 MiB of memory is required. The limiter is disabled.
xz: Decompression will need 65 MiB of memory.
server_disk.img (1/1)
100 % 74,5 GiB / 238,5 GiB = 0,312 70 MiB/s 58:23
If don't have enough RAM, you can try one or more of the following:
- Lower the compression level (adjust
-9to values between-1and-9)- Remove the
-eflag (extreme mode - uses more CPU time to compress better)
- Remove the
- Lower the amount of threads used.
Do not rely on swap! If you don't have enough memory, the OS will try to use swap, which is much slower than RAM and will make compression slower. Use the tips above to lower the memory requirements.
lrzip -z very_large_fileSome notes
server_disk_sparsed.img - Compression Ratio: 4.232. Average Compression Speed: 44.741MB/s.
Total time: 01:30:58.33
End result: 60.5GB
takeout.tar - Compression Ratio: 8.489. Average Compression Speed: 11.336MB/s.
Total time: 00:01:46.95
End result: 143MB
This uses the GZIP compression algorithm instead of XZ which is generally faster but has worse compression rations. Can be very effective when compressing text.
gzip -9 -k -r some_large_file.ext > result.ext.gz