AIDA GELINA BRIKEN nToF CRIB ISOLDE CIRCE nTOFCapture DESPEC DTAS EDI_PSA 179Ta CARME StellarModelling DCF K40
  AIDA  ELOG logo
Message ID: 515     Entry time: Fri Dec 2 06:41:25 2016
Author: DK 
Subject: Benchmarks for compression 
R45_17 as the test file, which was closed at 9.02 AM Friday, December 2nd at the end of the official parasitic
machine time on Fallon et al.

===Summarized results===
Initial file is 2.0 GB of AIDA data

LZMA: 862M; Time: 23 min
BZ2: 1.3G; Time: 6.5 min
GZ: 1.3G; Time: 3 min

The LZMA results are about as I expected: fantastic compression but it takes a very long time to pack the data.
 The BZ2 data are a bit surprising; usually it can be about 10 or 15% smaller than GZ for hexdata from my
experience.  I conclude that we should use GZ (which we are already doing, but now we confirm that it is
optimized for time versus disk space usage).

15.49 We are continuing to compress the data.  Presently we are somewhere around R39_20, sequentially.  I
estimated that there are 578 runs remaining, and naively say each is 2 GB (some at the end of RXX_ may be less).  

Compression at 3 minutes each then takes 29 hours, so it should be finished by tomorrow evening, say around
20:00 or a little later, depending on the fluctuation.

===Full details of the test===


npg@aidas1 ~/benchmarks % ls -altr
total 2048024
-rw-r--r--.  1 npg npgstaff 2097152000 Dec  2 13:58 R45_17
drwxrwxr-x. 55 npg users          4096 Dec  2 14:01 ..
drwxr-xr-x.  2 npg npgstaff       4096 Dec  2 14:01 .

First test with lzma

npg@aidas1 ~/benchmarks % time tar cvf R45_17.lzma --lzma R45_17
R45_17
tar cvf R45_17.lzma --lzma R45_17  1388.10s user 15.63s system 100% cpu 23:13.94 total
npg@aidas1 ~/benchmarks % ls -altrh
total 2.8G
-rw-r--r--.  1 npg npgstaff 2.0G Dec  2 13:58 R45_17
drwxrwxr-x. 55 npg users    4.0K Dec  2 14:08 ..
-rw-r--r--.  1 npg npgstaff 862M Dec  2 14:31 R45_17.lzma
npg@aidas1 ~/benchmarks % 

as expected, the compression quality is very good (> 50%) but this is much to slow to be practical.

next we can attempt bz2 

npg@aidas1 ~/benchmarks % time tar cvjf R45_17.tar.bz2 R45_17
R45_17
tar cvjf R45_17.tar.bz2 R45_17  375.22s user 6.68s system 97% cpu 6:31.74 total
npg@aidas1 ~/benchmarks % ls -altrh
total 4.1G
-rw-r--r--.  1 npg npgstaff 2.0G Dec  2 13:58 R45_17
-rw-r--r--.  1 npg npgstaff 862M Dec  2 14:31 R45_17.lzma
-rw-r--r--.  1 npg npgstaff  737 Dec  2 14:31 results.txt
drwxrwxr-x. 55 npg users    4.0K Dec  2 14:32 ..
-rw-------.  1 npg npgstaff  12K Dec  2 14:33 .results.txt.swp
drwxr-xr-x.  2 npg npgstaff 4.0K Dec  2 14:34 .
-rw-r--r--.  1 npg npgstaff 1.3G Dec  2 14:40 R45_17.tar.bz2


npg@aidas1 ~/benchmarks % ls -altrh
total 4.1G
-rw-r--r--.  1 npg npgstaff 2.0G Dec  2 13:58 R45_17
-rw-r--r--.  1 npg npgstaff 862M Dec  2 14:31 R45_17.lzma
-rw-r--r--.  1 npg npgstaff  737 Dec  2 14:31 results.txt
drwxrwxr-x. 55 npg users    4.0K Dec  2 14:32 ..
-rw-------.  1 npg npgstaff  12K Dec  2 14:33 .results.txt.swp
drwxr-xr-x.  2 npg npgstaff 4.0K Dec  2 14:34 .
-rw-r--r--.  1 npg npgstaff 1.3G Dec  2 14:40 R45_17.tar.bz2
npg@aidas1 ~/benchmarks % time tar cvzf R45_17.tar.gz R45_17
R45_17
tar cvzf R45_17.tar.gz R45_17  188.01s user 6.53s system 100% cpu 3:13.98 total

npg@aidas1 ~/benchmarks % time tar cvzf R45_17.tar.gz R45_17
R45_17
tar cvzf R45_17.tar.gz R45_17  188.01s user 6.53s system 100% cpu 3:13.98 total
npg@aidas1 ~/benchmarks % ls -altrh
total 5.4G
-rw-r--r--.  1 npg npgstaff 2.0G Dec  2 13:58 R45_17
-rw-r--r--.  1 npg npgstaff 862M Dec  2 14:31 R45_17.lzma
-rw-r--r--.  1 npg npgstaff  737 Dec  2 14:31 results.txt
drwxrwxr-x. 55 npg users    4.0K Dec  2 14:32 ..
-rw-r--r--.  1 npg npgstaff 1.3G Dec  2 14:40 R45_17.tar.bz2
drwxr-xr-x.  2 npg npgstaff 4.0K Dec  2 14:44 .
-rw-r--r--.  1 npg npgstaff 1.3G Dec  2 14:47 R45_17.tar.gz
-rw-------.  1 npg npgstaff  12K Dec  2 14:50 .results.txt.swp


conclusion is gzip is the best as for time efficiency.
ELOG V3.1.4-unknown