About GDCC 2021
This competition focuses on the advantages of algorithms and their implementations for universal lossless data compression rather than for certain data types. We test compressors under the following scenarios:
Test 1:
Qualitative-data compression
Filtered Chinese Wikipedia DB data
Test 2:
Quantitative-data compression
16-bit multispectral images, 16-bit integer and 32-bit floating point telemetry data
Test 3:
Mixed-data compression
Preprocessed ARM64 executable files mixed with scientific data containing 32-bit floating-point numbers and 32-bit integers
Test 4:
Small-block-data compression
Mixture of Test 1 and Test 3 data to be compressed independently in 64 KiB blocks but allowing random-access decompression in 8 KiB blocks
Test 5:
Student test
Participants must generate a parameter file for the provided compressor that minimizes compressed-data size for Test 1 data. Optimize a given compressor using parameter file.
categories
We impose speed limits to separate each of these four tests into three subcategories: rapid compression, balanced compression and high compression ratio (HCR). All told, the result is 12 categories and leaderboards, each with its own prizes.
2021 prize winners
Qualitative data
Quantitative data
Image data
Student
Board of experts of GDCC 2021
2021 leaderboards
- Rapid
- Balanced
- High compression ratio
General Notes
Ranking
Table Additional Notes
- The leaderboard tables below contain results for contest submissions and selected publicly available compressors. The names of submitted compressors appear in boldface.
- See “Ranking” for rules governing how we order the results.
- When possible, we set compressor options to use just one thread for publicly available compressors. Some programs, however, may (and did) use multiple threads. Because we declined to fine-tune presets to fit the speed limits as tightly as possible, the compressors are not aligned by speed. Therefore, these results SHOULD NOT be used to draw conclusions about publicly available compressors such as “compressor X is better than compressor Y.”
- HCR stands for “High Compression Ratio”.
For the “balanced” and “high compression ratio” categories we rank compressors according to the following metric:
c_full_size = compressed-data size + compressed-decompressor size
First place goes to the compressor with the smallest c_full_size.
We compress decompressors using bzip2 v.1.0.8 with the “-9” setting.
For the rapid categories we rank according to the function:
f = c_time + 2·d_time + 1/10⁶·c_full_size,
where c_time and d_time are, respectively, the compression and decompression times in seconds, and c_full_size is in bytes.
First place goes to the compressor with the smallest value for f.
The compressors that fell just short of a given speed category appear at the bottom of the corresponding table. Submissions that failed to fully comply with the rules (in particular, the rule that every compressor must correctly decode the compressed files for all four tests) are also at the bottom.
Charts for leaderboards
General notes
- The line joining the markers for different compressors on the scatter plot shows the Pareto frontier. That is, for each such compressor, no other analyzed programs in that category achieve better results for both the selected time and compression parameters.
- The names of submitted compressors appear in boldface.
- The names of submitted compressors that failed to fully comply with the competition rules appear in strikethrough.
- Test 1, Rapid
- Test 1, Balanced
- Test 1, HCR
- Test 2, Rapid
- Test 2, Balanced
- Test 2, HCR
- Test 3, Rapid
- Test 3, Balanced
- Test 3, HCR
- Test 4, Rapid
- Test 4, Balanced
- Test 4, HCR
- Full time
- Compression time
- Decompression time
- c_full_size
- c_full_size, megabytes
- Compression ratio
- Compression ratio, bits per byte
- Compression degree