Luca Gaetano Amarù, Andreas Peter Burg, Giovanni De Micheli, Pierre-Emmanuel Julien Marc Gaillardon
Nowadays, most software and hardware applications are committed to reduce the footprint and resource usage of data. In this general context, lossless data compression is a beneficial technique that encodes information using fewer (or at most equal number of) bits as compared to the original representation. A traditional compression flow consists of two phases: data decorrelation and entropy encoding. Data decorrelation, also called entropy reduction, aims at reducing the autocorrelation of the input data stream to be compressed in order to enhance the efficiency of entropy encoding. Entropy encoding reduces the size of the previously decorrelated data by using techniques such as Huffman coding, arithmetic coding, and others. When the data decorrelation is optimal, entropy encoding produces the strongest lossless compression possible. While efficient solutions for entropy encoding exist, data decorrelation is still a challenging problem limiting ultimate lossless compression opportunities. In this paper, we use logic synthesis to remove redundancy in binary data aiming to unlock the full potential of lossless compression. Embedded in a complete lossless compression flow, our logic synthesis based methodology is capable to identify the underlying function correlating a data set. Experimental results on data sets deriving from different causal processes show that the proposed approach achieves the highest compression ratio compared to state-of-art compression tools such as ZIP, bzip2 and 7zip.