Faster zlib and gzip compatible compression and decompression by providing python bindings for the isa-l library.
Project description
python-isal
Faster zlib and gzip compatible compression and decompression by providing Python bindings for the ISA-L library.
This package provides Python bindings for the ISA-L library. The Intel Infrastructure Storage Acceleration Library (ISA-L) implements several key algorithms in assembly language. This includes a variety of functions to provide zlib/gzip-compatible compression.
python-isal provides the bindings by offering an isal_zlib and igzip module which are usable as drop-in replacements for the zlib and gzip modules from the stdlib (with some minor exceptions, see below).
Usage
Python-isal has faster versions of the stdlib’s zlib and gzip module these are called isal_zlib and igzip respectively.
They can be imported as follows
from isal import isal_zlib
from isal import igzip
isal_zlib and igzip are meant to be used as drop in replacements so their api and functions are the same as the stdlib’s modules. Except where isa-l does not support the same calls as zlib (See differences below).
A full API documentation can be found on our readthedocs page.
python -m isal.igzip implements a simple gzip-like command line application (just like python -m gzip).
Installation
Installation with pip
Linux and MacOS: pip install isal. Wheels are provided, so installation should be almost instantaneous.
Windows: Installation is not supported yet.
The installation will include a staticallly linked version of isa-l. On Linux and MacOS, wheels are provided. If a wheel is not provided for your system the installation will build ISA-L first in a temporary directory. Please check the ISA-L homepage for the build requirements.
The latest development version of python-isal can be installed with:
pip install git+https://github.com/rhpvorderman/python-isal.git
This requires having the build requirements installed. If you wish to link dynamically against a version of libisal installed on your system use:
PYTHON_ISAL_LINK_DYNAMIC=true pip install isal
ISA-L is available in numerous Linux distro’s as well as on conda via the conda-forge channel. Checkout the ports documentation on the ISA-L project wiki to find out how to install it. It is important that the development headers are also installed.
On Debian and Ubuntu the ISA-L libraries (including the development headers) can be installed with:
sudo apt install libisal-dev
Installation via conda
Python-isal can be installed via conda, for example using the miniconda installer with a properly setup conda-forge channel. When used with bioinformatics tools setting up bioconda provides a clear set of installation instructions for conda.
python-isal is available on conda-forge and can be installed with
conda install python-isal
This will automatically install the isa-l library dependency as well, since it is available on conda-forge.
Differences with zlib and gzip modules
Compression level 0 in zlib and gzip means no compression, while in isal_zlib and igzip this is the lowest compression level. This is a design choice that was inherited from the ISA-L library.
Compression levels range from 0 to 3, not 1 to 9.
zlib.Z_DEFAULT_STRATEGY, zlib.Z_RLE etc. are exposed as isal_zlib.Z_DEFAULT_STRATEGY, isal_zlib.Z_RLE etc. for compatibility reasons. However, isal_zlib only supports a default strategy and will give warnings when other strategies are used.
zlib supports different memory levels from 1 to 9 (with 8 default). isal_zlib supports memory levels smallest, small, medium, large and largest. These have been mapped to levels 1, 2-3, 4-6, 7-8 and 9. So isal_zlib can be used with zlib compatible memory levels.
isal_zlib has a compressobj and decompressobj implementation. However, the unused_data and unconsumed_tail for the Decompress object, only work properly when using gzip compatible compression. (25 <= wbits <= 31).
The flush implementation for the Compress object behavious differently from the zlib equivalent. The flush implementation is sufficient for the igzip module to work 100% in compliance with the gzip tests from CPython. It does not however work for all the zlib compliance tests (see above). This is an area that still needs work.
Contributing
Please make a PR or issue if you feel anything can be improved. Bug reports are also very welcome. Please report them on the github issue tracker.