Compression

JLD2 supports compression of isbits arrays. This includes the typical Array{Float64} but also arrays of custom structs that are immutable and only consist of basic number type fields.

To enable the default compression, you can write:

using JLD2
save("example.jld2", "large_array", zeros(10000); compress = true)

Alternatively use

jldsave("example.jld2", true; large_array=zeros(10000))

or

jldopen("example.jld2", "w"; compress = true) do f
    f["large_array"] = zeros(10000)
end

When reading a file JLD2 detects compression and automatically decompresses the data so it is not necessary to pass any extra parameters for that case. However, JLD2 will prompt you to install and load the necessary filter packages if they are not yet available.

Compression Filter API

JLD2 can use a number of different compression algorithms, also called filters. These can be used individually and even chained which can be useful for some types of data. The filter used by compress = true is the Deflate() compression filter.

Note

The default Deflate() compression is always available but some others will need to be installed separately. JLD2 will throw an error if the required filter package is not loaded, prompting you to install and load the appropriate package e.g. : using JLD2, JLD2Lz4.

Installing Filter Packages

To use compression filters, you need to install and load the corresponding packages:

using Pkg
# For other compression algorithms
Pkg.add("JLD2Lz4")
using JLD2, JLD2Lz4  # Load the package you need

Available Compression Filters

This compression system is analogous to that of HDF5 and uses the same underlying compression libraries. JLD2 files with compressed datasets can in many cases be opened using HDF5 and similarly, JLD2 will be able to read most HDF5 files even with compression. The compression filters available for JLD2 are:

Filter PackageFilter TypeNotes
built inShuffleRearrangement of bytes useful as a preprocess filter
built inDeflateDefault compression, very widely used, good compatibility
built inZstdFilterFast, wide range of compression size vs speed trade-offs
JLD2Bzip2Bzip2FilterGood compression ratio, can be slower
JLD2Lz4Lz4FilterVery fast compression/decompression

Using Specific Filters

To use a specific compression filter, pass an instance of the filter instead of true:

using JLD2, JLD2Lz4

# Using Lz4 compression
jldopen("example.jld2", "w"; compress = Lz4Filter()) do f
    f["large_array"] = zeros(10000)
end

# Zstd with non-standard compression level
jldopen("example.jld2", "w"; compress = ZstdFilter(9)) do f
    f["large_array"] = zeros(10000)
end

Using Multiple Filters

JLD2 supports combining multiple filters for advanced compression strategies. This is particularly useful when combining preprocessing filters (like shuffling) with compression filters. Simply provide a vector of filters:

using JLD2

# Combine Shuffle preprocessing with Deflate compression
filters = [Shuffle(), Deflate()]

jldopen("example.jld2", "w"; compress = filters) do f
    # Benefits from byte shuffling
    # Only the lowest byte of each element is non-zero
    # Shuffle() reorders the bytes of all elements from e.g.
    # [123123123] to [111222333]
    # where each digit refers to the nth byte of an array element.
    f["numeric_data"] = UInt.(rand(UInt8, 10000))
end
Note

Filters in a pipeline are applied in order during compression and in reverse order during decompression. Preprocessing filters (like Shuffle) should typically come before compression filters.

Filter Configuration Examples

Different filters support various configuration options:

using JLD2, JLD2Lz4, JLD2Bzip2

# Zstd with different compression levels
zstd_fast = ZstdFilter(1)    # Fast compression
zstd_best = ZstdFilter(22)   # Best compression

# Bzip2 with custom block size
bzip2_filter = Bzip2Filter(4)

# Example usage
jldopen("example.jld2", "w") do f
    write(f, "fast_data", zeros(UInt8, 10000); compress=zstd_fast)
    write(f, "small_data", randn(10000); compress=zstd_best)
    write(f, "archive_data", randn(1000); compress=bzip2_filter)
end

Depending on the characteristics of your datasets, some configurations may be more efficient than others.

Manually selecting compression for datasets

Sometimes you may know that some of your arrays are easily compressible and that for others it is not worth the effort. For precise control, the write function takes an optional keyword argument to override the file compression settings.

using JLD2

jldopen("example.jld2", "w"; compress=ZstdFilter()) do f
    # This gets compressed with the ZstdFilter
    write(f, "default_array", zeros(10000))

    # Don't compress this
    write(f, "random_array", rand(10000); compress=false)

    # Override the above compression filter and use a different one
    write(f, "zlib_array", zeros(10000); compress=Deflate())

    # Alternatively, use the same filter but with different configuration
    write(f, "fast_compressed", rand(10000); compress=ZstdFilter(1))
end

Compatibility and Migration from v0.5 to v0.6

  • File Compatibility: Files created with the old API can be read with the new system. Files created with the v0.6 filter API may not be able to read by older versions of JLD2, see the compatibility table below for more information.
  • Performance: Compression performance and file sizes remain the same as the underlying compression libraries are unchanged.
  • HDF5 Compatibility: The new API is analogous to HDF5.jl, making it easier to work with HDF5 files and improving interoperability.

Filter Compatibility Table

The following table shows which JLD2 versions can decode data compressed using different filter features:

Filter featureJLD2 versions able to decode
DeflateSince 0.2.0
Bzip2FilterSince 0.4.4
ZstdFilterSince 0.4.49
ShuffleSince 0.6.0
Lz4FilterSince 0.6.0
multiple filtersSince 0.6.0

Notes:

  • Data compressed with LZ4FrameCompressor in previous versions

can be read if JLD2Lz4 is loaded. Data compressed with Lz4Filter cannot be read by JLD2 versions before 0.6.0.

For code migration, the main change is in how you specify compression filters:

# Old API
# using JLD2, CodecZlib
# jldopen("file.jld2", "w"; compress = ZlibCompressor()) do f

# New API
using JLD2
jldopen("file.jld2", "w"; compress = Deflate()) do f
    # ...
end

The simplest usage option of compress=true still works as before.

API Docstrings

JLD2.FiltersModule

JLD2.Filters

This module contains the interface for using filters in JLD2.jl.

source
JLD2.Filters.DeflateType
Deflate <: Filter

The Deflate filter can be used to compress datasets. It uses the well-known and widely used zlib (deflate) compression algorithm.

Arguments:

  • level: Compression level, between 0 and 9. Default is 5.

Larger numbers lead to better compression, but also to longer runtime.

source
JLD2.Filters.ShuffleType
Shuffle <: Filter

The Shuffle filter can be used as part of a filter pipeline to compress datasets. It rearranges the bytes of elements in an array to improve compression efficiency. It is not a compression filter by itself, but can be used in conjunction with other compression filters like DeflateorZstdFilter`.

It can be useful when the array, for example, contains unsigned integer UInt64 and all values are small. Then all the upper bytes of the eight byte integer are zero. This filter will rearrange the bytes so that all the least significant bytes are at the beginning of the array, followed by the second least significant bytes, and so on, which simplifies the compression of the data.

source
JLD2.Filters.ZstdFilterType
ZstdFilter <: Filter

The ZstdFilter can be used to compress datasets using the Zstandard compression algorithm.

Arguments:

  • level: Compression level, between 1 and 22.

Larger numbers lead to better compression, but also to longer runtime.

source