Skip to content

dupontcyborg/zpickle

Repository files navigation

zpickle

Build Status GitHub Release License

Python Versions PyPI Version PyPI Downloads

Transparent, drop-in compression for Python's pickle — smaller files, same API.

zpickle adds high-performance compression to your serialized Python objects using multiple state-of-the-art algorithms without changing how you work with pickle.

# Replace this:
import pickle

# With this:
import zpickle as pickle

# Everything else stays the same!

Features

  • Drop-in replacement for the standard pickle module
  • Transparent compression — everything happens automatically
  • Multiple algorithms — choose zstd, brotli, zlib, lzma, bzip2, or lz4 (powered by compress_utils)
  • Configure once, use everywhere — set global defaults for your entire app
  • Smaller data — 2-10× smaller serialized data (depending on content and algorithm)
  • Backward compatible — automatically reads both compressed and regular pickle data
  • Complete API compatibility — all pickle functions work as expected

Installation

pip install zpickle

Quick Start

Basic Usage

import zpickle as pickle

# Serializing works exactly like pickle
data = {"complex": ["nested", {"data": "structure"}], "with": "lots of repetition"}
serialized = pickle.dumps(data)  # Automatically compressed!

# Deserializing works the same way
restored = pickle.loads(serialized)  # Automatically decompressed!

# File operations work too
with open("data.zpkl", "wb") as f:
    pickle.dump(data, f)

with open("data.zpkl", "rb") as f:
    restored = pickle.load(f)

Custom Configuration

import zpickle

# Configure global settings
zpickle.configure(algorithm='brotli', level=9)  # Higher compression

# Or configure for a single operation
data = [1, 2, 3] * 1000
compressed = zpickle.dumps(data, algorithm='zstd', level=6)

Performance

Compression ratios versus standard pickle (higher is better):

Bar graph showing data compression ratios versus pickle

Serialization speed (MB/s, higher is better):

Bar graph showing data compression speeds versus pickle

Note: Performance varies by data characteristics. Run benchmarks on your specific data for accurate results.

To run your own benchmarks, you can use:

python -m benchmarks.benchmark

How It Works

zpickle applies compression with minimal overhead:

  1. Objects are first serialized using standard pickle
  2. The pickle data is compressed using the selected algorithm
  3. A small header (8 bytes) is added to identify the format and algorithm
  4. When deserializing, zpickle auto-detects the format and decompresses if needed

API Reference

zpickle maintains complete API compatibility with the standard pickle module:

Core Functions

  • dumps(obj, protocol=None, ..., algorithm=None, level=None) - Serialize and compress object
  • loads(data, ...) - Deserialize and decompress object
  • dump(obj, file, protocol=None, ..., algorithm=None, level=None) - Serialize to file
  • load(file, ...) - Deserialize from file

Configuration

  • configure(algorithm=None, level=None, min_size=None) - Set global defaults
  • get_config() - Get current configuration

Classes

  • Pickler(file, ...) - Subclass of pickle.Pickler with compression
  • Unpickler(file, ...) - Subclass of pickle.Unpickler with decompression

Alternatives

  • Standard pickle: No compression, but native to Python
  • compressed_pickle: Similar concept, but less configurable
  • joblib: More focused on large NumPy arrays and parallel processing
  • msgpack, protobuf: Different serialization formats (not pickle-compatible)

License

This project is distributed under the MIT License. Read more >

Links

About

Drop-in pickle replacement with transparent compression - smaller files, same API

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages