diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
index f65f5f5..d3ed19d 100644
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -14,7 +14,7 @@ jobs:
strategy:
matrix:
operating-system: [ubuntu-latest]
- python-version: ["3.10"]
+ python-version: ["3.12"]
fail-fast: false
steps:
- name: Checkout
@@ -30,6 +30,5 @@ jobs:
- name: Install dev requirements
run: |
pip install -r requirements-dev.txt
- pip install -r requirements.txt
- name: Run checks
run: pre-commit run --files $(find imread_benchmark -type f)
diff --git a/.gitignore b/.gitignore
index 0f8b6df..d2eb9b3 100644
--- a/.gitignore
+++ b/.gitignore
@@ -108,3 +108,5 @@ venv.bak/
.idea/
.ruff_cache/
+
+venvs/
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
index e8bfb78..0435884 100644
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -59,10 +59,6 @@ repos:
hooks:
- id: codespell
additional_dependencies: ["tomli"]
- - repo: https://github.com/igorshubovych/markdownlint-cli
- rev: v0.43.0
- hooks:
- - id: markdownlint
- repo: https://github.com/tox-dev/pyproject-fmt
rev: "v2.5.0"
hooks:
diff --git a/README.md b/README.md
index 0310f66..7fc1605 100644
--- a/README.md
+++ b/README.md
@@ -1,101 +1,148 @@
-[](https://github.com/ambv/black)
-[](https://github.com/astral-sh/ruff)
-
-# Image Loading Benchmark: From JPG to RGB Numpy Arrays
-
-
-
-This benchmark evaluates the efficiency of different libraries in loading JPG images and converting them into RGB numpy arrays, essential for neural network training data preparation. Inspired by the [Albumentations library](https://github.com/albumentations-team/albumentations/).
+# Image Loading Benchmark
+## Overview
+
+This benchmark evaluates the efficiency of different libraries in loading JPG images
+and converting them into RGB numpy arrays, essential for neural network training
+data preparation. The study compares traditional image processing libraries (Pillow, OpenCV),
+machine learning frameworks (TensorFlow, PyTorch), and specialized decoders (jpeg4py, kornia-rs)
+across different computing architectures.
+
+
+
+  |
+  |
+
+
+ | Performance on Apple Silicon (M4 Max) |
+ Performance on Linux (AMD Threadripper) |
+
+
## Important Note on Image Conversion
-In the benchmark, it's crucial to standardize image formats for a fair comparison, despite different default formats used by OpenCV (BGR), torchvision, and TensorFlow (tensors). A conversion step to RGB numpy arrays is included for consistency. Note that in typical use cases, torchvision and TensorFlow do not require this conversion. Preliminary analysis shows that this extra step does not significantly impact the comparative performance of the libraries, ensuring that the benchmark accurately reflects realistic end-to-end image loading and preprocessing times.
+In the benchmark, it's crucial to standardize image formats for a fair comparison.
+Different libraries use different default formats: OpenCV (BGR), torchvision and
+TensorFlow (tensors). A conversion step to RGB numpy arrays is included for
+consistency. Note that in typical use cases, torchvision and TensorFlow do not
+require this conversion.
## Installation and Setup
-Before running the benchmark, ensure your system is equipped with the necessary dependencies. Start by installing `libturbojpeg`:
+Before running the benchmark, ensure your system is equipped with the necessary
+dependencies:
+
+### System Requirements
```bash
+# On Ubuntu/Debian
sudo apt-get install libturbojpeg
-```
-Next, install all required Python libraries listed in `requirements.txt`:
+### Python Setup
+
+The benchmark uses separate virtual environments for each library to avoid
+dependency conflicts. You'll need:
```bash
-sudo apt install requirements.txt
+# Install uv for faster package installation
+pip install uv
```
-Note: If you want to update package versions in `requirements.txt`
+## Running the Benchmark
+
+The benchmark script creates separate virtual environments for each library and
+runs tests independently:
```bash
-pip install pip-tools
-```
+# Make the script executable
+chmod +x run_benchmarks.sh
-```bash
-pip-compile requirements.in
-```
-this will create new `requirements.txt` file
+# Show help and options
+./run_benchmarks.sh --help
-```bash
-pip install -r requirements.txt
+# Run benchmark with default settings (2000 images, 5 runs)
+./run_benchmarks.sh /path/to/images
+
+# Run with custom settings
+./run_benchmarks.sh /path/to/images 1000 3
```
-to install latest versions
-## Running the Benchmark
+The script will:
-To understand the benchmark's configuration options and run it according to your setup, use the following commands:
+1. Create separate virtual environments for each library
+2. Install required dependencies using `uv`
+3. Run benchmarks independently
+4. Save results to OS-specific directories
-```bash
-python imread_benchmark/benchmark.py -h
-
-usage: benchmark.py [-h] [-d DIR] [-n N] [-r N] [--show-std] [-m] [-p] [-s] [-o OUTPUT_PATH]
-
-Image reading libraries performance benchmark
-
-options:
- -h, --help show this help message and exit
- -d DIR, --data-dir DIR
- path to a directory with images
- -n N, --num_images N number of images for benchmarking (default: 2000)
- -r N, --num_runs N number of runs for each benchmark (default: 5)
- --show-std show standard deviation for benchmark runs
- -m, --markdown print benchmarking results as a markdown table
- -p, --print-package-versions
- print versions of packages
- -s, --shuffle Shuffle the list of images.
- -o OUTPUT_PATH, --output_path OUTPUT_PATH
- Path to save resulting dataframe.
-```
+### Results Structure
+Results are saved in JSON format under:
-```bash
-python imread_benchmark/benchmark.py \
- --data-dir \
- --num_images \
- --num_runs \
- --show-std \
- --print-package-versions \
- --print-package-versions
+```text
+output/
+├── linux/ # When run on Linux
+│ ├── opencv_results.json
+│ ├── pil_results.json
+│ └── ...
+└── darwin/ # When run on macOS
+ ├── opencv_results.json
+ ├── pil_results.json
+ └── ...
```
-Extra options:
-`--print-package-versions` - to print benchmarked libraries versions
-`--print-package-versions` - to shuffle images on every run
-`--show-std` - to show standard deviation for measurements
+## Libraries Being Benchmarked
+
+Each library uses different underlying JPEG decoders and implementation approaches:
+
+### Direct libjpeg-turbo Users (Fastest)
+- jpeg4py (Linux only) - Direct libjpeg-turbo binding
+- kornia-rs - Modern Rust-based implementation
+- OpenCV (opencv-python-headless)
+- torchvision
+
+### Standard libjpeg Users
+- PIL (Pillow)
+- Pillow-SIMD (Linux only)
+- scikit-image
+- imageio
+
+### Machine Learning Framework Components
+- tensorflow
+- torchvision
+- kornia-rs
+
+
+## Performance Considerations
+
+Several factors influence real-world performance beyond raw decoding speed:
+
+### Memory Usage
+- Memory utilization varies significantly across libraries
+- Some implementations (like kornia-rs) have specific memory allocation optimizations
+- Consider available system resources when scaling to batch processing
+
+### System Integration
+- All benchmarks performed on NVMe SSDs to minimize I/O variance
+- Single-threaded performance reported
+- Multi-threading capabilities vary between libraries
+
+### Image Characteristics
+- Results based on typical ImageNet JPEG images (~500x400 pixels)
+- Performance scaling with image size varies between implementations
+- Compression ratio and JPEG encoding parameters can influence decoding speed
-## Hardware and Software Specifications
+## Recommendations
-**CPU**: AMD Ryzen Threadripper 3970X 32-Core Processor
+### High-Performance Applications
+- Use kornia-rs or OpenCV for consistent cross-platform performance
+- On Linux, consider jpeg4py for maximum performance
+- Consider memory usage if processing many images simultaneously
-## Results
+### Cross-Platform Development
+- kornia-rs provides the most consistent performance
+- OpenCV and torchvision offer good balance of features and speed
+- Test with representative image sizes and batching patterns
-| | Library | Version | Performance (images/sec) |
-|---:|:-----------------------|:----------|:---------------------------|
-| 0 | scikit-image | 0.23.2 | 538.48 ± 6.86 |
-| 1 | imageio | 2.34.1 | 538.58 ± 6.84 |
-| 2 | opencv-python-headless | 4.10.0.82 | 631.46 ± 0.43 |
-| 3 | pillow | 10.3.0 | 589.56 ± 8.79 |
-| 4 | jpeg4py | 0.1.4 | 700.60 ± 0.88 |
-| 5 | torchvision | 0.18.1 | 658.68 ± 0.78 |
-| 6 | tensorflow | 2.16.1 | 704.43 ± 1.10 |
-| 7 | kornia-rs | 0.1.1 | 682.95 ± 1.21 |
+### Feature-Rich Applications
+- When needing extensive image processing features, OpenCV remains a strong choice
+- Consider dependency size and installation complexity
+- Evaluate the full image processing pipeline, not just JPEG decoding
diff --git a/images/2024-02-26.png b/images/2024-02-26.png
deleted file mode 100644
index 166ff37..0000000
Binary files a/images/2024-02-26.png and /dev/null differ
diff --git a/images/2024-03-11.png b/images/2024-03-11.png
deleted file mode 100644
index 8de5244..0000000
Binary files a/images/2024-03-11.png and /dev/null differ
diff --git a/images/2024-06-05.png b/images/2024-06-05.png
deleted file mode 100644
index 3515f1f..0000000
Binary files a/images/2024-06-05.png and /dev/null differ
diff --git a/images/performance_darwin.png b/images/performance_darwin.png
new file mode 100644
index 0000000..c206c03
Binary files /dev/null and b/images/performance_darwin.png differ
diff --git a/images/performance_linux.png b/images/performance_linux.png
new file mode 100644
index 0000000..08683ee
Binary files /dev/null and b/images/performance_linux.png differ
diff --git a/imread_benchmark/benchmark.py b/imread_benchmark/benchmark.py
deleted file mode 100644
index d35d046..0000000
--- a/imread_benchmark/benchmark.py
+++ /dev/null
@@ -1,335 +0,0 @@
-import argparse
-import logging
-import os
-import random
-import sys
-import time
-from abc import ABC
-from collections import defaultdict
-from pathlib import Path
-
-import cv2
-import imageio.v2 as imageio
-import jpeg4py
-import kornia_rs as K
-import numpy as np
-import pandas as pd
-import pkg_resources
-import skimage
-import tensorflow as tf
-import torchvision
-from PIL import Image
-from pytablewriter import MarkdownTableWriter
-from pytablewriter.style import Style
-from tqdm import tqdm
-
-cv2.setNumThreads(0)
-cv2.ocl.setUseOpenCL(False)
-
-os.environ["OMP_NUM_THREADS"] = "1"
-os.environ["OPENBLAS_NUM_THREADS"] = "1"
-os.environ["MKL_NUM_THREADS"] = "1"
-os.environ["VECLIB_MAXIMUM_THREADS"] = "1"
-os.environ["NUMEXPR_NUM_THREADS"] = "1"
-os.environ["CUDA_VISIBLE_DEVICES"] = ""
-
-# Set up logging
-logging.basicConfig(level=logging.INFO)
-logger = logging.getLogger(__name__)
-
-try:
- # Attempt to disable all GPUs
- tf.config.set_visible_devices([], "GPU")
- visible_devices = tf.config.get_visible_devices()
- for device in visible_devices:
- if device.device_type == "GPU":
- logger.warning("GPU device is still visible, disabling failed.")
-except tf.errors.NotFoundError: # Example of catching a more specific TensorFlow exception
- logger.exception("Specific TensorFlow error encountered when trying to modify GPU visibility.")
-except Exception: # Use this as a fallback if you're unsure which specific exceptions might be raised
- logger.exception("Failed to modify GPU visibility due to an unexpected error.")
-
-
-package_mapping = {
- "opencv": "opencv-python-headless", # or "opencv-python" depending on which you use
- "pil": "pillow",
- "jpeg4py": "jpeg4py",
- "skimage": "scikit-image",
- "imageio": "imageio",
- "torchvision": "torchvision",
- "tensorflow": "tensorflow",
- "kornia": "kornia-rs",
-}
-
-
-def get_package_versions():
- # Mapping of import names to package names as they might differ
-
- versions = {"Python": sys.version.split()[0]} # Just get the major.minor.patch
- for package, dist_name in package_mapping.items():
- try:
- versions[package] = pkg_resources.get_distribution(dist_name).version
- except pkg_resources.DistributionNotFound:
- versions[package] = "Not Installed"
- return versions
-
-
-class BenchmarkTest(ABC):
- def __str__(self):
- return self.__class__.__name__
-
- def run(self, library, image_paths: list) -> None:
- operation = getattr(self, library)
- for image in image_paths:
- operation(image)
-
-
-class GetArray(BenchmarkTest):
- def pil(self, image_path: str) -> np.array:
- img = Image.open(image_path)
- img = img.convert("RGB")
- return np.asarray(img)
-
- def opencv(self, image_path: str) -> np.array:
- img = cv2.imread(image_path)
- return cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
-
- def jpeg4py(self, image_path: str) -> np.array:
- return jpeg4py.JPEG(image_path).decode()
-
- def skimage(self, image_path: str) -> np.array:
- return skimage.io.imread(image_path)
-
- def imageio(self, image_path: str) -> np.array:
- return imageio.imread(image_path)
-
- def torchvision(self, image_path: str) -> np.array:
- image = torchvision.io.read_image(image_path)
- return image.permute(1, 2, 0).numpy()
-
- def tensorflow(self, image_path: str) -> np.array:
- # Read the image file
- image_string = tf.io.read_file(image_path)
- # Decode the image to tensor
- image = tf.io.decode_image(image_string, channels=3)
- # Convert the tensor to numpy array
- return image.numpy()
-
- def kornia(self, image_path: str) -> np.array:
- return K.read_image_jpeg(image_path)
-
-
-class MarkdownGenerator:
- def __init__(self, df, package_versions):
- self._df = df
- self._package_versions = package_versions
-
- def _highlight_best_result(self, results) -> list[str]:
- # Convert all results to floats for comparison, filtering out any non-numeric values beforehand
- numeric_results = [float(r) for r in results if r.replace(".", "", 1).isdigit()]
-
- if not numeric_results:
- return results # Return the original results if no numeric values were found
-
- best_result = max(numeric_results)
-
- # Highlight the best result by comparing the float representation of each result
- return [f"**{r}**" if float(r) == best_result else r for r in results]
-
- def _make_headers(self) -> list[str]:
- libraries = self._df.columns.to_list()
- columns = []
- for library in libraries:
- version = self._package_versions[
- (
- library.replace("opencv", "opencv-python-headless")
- .replace("pil", "pillow")
- .replace("skimage", "scikit-image")
- .replace("kornia", "kornia-rs")
- )
- ]
-
- columns.append(f"{library}
{version}")
- return ["", *columns]
-
- def _make_value_matrix(self) -> list[list]:
- index = self._df.index.tolist()
- values = self._df.to_numpy().tolist()
- value_matrix = []
- for transform, results in zip(index, values, strict=False):
- row = [transform, *self._highlight_best_result(results)]
- value_matrix.append(row)
- return value_matrix
-
- def _make_versions_text(self) -> str:
- libraries = [
- "Python",
- "numpy",
- "pillow",
- "opencv-python-headless",
- "scikit-image",
- "scipy",
- "tensorflow",
- "kornia-rs",
- ]
- libraries_with_versions = [
- "{library} {version}".format(library=library, version=self._package_versions[library].replace("\n", ""))
- for library in libraries
- ]
- return f"Python and library versions: {', '.join(libraries_with_versions)}."
-
- def print(self) -> None:
- writer = MarkdownTableWriter()
- writer.headers = self._make_headers()
- writer.value_matrix = self._make_value_matrix()
- writer.styles = [Style(align="left")] + [Style(align="center") for _ in range(len(writer.headers) - 1)]
- writer.write_table()
-
-
-def run_single_benchmark(benchmark, library, image_paths):
- """
- Runs a single benchmark for a given library and set of image paths.
- Returns the images per second performance.
- """
- start_time = time.perf_counter()
- benchmark.run(library, image_paths)
- end_time = time.perf_counter()
-
- run_time = end_time - start_time
- return len(image_paths) / run_time
-
-
-def warm_up(libraries, benchmarks, image_paths, warmup_runs, shuffle_paths):
- """Performs warm-up runs for each library to ensure fair timing."""
- for library in tqdm(libraries, desc="Warming up libraries"):
- for _ in tqdm(range(warmup_runs), desc=f"Warm-up runs for {library}"):
- for benchmark in benchmarks:
- if shuffle_paths:
- random.shuffle(image_paths)
- benchmark.run(library, image_paths)
-
-
-def perform_benchmark(libraries, benchmarks, image_paths, num_runs, shuffle_paths):
- """Main benchmarking logic, performing the benchmark for each library and benchmark combination."""
- images_per_second = defaultdict(lambda: defaultdict(list))
-
- # for _ in range(num_runs):
- for _ in tqdm(range(num_runs), desc="Benchmarking Runs"):
- shuffled_libraries = libraries.copy()
- random.shuffle(shuffled_libraries) # Shuffle library order for each run
-
- for library in tqdm(shuffled_libraries, desc="Libraries"):
- # for library in shuffled_libraries:
- for benchmark in benchmarks:
- if shuffle_paths:
- random.shuffle(image_paths)
-
- ips = run_single_benchmark(benchmark, library, image_paths)
- images_per_second[library][str(benchmark)].append(ips)
-
- return images_per_second
-
-
-def calculate_results(images_per_second):
- """Calculates the average and standard deviation of images per second for each library and benchmark."""
- final_results = defaultdict(dict)
- for library, benchmarks in images_per_second.items():
- for benchmark, times in benchmarks.items():
- avg_ips = np.mean(times)
- std_ips = np.std(times) if len(times) > 1 else 0
- final_results[library][benchmark] = f"{avg_ips:.2f} ± {std_ips:.2f}"
-
- return final_results
-
-
-def benchmark(
- libraries: list,
- benchmarks: list,
- image_paths: list,
- num_runs: int,
- shuffle_paths: bool,
- warmup_runs: int = 1,
-) -> defaultdict:
- """Orchestrates the benchmarking process, including warm-up, main benchmark, and result calculation."""
- # Warm-up phase
- warm_up(libraries, benchmarks, image_paths, warmup_runs, shuffle_paths)
-
- # Main benchmarking
- images_per_second = perform_benchmark(libraries, benchmarks, image_paths, num_runs, shuffle_paths)
-
- # Calculate and return final results
- return calculate_results(images_per_second)
-
-
-def parse_args():
- parser = argparse.ArgumentParser(description="Image reading libraries performance benchmark")
- parser.add_argument("-d", "--data-dir", metavar="DIR", help="path to a directory with images")
- parser.add_argument(
- "-n",
- "--num_images",
- default=2000,
- type=int,
- metavar="N",
- help="number of images for benchmarking (default: 2000)",
- )
- parser.add_argument(
- "-r",
- "--num_runs",
- default=5,
- type=int,
- metavar="N",
- help="number of runs for each benchmark (default: 5)",
- )
- parser.add_argument(
- "--show-std",
- dest="show_std",
- action="store_true",
- help="show standard deviation for benchmark runs",
- )
- parser.add_argument("-m", "--markdown", action="store_true", help="print benchmarking results as a markdown table")
- parser.add_argument("-p", "--print-package-versions", action="store_true", help="print versions of packages")
- parser.add_argument("-s", "--shuffle", action="store_true", help="Shuffle the list of images.")
- parser.add_argument("-o", "--output_path", type=Path, help="Path to save resulting dataframe.", default="output")
- return parser.parse_args()
-
-
-def get_image_paths(data_dir: str | Path, num_images: int) -> list:
- image_paths = sorted(Path(data_dir).glob("*.*"))
- return [str(x) for x in image_paths[:num_images]]
-
-
-def main() -> None:
- args = parse_args()
-
- Path(args.output_path).mkdir(parents=True, exist_ok=True)
-
- package_versions = get_package_versions()
-
- benchmarks = [GetArray()] # Add more benchmark classes as needed
- libraries = ["skimage", "imageio", "opencv", "pil", "jpeg4py", "torchvision", "tensorflow", "kornia"]
-
- image_paths = get_image_paths(args.data_dir, args.num_images)
- images_per_second = benchmark(libraries, benchmarks, image_paths, args.num_runs, args.shuffle)
-
- # Convert the results to a DataFrame
- results = defaultdict(list)
- for library in libraries:
- for perf in images_per_second[library].values():
- results["Library"].append(package_mapping[library])
- results["Version"].append(package_versions.get(library, "Unknown"))
- results["Performance (images/sec)"].append(perf)
-
- df = pd.DataFrame(results)
-
- if args.output_path:
- df.to_csv(args.output_path, index=False)
-
- if args.markdown:
- # Convert dataframe to markdown table
- print(df.to_markdown())
-
- return df # Return the dataframe if needed
-
-
-if __name__ == "__main__":
- df = main()
diff --git a/imread_benchmark/benchmark_single.py b/imread_benchmark/benchmark_single.py
new file mode 100644
index 0000000..e905a97
--- /dev/null
+++ b/imread_benchmark/benchmark_single.py
@@ -0,0 +1,215 @@
+import argparse
+import json
+import logging
+import os
+import platform
+import sys
+import time
+from importlib.metadata import version
+from pathlib import Path
+
+import cpuinfo
+import numpy as np
+from tqdm import tqdm
+
+# Set up logging
+logging.basicConfig(level=logging.INFO)
+logger = logging.getLogger(__name__)
+
+SUPPORTED_LIBRARIES = {
+ "opencv": "opencv-python-headless",
+ "pil": "pillow",
+ "jpeg4py": "jpeg4py",
+ "skimage": "scikit-image",
+ "imageio": "imageio",
+ "torchvision": "torchvision",
+ "tensorflow": "tensorflow",
+ "kornia": "kornia-rs",
+}
+
+
+def get_package_versions():
+ import multiprocessing
+
+ import cpuinfo
+
+ # Get CPU info
+ try:
+ cpu_info = cpuinfo.get_cpu_info()
+ cpu_details = {
+ "brand_raw": cpu_info.get("brand_raw", "Unknown"),
+ "arch": cpu_info.get("arch", "Unknown"),
+ "hz_advertised_raw": cpu_info.get("hz_advertised_raw", "Unknown"),
+ "count": multiprocessing.cpu_count(),
+ }
+ except Exception as e:
+ logger.warning(f"Failed to get CPU info: {e}")
+ cpu_details = {"error": str(e)}
+
+ versions = {
+ "Python": sys.version.split()[0],
+ "OS": platform.system(),
+ "OS Version": platform.version(),
+ "Machine": platform.machine(),
+ "CPU": cpu_details,
+ }
+
+ lib_name = os.environ.get("BENCHMARK_LIBRARY")
+ if lib_name:
+ pkg_name = SUPPORTED_LIBRARIES.get(lib_name)
+ if pkg_name:
+ try:
+ versions[lib_name] = version(pkg_name)
+ except Exception as e:
+ versions[lib_name] = f"Error getting version: {e!s}"
+
+ return versions
+
+
+def get_system_identifier() -> str:
+ """
+ Get a detailed system identifier including OS and CPU.
+
+ Returns:
+ str: A string combining OS and CPU model, formatted as 'os_cpu-model'
+
+ """
+ try:
+ cpu_info = cpuinfo.get_cpu_info()
+ cpu_brand = cpu_info.get("brand_raw", "Unknown")
+
+ # Simple OS identification
+ os_id = "darwin" if platform.system().lower() == "darwin" else "linux"
+
+ # Replace spaces with hyphens but keep full names
+ cpu_id = cpu_brand.replace(" ", "-")
+ except Exception as e:
+ logger.warning(f"Failed to get system info: {e}")
+ return "unknown-system"
+ else:
+ return f"{os_id}_{cpu_id}"
+
+
+def setup_library():
+ """Set up the image reading function based on the specified library."""
+ library = os.environ.get("BENCHMARK_LIBRARY")
+ if not library:
+ raise ValueError("BENCHMARK_LIBRARY environment variable must be set")
+
+ if library == "opencv":
+ import cv2
+
+ def read_image(path):
+ return cv2.imread(path, cv2.IMREAD_COLOR_RGB)
+
+ elif library in {"pillow", "pillow-simd"}:
+ from PIL import Image
+
+ def read_image(path):
+ img = Image.open(path)
+ img = img.convert("RGB")
+ return np.asarray(img)
+
+ elif library == "jpeg4py":
+ import jpeg4py
+
+ def read_image(path):
+ return jpeg4py.JPEG(path).decode()
+
+ elif library == "skimage":
+ import skimage.io
+
+ def read_image(path):
+ return skimage.io.imread(path)
+
+ elif library == "imageio":
+ import imageio.v2 as imageio
+
+ def read_image(path):
+ return imageio.imread(path)
+
+ elif library == "torchvision":
+ import torchvision
+
+ def read_image(path):
+ image = torchvision.io.read_image(path)
+ return image.permute(1, 2, 0).numpy()
+
+ elif library == "tensorflow":
+ import tensorflow as tf
+
+ def read_image(path):
+ image_string = tf.io.read_file(path)
+ image = tf.io.decode_image(image_string, channels=3)
+ return image.numpy()
+
+ elif library == "kornia":
+ import kornia_rs as K
+
+ def read_image(path):
+ return K.read_image_jpeg(path)
+
+ else:
+ raise ValueError(f"Unsupported library: {library}")
+
+ return library, read_image
+
+
+def run_benchmark(read_image, image_paths, num_runs):
+ times = []
+ for _ in tqdm(range(num_runs), desc="Benchmarking"):
+ start_time = time.perf_counter()
+ for path in image_paths:
+ read_image(path)
+ end_time = time.perf_counter()
+
+ run_time = end_time - start_time
+ images_per_second = len(image_paths) / run_time
+ times.append(images_per_second)
+
+ avg_ips = np.mean(times)
+ std_ips = np.std(times)
+
+ return {
+ "images_per_second": f"{avg_ips:.2f} ± {std_ips:.2f}",
+ "raw_times": times,
+ }
+
+
+def main():
+ parser = argparse.ArgumentParser()
+ parser.add_argument("-d", "--data-dir", required=True, help="Path to image directory")
+ parser.add_argument("-n", "--num-images", type=int, default=2000)
+ parser.add_argument("-r", "--num-runs", type=int, default=5)
+ parser.add_argument("-o", "--output-dir", type=Path, required=True)
+ args = parser.parse_args()
+
+ # Set up library and get read function once at startup
+ library, read_image = setup_library()
+
+ # Create output directory with detailed system info
+ system_id = get_system_identifier()
+ output_dir = args.output_dir / system_id
+ output_dir.mkdir(parents=True, exist_ok=True)
+
+ # Get image paths
+ image_paths = sorted(Path(args.data_dir).glob("*.*"))[: args.num_images]
+ image_paths = [str(x) for x in image_paths]
+
+ # Run benchmark
+ results = {
+ "library": library,
+ "system_info": get_package_versions(),
+ "benchmark_results": run_benchmark(read_image, image_paths, args.num_runs),
+ "num_images": args.num_images,
+ "num_runs": args.num_runs,
+ }
+
+ # Save results
+ output_file = output_dir / f"{library}_results.json"
+ with output_file.open("w") as f:
+ json.dump(results, f, indent=2)
+
+
+if __name__ == "__main__":
+ main()
diff --git a/imread_benchmark/create_plot.py b/imread_benchmark/create_plot.py
deleted file mode 100644
index 96580c2..0000000
--- a/imread_benchmark/create_plot.py
+++ /dev/null
@@ -1,65 +0,0 @@
-import argparse
-
-import matplotlib.pyplot as plt
-import pandas as pd
-import seaborn as sns
-
-
-def create_plot(df_path, output_path):
- sns.set_theme(style="whitegrid", context="talk")
- df = pd.read_csv(df_path)
-
- # Processing the DataFrame
- performance_split = df["Performance (images/sec)"].str.split(" ± ", expand=True)
- df["Mean Performance"] = performance_split[0].astype(float)
- df["Std Dev"] = performance_split[1].astype(float)
- df["Library with Version"] = df["Library"] + ", " + df["Version"]
- df_sorted = df.sort_values("Mean Performance", ascending=True)
-
- # Create the bar plot
- plt.figure(figsize=(14, 8))
- barplot = sns.barplot(x="Mean Performance", y="Library with Version", data=df_sorted, palette="viridis")
-
- # Manually add error bars
- # The positions of bars (center) are usually at half-integers (0.5, 1.5, ...) in seaborn's horizontal barplot
- # But we'll calculate directly from the generated plot to be more precise
- y_positions = [p.get_y() + p.get_height() / 2 for p in barplot.patches]
- error_values = df_sorted["Std Dev"].to_numpy()
-
- for y_pos, x_val, error_val in zip(y_positions, df_sorted["Mean Performance"], error_values, strict=False):
- plt.errorbar(
- x=x_val,
- y=y_pos,
- xerr=error_val, # Horizontal error for horizontal bar plot
- fmt="none", # No connecting lines
- capsize=5, # Cap size
- color="black", # Color of the error bars
- )
-
- # Plot customization
- plt.xlabel("Mean Performance (images/sec)")
- plt.ylabel("")
- plt.title("Library Performance Comparison")
- plt.tight_layout()
-
- # Save the plot
- plt.savefig(output_path)
- plt.close()
-
-
-def parse_args():
- parser = argparse.ArgumentParser(description="Create a plot from benchmark results DataFrame")
- parser.add_argument(
- "-f", "--file_path", required=True, help="Path to the CSV file containing the benchmark results"
- )
- parser.add_argument("-o", "--output_path", required=True, help="Path where the plot image will be saved")
- return parser.parse_args()
-
-
-def main():
- args = parse_args()
- create_plot(args.file_path, args.output_path)
-
-
-if __name__ == "__main__":
- main()
diff --git a/output/darwin/imageio_results.json b/output/darwin/imageio_results.json
new file mode 100644
index 0000000..9400878
--- /dev/null
+++ b/output/darwin/imageio_results.json
@@ -0,0 +1,43 @@
+{
+ "library": "imageio",
+ "system_info": {
+ "Python": "3.12.7",
+ "OS": "Darwin",
+ "OS Version": "Darwin Kernel Version 24.1.0: Thu Oct 10 21:06:57 PDT 2024; root:xnu-11215.41.3~3/RELEASE_ARM64_T6041",
+ "Machine": "arm64",
+ "CPU": {
+ "brand_raw": "Apple M4 Max",
+ "arch": "ARM_8",
+ "hz_advertised_raw": "Unknown",
+ "count": 16
+ },
+ "imageio": "2.37.0"
+ },
+ "benchmark_results": {
+ "images_per_second": "777.49 \u00b1 11.70",
+ "raw_times": [
+ 773.3362551778018,
+ 788.0979997280438,
+ 787.738650909651,
+ 787.4840179428744,
+ 780.3523028334013,
+ 781.6396623215167,
+ 774.687086672151,
+ 730.0676139616584,
+ 782.6431786574269,
+ 783.7668689780774,
+ 782.1544299803793,
+ 776.4593567244634,
+ 780.9970655883873,
+ 777.2857290199994,
+ 779.7222317321122,
+ 773.8445318199068,
+ 776.4496607142421,
+ 777.8145009024621,
+ 778.6917997557493,
+ 776.5908468776514
+ ]
+ },
+ "num_images": 2000,
+ "num_runs": 20
+}
diff --git a/output/darwin/kornia_results.json b/output/darwin/kornia_results.json
new file mode 100644
index 0000000..efc50c3
--- /dev/null
+++ b/output/darwin/kornia_results.json
@@ -0,0 +1,43 @@
+{
+ "library": "kornia",
+ "system_info": {
+ "Python": "3.12.7",
+ "OS": "Darwin",
+ "OS Version": "Darwin Kernel Version 24.1.0: Thu Oct 10 21:06:57 PDT 2024; root:xnu-11215.41.3~3/RELEASE_ARM64_T6041",
+ "Machine": "arm64",
+ "CPU": {
+ "brand_raw": "Apple M4 Max",
+ "arch": "ARM_8",
+ "hz_advertised_raw": "Unknown",
+ "count": 16
+ },
+ "kornia": "0.1.8"
+ },
+ "benchmark_results": {
+ "images_per_second": "1034.47 \u00b1 18.42",
+ "raw_times": [
+ 1030.0550536145663,
+ 1041.5514363711877,
+ 1040.1540186745633,
+ 1040.148000717588,
+ 1041.137196232814,
+ 1039.533367243049,
+ 1027.7022933070384,
+ 1031.7264922909167,
+ 1042.073160979155,
+ 956.1449571708179,
+ 1038.644207186712,
+ 1037.017753305197,
+ 1036.057342652493,
+ 1040.0293425434209,
+ 1041.4460601700132,
+ 1041.174481257482,
+ 1041.2124691574397,
+ 1040.290403644477,
+ 1041.7558305884224,
+ 1041.4534714855283
+ ]
+ },
+ "num_images": 2000,
+ "num_runs": 20
+}
diff --git a/output/darwin/opencv_results.json b/output/darwin/opencv_results.json
new file mode 100644
index 0000000..a3d6192
--- /dev/null
+++ b/output/darwin/opencv_results.json
@@ -0,0 +1,43 @@
+{
+ "library": "opencv",
+ "system_info": {
+ "Python": "3.12.7",
+ "OS": "Darwin",
+ "OS Version": "Darwin Kernel Version 24.1.0: Thu Oct 10 21:06:57 PDT 2024; root:xnu-11215.41.3~3/RELEASE_ARM64_T6041",
+ "Machine": "arm64",
+ "CPU": {
+ "brand_raw": "Apple M4 Max",
+ "arch": "ARM_8",
+ "hz_advertised_raw": "Unknown",
+ "count": 16
+ },
+ "opencv": "4.11.0.86"
+ },
+ "benchmark_results": {
+ "images_per_second": "1016.02 \u00b1 7.01",
+ "raw_times": [
+ 1024.4635275982616,
+ 1026.0244252552316,
+ 1025.6640371358717,
+ 1026.4049077442344,
+ 1015.9824195523589,
+ 1025.3303809736785,
+ 1022.1921101041164,
+ 1019.3528599748722,
+ 1019.6086893660788,
+ 1018.4911311348419,
+ 1013.4911724175921,
+ 1005.5940569754418,
+ 1011.0163715516046,
+ 1013.2027282063441,
+ 1009.9494539517264,
+ 1009.2187296894285,
+ 1009.9681331064289,
+ 1009.4144297721575,
+ 1007.93437525726,
+ 1007.0939272684269
+ ]
+ },
+ "num_images": 2000,
+ "num_runs": 20
+}
diff --git a/output/darwin/pillow_results.json b/output/darwin/pillow_results.json
new file mode 100644
index 0000000..ca8cb6d
--- /dev/null
+++ b/output/darwin/pillow_results.json
@@ -0,0 +1,42 @@
+{
+ "library": "pillow",
+ "system_info": {
+ "Python": "3.12.7",
+ "OS": "Darwin",
+ "OS Version": "Darwin Kernel Version 24.1.0: Thu Oct 10 21:06:57 PDT 2024; root:xnu-11215.41.3~3/RELEASE_ARM64_T6041",
+ "Machine": "arm64",
+ "CPU": {
+ "brand_raw": "Apple M4 Max",
+ "arch": "ARM_8",
+ "hz_advertised_raw": "Unknown",
+ "count": 16
+ }
+ },
+ "benchmark_results": {
+ "images_per_second": "774.55 \u00b1 14.32",
+ "raw_times": [
+ 755.4795761175405,
+ 796.0627798519542,
+ 789.9106813530816,
+ 785.3795446727694,
+ 783.9906368396933,
+ 779.205460746831,
+ 776.6398386515864,
+ 776.0803622195672,
+ 776.7283385085334,
+ 776.0572247091527,
+ 778.5114113115783,
+ 775.595449774128,
+ 777.5430408253858,
+ 772.7972036206445,
+ 773.4511599301802,
+ 722.002991586953,
+ 773.0171409330501,
+ 774.6960514977644,
+ 774.798228444907,
+ 772.987002960018
+ ]
+ },
+ "num_images": 2000,
+ "num_runs": 20
+}
diff --git a/output/darwin/skimage_results.json b/output/darwin/skimage_results.json
new file mode 100644
index 0000000..c28cfef
--- /dev/null
+++ b/output/darwin/skimage_results.json
@@ -0,0 +1,43 @@
+{
+ "library": "skimage",
+ "system_info": {
+ "Python": "3.12.7",
+ "OS": "Darwin",
+ "OS Version": "Darwin Kernel Version 24.1.0: Thu Oct 10 21:06:57 PDT 2024; root:xnu-11215.41.3~3/RELEASE_ARM64_T6041",
+ "Machine": "arm64",
+ "CPU": {
+ "brand_raw": "Apple M4 Max",
+ "arch": "ARM_8",
+ "hz_advertised_raw": "Unknown",
+ "count": 16
+ },
+ "skimage": "0.25.0"
+ },
+ "benchmark_results": {
+ "images_per_second": "766.00 \u00b1 9.06",
+ "raw_times": [
+ 766.8692176050578,
+ 783.7257132190405,
+ 785.1604027762115,
+ 782.9970622353658,
+ 777.9844534423798,
+ 769.247288272154,
+ 769.0387435805927,
+ 764.5407528267608,
+ 758.6448169794073,
+ 761.1050578579769,
+ 764.0291754555469,
+ 763.0891003677041,
+ 762.492994946151,
+ 760.8338820060599,
+ 760.6101431374296,
+ 760.9394920566515,
+ 755.0257821724493,
+ 756.1723513083739,
+ 758.7697182310162,
+ 758.8173748618219
+ ]
+ },
+ "num_images": 2000,
+ "num_runs": 20
+}
diff --git a/output/darwin/tensorflow_results.json b/output/darwin/tensorflow_results.json
new file mode 100644
index 0000000..1810853
--- /dev/null
+++ b/output/darwin/tensorflow_results.json
@@ -0,0 +1,43 @@
+{
+ "library": "tensorflow",
+ "system_info": {
+ "Python": "3.12.7",
+ "OS": "Darwin",
+ "OS Version": "Darwin Kernel Version 24.1.0: Thu Oct 10 21:06:57 PDT 2024; root:xnu-11215.41.3~3/RELEASE_ARM64_T6041",
+ "Machine": "arm64",
+ "CPU": {
+ "brand_raw": "Apple M4 Max",
+ "arch": "ARM_8",
+ "hz_advertised_raw": "Unknown",
+ "count": 16
+ },
+ "tensorflow": "2.18.0"
+ },
+ "benchmark_results": {
+ "images_per_second": "664.28 \u00b1 8.80",
+ "raw_times": [
+ 653.2195612875155,
+ 659.194694745475,
+ 666.8145792211423,
+ 665.4012488359364,
+ 662.5056985564643,
+ 656.8600920071962,
+ 632.020921101284,
+ 668.058482984238,
+ 667.8843498303524,
+ 669.9456130250902,
+ 670.2474076612411,
+ 671.7269068251509,
+ 671.9704266827009,
+ 670.0983425982946,
+ 668.274925958546,
+ 668.2849465726157,
+ 667.8620934556499,
+ 664.2110231642316,
+ 664.6250124212706,
+ 666.3288934200496
+ ]
+ },
+ "num_images": 2000,
+ "num_runs": 20
+}
diff --git a/output/darwin/torchvision_results.json b/output/darwin/torchvision_results.json
new file mode 100644
index 0000000..363f86e
--- /dev/null
+++ b/output/darwin/torchvision_results.json
@@ -0,0 +1,43 @@
+{
+ "library": "torchvision",
+ "system_info": {
+ "Python": "3.12.7",
+ "OS": "Darwin",
+ "OS Version": "Darwin Kernel Version 24.1.0: Thu Oct 10 21:06:57 PDT 2024; root:xnu-11215.41.3~3/RELEASE_ARM64_T6041",
+ "Machine": "arm64",
+ "CPU": {
+ "brand_raw": "Apple M4 Max",
+ "arch": "ARM_8",
+ "hz_advertised_raw": "Unknown",
+ "count": 16
+ },
+ "torchvision": "0.20.1"
+ },
+ "benchmark_results": {
+ "images_per_second": "992.20 \u00b1 17.15",
+ "raw_times": [
+ 987.0900145449132,
+ 1007.415522739194,
+ 987.876796119717,
+ 994.1233847772254,
+ 982.7930140201138,
+ 985.7552611822863,
+ 993.1553799866899,
+ 983.3769754969791,
+ 1000.9418027519556,
+ 1006.9690865957836,
+ 1006.2856376033868,
+ 983.7422050660286,
+ 928.4764959985383,
+ 983.2389710143362,
+ 997.5434245862426,
+ 1004.565394113687,
+ 995.4654269216271,
+ 1004.3494609436391,
+ 1004.6521044775942,
+ 1006.1666273940092
+ ]
+ },
+ "num_images": 2000,
+ "num_runs": 20
+}
diff --git a/output/linux/imageio_results.json b/output/linux/imageio_results.json
new file mode 100644
index 0000000..ebcf41f
--- /dev/null
+++ b/output/linux/imageio_results.json
@@ -0,0 +1,43 @@
+{
+ "library": "imageio",
+ "system_info": {
+ "Python": "3.12.8",
+ "OS": "Linux",
+ "OS Version": "#135~20.04.1-Ubuntu SMP Mon Oct 7 13:56:22 UTC 2024",
+ "Machine": "x86_64",
+ "CPU": {
+ "brand_raw": "AMD Ryzen Threadripper 3970X 32-Core Processor",
+ "arch": "X86_64",
+ "hz_advertised_raw": "Unknown",
+ "count": 64
+ },
+ "imageio": "2.37.0"
+ },
+ "benchmark_results": {
+ "images_per_second": "536.11 \u00b1 5.72",
+ "raw_times": [
+ 516.8561561826256,
+ 536.1494348059053,
+ 536.7697513662142,
+ 541.2213418297825,
+ 535.802765009853,
+ 531.524755680685,
+ 535.5521861708272,
+ 537.932087738096,
+ 543.6772889887665,
+ 537.5952031308753,
+ 537.5911167712384,
+ 538.0330889614744,
+ 539.5160064991541,
+ 536.5376339359522,
+ 531.9434538921021,
+ 533.4716159923753,
+ 537.1596300638529,
+ 546.2547273940405,
+ 537.4879022237232,
+ 531.2139404446549
+ ]
+ },
+ "num_images": 2000,
+ "num_runs": 20
+}
diff --git a/output/linux/jpeg4py_results.json b/output/linux/jpeg4py_results.json
new file mode 100644
index 0000000..8713623
--- /dev/null
+++ b/output/linux/jpeg4py_results.json
@@ -0,0 +1,43 @@
+{
+ "library": "jpeg4py",
+ "system_info": {
+ "Python": "3.12.8",
+ "OS": "Linux",
+ "OS Version": "#135~20.04.1-Ubuntu SMP Mon Oct 7 13:56:22 UTC 2024",
+ "Machine": "x86_64",
+ "CPU": {
+ "brand_raw": "AMD Ryzen Threadripper 3970X 32-Core Processor",
+ "arch": "X86_64",
+ "hz_advertised_raw": "Unknown",
+ "count": 64
+ },
+ "jpeg4py": "0.1.4"
+ },
+ "benchmark_results": {
+ "images_per_second": "698.09 \u00b1 6.87",
+ "raw_times": [
+ 672.7528763032833,
+ 697.2087194162982,
+ 698.4113191879326,
+ 691.5627849080735,
+ 696.3421741912671,
+ 701.4092559804876,
+ 702.1707433935258,
+ 701.8786160342602,
+ 701.7915936925575,
+ 702.3646617205877,
+ 702.3502612104418,
+ 701.8373019644533,
+ 702.1618268659923,
+ 701.387293227451,
+ 700.6724765495026,
+ 692.8573613145545,
+ 690.756390865194,
+ 699.9028558075184,
+ 701.7206243281846,
+ 702.2271980260354
+ ]
+ },
+ "num_images": 2000,
+ "num_runs": 20
+}
diff --git a/output/linux/kornia_results.json b/output/linux/kornia_results.json
new file mode 100644
index 0000000..e51478e
--- /dev/null
+++ b/output/linux/kornia_results.json
@@ -0,0 +1,43 @@
+{
+ "library": "kornia",
+ "system_info": {
+ "Python": "3.12.8",
+ "OS": "Linux",
+ "OS Version": "#135~20.04.1-Ubuntu SMP Mon Oct 7 13:56:22 UTC 2024",
+ "Machine": "x86_64",
+ "CPU": {
+ "brand_raw": "AMD Ryzen Threadripper 3970X 32-Core Processor",
+ "arch": "X86_64",
+ "hz_advertised_raw": "Unknown",
+ "count": 64
+ },
+ "kornia": "0.1.8"
+ },
+ "benchmark_results": {
+ "images_per_second": "711.38 \u00b1 11.16",
+ "raw_times": [
+ 697.0033174377847,
+ 703.8561808167548,
+ 703.775749696591,
+ 703.3396211446284,
+ 703.803146786953,
+ 700.5217307184085,
+ 701.2315105887913,
+ 702.3881949582293,
+ 703.3551477397373,
+ 701.9920341989025,
+ 702.5587206284661,
+ 705.1785187872333,
+ 722.4881383660472,
+ 727.2041226412655,
+ 726.6137042797883,
+ 727.1905751221473,
+ 724.508512337426,
+ 722.763620788062,
+ 722.1669399734006,
+ 725.6880062604613
+ ]
+ },
+ "num_images": 2000,
+ "num_runs": 20
+}
diff --git a/output/linux/opencv_results.json b/output/linux/opencv_results.json
new file mode 100644
index 0000000..fe2f068
--- /dev/null
+++ b/output/linux/opencv_results.json
@@ -0,0 +1,43 @@
+{
+ "library": "opencv",
+ "system_info": {
+ "Python": "3.12.8",
+ "OS": "Linux",
+ "OS Version": "#135~20.04.1-Ubuntu SMP Mon Oct 7 13:56:22 UTC 2024",
+ "Machine": "x86_64",
+ "CPU": {
+ "brand_raw": "AMD Ryzen Threadripper 3970X 32-Core Processor",
+ "arch": "X86_64",
+ "hz_advertised_raw": "Unknown",
+ "count": 64
+ },
+ "opencv": "4.11.0.86"
+ },
+ "benchmark_results": {
+ "images_per_second": "664.83 \u00b1 2.90",
+ "raw_times": [
+ 660.0743508061927,
+ 663.1252559402171,
+ 667.0506757291804,
+ 666.0791739517408,
+ 658.1092975827875,
+ 663.7787715078598,
+ 666.446764559172,
+ 667.008377686375,
+ 667.0506367757584,
+ 667.0257497843158,
+ 666.8665428676356,
+ 665.3095160032029,
+ 665.1992851368672,
+ 664.3705875265218,
+ 666.3775632439565,
+ 658.244988065647,
+ 662.8110244777704,
+ 666.9609514384617,
+ 667.2306544943056,
+ 667.5615122027385
+ ]
+ },
+ "num_images": 2000,
+ "num_runs": 20
+}
diff --git a/output/linux/pillow-simd_results.json b/output/linux/pillow-simd_results.json
new file mode 100644
index 0000000..a36ba25
--- /dev/null
+++ b/output/linux/pillow-simd_results.json
@@ -0,0 +1,42 @@
+{
+ "library": "pillow-simd",
+ "system_info": {
+ "Python": "3.12.8",
+ "OS": "Linux",
+ "OS Version": "#135~20.04.1-Ubuntu SMP Mon Oct 7 13:56:22 UTC 2024",
+ "Machine": "x86_64",
+ "CPU": {
+ "brand_raw": "AMD Ryzen Threadripper 3970X 32-Core Processor",
+ "arch": "X86_64",
+ "hz_advertised_raw": "Unknown",
+ "count": 64
+ }
+ },
+ "benchmark_results": {
+ "images_per_second": "564.72 \u00b1 4.00",
+ "raw_times": [
+ 551.0684780476361,
+ 568.5303518546904,
+ 564.4721928861869,
+ 564.986394573479,
+ 565.9241648500185,
+ 564.5543556296951,
+ 559.516444224208,
+ 563.101160018855,
+ 565.400877886764,
+ 562.9424356802866,
+ 570.2997362895217,
+ 566.8645911256513,
+ 568.6557086659257,
+ 566.6480598696354,
+ 565.3965306363618,
+ 566.5711038221959,
+ 567.1277003271592,
+ 565.1766800396712,
+ 560.7312824229764,
+ 566.473942217468
+ ]
+ },
+ "num_images": 2000,
+ "num_runs": 20
+}
diff --git a/output/linux/pillow_results.json b/output/linux/pillow_results.json
new file mode 100644
index 0000000..4b70dff
--- /dev/null
+++ b/output/linux/pillow_results.json
@@ -0,0 +1,42 @@
+{
+ "library": "pillow",
+ "system_info": {
+ "Python": "3.12.8",
+ "OS": "Linux",
+ "OS Version": "#135~20.04.1-Ubuntu SMP Mon Oct 7 13:56:22 UTC 2024",
+ "Machine": "x86_64",
+ "CPU": {
+ "brand_raw": "AMD Ryzen Threadripper 3970X 32-Core Processor",
+ "arch": "X86_64",
+ "hz_advertised_raw": "Unknown",
+ "count": 64
+ }
+ },
+ "benchmark_results": {
+ "images_per_second": "601.24 \u00b1 4.87",
+ "raw_times": [
+ 582.1465579066207,
+ 601.9178695362688,
+ 604.5868164943024,
+ 605.6676420392431,
+ 596.9322703773919,
+ 599.7594494970223,
+ 602.8565927986988,
+ 602.3866678397237,
+ 601.1859863885969,
+ 600.0216123143805,
+ 605.0214633287203,
+ 604.5962910945472,
+ 603.8935045759823,
+ 601.5145407238044,
+ 599.6699283794105,
+ 599.8466931572044,
+ 603.0947156486524,
+ 602.845242127832,
+ 603.1874047374191,
+ 603.6237030933554
+ ]
+ },
+ "num_images": 2000,
+ "num_runs": 20
+}
diff --git a/output/linux/skimage_results.json b/output/linux/skimage_results.json
new file mode 100644
index 0000000..46c7fb5
--- /dev/null
+++ b/output/linux/skimage_results.json
@@ -0,0 +1,43 @@
+{
+ "library": "skimage",
+ "system_info": {
+ "Python": "3.12.8",
+ "OS": "Linux",
+ "OS Version": "#135~20.04.1-Ubuntu SMP Mon Oct 7 13:56:22 UTC 2024",
+ "Machine": "x86_64",
+ "CPU": {
+ "brand_raw": "AMD Ryzen Threadripper 3970X 32-Core Processor",
+ "arch": "X86_64",
+ "hz_advertised_raw": "Unknown",
+ "count": 64
+ },
+ "skimage": "0.25.0"
+ },
+ "benchmark_results": {
+ "images_per_second": "492.77 \u00b1 2.53",
+ "raw_times": [
+ 484.71803942009865,
+ 493.7498817807783,
+ 493.32007350551254,
+ 493.06207269824824,
+ 494.01273605770893,
+ 493.5038059685943,
+ 493.942403802711,
+ 493.64664828688086,
+ 494.4857308934923,
+ 487.3462095159777,
+ 491.80117074990324,
+ 494.05113229582713,
+ 494.3202169979844,
+ 494.0411029959842,
+ 494.3706951761936,
+ 494.67428087909633,
+ 493.4428233368967,
+ 492.9830264633185,
+ 489.6761353502785,
+ 494.21386564222996
+ ]
+ },
+ "num_images": 2000,
+ "num_runs": 20
+}
diff --git a/output/linux/tensorflow_results.json b/output/linux/tensorflow_results.json
new file mode 100644
index 0000000..97b8624
--- /dev/null
+++ b/output/linux/tensorflow_results.json
@@ -0,0 +1,43 @@
+{
+ "library": "tensorflow",
+ "system_info": {
+ "Python": "3.12.8",
+ "OS": "Linux",
+ "OS Version": "#135~20.04.1-Ubuntu SMP Mon Oct 7 13:56:22 UTC 2024",
+ "Machine": "x86_64",
+ "CPU": {
+ "brand_raw": "AMD Ryzen Threadripper 3970X 32-Core Processor",
+ "arch": "X86_64",
+ "hz_advertised_raw": "Unknown",
+ "count": 64
+ },
+ "tensorflow": "2.18.0"
+ },
+ "benchmark_results": {
+ "images_per_second": "706.54 \u00b1 18.35",
+ "raw_times": [
+ 634.5536614738515,
+ 710.9943200269073,
+ 712.147455478579,
+ 702.6625257705427,
+ 686.6894124052747,
+ 716.3535191199385,
+ 714.6014541142557,
+ 695.6015275114487,
+ 705.0791279828891,
+ 716.3590213321612,
+ 714.4437253468774,
+ 717.3555760137459,
+ 715.4004139881656,
+ 717.3639313697548,
+ 715.7141120357034,
+ 715.4218407084416,
+ 711.0033552977274,
+ 703.5542631628908,
+ 706.8857247607566,
+ 718.5983796746689
+ ]
+ },
+ "num_images": 2000,
+ "num_runs": 20
+}
diff --git a/output/linux/torchvision_results.json b/output/linux/torchvision_results.json
new file mode 100644
index 0000000..fe2300a
--- /dev/null
+++ b/output/linux/torchvision_results.json
@@ -0,0 +1,43 @@
+{
+ "library": "torchvision",
+ "system_info": {
+ "Python": "3.12.8",
+ "OS": "Linux",
+ "OS Version": "#135~20.04.1-Ubuntu SMP Mon Oct 7 13:56:22 UTC 2024",
+ "Machine": "x86_64",
+ "CPU": {
+ "brand_raw": "AMD Ryzen Threadripper 3970X 32-Core Processor",
+ "arch": "X86_64",
+ "hz_advertised_raw": "Unknown",
+ "count": 64
+ },
+ "torchvision": "0.20.1"
+ },
+ "benchmark_results": {
+ "images_per_second": "655.61 \u00b1 4.88",
+ "raw_times": [
+ 659.7269788613717,
+ 673.1900036482474,
+ 653.996734147805,
+ 655.5051409368823,
+ 655.8949824598442,
+ 656.0741388424619,
+ 655.5828172303529,
+ 655.681145793864,
+ 654.5482524006712,
+ 654.369946638077,
+ 645.0725848678959,
+ 650.3202969347959,
+ 653.7897284765922,
+ 654.7675405805811,
+ 655.54363003821,
+ 655.9752744990523,
+ 655.8893644469305,
+ 655.7972257021704,
+ 655.7748849826793,
+ 654.7172838350294
+ ]
+ },
+ "num_images": 2000,
+ "num_runs": 20
+}
diff --git a/pyproject.toml b/pyproject.toml
index 43a6d88..15b9c23 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -6,7 +6,7 @@
[tool.black]
line-length = 120
target-version = [
- "py310",
+ "py312",
]
include = '\.pyi?$'
exclude = '''
@@ -27,8 +27,7 @@ exclude = '''
'''
[tool.ruff]
-# Assume Python 3.10
-target-version = "py310"
+target-version = "py312"
# Same as Black.
line-length = 120
@@ -75,91 +74,31 @@ format.line-ending = "auto"
# Like Black, respect magic trailing commas.
format.skip-magic-trailing-comma = false
lint.select = [
- "A",
- "ANN",
- "ARG",
- "ASYNC",
- "B",
- "BLE",
- "C4",
- "C90",
- "COM",
- "CPY",
- "D",
- "DJ",
- "DTZ",
- "E",
- "EM",
- "ERA",
- "EXE",
- "F",
- "FBT",
- "FIX",
- "FLY",
- "FURB",
- "G",
- "I",
- "ICN",
- "INP",
- "INT",
- "ISC",
- "LOG",
- "N",
- "NPY",
- "PD",
- "PERF",
- "PGH",
- "PIE",
- "PL",
- "PT",
- "PTH",
- "PYI",
- "Q",
- "RET",
- "RSE",
- "RUF",
- "S",
- "SIM",
- "SLF",
- "SLOT",
- "T10",
- "T20",
- "TCH",
- "TD",
- "TID",
- "TRIO",
- "TRY",
- "UP",
- "W",
- "YTT",
+ "ALL",
]
lint.ignore = [
"ANN001",
- "ANN101",
"ANN201",
- "ANN204",
- "B024",
- "COM812",
+ "ANN202",
+ "BLE001",
+ "C901",
"D100",
"D101",
- "D102",
"D103",
"D104",
- "D105",
- "D107",
"D203",
- "D205",
- "D211",
"D212",
- "D401",
- "FBT001",
- "ISC001",
+ "EM101",
+ "EM102",
+ "FBT003",
+ "G004",
"N812",
"PD901",
- "PERF203",
- "PLR0913",
+ "PLR2004",
"T201",
+ "TRY003",
]
+
lint.explicit-preview-rules = true
# Allow fix for all enabled rules (when `--fix`) is provided.
lint.fixable = [
@@ -167,11 +106,12 @@ lint.fixable = [
]
lint.unfixable = [
]
+
# Allow unused variables when underscore-prefixed.
lint.dummy-variable-rgx = "^(_+|(_+[a-zA-Z0-9_]*[a-zA-Z0-9]+?))$"
[tool.mypy]
-python_version = "3.10"
+python_version = "3.12"
ignore_missing_imports = true
follow_imports = "silent"
warn_redundant_casts = true
diff --git a/requirements.in b/requirements.in
deleted file mode 100644
index 3332802..0000000
--- a/requirements.in
+++ /dev/null
@@ -1,17 +0,0 @@
--f https://download.pytorch.org/whl/torch_stable.html
-
-imageio
-jpeg4py
-kornia-rs
-matplotlib
-numpy
-opencv-python-headless
-pandas
-pillow
-pytablewriter
-scikit-image
-tensorflow
-torchvision
-tqdm
-tabulate
-seaborn
diff --git a/requirements.txt b/requirements.txt
deleted file mode 100644
index 6003239..0000000
--- a/requirements.txt
+++ /dev/null
@@ -1,231 +0,0 @@
-# This file was autogenerated by uv via the following command:
-# uv pip compile requirements.in
-absl-py==2.1.0
- # via
- # keras
- # tensorboard
- # tensorflow
-astunparse==1.6.3
- # via tensorflow
-certifi==2024.12.14
- # via requests
-cffi==1.17.1
- # via jpeg4py
-chardet==5.2.0
- # via mbstrdecoder
-charset-normalizer==3.4.1
- # via requests
-contourpy==1.3.1
- # via matplotlib
-cycler==0.12.1
- # via matplotlib
-dataproperty==1.1.0
- # via
- # pytablewriter
- # tabledata
-filelock==3.16.1
- # via torch
-flatbuffers==24.12.23
- # via tensorflow
-fonttools==4.55.3
- # via matplotlib
-fsspec==2024.12.0
- # via torch
-gast==0.6.0
- # via tensorflow
-google-pasta==0.2.0
- # via tensorflow
-grpcio==1.69.0
- # via
- # tensorboard
- # tensorflow
-h5py==3.12.1
- # via
- # keras
- # tensorflow
-idna==3.10
- # via requests
-imageio==2.37.0
- # via
- # -r requirements.in
- # scikit-image
-jinja2==3.1.5
- # via torch
-jpeg4py==0.1.4
- # via -r requirements.in
-keras==3.8.0
- # via tensorflow
-kiwisolver==1.4.8
- # via matplotlib
-kornia-rs==0.1.8
- # via -r requirements.in
-lazy-loader==0.4
- # via scikit-image
-libclang==18.1.1
- # via tensorflow
-markdown==3.7
- # via tensorboard
-markdown-it-py==3.0.0
- # via rich
-markupsafe==3.0.2
- # via
- # jinja2
- # werkzeug
-matplotlib==3.10.0
- # via
- # -r requirements.in
- # seaborn
-mbstrdecoder==1.1.4
- # via
- # dataproperty
- # pytablewriter
- # typepy
-mdurl==0.1.2
- # via markdown-it-py
-ml-dtypes==0.4.1
- # via
- # keras
- # tensorflow
-mpmath==1.3.0
- # via sympy
-namex==0.0.8
- # via keras
-networkx==3.4.2
- # via
- # scikit-image
- # torch
-numpy==2.0.2
- # via
- # -r requirements.in
- # contourpy
- # h5py
- # imageio
- # jpeg4py
- # keras
- # matplotlib
- # ml-dtypes
- # opencv-python-headless
- # pandas
- # scikit-image
- # scipy
- # seaborn
- # tensorboard
- # tensorflow
- # tifffile
- # torchvision
-opencv-python-headless==4.11.0.86
- # via -r requirements.in
-opt-einsum==3.4.0
- # via tensorflow
-optree==0.14.0
- # via keras
-packaging==24.2
- # via
- # keras
- # lazy-loader
- # matplotlib
- # scikit-image
- # tensorboard
- # tensorflow
- # typepy
-pandas==2.2.3
- # via
- # -r requirements.in
- # seaborn
-pathvalidate==3.2.3
- # via pytablewriter
-pillow==11.1.0
- # via
- # -r requirements.in
- # imageio
- # matplotlib
- # scikit-image
- # torchvision
-protobuf==5.29.3
- # via
- # tensorboard
- # tensorflow
-pycparser==2.22
- # via cffi
-pygments==2.19.1
- # via rich
-pyparsing==3.2.1
- # via matplotlib
-pytablewriter==1.2.1
- # via -r requirements.in
-python-dateutil==2.9.0.post0
- # via
- # matplotlib
- # pandas
- # typepy
-pytz==2024.2
- # via
- # pandas
- # typepy
-requests==2.32.3
- # via tensorflow
-rich==13.9.4
- # via keras
-scikit-image==0.25.0
- # via -r requirements.in
-scipy==1.15.1
- # via scikit-image
-seaborn==0.13.2
- # via -r requirements.in
-setuptools==75.8.0
- # via
- # pytablewriter
- # tensorboard
- # tensorflow
- # torch
-six==1.17.0
- # via
- # astunparse
- # google-pasta
- # python-dateutil
- # tensorboard
- # tensorflow
-sympy==1.13.1
- # via torch
-tabledata==1.3.4
- # via pytablewriter
-tabulate==0.9.0
- # via -r requirements.in
-tcolorpy==0.1.7
- # via pytablewriter
-tensorboard==2.18.0
- # via tensorflow
-tensorboard-data-server==0.7.2
- # via tensorboard
-tensorflow==2.18.0
- # via -r requirements.in
-termcolor==2.5.0
- # via tensorflow
-tifffile==2025.1.10
- # via scikit-image
-torch==2.5.1
- # via torchvision
-torchvision==0.20.1
- # via -r requirements.in
-tqdm==4.67.1
- # via -r requirements.in
-typepy==1.3.4
- # via
- # dataproperty
- # pytablewriter
- # tabledata
-typing-extensions==4.12.2
- # via
- # optree
- # tensorflow
- # torch
-tzdata==2024.2
- # via pandas
-urllib3==2.3.0
- # via requests
-werkzeug==3.1.3
- # via tensorboard
-wheel==0.45.1
- # via astunparse
-wrapt==1.17.2
- # via tensorflow
diff --git a/requirements/base.txt b/requirements/base.txt
new file mode 100644
index 0000000..b2da4ca
--- /dev/null
+++ b/requirements/base.txt
@@ -0,0 +1,5 @@
+numpy
+pandas
+py-cpuinfo
+pytablewriter
+tqdm
diff --git a/requirements/imageio.txt b/requirements/imageio.txt
new file mode 100644
index 0000000..a464e4c
--- /dev/null
+++ b/requirements/imageio.txt
@@ -0,0 +1 @@
+imageio
diff --git a/requirements/jpeg4py.txt b/requirements/jpeg4py.txt
new file mode 100644
index 0000000..37eeca0
--- /dev/null
+++ b/requirements/jpeg4py.txt
@@ -0,0 +1 @@
+jpeg4py
diff --git a/requirements/kornia.txt b/requirements/kornia.txt
new file mode 100644
index 0000000..f41272c
--- /dev/null
+++ b/requirements/kornia.txt
@@ -0,0 +1 @@
+kornia-rs
diff --git a/requirements/opencv.txt b/requirements/opencv.txt
new file mode 100644
index 0000000..6ab6d0d
--- /dev/null
+++ b/requirements/opencv.txt
@@ -0,0 +1 @@
+opencv-python-headless
diff --git a/requirements/pillow-simd.txt b/requirements/pillow-simd.txt
new file mode 100644
index 0000000..1a3a520
--- /dev/null
+++ b/requirements/pillow-simd.txt
@@ -0,0 +1 @@
+pillow-simd
diff --git a/requirements/pillow.txt b/requirements/pillow.txt
new file mode 100644
index 0000000..3868fb1
--- /dev/null
+++ b/requirements/pillow.txt
@@ -0,0 +1 @@
+pillow
diff --git a/requirements/skimage.txt b/requirements/skimage.txt
new file mode 100644
index 0000000..391ca2f
--- /dev/null
+++ b/requirements/skimage.txt
@@ -0,0 +1 @@
+scikit-image
diff --git a/requirements/tensorflow.txt b/requirements/tensorflow.txt
new file mode 100644
index 0000000..0f57144
--- /dev/null
+++ b/requirements/tensorflow.txt
@@ -0,0 +1 @@
+tensorflow
diff --git a/requirements/torchvision.txt b/requirements/torchvision.txt
new file mode 100644
index 0000000..284c569
--- /dev/null
+++ b/requirements/torchvision.txt
@@ -0,0 +1,4 @@
+-f https://download.pytorch.org/whl/torch_stable.html
+
+torch
+torchvision
diff --git a/run_benchmarks.sh b/run_benchmarks.sh
new file mode 100755
index 0000000..f648da8
--- /dev/null
+++ b/run_benchmarks.sh
@@ -0,0 +1,141 @@
+#!/bin/bash
+
+# Function to show help message
+show_help() {
+ cat << EOF
+Usage: ./run_benchmarks.sh [num_images] [num_runs]
+
+This script runs image reading benchmarks for multiple Python libraries.
+It creates separate virtual environments for each library and saves results
+to output//_results.json
+
+Arguments:
+ path_to_image_directory (Required) Directory containing images to benchmark
+ num_images (Optional) Number of images to process (default: 2000)
+ num_runs (Optional) Number of benchmark runs (default: 5)
+
+Example usage:
+ # Basic usage with defaults (2000 images, 5 runs):
+ ./run_benchmarks.sh ~/dataset/images
+
+ # Custom number of images and runs:
+ ./run_benchmarks.sh ~/dataset/images 1000 3
+
+Libraries being benchmarked:
+ - opencv (opencv-python-headless)
+ - pil (Pillow)
+ - skimage (scikit-image)
+ - imageio
+ - torchvision
+ - tensorflow
+ - kornia (kornia-rs)
+
+Results will be saved in:
+ output/
+ ├── linux/ # When run on Linux
+ │ ├── opencv_results.json
+ │ ├── pil_results.json
+ │ └── ...
+ └── darwin/ # When run on macOS
+ ├── opencv_results.json
+ ├── pil_results.json
+ └── ...
+EOF
+}
+
+# Show help if -h or --help is passed
+if [[ "$1" == "-h" || "$1" == "--help" ]]; then
+ show_help
+ exit 0
+fi
+
+# Exit on error
+set -e
+
+# Base directory for virtual environments
+VENV_DIR="venvs"
+mkdir -p "$VENV_DIR"
+
+# Create output directory
+mkdir -p output
+
+# List of libraries to benchmark
+LIBRARIES=("opencv" "pillow" "jpeg4py" "skimage" "imageio" "torchvision" "tensorflow" "kornia" "pillow-simd")
+
+# Function to create and activate virtual environment
+setup_venv() {
+ local lib=$1
+ echo "Setting up environment for $lib..."
+
+ # Get the full path to the current Python interpreter
+ PYTHON_PATH=$(which python)
+ echo "Using Python: $PYTHON_PATH"
+ echo "Python version: $($PYTHON_PATH --version)"
+
+ # Create venv with the same Python version
+ $PYTHON_PATH -m venv "$VENV_DIR/$lib" --clear
+
+ # Activate virtual environment (works on both Unix and Windows)
+ if [[ "$OSTYPE" == "msys" || "$OSTYPE" == "cygwin" ]]; then
+ source "$VENV_DIR/$lib/Scripts/activate"
+ else
+ source "$VENV_DIR/$lib/bin/activate"
+ fi
+
+ # Upgrade pip first using the correct Python
+ $PYTHON_PATH -m pip install --upgrade pip
+
+ # Install uv using the correct Python
+ $PYTHON_PATH -m pip install uv
+
+ # Set UV to use copy mode instead of hardlinks
+ export UV_LINK_MODE=copy
+
+ # Install requirements using uv
+ uv pip install -r requirements/base.txt
+ uv pip install -r "requirements/$lib.txt"
+}
+
+# Function to run benchmark for a single library
+run_benchmark() {
+ local lib=$1
+ echo "Running benchmark for $lib..."
+ export BENCHMARK_LIBRARY=$lib
+ python imread_benchmark/benchmark_single.py \
+ --data-dir "$DATA_DIR" \
+ --num-images "$NUM_IMAGES" \
+ --num-runs "$NUM_RUNS" \
+ --output-dir output
+}
+
+# Check if required arguments are provided
+if [ -z "$1" ]; then
+ echo "Error: Image directory path is required"
+ echo
+ show_help
+ exit 1
+fi
+
+DATA_DIR=$1
+NUM_IMAGES=${2:-2000}
+NUM_RUNS=${3:-5}
+
+echo "Starting benchmarks with:"
+echo " Image directory: $DATA_DIR"
+echo " Number of images: $NUM_IMAGES"
+echo " Number of runs: $NUM_RUNS"
+echo
+
+# Run benchmarks for each library
+for lib in "${LIBRARIES[@]}"; do
+ echo "Processing $lib..."
+ setup_venv "$lib"
+ run_benchmark "$lib"
+ deactivate
+ echo "Completed $lib"
+ echo
+done
+
+echo "All benchmarks completed!"
+echo "Results are saved in the output directory organized by operating system."
+echo "Check output/$(uname -s | tr '[:upper:]' '[:lower:]')/ for results."
diff --git a/tools/__init__.py b/tools/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/tools/analyze_images.py b/tools/analyze_images.py
new file mode 100644
index 0000000..640823f
--- /dev/null
+++ b/tools/analyze_images.py
@@ -0,0 +1,59 @@
+import statistics
+from pathlib import Path
+
+from PIL import Image
+
+
+def analyze_images(folder_path: str | Path, limit: int = 2000) -> None:
+ """Analyze first N JPEG images in the given folder."""
+ folder = Path(folder_path)
+ file_sizes = []
+ resolutions = []
+
+ # Collect data from first 'limit' images
+ for img_path in sorted(folder.glob("*.*"))[:limit]:
+ if img_path.suffix.lower() in {".jpg", ".jpeg"}:
+ # Get file size in KB
+ file_sizes.append(img_path.stat().st_size / 1024)
+
+ # Get image resolution
+ with Image.open(img_path) as img:
+ resolutions.append(img.size)
+
+ if not file_sizes:
+ print("No JPEG images found in the folder")
+ return
+
+ # Analyze file sizes
+ avg_size = statistics.mean(file_sizes)
+ min_size = min(file_sizes)
+ max_size = max(file_sizes)
+ median_size = statistics.median(file_sizes)
+
+ # Analyze resolutions
+ widths, heights = zip(*resolutions, strict=False)
+ min_res = (min(widths), min(heights))
+ max_res = (max(widths), max(heights))
+ avg_res = (statistics.mean(widths), statistics.mean(heights))
+ median_res = (statistics.median(widths), statistics.median(heights))
+
+ # Print results
+ print(f"Analysis of first {len(file_sizes)} images:")
+ print("\nFile Sizes (KB):")
+ print(f"- Average: {avg_size:.1f}")
+ print(f"- Median: {median_size:.1f}")
+ print(f"- Range: {min_size:.1f} - {max_size:.1f}")
+ print("\nResolutions (pixels):")
+ print(f"- Average: {avg_res[0]:.0f} x {avg_res[1]:.0f}")
+ print(f"- Median: {median_res[0]:.0f} x {median_res[1]:.0f}")
+ print(f"- Range: {min_res[0]}x{min_res[1]} - {max_res[0]}x{max_res[1]}")
+
+
+if __name__ == "__main__":
+ import sys
+
+ if len(sys.argv) != 2:
+ print("Usage: python script.py ")
+ sys.exit(1)
+
+ analyze_images(sys.argv[1])
diff --git a/tools/create_plots.py b/tools/create_plots.py
new file mode 100644
index 0000000..791cbb5
--- /dev/null
+++ b/tools/create_plots.py
@@ -0,0 +1,157 @@
+from __future__ import annotations
+
+import json
+from pathlib import Path
+from typing import NotRequired, TypedDict
+
+import matplotlib.pyplot as plt
+import pandas as pd
+import seaborn as sns
+
+sns.set_theme(style="whitegrid", font="Arial")
+sns.set_context("paper", font_scale=1.5)
+
+
+class BenchmarkResults(TypedDict):
+ images_per_second: str
+ raw_times: list[float]
+
+
+class SystemInfo(TypedDict):
+ Python: str
+ OS: str
+ OS_Version: NotRequired[str]
+ Machine: str
+ CPU: dict
+ imageio: NotRequired[str]
+ kornia: NotRequired[str]
+ opencv: NotRequired[str]
+ skimage: NotRequired[str]
+ tensorflow: NotRequired[str]
+ torchvision: NotRequired[str]
+
+
+class ResultData(TypedDict):
+ library: str
+ system_info: SystemInfo
+ benchmark_results: BenchmarkResults
+ num_images: int
+ num_runs: int
+
+
+def load_results(path: str | Path) -> pd.DataFrame:
+ """Load all JSON results and convert to DataFrame."""
+ results: list[dict] = []
+ path = Path(path)
+
+ for platform_dir in path.iterdir():
+ if not platform_dir.is_dir():
+ continue
+
+ platform = platform_dir.name
+
+ for result_file in platform_dir.glob("*_results.json"):
+ with result_file.open() as f:
+ data: ResultData = json.load(f)
+
+ library = data["library"]
+ if library == "kornia":
+ library = "kornia-rs"
+
+ mean_str, std_str = data["benchmark_results"]["images_per_second"].split("±")
+ mean = float(mean_str.strip())
+ std = float(std_str.strip())
+
+ results.append(
+ {
+ "platform": platform,
+ "library": library,
+ "images_per_second": mean,
+ "std_dev": std,
+ },
+ )
+
+ return pd.DataFrame(results)
+
+
+def plot_platform_performance(df: pd.DataFrame, platform: str, output_path: str | Path) -> None:
+ """Create a publication-quality horizontal bar plot optimized for two-column paper format."""
+ plt.style.use("default")
+ sns.set_theme(style="whitegrid", font="Arial")
+
+ platform_data = df[df["platform"] == platform].copy()
+ platform_data = platform_data.sort_values("images_per_second", ascending=True)
+
+ # Figure size for two-column paper
+ plt.figure(figsize=(7, 5))
+
+ # Generate colors
+ n_bars = len(platform_data)
+ colors = sns.color_palette("Blues", n_colors=n_bars)
+
+ # Create horizontal bars
+ bars = plt.barh(range(len(platform_data)), platform_data["images_per_second"], height=0.7, color=colors)
+
+ # Add error bars
+ plt.errorbar(
+ platform_data["images_per_second"],
+ range(len(platform_data)),
+ xerr=platform_data["std_dev"],
+ fmt="none",
+ color="black",
+ capsize=4,
+ alpha=0.5,
+ linewidth=1.5,
+ )
+
+ # Add value labels inside bars
+ for i, bar in enumerate(bars):
+ width = bar.get_width()
+ text_color = "white" if i > n_bars / 2 else "black"
+ plt.text(
+ width / 2,
+ bar.get_y() + bar.get_height() / 2,
+ f"{width:.0f}",
+ ha="center",
+ va="center",
+ color=text_color,
+ fontsize=14,
+ fontweight="bold",
+ )
+
+ # Concise, single-line titles
+ platform_titles = {
+ "darwin": "JPEG Decoding Speed (Apple M4 Max)",
+ "linux": "JPEG Decoding Speed (AMD Threadripper 3970X)",
+ }
+ plt.title(platform_titles[platform], pad=20, fontsize=16, fontweight="bold")
+ plt.xlabel("Images per Second", fontsize=14, fontweight="bold")
+ plt.yticks(range(len(platform_data)), platform_data["library"], fontsize=14)
+
+ # Thicker axis lines
+ plt.gca().spines["left"].set_linewidth(1.5)
+ plt.gca().spines["bottom"].set_linewidth(1.5)
+
+ # Adjust grid
+ plt.grid(True, axis="x", linestyle="--", alpha=0.3, linewidth=1.5)
+
+ # Adjust layout
+ plt.tight_layout(pad=1.2)
+
+ # Save plot
+ plt.savefig(output_path, dpi=600, bbox_inches="tight")
+ plt.close()
+
+
+def main() -> None:
+ # Load and process data
+ df = load_results("output")
+
+ # Create visualizations
+ # Create separate plots for each platform
+ plot_platform_performance(df, "darwin", "performance_darwin.png")
+ plot_platform_performance(df, "linux", "performance_linux.png")
+
+
+if __name__ == "__main__":
+ main()