ebook-convert

An ebook format converter — convert various ebook formats to EPUB.

Supported Formats

Source Format	Extension	Method	Layout Preservation
MOBI (KF8)	`.mobi`	Extract embedded EPUB	Near lossless
AZW3	`.azw3` `.azw`	Extract embedded EPUB	Near lossless
PDF (text-based)	`.pdf`	Page-by-page extraction & rebuild	Good

Installation

Requires Python 3.12+. uv is recommended:

git clone <repo-url>
cd ebook_convert

uv sync

Or with pip:

pip install -e .

Usage

Single File

# Output to the same directory with .epub extension
ebook-convert book.mobi
ebook-convert book.azw3
ebook-convert document.pdf

# Specify output path
ebook-convert book.mobi -o ~/Books/output.epub

Batch Convert

# Convert all supported files in a directory
ebook-convert ./my-books/

Run with uv (no install needed)

uv run ebook-convert book.mobi

Conversion Details

MOBI / AZW3

Uses KindleUnpack to unpack Kindle files. KF8 format (most azw3 and newer mobi files) contains a full EPUB structure internally — CSS styles, images, and fonts are fully preserved.

For legacy MOBI files that only contain HTML, the converter rebuilds the EPUB from extracted HTML/CSS/images, preserving original styles and chapter structure.

PDF

PDF is a fixed-layout format. Converting to reflowable EPUB involves:

Text content: Fully extracted with bold, italic, and other styles preserved
Heading detection: Automatically inferred from font size statistics (text larger than body size is recognized as headings)
Image positioning: Sorted by page coordinates and inserted between corresponding text blocks, preserving relative text-image relationships
Paragraph layout: 2em text indent, justified alignment, 1.8 line height
Fonts: Prefers CJK serif fonts (Noto Serif CJK, Source Han Serif, etc.)

Note: Scanned PDFs (image-only) cannot extract text. OCR is not currently supported.

Project Structure

src/ebook_convert/
├── cli.py              # CLI entry point (click)
├── converter.py        # Conversion dispatcher
└── converters/
    ├── base.py         # Base converter class
    ├── mobi.py         # MOBI → EPUB
    ├── azw3.py         # AZW3 → EPUB
    └── pdf.py          # PDF → EPUB

Adding New Formats

Subclass BaseConverter and implement the convert method:

from ebook_convert.converters.base import BaseConverter

class TxtConverter(BaseConverter):
    supported_extensions = [".txt"]

    def convert(self, input_path, output_path):
        # conversion logic
        ...

Then register it in the _CONVERTERS list in converter.py.

Dependencies

click — CLI framework
mobi — Kindle format unpacking (based on KindleUnpack)
ebooklib — EPUB read/write
PyMuPDF — PDF parsing

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
src/ebook_convert		src/ebook_convert
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
README_CN.md		README_CN.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ebook-convert

Supported Formats

Installation

Usage

Single File

Batch Convert

Run with uv (no install needed)

Conversion Details

MOBI / AZW3

PDF

Project Structure

Adding New Formats

Dependencies

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ebook-convert

Supported Formats

Installation

Usage

Single File

Batch Convert

Run with uv (no install needed)

Conversion Details

MOBI / AZW3

PDF

Project Structure

Adding New Formats

Dependencies

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages