Skip to content

tstone-1/xlsxturbo

Repository files navigation

xlsxturbo

High-performance Excel writer with automatic type detection. Written in Rust, usable from Python.

Features

  • Direct DataFrame support for pandas and polars
  • Excel tables - filterable tables with 61 built-in styles (banded rows, autofilter)
  • Conditional formatting - color scales, data bars, icon sets for visual data analysis
  • Formula columns - add calculated columns with Excel formulas
  • Merged cells - merge cell ranges for headers and titles
  • Hyperlinks - add clickable links to cells
  • Comments/Notes - add cell annotations with optional author
  • Data validation - dropdowns, number ranges, text length constraints
  • Rich text - multiple formats within a single cell
  • Images - embed PNG, JPEG, GIF, BMP in cells
  • Checkboxes - interactive cell checkboxes (Excel for Microsoft 365, Sept 2024+)
  • Textboxes - floating text shapes with configurable font, fill, and line colors
  • Defined names - workbook-level named ranges for formulas and references
  • Arbitrary cell writes - write values to specific cells with optional formatting
  • Border styles - per-side borders (left, right, top, bottom) with 13 style options
  • Text alignment - horizontal and vertical alignment with text wrapping
  • Auto-fit columns - automatically adjust column widths to fit content
  • Custom column widths - set specific widths per column or cap all with _all
  • Header styling - bold, colors, font size for header row
  • Named tables - set custom table names
  • Custom row heights - set specific heights per row
  • Freeze panes - freeze header row for easier scrolling
  • Multi-sheet workbooks - write multiple DataFrames to one file
  • Per-sheet options - override settings per sheet in multi-sheet workbooks
  • Constant memory mode - minimize RAM usage for very large files
  • Parallel CSV processing - optional multi-core parsing for large files
  • Automatic type detection from CSV strings and Python objects:
    • Integers and floats → Excel numbers
    • true/false → Excel booleans
    • Dates (2024-01-15, 15/01/2024, etc.) → Excel dates with formatting
    • Datetimes (ISO 8601) → Excel datetimes
    • NaN/Inf → Empty cells (graceful handling)
    • Everything else → Text
  • ~7x faster than pandas + openpyxl (see benchmarks)
  • Memory efficient - streams data with 1MB buffer
  • Available as both Python library and CLI tool

Installation

pip install xlsxturbo

Or build from source:

pip install maturin
maturin develop --release

Python Usage

DataFrame Export (pandas/polars)

import xlsxturbo
import pandas as pd

# Create a DataFrame
df = pd.DataFrame({
    'name': ['Alice', 'Bob'],
    'age': [30, 25],
    'salary': [50000.50, 60000.75],
    'active': [True, False]
})

# Export to XLSX (preserves types: int, float, bool, date, datetime)
rows, cols = xlsxturbo.df_to_xlsx(df, "output.xlsx")
print(f"Wrote {rows} rows and {cols} columns")

# Works with polars too!
import polars as pl
df_polars = pl.DataFrame({'x': [1, 2, 3], 'y': [4.0, 5.0, 6.0]})
xlsxturbo.df_to_xlsx(df_polars, "polars_output.xlsx", sheet_name="Data")

Excel Tables with Styling

import xlsxturbo
import pandas as pd

df = pd.DataFrame({
    'Product': ['Widget A', 'Widget B', 'Widget C'],
    'Price': [19.99, 29.99, 39.99],
    'Quantity': [100, 75, 50],
})

# Create a styled Excel table with autofilter, banded rows, and auto-fit columns
xlsxturbo.df_to_xlsx(df, "report.xlsx",
    table_style="Medium9",   # Excel's default table style
    autofit=True,            # Fit column widths to content
    freeze_panes=True        # Freeze header row for scrolling
)

# Available styles: Light1-Light21, Medium1-Medium28, Dark1-Dark11
xlsxturbo.df_to_xlsx(df, "dark_table.xlsx", table_style="Dark1", autofit=True)

Custom Column Widths and Row Heights

import xlsxturbo
import pandas as pd

df = pd.DataFrame({
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Department': ['Engineering', 'Marketing', 'Sales'],
    'Salary': [75000, 65000, 55000]
})

# Set specific column widths (column index -> width in characters)
xlsxturbo.df_to_xlsx(df, "report.xlsx", 
    column_widths={0: 20, 1: 25, 2: 15}
)

# Set specific row heights (row index -> height in points)
xlsxturbo.df_to_xlsx(df, "report.xlsx",
    row_heights={0: 25}  # Make header row taller
)

# Combine with other options
xlsxturbo.df_to_xlsx(df, "styled.xlsx",
    table_style="Medium9",
    freeze_panes=True,
    column_widths={0: 20, 1: 30, 2: 15},
    row_heights={0: 22}
)

Global Column Width Cap

Use column_widths={'_all': value} to cap all columns at a maximum width:

import xlsxturbo
import pandas as pd

df = pd.DataFrame({
    'Name': ['Alice', 'Bob'],
    'VeryLongDescription': ['A' * 100, 'B' * 100],
    'Score': [95, 87]
})

# Cap all columns at 30 characters
xlsxturbo.df_to_xlsx(df, "capped.xlsx", column_widths={'_all': 30})

# Mix specific widths with global cap (specific overrides '_all')
xlsxturbo.df_to_xlsx(df, "mixed.xlsx", column_widths={0: 15, '_all': 30})

# Autofit with cap: fit content, but never exceed 25 characters
xlsxturbo.df_to_xlsx(df, "fitted.xlsx", autofit=True, column_widths={'_all': 25})

Named Excel Tables

Set custom names for Excel tables:

import xlsxturbo
import pandas as pd

df = pd.DataFrame({'Product': ['A', 'B'], 'Price': [10, 20]})

# Name the Excel table
xlsxturbo.df_to_xlsx(df, "report.xlsx", 
    table_style="Medium2", 
    table_name="ProductPrices"
)

# Invalid characters are auto-sanitized, digits get underscore prefix
xlsxturbo.df_to_xlsx(df, "report.xlsx",
    table_style="Medium2",
    table_name="2024 Sales Data!"  # Becomes "_2024_Sales_Data_"
)

Header Styling

Apply custom formatting to header cells:

import xlsxturbo
import pandas as pd

df = pd.DataFrame({'Name': ['Alice', 'Bob'], 'Score': [95, 87]})

# Bold headers
xlsxturbo.df_to_xlsx(df, "bold.xlsx", header_format={'bold': True})

# Full styling with colors
xlsxturbo.df_to_xlsx(df, "styled.xlsx", header_format={
    'bold': True,
    'bg_color': '#4F81BD',   # Blue background
    'font_color': 'white'    # White text
})

# Available options:
# - bold (bool): Bold text
# - italic (bool): Italic text
# - font_color (str): '#RRGGBB' or named color (white, black, red, blue, etc.)
# - bg_color (str): Background color
# - font_size (float): Font size in points
# - underline (bool): Underlined text
# - border (bool|str): True = thin all sides, or style name
# - border_left/right/top/bottom (str): Per-side border style
# - border_color (str): Color for all borders
# - align_horizontal (str): 'left', 'center', 'right', 'fill', 'justify'
# - align_vertical (str): 'top', 'center', 'bottom'
# - wrap_text (bool): Enable text wrapping within cell

Note: Unknown keys (e.g. 'color' instead of 'font_color') and wrong value types raise an error listing the valid options. Applies to header_format, column_formats, conditional_formats[...]['format'], images, and validations.

Column Formatting

Apply formatting to data columns using pattern matching. Unknown keys raise errors (see Header Styling).

import xlsxturbo
import pandas as pd

df = pd.DataFrame({
    'product_id': [1, 2, 3],
    'product_name': ['Widget A', 'Widget B', 'Widget C'],
    'price_usd': [19.99, 29.99, 39.99],
    'price_eur': [17.99, 26.99, 35.99],
    'quantity': [100, 75, 50]
})

# Format columns by pattern
xlsxturbo.df_to_xlsx(df, "report.xlsx", column_formats={
    'price_*': {'num_format': '$#,##0.00', 'bg_color': '#E8F5E9'},  # All price columns
    'quantity': {'bold': True}  # Exact match
})

# Wildcard patterns:
# - 'prefix*' matches columns starting with 'prefix'
# - '*suffix' matches columns ending with 'suffix'
# - '*contains*' matches columns containing 'contains'
# - 'exact' matches column name exactly

# Available format options:
# - bg_color (str): Background color ('#RRGGBB' or named)
# - font_color (str): Text color
# - num_format (str): Excel number format ('0.00', '#,##0', '0.00%', etc.)
# - bold (bool): Bold text
# - italic (bool): Italic text
# - underline (bool): Underlined text
# - border (bool|str): True = thin all sides, or style name all sides
# - border_left (str): Border style for left side only
# - border_right (str): Border style for right side only
# - border_top (str): Border style for top side only
# - border_bottom (str): Border style for bottom side only
# - border_color (str): Color for all borders ('#RRGGBB' or named)
#
# Border styles: thin, medium, thick, dashed, dotted, double, hair,
#   medium_dashed, dash_dot, medium_dash_dot, dash_dot_dot,
#   medium_dash_dot_dot, slant_dash_dot
# - align_horizontal (str): 'left', 'center', 'right', 'fill', 'justify'
# - align_vertical (str): 'top', 'center', 'bottom'
# - wrap_text (bool): Enable text wrapping within cell

# First matching pattern wins (order preserved)
xlsxturbo.df_to_xlsx(df, "report.xlsx", column_formats={
    'price_usd': {'bg_color': '#FFEB3B'},  # Specific: yellow for USD
    'price_*': {'bg_color': '#E3F2FD'}     # General: blue for other prices
})

# Per-side borders with style control
xlsxturbo.df_to_xlsx(df, "report.xlsx", column_formats={
    'price_usd': {'border_right': 'thick'},              # Thick right border only
    'quantity': {'border': 'thin'},                       # Thin border all sides
    'product_name': {'border_left': 'medium', 'border_right': 'medium'},  # Left+right
})

Multi-Sheet Workbooks

import xlsxturbo
import pandas as pd

# Write multiple DataFrames to separate sheets
df1 = pd.DataFrame({'product': ['A', 'B'], 'sales': [100, 200]})
df2 = pd.DataFrame({'region': ['East', 'West'], 'total': [500, 600]})

xlsxturbo.dfs_to_xlsx([
    (df1, "Products"),
    (df2, "Regions")
], "report.xlsx")

# With styling applied to all sheets
xlsxturbo.dfs_to_xlsx([
    (df1, "Products"),
    (df2, "Regions")
], "styled_report.xlsx", table_style="Medium2", autofit=True, freeze_panes=True)

# With column widths applied to all sheets
xlsxturbo.dfs_to_xlsx([
    (df1, "Products"),
    (df2, "Regions")
], "report.xlsx", column_widths={0: 20, 1: 15})

Per-Sheet Options

Override global settings for individual sheets using a 3-tuple with options dict:

import xlsxturbo
import pandas as pd

df_data = pd.DataFrame({'Product': ['A', 'B'], 'Price': [10, 20]})
df_instructions = pd.DataFrame({'Step': [1, 2], 'Action': ['Open file', 'Review data']})

# Different settings per sheet:
# - "Data" sheet: has header, table style, autofit
# - "Instructions" sheet: no header (raw data), no table style
xlsxturbo.dfs_to_xlsx([
    (df_data, "Data", {"header": True, "table_style": "Medium2"}),
    (df_instructions, "Instructions", {"header": False, "table_style": None})
], "report.xlsx", autofit=True)

# Old 2-tuple API still works - uses global defaults
xlsxturbo.dfs_to_xlsx([
    (df_data, "Sheet1"),  # Uses global header=True, table_style=None
    (df_instructions, "Sheet2", {"header": False})  # Override just header
], "mixed.xlsx", header=True, autofit=True)

Available per-sheet options:

  • header (bool): Include column names as header row
  • autofit (bool): Automatically adjust column widths
  • table_style (str|None): Excel table style or None to disable
  • freeze_panes (bool): Freeze header row
  • column_widths (dict): Custom column widths
  • row_heights (dict): Custom row heights
  • table_name (str): Custom Excel table name
  • header_format (dict): Header cell styling
  • column_formats (dict): Column formatting with pattern matching
  • conditional_formats (dict): Conditional formatting (color scales, data bars, icons)
  • formula_columns (dict): Calculated columns with Excel formulas (column name -> formula template)
  • merged_ranges (list): List of (range, text) or (range, text, format) tuples to merge cells
  • hyperlinks (list): List of (cell, url) or (cell, url, display_text) tuples to add clickable links
  • comments (dict): Cell comments/notes (cell_ref -> text or {text, author})
  • validations (dict): Data validation rules (column name/pattern -> validation config)
  • rich_text (dict): Rich text with multiple formats (cell_ref -> list of segments)
  • images (dict): Embedded images (cell_ref -> path or {path, scale_width, scale_height, alt_text})
  • cells (dict): Arbitrary cell writes (cell_ref -> value or {value, num_format})

Conditional Formatting

Apply visual formatting based on cell values. Unknown keys in the nested format dict raise errors (see Header Styling).

import xlsxturbo
import pandas as pd

df = pd.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie', 'Diana'],
    'score': [95, 72, 88, 45],
    'progress': [0.9, 0.5, 0.75, 0.3],
    'status': [3, 2, 3, 1]
})

xlsxturbo.df_to_xlsx(df, "report.xlsx",
    autofit=True,
    conditional_formats={
        # 2-color gradient: red (low) to green (high)
        'score': {
            'type': '2_color_scale',
            'min_color': '#FF6B6B',
            'max_color': '#51CF66'
        },
        # Data bars: in-cell bar chart
        'progress': {
            'type': 'data_bar',
            'bar_color': '#339AF0',
            'solid': True  # Solid fill instead of gradient
        },
        # Icon set: traffic lights
        'status': {
            'type': 'icon_set',
            'icon_type': '3_traffic_lights'
        }
    }
)

Supported conditional format types:

Type Options
2_color_scale min_color, max_color
3_color_scale min_color, mid_color, max_color
data_bar bar_color, border_color, solid, direction
icon_set icon_type, reverse, icons_only
cell criteria, value, min_value, max_value, format

Available icon types:

  • 3 icons: 3_arrows, 3_arrows_gray, 3_flags, 3_traffic_lights, 3_traffic_lights_rimmed, 3_signs, 3_symbols, 3_symbols_uncircled
  • 4 icons: 4_arrows, 4_arrows_gray, 4_traffic_lights, 4_rating
  • 5 icons: 5_arrows, 5_arrows_gray, 5_quarters, 5_rating

Cell rules — highlight cells based on value conditions:

# Single rule
conditional_formats={
    'status': {
        'type': 'cell',
        'criteria': 'equal_to',
        'value': 'ERROR',
        'format': {'bg_color': '#FF0000', 'font_color': 'white', 'bold': True}
    }
}

# Multiple rules on one column (pass a list)
conditional_formats={
    'severity': [
        {'type': 'cell', 'criteria': 'equal_to', 'value': 'HIGH', 'format': {'bg_color': '#FF0000'}},
        {'type': 'cell', 'criteria': 'equal_to', 'value': 'MEDIUM', 'format': {'bg_color': '#FFA500'}},
        {'type': 'cell', 'criteria': 'equal_to', 'value': 'LOW', 'format': {'bg_color': '#FFFF00'}},
    ]
}

# Numeric comparison
conditional_formats={'score': {'type': 'cell', 'criteria': 'between', 'min_value': 0, 'max_value': 50, 'format': {'bg_color': '#FF0000'}}}

Available criteria for cell type:

Criteria Value keys Description
equal_to, not_equal_to value Exact match (string or number)
greater_than, less_than value Numeric comparison
greater_than_or_equal_to, less_than_or_equal_to value Numeric comparison
between, not_between min_value, max_value Range check
containing, not_containing value Text contains substring
begins_with, ends_with value Text prefix/suffix match
blanks, no_blanks (none) Empty/non-empty cells

Column patterns work with conditional formats:

# Apply data bars to all columns starting with "price_"
conditional_formats={'price_*': {'type': 'data_bar', 'bar_color': '#9B59B6'}}

Formula Columns

Add calculated columns to your Excel output. Formulas are written after data columns and use {row} as a placeholder for the row number:

import xlsxturbo
import pandas as pd

df = pd.DataFrame({
    'price': [100, 200, 150],
    'quantity': [5, 3, 8],
    'tax_rate': [0.1, 0.1, 0.2]
})

xlsxturbo.df_to_xlsx(df, "sales.xlsx",
    autofit=True,
    formula_columns={
        'Subtotal': '=A{row}*B{row}',      # price * quantity
        'Tax': '=D{row}*C{row}',            # subtotal * tax_rate
        'Total': '=D{row}+E{row}'           # subtotal + tax
    }
)

Formula columns appear after data columns (A=price, B=quantity, C=tax_rate, D=Subtotal, E=Tax, F=Total).

Notes:

  • {row} is replaced with the Excel row number (1-based, starting at 2 for data rows when header=True)
  • Formula columns inherit header formatting if specified
  • Column order is preserved (first formula = first new column)
  • Works with both df_to_xlsx and dfs_to_xlsx (global or per-sheet)

Merged Cells

Merge cell ranges to create headers, titles, or grouped labels:

import xlsxturbo
import pandas as pd

df = pd.DataFrame({
    'product': ['Widget A', 'Widget B'],
    'sales': [1500, 2300],
    'revenue': [7500, 11500]
})

# Merge cells for a title above the data
xlsxturbo.df_to_xlsx(df, "report.xlsx",
    header=True,
    merged_ranges=[
        # Simple merge with text (auto-centered)
        ('A1:C1', 'Q4 Sales Report'),
        # Merge with custom formatting
        ('A2:C2', 'Regional Data', {
            'bold': True,
            'bg_color': '#4F81BD',
            'font_color': 'white'
        })
    ]
)

Merged range format:

  • Tuple of (range, text) or (range, text, format_dict)
  • Range uses Excel notation: 'A1:D1', 'B3:B10', etc.
  • Format options same as header_format: bold, italic, font_color, bg_color, font_size, underline

Notes:

  • Merged cells are applied after data is written, so plan row positions accordingly
  • When using with header=True, data starts at row 2 (Excel row 2)
  • Works with both df_to_xlsx and dfs_to_xlsx (global or per-sheet)

Hyperlinks

Add clickable links to cells:

import xlsxturbo
import pandas as pd

df = pd.DataFrame({
    'company': ['Anthropic', 'Google', 'Microsoft'],
    'product': ['Claude', 'Gemini', 'Copilot'],
})

# Add hyperlinks to a new column (D) after the data columns (A, B, C with header)
xlsxturbo.df_to_xlsx(df, "companies.xlsx",
    autofit=True,
    hyperlinks=[
        # Header for the links column
        ('C1', 'https://example.com', 'Website'),
        # Links with company names as display text
        ('C2', 'https://anthropic.com', 'anthropic.com'),
        ('C3', 'https://google.com', 'google.com'),
        ('C4', 'https://microsoft.com', 'microsoft.com'),
    ]
)

Hyperlink format:

  • Tuple of (cell, url) or (cell, url, display_text)
  • Cell uses Excel notation: 'A1', 'B5', etc.
  • Display text is optional; if omitted, the URL is shown

Notes:

  • Hyperlinks write to the specified cell position (overwrites existing content)
  • To add a "links column", target cells beyond your DataFrame columns (as shown above)
  • Works with both df_to_xlsx and dfs_to_xlsx (global or per-sheet)
  • Not available in constant memory mode

Comments/Notes

Add cell annotations (hover to view):

import xlsxturbo
import pandas as pd

df = pd.DataFrame({
    'product': ['Widget A', 'Widget B'],
    'price': [19.99, 29.99]
})

xlsxturbo.df_to_xlsx(df, "report.xlsx",
    comments={
        # Simple text comment
        'A1': 'This column contains product names',
        # Comment with author
        'B1': {'text': 'Prices in USD', 'author': 'Finance Team'}
    }
)

Comment format:

  • Simple: {'A1': 'Note text'}
  • With author: {'A1': {'text': 'Note text', 'author': 'Name'}}

Notes:

  • Comments appear as small red triangles in the cell corner
  • Hover over the cell to see the comment
  • Works with both df_to_xlsx and dfs_to_xlsx (global or per-sheet)
  • Not available in constant memory mode

Data Validation

Add dropdowns and input constraints. Unknown keys raise errors (see Header Styling).

import xlsxturbo
import pandas as pd

df = pd.DataFrame({
    'status': ['Open', 'Closed'],
    'score': [85, 92],
    'price': [19.99, 29.99],
    'code': ['ABC', 'XYZ']
})

xlsxturbo.df_to_xlsx(df, "validated.xlsx",
    validations={
        # Dropdown list
        'status': {
            'type': 'list',
            'values': ['Open', 'Closed', 'Pending', 'Review']
        },
        # Whole number range (0-100)
        'score': {
            'type': 'whole_number',
            'min': 0,
            'max': 100,
            'error_title': 'Invalid Score',
            'error_message': 'Score must be between 0 and 100'
        },
        # Decimal range
        'price': {
            'type': 'decimal',
            'min': 0.0,
            'max': 999.99
        },
        # Text length constraint
        'code': {
            'type': 'text_length',
            'min': 3,
            'max': 10
        }
    }
)

Validation types:

Type Aliases Description Options
list - Dropdown menu values (list of strings, max 255 chars total)
whole_number whole, integer Integer range min, max
decimal number Decimal range min, max
text_length textlength, length Character count min, max

Optional message options:

  • input_title, input_message: Prompt shown when cell is selected
  • error_title, error_message: Message shown when invalid data is entered

Notes:

  • Validations apply to the data rows of the specified column
  • Column patterns work: 'score_*': {...} matches all columns starting with score_
  • If only min or only max is specified, the other defaults to the type's extreme value
  • List validation values are limited to 255 total characters (Excel limitation)
  • Works with both df_to_xlsx and dfs_to_xlsx (global or per-sheet)
  • Not available in constant memory mode

Rich Text

Multiple formats within a single cell:

import xlsxturbo
import pandas as pd

df = pd.DataFrame({'A': [1, 2, 3]})

xlsxturbo.df_to_xlsx(df, "rich.xlsx",
    rich_text={
        'D1': [
            ('Important: ', {'bold': True, 'font_color': 'red'}),
            'Please review ',
            ('all', {'italic': True}),
            ' values'
        ],
        'D2': [
            ('Status: ', {'bold': True}),
            ('OK', {'font_color': 'green', 'bold': True})
        ]
    }
)

Segment format:

  • Formatted: ('text', {'bold': True, 'font_color': 'blue'})
  • Plain: 'plain text' (no formatting)

Available format options:

  • bold (bool)
  • italic (bool)
  • font_color (str): '#RRGGBB' or named color
  • bg_color (str): Background color
  • font_size (float)
  • underline (bool)

Notes:

  • Rich text writes to the specified cell position (overwrites existing content)
  • Works with both df_to_xlsx and dfs_to_xlsx (global or per-sheet)
  • Not available in constant memory mode

Images

Embed images in cells. Unknown keys raise errors (see Header Styling).

import xlsxturbo
import pandas as pd

df = pd.DataFrame({'Product': ['Widget A', 'Widget B'], 'Price': [19.99, 29.99]})

xlsxturbo.df_to_xlsx(df, "catalog.xlsx",
    autofit=True,
    images={
        # Simple path
        'C2': 'images/widget_a.png',
        # With options
        'C3': {
            'path': 'images/widget_b.png',
            'scale_width': 0.5,
            'scale_height': 0.5,
            'alt_text': 'Widget B photo'
        }
    }
)

Image format:

  • Simple: {'C2': 'path/to/image.png'}
  • With options: {'C2': {'path': '...', 'scale_width': 0.5, ...}}

Available options:

  • path (str, required): Path to image file
  • scale_width (float): Width scale factor (1.0 = original)
  • scale_height (float): Height scale factor (1.0 = original)
  • alt_text (str): Alternative text for accessibility

Supported formats: PNG, JPEG, GIF, BMP

Notes:

  • Images are positioned at the specified cell (overlays any existing content)
  • Image file must exist; non-existent files will raise an error
  • Works with both df_to_xlsx and dfs_to_xlsx (global or per-sheet)
  • Not available in constant memory mode

Checkboxes

Add interactive checkboxes to cells (Excel for Microsoft 365, Sept 2024+). Renders as TRUE or FALSE that can be toggled in Excel:

import xlsxturbo
import pandas as pd

df = pd.DataFrame({'Task': ['Write docs', 'Run tests', 'Ship release']})

xlsxturbo.df_to_xlsx(df, "checklist.xlsx",
    checkboxes={
        # Bare bool - simplest form
        'B2': True,
        'B3': False,
        'B4': False,
        # Dict form with cell format
        'C2': {'checked': True, 'format': {'bg_color': '#C6EFCE', 'bold': True}},
    }
)

Checkbox format:

  • Simple: {'A1': True} or {'A1': False}
  • With format: {'A1': {'checked': True, 'format': {...}}}

Available options (dict form):

  • checked (bool, required): Initial state
  • format (dict): Optional cell format. Accepts the same keys as Column Formatting (bg_color, font_color, border, bold, etc.)

Notes:

  • Checkboxes are written AFTER DataFrame data — use cell refs that don't collide with data rows
  • Requires Excel for Microsoft 365 (Sept 2024 or later); older versions will display the underlying boolean value instead
  • Works with both df_to_xlsx and dfs_to_xlsx (global or per-sheet)
  • Not available in constant memory mode

Textboxes

Add floating text shapes (callouts, annotations) that sit on top of cells. Unknown keys raise errors (both at the top level and inside font).

import xlsxturbo
import pandas as pd

df = pd.DataFrame({'Region': ['North', 'South'], 'Sales': [120, 95]})

xlsxturbo.df_to_xlsx(df, "report.xlsx",
    textboxes={
        # Bare string - simplest form, default size/style
        'D2': 'Simple note',
        # Dict form with full options
        'E5': {
            'text': 'Q4 target met for all regions',
            'width': 220,
            'height': 80,
            'x_offset': 10,
            'y_offset': 5,
            'font': {
                'name': 'Arial',
                'size': 12,
                'bold': True,
                'italic': False,
                'underline': False,
                'color': '#2C3E50',
            },
            'fill_color': '#ECF0F1',
            'line_color': '#34495E',
            'alt_text': 'Q4 summary callout',
        },
    }
)

Textbox format:

  • Simple: {'D2': 'Some text'}
  • With options: {'D2': {'text': '...', 'width': 200, 'font': {'bold': True}, ...}}

Available options (dict form):

  • text (str, required): Textbox contents
  • width, height (int pixels): Shape size. Defaults are 192 × 120 pixels
  • x_offset, y_offset (int pixels): Shift within the anchor cell
  • font (dict): Font options — name, size (points), bold, italic, underline, color (hex or named)
  • fill_color (str): Background fill — hex #RRGGBB or named color
  • line_color (str): Border line — hex #RRGGBB or named color
  • alt_text (str): Alternative text for accessibility

Notes:

  • Textboxes are floating shapes anchored to a cell, not cell-content — they overlay cells without overwriting them
  • Works with both df_to_xlsx and dfs_to_xlsx (global or per-sheet)
  • Not available in constant memory mode

Defined Names

Create workbook-level named ranges that can be referenced in formulas:

import xlsxturbo
import pandas as pd

df = pd.DataFrame({
    'product': ['Widget A', 'Widget B', 'Widget C'],
    'price': [19.99, 29.99, 39.99],
    'quantity': [100, 75, 50]
})

# Define named ranges for use in formulas or external references
xlsxturbo.df_to_xlsx(df, "report.xlsx",
    defined_names={
        "PriceRange": "=Sheet1!$B$2:$B$4",
        "AllData": "=Sheet1!$A$1:$C$4"
    }
)

# Works with multi-sheet workbooks too
df1 = pd.DataFrame({'x': [1, 2, 3]})
df2 = pd.DataFrame({'y': [4, 5, 6]})
xlsxturbo.dfs_to_xlsx([
    (df1, "Data"),
    (df2, "Summary")
], "multi.xlsx",
    defined_names={
        "DataRange": "=Data!$A$1:$A$4",
        "SummaryRange": "=Summary!$A$1:$A$4"
    }
)

Notes:

  • Defined names are workbook-level (not per-sheet)
  • References must use Excel notation with sheet name: =Sheet1!$A$1:$D$100
  • Works with both df_to_xlsx and dfs_to_xlsx

Arbitrary Cell Writes

Write values to specific cells, optionally overwriting DataFrame data:

import xlsxturbo
import pandas as pd

df = pd.DataFrame({
    'product': ['Widget A', 'Widget B'],
    'price': [19.99, 29.99]
})

# Write simple values to specific cells
xlsxturbo.df_to_xlsx(df, "report.xlsx",
    cells={
        'D1': 'Notes',          # String
        'D2': 'Reviewed',       # String
        'D3': 42,               # Number
        'E1': True              # Boolean
    }
)

# Write with number formatting (e.g., force text format for long numbers)
xlsxturbo.df_to_xlsx(df, "report.xlsx",
    cells={
        'C5': 'Total',
        'C6': {'value': '934728173849', 'num_format': '@'},  # Text format
        'C7': {'value': 0.15, 'num_format': '0.00%'}         # Percentage
    }
)

# Overwrite DataFrame cells (cells are written after data)
xlsxturbo.df_to_xlsx(df, "report.xlsx",
    cells={
        'A2': 'OVERRIDE',  # Replaces 'Widget A' in the output
    }
)

Cell value format:

  • Simple: {'A1': 'text'}, {'B2': 42}, {'C3': True}
  • With formatting: {'A1': {'value': '...', 'num_format': '@'}}
  • Additional format options: align_horizontal, align_vertical, wrap_text

Notes:

  • Cells are written after all DataFrame data, so they can overwrite existing values
  • Works with both df_to_xlsx and dfs_to_xlsx (global or per-sheet)
  • Not available in constant memory mode

Constant Memory Mode (Large Files)

For very large files (millions of rows), use constant_memory=True to minimize RAM usage:

import xlsxturbo
import polars as pl

# Generate a large DataFrame
large_df = pl.DataFrame({
    'id': range(1_000_000),
    'value': [i * 1.5 for i in range(1_000_000)]
})

# Use constant_memory mode for large files
xlsxturbo.df_to_xlsx(large_df, "big_file.xlsx", constant_memory=True)

# Also works with dfs_to_xlsx
xlsxturbo.dfs_to_xlsx([
    (large_df, "Data")
], "multi_sheet.xlsx", constant_memory=True)

Note: Constant memory mode disables some features that require random access:

  • table_style (Excel tables)
  • freeze_panes
  • row_heights
  • autofit
  • conditional_formats
  • formula_columns
  • merged_ranges
  • hyperlinks
  • comments
  • validations
  • rich_text
  • images
  • cells

Column widths still work in constant memory mode.

CSV Conversion

import xlsxturbo

# Convert CSV to XLSX with automatic type detection
rows, cols = xlsxturbo.csv_to_xlsx("input.csv", "output.xlsx")
print(f"Converted {rows} rows and {cols} columns")

# Custom sheet name
xlsxturbo.csv_to_xlsx("data.csv", "report.xlsx", sheet_name="Sales Data")

# For large files (100K+ rows), use parallel processing
xlsxturbo.csv_to_xlsx("big_data.csv", "output.xlsx", parallel=True)

# Handle ambiguous dates (01-02-2024: is it Jan 2 or Feb 1?)
xlsxturbo.csv_to_xlsx("us_data.csv", "output.xlsx", date_order="us")   # January 2
xlsxturbo.csv_to_xlsx("eu_data.csv", "output.xlsx", date_order="eu")   # February 1

# date_order options:
# - "auto" (default): ISO first, then European (DMY), then US (MDY)
# - "mdy" or "us": US format (MM-DD-YYYY)
# - "dmy" or "eu": European format (DD-MM-YYYY)

CLI Usage

xlsxturbo input.csv output.xlsx [OPTIONS]

Options

  • -s, --sheet-name <NAME>: Name of the Excel sheet (default: "Sheet1")
  • -d, --date-order <ORDER>: Date parsing order for ambiguous dates (default: "auto")
    • auto: ISO first, then European, then US
    • mdy or us: US format (01-02-2024 = January 2)
    • dmy or eu: European format (01-02-2024 = February 1)
  • -v, --verbose: Show progress information

Examples

# Basic conversion
xlsxturbo sales.csv report.xlsx

# With US date format
xlsxturbo sales.csv report.xlsx --date-order us

# With European date format and verbose output
xlsxturbo sales.csv report.xlsx -d eu -v --sheet-name "Q4 Sales"

Performance

Reference benchmark on 100,000 rows x 50 columns with mixed data types. Your results will vary by system - run the benchmark yourself (see Benchmarking).

All libraries use default settings; outputs differ in styling (e.g. polars auto-sizes columns and bolds headers by default, while xlsxturbo writes bare cells unless asked).

Library Time (s) Rows/sec vs xlsxturbo
xlsxturbo 4.76 21,010 1.0x
polars 18.33 5,455 3.9x
pandas + xlsxwriter 27.66 3,615 5.8x
pandas + openpyxl 35.36 2,828 7.4x

Test system: Windows 11, Python 3.14, AMD Ryzen 9 (32 threads). Median of 3 runs after warmup. Re-run with --markdown to regenerate this table with current variance figures.

Type Detection Examples

CSV Value Excel Type Notes
123 Number Integer
3.14159 Number Float
true / FALSE Boolean Case insensitive
2024-01-15 Date Formatted as date
2024-01-15T10:30:00 DateTime ISO 8601 format
NaN Empty Graceful handling
hello world Text Default

Supported date formats: YYYY-MM-DD, YYYY/MM/DD, DD-MM-YYYY, DD/MM/YYYY, MM-DD-YYYY, MM/DD/YYYY

Known Limitations

  • Datetime precision: Sub-second precision (microseconds) is not preserved. Datetimes are written with second-level granularity, matching Excel's practical display precision.
  • Large integers: Integers exceeding 2^53 (9,007,199,254,740,992) are written as strings to prevent silent precision loss in Excel's floating-point representation.
  • Validation lists: Limited to 255 total characters (Excel limitation).

Building from Source

Requires Rust toolchain and maturin:

# Install maturin
pip install maturin

# Development build
maturin develop

# Release build (optimized)
maturin develop --release

# Build wheel for distribution
maturin build --release

Benchmarking

Run the included benchmark scripts:

# Compare xlsxturbo vs other libraries (100K rows default)
python benchmarks/benchmark.py

# Full benchmark: small, medium, large datasets
python benchmarks/benchmark.py --full

# Custom size
python benchmarks/benchmark.py --rows 500000 --cols 100

# Output formats for CI/documentation
python benchmarks/benchmark.py --markdown
python benchmarks/benchmark.py --json

# Test parallel vs single-threaded CSV conversion
python benchmarks/benchmark_parallel.py

License

MIT

About

High-performance Rust-powered Excel writer for Python. Pandas/polars DataFrames and CSV input, with tables, conditional formatting, formulas, and images.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors