Skip to content

Parquet export of timestamp cols w/o timezone adds tz info anyways #34

@xuxoramos

Description

@xuxoramos

When exporting a DuckDB table to parquet, timestamp columns are forcibly added timezone info, even if the original DuckDB timestamp type does not include any info on TZ.

Reproducible example:

-- Create the table with three columns
CREATE TABLE github_issue (
    id INTEGER,
    description TEXT,
    date DATE
);

-- Insert mock data into the table
INSERT INTO github_issue (id, description, date) VALUES
(1, 'Issue with login functionality', '2025-03-01'),
(2, 'Error in data processing pipeline', '2025-03-15'),
(3, 'UI bug on the dashboard', '2025-03-25');

-- Create a new table github_issue_step_2 with the same structure but date as TIMESTAMP
CREATE TABLE github_issue_step_2 (
    id INTEGER,
    description TEXT,
    date TIMESTAMP
);

-- Insert data from github_issue into github_issue_step_2
-- Convert the date column to a timestamp
INSERT INTO github_issue_step_2 (id, description, date)
SELECT id, description, CAST(date AS TIMESTAMP)
FROM github_issue;

COPY github_issue_step_2 TO './github_issue_step_2.parquet' (FORMAT 'parquet');

After executing the script, inspect the resulting parquet file to observe it has a Z character at the end, which indicates UTC timezone.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions