Feature
Add Parquet file export support to NeuG's COPY TO command, enabling users to export query results to high-performance Parquet files for analytics and data lake integration.
User Scenarios
- Export for external analytics: Data engineers can export Cypher query results to Parquet files for analysis with Spark, DuckDB, or Presto
- Compression optimization: Configure compression settings (none, snappy, zlib, zstd) to balance file size and performance
- Large dataset export: Export millions of rows without memory issues using streaming writes
Requirements
P1: Core Parquet Export
- Basic export with default SNAPPY compression
- Streaming batch writes (no OOM for large datasets)
- Type mapping (NeuG types to Arrow/Parquet types)
- Arrow schema metadata preservation
P2: Compression and Performance Options
- Configurable compression: none, snappy, zlib, zstd
- Configurable row group size
- Dictionary encoding control
P3: Complex Data Type Support
- Vertex/edge object serialization
- List/array property handling
- Date and timestamp serialization
Syntax
Basic export (default SNAPPY):
COPY (MATCH (n:person) RETURN n.*) TO 'person.parquet';
With options:
COPY (MATCH (n:person) RETURN n.*) TO 'person.parquet' (compression='zstd', row_group_size=65536);
Feature
Add Parquet file export support to NeuG's COPY TO command, enabling users to export query results to high-performance Parquet files for analytics and data lake integration.
User Scenarios
Requirements
P1: Core Parquet Export
P2: Compression and Performance Options
P3: Complex Data Type Support
Syntax
Basic export (default SNAPPY):
COPY (MATCH (n:person) RETURN n.*) TO 'person.parquet';
With options:
COPY (MATCH (n:person) RETURN n.*) TO 'person.parquet' (compression='zstd', row_group_size=65536);