Problem
Some real-world files have metadata or description rows before the actual headers. For example:
(empty row)
*required, *required, *required, Ex: 45.398792, ...
public name, private name, address, GPS latitude, ...
ADRET, ADRET, COURCHEVEL 1550, ...
Currently, Sources::Csv and Sources::Xlsx always read the first row as headers (csv_content.lines.first / sheet.simple_rows.first). There is no way to skip leading rows.
Proposal
Add a header_row option (1-based index, default: 1) configurable per target or via import config:
Target DSL:
class HousingTarget < DataPorter::Target
sources :csv, :xlsx
header_row 3 # skip 2 description rows
end
Or via import config (runtime):
import.config = { "header_row" => 3 }
Both Sources::Csv#headers / #fetch and Sources::Xlsx#headers / #fetch would skip header_row - 1 rows before reading headers, and parse data rows from header_row + 1 onward.
Use case
Property management platforms where operators export templates with instruction rows above the actual column headers (e.g. *required, Ex: 45.398792). Asking users to manually clean files before import adds friction.
Problem
Some real-world files have metadata or description rows before the actual headers. For example:
Currently,
Sources::CsvandSources::Xlsxalways read the first row as headers (csv_content.lines.first/sheet.simple_rows.first). There is no way to skip leading rows.Proposal
Add a
header_rowoption (1-based index, default:1) configurable per target or via import config:Target DSL:
Or via import config (runtime):
Both
Sources::Csv#headers/#fetchandSources::Xlsx#headers/#fetchwould skipheader_row - 1rows before reading headers, and parse data rows fromheader_row + 1onward.Use case
Property management platforms where operators export templates with instruction rows above the actual column headers (e.g.
*required,Ex: 45.398792). Asking users to manually clean files before import adds friction.