Skip to content

feat(closes OPEN-5988): openlayer-ts should accept csv datasets#178

Open
shah-siddd wants to merge 1 commit intomainfrom
siddhant/open-5988-openlayer-ts-should-accept-csv-datasets
Open

feat(closes OPEN-5988): openlayer-ts should accept csv datasets#178
shah-siddd wants to merge 1 commit intomainfrom
siddhant/open-5988-openlayer-ts-should-accept-csv-datasets

Conversation

@shah-siddd
Copy link

Pull Request

Summary

Adds CSV format support to the CLI dataset loading and writing functionality. The CLI now automatically detects and handles both JSON and CSV datasets based on file extension, maintaining format consistency between input and output files.

Changes

  • Added CSV parsing with proper quote and comma handling (parseCsv, parseCsvLine)
  • Added CSV serialization with automatic quoting for special characters (serializeCsv, escapeCsvValue)
  • Added format detection based on file extension (detectDatasetFormat)
  • Refactored loadDataset to support both JSON and CSV formats
  • Refactored writeDataset to preserve input format in output files
  • Updated CLIHandler to use the new format-aware dataset functions
  • Added comprehensive unit tests for CSV parsing, serialization, and round-trip operations

Context

Previously, the CLI only supported JSON datasets. This change enables users to work with CSV datasets, which are commonly used in data science workflows. The implementation handles edge cases like quoted values, commas within fields, and escaped quotes according to RFC 4180 standards.

Testing

  • Unit tests
  • Manual testing
  • Postman CI/CD
  • Other (please specify)

Test Coverage:

  • JSON dataset loading and writing
  • CSV dataset loading and writing
  • CSV parsing with quoted values and commas
  • CSV serialization with proper escaping
  • Round-trip CSV serialization/parsing

Monitoring

  • No expected impact
  • Added/updated relevant monitoring (Sentry alerts, logs, dashboards)

Notes

  • CSV values are automatically quoted when they contain commas, quotes, or newlines
  • CSV parsing handles escaped quotes ("") correctly
  • Output format matches input format (CSV input → CSV output, JSON input → JSON output)
  • All CSV parsing/serialization is implemented without external dependencies

@whoseoyster whoseoyster requested a review from vikasnair January 5, 2026 11:53
@shah-siddd shah-siddd force-pushed the siddhant/open-5988-openlayer-ts-should-accept-csv-datasets branch 2 times, most recently from fbd3d1a to 40c836a Compare January 28, 2026 06:31
@shah-siddd shah-siddd force-pushed the siddhant/open-5988-openlayer-ts-should-accept-csv-datasets branch from 40c836a to 9b6c6c2 Compare January 28, 2026 06:36
Copy link
Contributor

@gustavocidornelas gustavocidornelas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a few comments to version.ts.

Also, can you discard all changes outside of src/lib? Everything outside of it is auto-generated by Stainless, so this will cause issues. Version updates and changelog are fully handled by them based on the commit messages.

I recently wrote more details about SDK maintenance. I'm sure you know most of the things but sending here just in case.

Also, in the PR description, you mention unit tests but I don't see any. Make sure the description is accurate

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you discard these changes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you discard these changes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you discard these changes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you discard these changes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you discard these changes?

export interface RunReturn {
otherFields: { [key: string]: any };
output: any;
[key: string]: unknown;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need this?

return { data: parsed, format };
}

export function writeDataset(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function seems to not be used anywhere

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i understand that this would be used to potentially save csv and json. however, i'm not sure we need it. even if the user specifies a csv dataset, we could keep saving a dataset.json with the model outputs, right?

const datasetFullPath = path.resolve(datasetPath);
const rawData = fs.readFileSync(datasetFullPath, 'utf8');
const dataset = JSON.parse(rawData);
const { data: dataset, format } = loadDataset(datasetPath);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

format seems to not be used anywhere

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants