Skip to content

train/sft: add type annotations to all Python scripts#85

Open
VishvakR wants to merge 1 commit into
StarTrail-org:mainfrom
VishvakR:add-type-hints-train-sft
Open

train/sft: add type annotations to all Python scripts#85
VishvakR wants to merge 1 commit into
StarTrail-org:mainfrom
VishvakR:add-type-hints-train-sft

Conversation

@VishvakR

Copy link
Copy Markdown

Add comprehensive Python type annotations across all 18 scripts in train/sft/, improving IDE support and debuggability of raw data dicts.

Changes:

  • Define TypedDict classes for recurring data shapes: ShareGPTMessage, ShareGPTExample, ImageInfo, DatasetInfoEntry, RetrievalRow, RetrievalHit, SplitStats, EvalResult, EMCharMetrics, JudgeMetrics
  • Annotate all function/method signatures (params + return types)
  • Type key local variables (accumulators, containers) where not inferrable from context

Bug fixes discovered during analysis:

  • download_tiles.py: collect_paths() had return annotation set[str] but actually returns dict[str, str | None] — fixed
  • prepare_sft_data_variable.py L100: removed dead dict comprehension expression whose result was discarded

No behavioral changes beyond the dead-code removal. All files pass ruff check, ruff format, and py_compile.

Closes #84

Add comprehensive Python type annotations across all 18 scripts in
train/sft/, improving IDE support and debuggability of raw data dicts.

Changes:
- Define TypedDict classes for recurring data shapes: ShareGPTMessage,
  ShareGPTExample, ImageInfo, DatasetInfoEntry, RetrievalRow,
  RetrievalHit, SplitStats, EvalResult, EMCharMetrics, JudgeMetrics
- Annotate all function/method signatures (params + return types)
- Type key local variables (accumulators, containers) where not
  inferrable from context

Bug fixes discovered during analysis:
- download_tiles.py: collect_paths() had return annotation set[str]
  but actually returns dict[str, str | None] — fixed
- prepare_sft_data_variable.py L100: removed dead dict comprehension
  expression whose result was discarded

No behavioral changes beyond the dead-code removal. All files pass
ruff check, ruff format, and py_compile.
@vercel

vercel Bot commented Jun 23, 2026

Copy link
Copy Markdown

@VishvakR is attempting to deploy a commit to the andylizf's projects Team on Vercel.

A member of the Team first needs to authorize it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Enhancement] Add type annotations to train/sft modules

1 participant