fix(read): truncate long lines at 500 chars as documented by T0mSIlver · Pull Request #38 · microsoft/fastcontext

T0mSIlver · 2026-06-27T01:38:35Z

Problem

read.md states:

Any lines longer than 500 characters will be truncated to 500 characters with '...' appended to the end.

But MAX_LINE_LENGTH = 2000, so lines up to 2000 chars were returned in full — a 4x mismatch between the documented and actual line-truncation length.

Fix

Set MAX_LINE_LENGTH = 500 to honor the prompt. MAX_LINE (the 2000-line file cap, which is correctly documented) is left unchanged. Adds tests/test_read_truncation.py asserting a 600-char line is truncated to 500 + "..." and that the constant matches the doc.

Why 500, not 2000

The paper's Read schema specifies no line-length or line-count truncation values (FastContext paper, arXiv:2606.14066, Appendix E, p. 19), so the exact number isn't spec-mandated. But 500 is the principled choice: it keeps a single Read's worst-case output within the model's context budget.

MAX_LINE * MAX_LINE_LENGTH = 2000 * 500 = 1,000,000 chars (1M)
At the common ~4 chars/token heuristic → 1M / 4 ≈ 250k tokens
~250k tokens fits comfortably within the model's max sequence length.

Keeping MAX_LINE_LENGTH = 2000 instead would make the worst case 2000 * 2000 = 4M chars ≈ 1M tokens — 4x larger, which would blow past typical context limits. So beyond matching read.md, 500 is the value that balances the two caps to a sane ~250k-token ceiling per Read call.

read.md states 'Any lines longer than 500 characters will be truncated to 500 characters with ...' but MAX_LINE_LENGTH was 2000, so lines up to 2000 chars passed through untruncated. Set MAX_LINE_LENGTH to 500 to match the documented contract (MAX_LINE, the 2000-line file cap, is correct and unchanged), and add a test covering the truncation length.

T0mSIlver closed this Jun 27, 2026

T0mSIlver deleted the fix/read-line-length branch June 27, 2026 11:32

T0mSIlver restored the fix/read-line-length branch June 27, 2026 11:56

T0mSIlver reopened this Jun 27, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(read): truncate long lines at 500 chars as documented#38

fix(read): truncate long lines at 500 chars as documented#38
T0mSIlver wants to merge 1 commit into
microsoft:mainfrom
T0mSIlver:fix/read-line-length

T0mSIlver commented Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

T0mSIlver commented Jun 27, 2026

Problem

Fix

Why 500, not 2000

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant