Skip to content

Add Tessera#671

Open
favyen2 wants to merge 4 commits into
masterfrom
favyen/20260616-tessera
Open

Add Tessera#671
favyen2 wants to merge 4 commits into
masterfrom
favyen/20260616-tessera

Conversation

@favyen2

@favyen2 favyen2 commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

No description provided.

@favyen2

favyen2 commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator Author

@claude can you review this PR?

@claude

claude Bot commented Jun 16, 2026

Copy link
Copy Markdown

Claude finished @favyen2's task in 2m 46s —— View job


PR Review: Add Tessera

  • Load tools and gather context
  • Read changed files
  • Review model implementation
  • Review tests
  • Review documentation
  • Post final review

Overall this is a clean, well-structured implementation. The architecture faithfully mirrors Tessera v1.1 upstream, the tests are solid, and the docs are thorough. I found a few issues worth addressing before merging.


Issues

1. S1 observations are not sorted by day-of-year (potential correctness bug)

tessera.py:624-630 — Sentinel-1 ascending and descending time series are concatenated in a fixed order (ascending first) without sorting by day-of-year. If Tessera was trained with temporally sorted observations across both orbits, the fixed ordering here would produce results inconsistent with training:

s1_seq = torch.cat(
    [
        self._build_sequence(s1a_image, s1a_doys, num_pixels),   # asc DOYs
        self._build_sequence(s1d_image, s1d_doys, num_pixels),   # desc DOYs
    ],
    dim=1,
)

The S2 sequence has no such issue since it's a single modality. Worth verifying against the upstream training code whether merged S1 observations were sorted. Fix this →


2. Dead condition in _init_weights

tessera.py:283-285:

for name, param in self.named_parameters():
    if "weight" in name or name.startswith("W_"):

named_parameters() on a module returns names like W_ir.weight, W_iz.weight, etc. — so name.startswith("W_") is always False (the prefix includes the submodule separator, e.g. "W_ir.weight" not "W_"...). The "weight" in name condition already covers all Linear weight tensors. The startswith branch is dead code. Fix this →


3. div_term recreated on every forward call

tessera.py:372-376:

def forward(self, doy: torch.Tensor) -> torch.Tensor:
    ...
    div_term = torch.exp(
        torch.arange(0, self.d_model, 2, dtype=torch.float)
        * -(math.log(10000.0) / self.d_model)
    ).to(doy.device)

This allocates and computes the frequency terms on every call. Pre-computing and registering as a buffer in __init__ would be cleaner and slightly faster, though for inference workloads this is unlikely to be a bottleneck. Minor.


4. data_source deprecation is silent

tessera.py:526-531data_source is documented as a "Deprecated compatibility argument" but no warnings.warn() is issued. Users who pass it expecting it to control normalization stats will get no feedback that it does nothing. Consider adding:

import warnings
warnings.warn(
    "The data_source argument to Tessera is deprecated and has no effect; "
    "set data_source in TesseraNormalize instead.",
    DeprecationWarning,
    stacklevel=2,
)

Minor Notes

Docs: S2 band list in JSON example includes extra bandsTessera.md:113-126 shows a data-source JSON with 12 S2 bands (including B01 and B09), but the YAML model config example correctly selects the 10 Tessera-order bands. These two examples are consistent (the JSON is for data storage, YAML selects from it), but worth a clarifying comment in the docs since B01/B09 are never used by the model.

valid mask for S1 relies on clipping behaviourtessera.py:225-229: Any S1 pixel at exactly -25 dB (both VV and VH) would be clipped to 0 and silently treated as no-data. This is intentional (matching Tessera's int16 storage), but documenting the -25 dB threshold explicitly in the docstring of _normalize_s1 would help.

ingest: false inside data_sourceTessera.md:136: ingest appears inside the data_source block in the JSON example; double-check this matches the expected rslearn dataset config schema.


Positives

  • weights_only=True on checkpoint loading — good security practice.
  • Two focused test files with clear separation between model shape tests and normalization tests.
  • Pixel-batching in _forward_one_sample avoids OOM on large images.
  • The skip_missing=True pattern in TesseraNormalize integrates cleanly with the existing Transform API.
  • Docstrings throughout are clear and informative.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant