Skip to content

Fix IR field types: statement labels/targets are ints, not statements#41

Merged
twizmwazin merged 1 commit into
masterfrom
fix-ir-int-types
Jun 1, 2026
Merged

Fix IR field types: statement labels/targets are ints, not statements#41
twizmwazin merged 1 commit into
masterfrom
fix-ir-int-types

Conversation

@twizmwazin
Copy link
Copy Markdown
Member

Summary

Several IR fields in sootir/ are annotated as SootStmt or str but actually hold int values (sequential statement indices used as labels). This PR fixes the annotations to match reality.

File Field Was Now
soot_block.py SootBlock.label str int
soot_statement.py GotoStmt.target SootStmt int
soot_statement.py IfStmt.target SootStmt int
soot_statement.py LookupSwitchStmt.lookup_values_and_targets frozendict[int, SootStmt] frozendict[int, int]
soot_statement.py LookupSwitchStmt.default_target SootStmt int
soot_statement.py TableSwitchStmt.targets tuple[SootStmt, ...] tuple[int, ...]
soot_statement.py TableSwitchStmt.lookup_values_and_targets frozendict[int, SootStmt] frozendict[int, int]
soot_statement.py TableSwitchStmt.default_target SootStmt int

Why

The from_ir factory methods construct these fields from stmt_map[ir_unit], where stmt_map is {u: i for i, u in enumerate(units)} — i.e. ints. Type checkers in strict mode flag the mismatch.

angr's usage confirms ints are the intended semantics:

  • angr/analyses/cfg/cfg_fast_soot.py: block.label + len(block.statements) (arithmetic)
  • angr/engines/soot/statements/base.py: current_method.block_by_label[instr] (dict lookup keyed by block.label)
  • angr/engines/soot/statements/switch.py: targets passed to _get_bb_addr_from_instr

Test plan

  • No runtime behavior change — only annotations are touched
  • Ruff passes
  • Existing test suite (pytest tests/) still passes

🤖 Generated with Claude Code

The following IR fields were annotated as SootStmt or str but the actual
runtime values are int (sequential statement indices used as labels):

- SootBlock.label: str -> int
- GotoStmt.target: SootStmt -> int
- IfStmt.target: SootStmt -> int
- LookupSwitchStmt.lookup_values_and_targets values: SootStmt -> int
- LookupSwitchStmt.default_target: SootStmt -> int
- TableSwitchStmt.targets: tuple[SootStmt, ...] -> tuple[int, ...]
- TableSwitchStmt.lookup_values_and_targets values: SootStmt -> int
- TableSwitchStmt.default_target: SootStmt -> int

Cross-referenced angr's usage to confirm these are always treated as ints:
e.g. cfg_fast_soot.py does `method.block_by_label[stmt.target]` (dict
lookup keyed by label) and `block.label + len(block.statements)`
(arithmetic).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@angr-bot
Copy link
Copy Markdown
Member

angr-bot commented Jun 1, 2026

Corpus decompilation diffs can be found at angr/dec-snapshots@master...angr/pysoot_41

@twizmwazin twizmwazin merged commit f251e09 into master Jun 1, 2026
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants