Skip to content

Centralize from_ir conversion; switch IR to frozen dataclasses#42

Merged
twizmwazin merged 1 commit into
masterfrom
refactor-conversion-frozen-ir
Jun 1, 2026
Merged

Centralize from_ir conversion; switch IR to frozen dataclasses#42
twizmwazin merged 1 commit into
masterfrom
refactor-conversion-frozen-ir

Conversation

@twizmwazin
Copy link
Copy Markdown
Member

Summary

  • Move all JPype-touching from_ir factory logic out of the sootir/ package and into soot_manager.py as private _convert_* functions. The sootir/ package is now a pure data layer with no Java interop.
  • Switch every IR dataclass from unsafe_hash=True to frozen=True, catching post-init mutation at runtime.
  • Tighten types in soot_manager.py (return type annotations on the conversion helpers, narrower union types for _build_instance_invoke and the binop dispatch table, _JavaObj alias for JPype-returned objects).

Stacked on top of #41 — please merge that first.

Why frozen is now possible

SootPhiExpr previously had to be constructed without block indices and then mutated post-init in a second pass:

phi_expr = SootValue.IREXPR_TO_EXPR[ir_expr]
phi_expr.values = values  # post-init mutation

The new code threads a per-method _Ctx (containing stmt_to_block_idx) through _convert_value, so SootPhiExpr is built in a single shot with its final (value, block_idx) tuples. No mutation, no global side-channel.

Notes

  • Conversion functions are now module-private (_convert_*) since they are only used by run_soot. No public API change in sootir/.
  • The IR dataclasses remain hashable — frozen=True enables value-based __hash__ automatically, so the existing pattern of using SootBlock as a dict key in basic_cfg / exceptional_preds still works.

Test plan

  • Ruff passes
  • pytest tests/ still passes (needs JPype runtime, can't run in dev sandbox)
  • Spot-check that SootMethod.block_by_label still resolves, that frozendict lookups for basic_cfg / exceptional_preds still work
  • Verify Phi expressions are constructed correctly on a shimple-formatted JAR

🤖 Generated with Claude Code

@angr-bot
Copy link
Copy Markdown
Member

angr-bot commented Jun 1, 2026

Corpus decompilation diffs can be found at angr/dec-snapshots@master...angr/pysoot_42

@twizmwazin twizmwazin force-pushed the refactor-conversion-frozen-ir branch from 0170b0d to de936b5 Compare June 1, 2026 20:21
@twizmwazin twizmwazin changed the base branch from fix-ir-int-types to master June 1, 2026 20:21
@twizmwazin twizmwazin force-pushed the refactor-conversion-frozen-ir branch from de936b5 to 2c48248 Compare June 1, 2026 20:27
The sootir/ package was a mix of pure data classes and JPype-dependent
from_ir factory methods. Move all conversion logic into soot_manager.py
so the sootir package becomes a pure-data layer.

Then switch every dataclass from unsafe_hash=True to frozen=True. This
was previously impossible because SootPhiExpr was constructed without
block indices and then mutated post-init:

    phi_expr = SootValue.IREXPR_TO_EXPR[ir_expr]
    phi_expr.values = values

The new conversion code threads the per-method stmt_to_block_idx context
through _convert_value, so SootPhiExpr is built in a single shot with its
final (value, block_idx) tuples — no mutation required.

Type improvements in soot_manager.py:
- _JavaObj alias (= Any) for JPype-returned objects, used uniformly on
  ir_* parameters
- Return type annotations on _convert_class, _convert_method,
  _convert_block, _convert_stmt, _convert_value, _convert_expr,
  _build_instance_invoke
- _Ctx.stmt_map / stmt_to_block_idx now typed dict[_JavaObj, int]
- _build_instance_invoke cls parameter tightened from Any to a union of
  the three concrete InstanceInvokeExpr subclasses
- _BINOP_EXPR_NAMES tightened from dict[str, Any] to
  dict[str, type[SootBinopExpr] | type[SootConditionExpr]]

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@twizmwazin twizmwazin force-pushed the refactor-conversion-frozen-ir branch from 2c48248 to 982df29 Compare June 1, 2026 20:38
@twizmwazin twizmwazin merged commit fce3666 into master Jun 1, 2026
27 of 28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants