Skip to content

Update Hamiltonian simulation compilation tutorial to new template#4969

Open
henryzou50 wants to merge 15 commits into
Qiskit:mainfrom
henryzou50:update-cmhsc
Open

Update Hamiltonian simulation compilation tutorial to new template#4969
henryzou50 wants to merge 15 commits into
Qiskit:mainfrom
henryzou50:update-cmhsc

Conversation

@henryzou50
Copy link
Copy Markdown
Collaborator

Summary

Revised compilation-methods-for-hamiltonian-simulation-circuits.ipynb to follow the Tutorial_Template structure, focusing exclusively on benchmarking SABRE, AI transpiler, and Rustiq compilation methods on Hamiltonian simulation circuits from the Hamlib collection.

Key changes from the old notebook:

  • Removed EfficientSU2 section: The old notebook started with a separate Part 1 using efficient_su2 circuits. The revised version focuses entirely on Hamlib Hamiltonian simulation circuits built with PauliEvolutionGate
  • Split benchmarks by scale: Circuits are now grouped into small-scale (<20 qubits) and large-scale (>=20 qubits) with separate analysis for each
  • Added circuit filtering: Filters out circuits exceeding backend qubit count or 5000 decomposed gates to keep benchmarks practical
  • Added execution and fidelity testing: The old notebook had no execution ("focused on the transpilation process"). The revised version uses mirror circuits to measure survival probability on both an Aer simulator (small-scale) and real hardware (large-scale)
  • Improved visualizations: Line charts by circuit index, % improvement over SABRE bar charts, and grouped bar charts for best-performing method with tie tracking
  • Added quantitative analysis: Styled summary tables with mean/stdev and per-circuit comparison tables with best-value highlighting
  • Updated dependencies: Bumped to Qiskit SDK v2.0+, added Qiskit Aer dependency, switched backend from ibm_torino to least_busy()
  • Template compliance: Added learning outcomes, prerequisites, and structured background sections following the standard tutorial template

Tutorial structure:

  • Small-scale example (<20 qubits): Full walkthrough with Aer simulator noise evaluation using mirror circuits and survival probability
  • Large-scale example (>=20 qubits): Compressed workflow with real hardware submission and fidelity comparison across all three compilation methods

…utorial

Replace the old notebook with a revised version that follows the tutorial
template and focuses exclusively on Hamiltonian simulation circuits from
the Hamlib benchmark collection (removing the EfficientSU2 section).

Key changes:
- Compare SABRE, AI transpiler, and Rustiq on Hamlib circuits split into
  small-scale (<20 qubits) and large-scale (>=20 qubits) groups
- Filter out circuits exceeding backend qubit count or 5000 decomposed gates
- Add styled summary tables with mean/stdev and % improvement over SABRE
- Add per-circuit comparison tables with best-value highlighting
- Improve plots: line charts by circuit index, % improvement over SABRE,
  grouped bar charts for best-performing method with tie tracking
- Add mirror circuit execution for noise evaluation (Aer sim for small-scale,
  real hardware for large-scale) with survival probability metric
- Revise all commentary to reflect benchmark observations
@henryzou50 henryzou50 requested a review from a team April 10, 2026 09:25
@qiskit-bot
Copy link
Copy Markdown
Contributor

One or more of the following people are relevant to this code:

  • @henryzou50
  • @nathanearnestnoble

@review-notebook-app
Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link
Copy Markdown
Collaborator

@kaelynj kaelynj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a handful of comments to address! I also think it'd be worth it here to include hardware results, since you're already generating the mirror circuits it'd be fairly easy to check.

Comment thread docs/tutorials/compilation-methods-for-hamiltonian-simulation-circuits.ipynb Outdated
"In addition to these standard metrics, we also record the 2-qubit gate depth, which is a particularly important metric for evaluating execution on quantum hardware. Unlike total depth, which includes all gates, the 2-qubit depth more accurately reflects the circuit's*actual execution duration on hardware. This is because 2-qubit gates typically dominate the time and error budget in most quantum devices. As such, minimizing 2-qubit depth is critical for improving fidelity and reducing decoherence effects during execution.\n",
"\n",
"We will use this function to analyze the performance of the different compilation methods across multiple circuits."
"The following function transpiles a list of circuits using a given pass manager and records the key metrics (two-qubit depth, circuit size, and runtime) in a DataFrame."
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should avoid using a DataFrame to present the results. They don't render correctly on the platform

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Comment thread docs/tutorials/compilation-methods-for-hamiltonian-simulation-circuits.ipynb Outdated
Comment thread docs/tutorials/compilation-methods-for-hamiltonian-simulation-circuits.ipynb Outdated
Comment thread docs/tutorials/compilation-methods-for-hamiltonian-simulation-circuits.ipynb Outdated
Comment thread docs/tutorials/compilation-methods-for-hamiltonian-simulation-circuits.ipynb Outdated
Comment thread docs/tutorials/compilation-methods-for-hamiltonian-simulation-circuits.ipynb Outdated
"\n",
"### Two-qubit depth and gate count\n",
"\n",
"At large scale, SABRE and the AI transpiler produce similar results overall, with each having a slight edge in different areas: SABRE tends to achieve a slightly lower gate count on average, which aligns with how its heuristic is designed to minimize the number of inserted SWAP gates. The AI transpiler edges ahead slightly on two-qubit depth, consistent with the fact that part of its reinforcement learning training objective optimizes for circuit depth. Both methods are consistent and reliable across the full range of circuits.\n",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SABRE tends to achieve a slightly lower gate count on average, which aligns with how its heuristic is designed to minimize the number of inserted SWAP gates.

This isn't true from the plot. The AI transpiler has a slightly lower gate count on average. And Rustiq performs just about as well as the AI transpiler in terms of 2Q depth.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this! I dug into it and I think the discrepancy comes from the hardware mirror-circuit plot, which only shows a single circuit (one 26-qubit tfim case), and that one happens to be an outlier where SABRE's two-qubit depth is unusually high.

Looking at the aggregate charts instead (% improvement over SABRE and best-performing method by metric), the trend across all the large-scale circuits is:

  • SABRE wins gate count on most circuits (~73%)
  • AI wins two-qubit depth on most circuits (~64%)
  • Rustiq is best on only a small share and isn't comparable to AI on 2Q depth at scale (its averages are dominated by a few large outliers)

So the original statement was actually correct for the large example circuits, I've kept it but reworded the section to make the per-metric split explicit and added a note clarifying that the mirror plot reflects just one circuit, not the overall results. I also explained this in my latest summary comment above, but let me know if there should be anything changed here.

Comment thread docs/tutorials/compilation-methods-for-hamiltonian-simulation-circuits.ipynb Outdated
henryzou50 and others added 8 commits June 4, 2026 16:32
…circuits.ipynb

Co-authored-by: Kaelyn Ferris <43348706+kaelynj@users.noreply.github.com>
…circuits.ipynb

Co-authored-by: Kaelyn Ferris <43348706+kaelynj@users.noreply.github.com>
…circuits.ipynb

Co-authored-by: Kaelyn Ferris <43348706+kaelynj@users.noreply.github.com>
…circuits.ipynb

Co-authored-by: Kaelyn Ferris <43348706+kaelynj@users.noreply.github.com>
…circuits.ipynb

Co-authored-by: Kaelyn Ferris <43348706+kaelynj@users.noreply.github.com>
…circuits.ipynb

Co-authored-by: Kaelyn Ferris <43348706+kaelynj@users.noreply.github.com>
…circuits.ipynb

Co-authored-by: Kaelyn Ferris <43348706+kaelynj@users.noreply.github.com>
…utorial

From kaelynj's review:
- Remove the pandas DataFrames used to present results, which don't render
  on the docs platform. Results are now stored as a list of dicts and shown
  via plain-text printed tables (print_summary_table, print_per_circuit_
  comparison), matching the AI transpiler tutorial. Removes pandas entirely
  and rewrites the three plot helpers in pure Python.
- Use "fidelity" instead of the unfamiliar term "survival probability"
  throughout (prose, plot labels, and variables).
- Clarify the large-scale analysis behind kaelynj's gate-count comment. Her
  note came from the single hardware mirror-circuit plot (one tfim circuit,
  an outlier where SABRE's depth is high). The aggregate best-performing-
  method and %-improvement charts show the actual trend, which was already
  the case: SABRE wins gate count on most circuits and the AI transpiler
  wins two-qubit depth on most. The analysis now states this explicitly and
  flags that the mirror plot reflects a single circuit, not the aggregate.

Additional changes:
- Reconcile the rest of the commentary with the re-run results, including
  the small-scale best-method analysis (all three methods close except AI
  on runtime; Rustiq a slight overall edge) and Rustiq's outlier behavior
  at large scale.
- Convert absolute quantum.cloud.ibm.com doc links to relative paths for
  consistency with other tutorials.
- Remove the link to the deprecated Qiskit Transpiler Service from Next
  steps.
@henryzou50
Copy link
Copy Markdown
Collaborator Author

Thanks for the review, @kaelynj! I've pushed changes addressing all of your comments:

  • Removed the pandas DataFrames for presenting results, since they don't render on the docs platform. Results are now stored as a list of dicts and shown with plain-text printed tables (same approach as the AI transpiler tutorial). I removed the pandas dependency entirely and rewrote the plot helpers in pure Python.
  • Replaced "survival probability" with "fidelity" throughout (prose, plot labels, and variables) since it's the more recognizable term.
  • The gate-count comment: you were reading the hardware mirror-circuit plot, which is just a single circuit — one 26-qubit tfim case that happens to be an outlier where SABRE's two-qubit depth is unusually high. Looking at the aggregate charts (% improvement over SABRE and best-performing method by metric), the real trend is that SABRE wins gate count on most circuits and the AI transpiler wins two-qubit depth on most — and Rustiq isn't comparable to AI on depth at scale. I've made the analysis state this explicitly and added a note clarifying that each mirror-circuit plot reflects one circuit, not the aggregate.
  • Hardware results: these are already included, as the large-scale section builds mirror circuits and submits them to a real backend, then plots the fidelity (cells in the "Large-scale hardware example" section). Let me know if we should have more hardware results, but I kept them minimal in this tutorial as this tutorial is more focused on transpilation.

While I was in there, I also:

  • Reconciled the rest of the commentary with the re-run numbers (small-scale results are close across methods except AI on runtime, with Rustiq a slight overall edge; clarified Rustiq's outlier behavior at large scale).
  • Converted the absolute doc links to relative paths for consistency with the other tutorials, and removed the link to the deprecated Qiskit Transpiler Service.

Let me know if you'd like any further changes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

3 participants