Title
Source chunk highlight is inconsistent when clicking citation chips`
Summary
Clicking a source citation chip in the chat (for example 1, 2, 3) does not always highlight the expected chunk in the book viewer. Some citations highlight correctly, while others do nothing.
Why this is a problem
This app's key promise is source-grounded answers. If source jumps are unreliable, users cannot trust or verify cited content quickly.
Affected areas
climate_streamlit/app.py (citation click -> viewer jump)
climate_streamlit/html_sectioning.py (chunk generation and anchor stamping)
Steps to reproduce
- Run the app.
- Ask a question that returns multiple citations.
- Click each citation chip one-by-one.
- Observe: some highlights work, some fail.
Expected behavior
Every citation click should always:
- target one valid paragraph anchor,
- scroll to the anchor,
- apply paragraph highlight.
Actual behavior
- Intermittent no-highlight on some citations.
- Sometimes fallback section jump works, but exact paragraph highlight does not.
Probable root cause
Chunking and DOM annotation appear to use different paragraph segmentation behavior in some paths.
This can produce citation anchor_id values that are not present in rendered HTML.
Potential mismatches include:
- Different block handling between format modes (
p vs lists/tables/div blocks).
- Paragraph merge behavior for short fragments changing index alignment.
Proposed fix
- Centralize paragraph segmentation so indexing and annotation share the same source-of-truth logic.
- Align anchor stamping across format modes.
- Add validation test: every generated
anchor_id must exist in annotated HTML.
- Add fallback logging when
anchor_id lookup fails.
Acceptance criteria
- Citation highlight success is consistent and reproducible.
- No missing anchors for retrieved chunks.
- Tests cover both format types and mixed block content.
Suggested labels
bug, rag, frontend, high priority, good first issue
Title
Source chunk highlight is inconsistent when clicking citation chips`
Summary
Clicking a source citation chip in the chat (for example
1,2,3) does not always highlight the expected chunk in the book viewer. Some citations highlight correctly, while others do nothing.Why this is a problem
This app's key promise is source-grounded answers. If source jumps are unreliable, users cannot trust or verify cited content quickly.
Affected areas
climate_streamlit/app.py(citation click -> viewer jump)climate_streamlit/html_sectioning.py(chunk generation and anchor stamping)Steps to reproduce
Expected behavior
Every citation click should always:
Actual behavior
Probable root cause
Chunking and DOM annotation appear to use different paragraph segmentation behavior in some paths.
This can produce citation
anchor_idvalues that are not present in rendered HTML.Potential mismatches include:
pvs lists/tables/div blocks).Proposed fix
anchor_idmust exist in annotated HTML.anchor_idlookup fails.Acceptance criteria
Suggested labels
bug,rag,frontend,high priority,good first issue