Improved Question Generation with Automated Retries & Sanity Checks#72
Improved Question Generation with Automated Retries & Sanity Checks#72MukeshAofficial wants to merge 11 commits intosatvik314:mainfrom
Conversation
WalkthroughThis pull request introduces new Pydantic models Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Caller
participant Engine as QnAEngine
participant Result as SanityCheckResult
participant Summary as SanityCheckSummary
Client->>Engine: Call sanity_check(question_list, options)
Engine->>Result: Evaluate each question, create SanityCheckResult(s)
Engine->>Engine: Auto-correct failed questions if enabled
Engine->>Summary: Aggregate results into SanityCheckSummary
Engine->>Client: Return SanityCheckSummary
Client->>Summary: Optionally call show() to display results
Possibly related PRs
Poem
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (6)
educhain/models/__init__.py (2)
2-6: Consider pruning unused imports or exporting them.Several imports (
MultipleChoiceQuestion,ShortAnswerQuestion,TrueFalseQuestion,FillInBlankQuestion,MCQList,ShortAnswerQuestionList,TrueFalseQuestionList,FillInBlankQuestionList,Option,MCQMath,MCQListMath,SanityCheckResult,SanityCheckSummary) appear unused in this module. Consider removing them or adding them to an__all__list to clarify their intended usage and reduce lint warnings.Example removal for unused imports:
-from .qna_models import ( - MultipleChoiceQuestion, ShortAnswerQuestion, TrueFalseQuestion, FillInBlankQuestion, - MCQList, ShortAnswerQuestionList, TrueFalseQuestionList, FillInBlankQuestionList, - Option, MCQMath, MCQListMath, SanityCheckResult, SanityCheckSummary -) +from .qna_models import ( + SanityCheckResult, SanityCheckSummary +)🧰 Tools
🪛 Ruff (0.8.2)
3-3:
.qna_models.MultipleChoiceQuestionimported but unused; consider removing, adding to__all__, or using a redundant alias(F401)
3-3:
.qna_models.ShortAnswerQuestionimported but unused; consider removing, adding to__all__, or using a redundant alias(F401)
3-3:
.qna_models.TrueFalseQuestionimported but unused; consider removing, adding to__all__, or using a redundant alias(F401)
3-3:
.qna_models.FillInBlankQuestionimported but unused; consider removing, adding to__all__, or using a redundant alias(F401)
4-4:
.qna_models.MCQListimported but unused; consider removing, adding to__all__, or using a redundant alias(F401)
4-4:
.qna_models.ShortAnswerQuestionListimported but unused; consider removing, adding to__all__, or using a redundant alias(F401)
4-4:
.qna_models.TrueFalseQuestionListimported but unused; consider removing, adding to__all__, or using a redundant alias(F401)
4-4:
.qna_models.FillInBlankQuestionListimported but unused; consider removing, adding to__all__, or using a redundant alias(F401)
5-5:
.qna_models.Optionimported but unused; consider removing, adding to__all__, or using a redundant alias(F401)
5-5:
.qna_models.MCQMathimported but unused; consider removing, adding to__all__, or using a redundant alias(F401)
5-5:
.qna_models.MCQListMathimported but unused; consider removing, adding to__all__, or using a redundant alias(F401)
5-5:
.qna_models.SanityCheckResultimported but unused; consider removing, adding to__all__, or using a redundant alias(F401)
5-5:
.qna_models.SanityCheckSummaryimported but unused; consider removing, adding to__all__, or using a redundant alias(F401)
7-7: Consider pruning unused imports or exporting them.The imports from
.content_modelsin this line do not appear to be referenced in this file. You might remove them or export them in__all__if you intend to make them publicly accessible.🧰 Tools
🪛 Ruff (0.8.2)
7-7:
.content_models.ContentElementimported but unused; consider removing, adding to__all__, or using a redundant alias(F401)
7-7:
.content_models.SubTopicimported but unused; consider removing, adding to__all__, or using a redundant alias(F401)
7-7:
.content_models.MainTopicimported but unused; consider removing, adding to__all__, or using a redundant alias(F401)
7-7:
.content_models.LessonPlanimported but unused; consider removing, adding to__all__, or using a redundant alias(F401)
7-7:
.content_models.Flashcardimported but unused; consider removing, adding to__all__, or using a redundant alias(F401)
7-7:
.content_models.FlashcardSetimported but unused; consider removing, adding to__all__, or using a redundant alias(F401)
educhain/models/qna_models.py (1)
153-191: Remove unnecessary f-string prefix.Line 173 uses an f-string but does not include any substitution placeholders. This triggers a static analysis warning. You can safely remove the
fprefix:- print(f"Sanity Check Summary") + print("Sanity Check Summary")🧰 Tools
🪛 Ruff (0.8.2)
173-173: f-string without any placeholders
Remove extraneous
fprefix(F541)
educhain/engines/qna_engine.py (3)
26-29: Remove the unusedVisualMCQimport.Static analysis indicates that
VisualMCQis imported but never used in this file. Removing unused imports keeps the codebase clean:-from educhain.models.qna_models import ( - MCQList, ShortAnswerQuestionList, TrueFalseQuestionList, - FillInBlankQuestionList, MCQListMath, Option ,SolvedDoubt, SpeechInstructions, - VisualMCQList, VisualMCQ, BulkMCQ, BulkMCQList, - SanityCheckSummary, SanityCheckResult, - MultipleChoiceQuestion, ShortAnswerQuestion, TrueFalseQuestion, FillInBlankQuestion +from educhain.models.qna_models import ( + MCQList, ShortAnswerQuestionList, TrueFalseQuestionList, + FillInBlankQuestionList, MCQListMath, Option, SolvedDoubt, SpeechInstructions, + VisualMCQList, BulkMCQ, BulkMCQList, + SanityCheckSummary, SanityCheckResult, + MultipleChoiceQuestion, ShortAnswerQuestion, TrueFalseQuestion, FillInBlankQuestion )🧰 Tools
🪛 Ruff (0.8.2)
26-26:
educhain.models.qna_models.VisualMCQimported but unusedRemove unused import
(F401)
1362-1362: Remove extraneous f-string.Line 1362 references an f-string without placeholders:
- additional_fields = f"Options:\n" + "\n".join(failed_result.options) + additional_fields = "Options:\n" + "\n".join(failed_result.options)🧰 Tools
🪛 Ruff (0.8.2)
1362-1362: f-string without any placeholders
Remove extraneous
fprefix(F541)
1429-1429: Unused variable assignment.
response_modelat line 1429 is never used. To eliminate confusion, remove the assignment:- response_model = self._get_response_model(question_type)🧰 Tools
🪛 Ruff (0.8.2)
1429-1429: Local variable
response_modelis assigned to but never usedRemove assignment to unused variable
response_model(F841)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
educhain/models/__init__.py(1 hunks)educhain/models/qna_models.py(1 hunks)educhain/engines/qna_engine.py(6 hunks)educhain/engines/qna_engine.py(1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
educhain/models/__init__.py
3-3: .qna_models.MultipleChoiceQuestion imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
3-3: .qna_models.ShortAnswerQuestion imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
3-3: .qna_models.TrueFalseQuestion imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
3-3: .qna_models.FillInBlankQuestion imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
4-4: .qna_models.MCQList imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
4-4: .qna_models.ShortAnswerQuestionList imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
4-4: .qna_models.TrueFalseQuestionList imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
4-4: .qna_models.FillInBlankQuestionList imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
5-5: .qna_models.Option imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
5-5: .qna_models.MCQMath imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
5-5: .qna_models.MCQListMath imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
5-5: .qna_models.SanityCheckResult imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
5-5: .qna_models.SanityCheckSummary imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
7-7: .content_models.ContentElement imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
7-7: .content_models.SubTopic imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
7-7: .content_models.MainTopic imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
7-7: .content_models.LessonPlan imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
7-7: .content_models.Flashcard imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
7-7: .content_models.FlashcardSet imported but unused; consider removing, adding to __all__, or using a redundant alias
(F401)
educhain/models/qna_models.py
173-173: f-string without any placeholders
Remove extraneous f prefix
(F541)
educhain/engines/qna_engine.py
26-26: educhain.models.qna_models.VisualMCQ imported but unused
Remove unused import
(F401)
1362-1362: f-string without any placeholders
Remove extraneous f prefix
(F541)
1429-1429: Local variable response_model is assigned to but never used
Remove assignment to unused variable response_model
(F841)
🔇 Additional comments (12)
educhain/engines/qna_engine.py (12)
5-5: No significant change detected.
41-42: Good import addition.The new imports
timeandcsvare used later for retry backoff and CSV operations, respectively. This is appropriate.
123-131: Automatic model selection looks good.The
_get_response_modeldictionary is straightforward and covers all question types. Defaulting toMCQListfor unrecognized input is acceptable.
132-140: Single-response model fallback is sufficient.Similarly,
_get_single_response_modelcovers known question types, defaulting toMultipleChoiceQuestion. This is consistent with the multi-response approach.
374-375: Feature acceptance: batch parameters.Introducing
batch_sizeandmax_retries_per_batchis a good way to improve reliability and performance in question generation.
380-381: Clear docstring improvement.The updated docstring accurately reflects automated retries and batch generation.
384-420: Incremental question generation logic.The while loop logic and batch retry flow are well-structured, ensuring partial results are accumulated.
422-426: Final question slicing.Capping the accumulated questions to the requested number prevents over-generation. Looks good.
1207-1264: New_generate_batchmethod.The helper method is neatly modularized: prompts, parse, and returns. This structure is consistent and maintainable.
1266-1321: CSV loader is adequately structured.
load_questions_from_csvproperly constructs different question models. CSV columns align well with each question type.
1323-1406: Automatic question correction is well-implemented.The
_correct_questionmethod is a helpful approach for iterative question refinement. Retries guard against parse failures.🧰 Tools
🪛 Ruff (0.8.2)
1362-1362: f-string without any placeholders
Remove extraneous
fprefix(F541)
1408-1567: Complete sanity check process.The expanded
sanity_checkmethod with retries, auto-correction, and summary building is comprehensive. Implementation details appear correct and consistent with the new data models.🧰 Tools
🪛 Ruff (0.8.2)
1429-1429: Local variable
response_modelis assigned to but never usedRemove assignment to unused variable
response_model(F841)
There was a problem hiding this comment.
Actionable comments posted: 4
🔭 Outside diff range comments (1)
educhain/engines/qna_engine.py (1)
37-37:⚠️ Potential issueRemove duplicate csv import
The
csvmodule is imported twice - once on line 37 and again on line 43. Remove the duplicate import.import csv import matplotlib.pyplot as plt import pandas as pd import dataframe_image as dfi from IPython.display import display, HTML import time -import csvAlso applies to: 43-43
🧹 Nitpick comments (1)
educhain/models/qna_models.py (1)
159-160: Remove unnecessary f-string prefixThe f-string on line 160 doesn't contain any placeholders, making the 'f' prefix unnecessary.
- print("=" * 80) - print(f"Sanity Check Summary") + print("=" * 80) + print("Sanity Check Summary")🧰 Tools
🪛 Ruff (0.11.9)
160-160: f-string without any placeholders
Remove extraneous
fprefix(F541)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
README.md(1 hunks)cookbook/starters/Starter_guide_question_generation.ipynb(0 hunks)educhain/engines/qna_engine.py(13 hunks)educhain/models/qna_models.py(1 hunks)
💤 Files with no reviewable changes (1)
- cookbook/starters/Starter_guide_question_generation.ipynb
✅ Files skipped from review due to trivial changes (1)
- README.md
🧰 Additional context used
🧬 Code Graph Analysis (2)
educhain/engines/qna_engine.py (2)
educhain/models/qna_models.py (21)
SanityCheckSummary(152-177)SanityCheckResult(142-150)MCQList(69-70)ShortAnswerQuestionList(72-73)TrueFalseQuestionList(75-76)FillInBlankQuestionList(78-79)MultipleChoiceQuestion(6-16)ShortAnswerQuestion(44-51)TrueFalseQuestion(53-59)FillInBlankQuestion(61-67)show(9-16)show(34-38)show(47-51)show(56-59)show(64-67)show(91-97)show(102-105)show(121-133)show(158-177)show(233-240)show(268-275)archive/models.py (9)
MCQList(75-76)ShortAnswerQuestionList(78-79)TrueFalseQuestionList(81-82)FillInBlankQuestionList(84-85)MultipleChoiceQuestion(30-40)ShortAnswerQuestion(42-49)TrueFalseQuestion(51-57)FillInBlankQuestion(59-65)show(23-28)
educhain/models/qna_models.py (2)
educhain/engines/qna_engine.py (3)
SanityCheckResult(1262-1270)SanityCheckSummary(1272-1298)show(1278-1298)archive/experimental.py (1)
show(220-226)
🪛 Ruff (0.11.9)
educhain/engines/qna_engine.py
1299-1299: SyntaxError: Expected a statement
1299-1299: SyntaxError: Expected a statement
1299-1299: SyntaxError: Expected a statement
1299-1299: SyntaxError: Expected a statement
2153-2153: SyntaxError: Expected a statement
educhain/models/qna_models.py
160-160: f-string without any placeholders
Remove extraneous f prefix
(F541)
🔇 Additional comments (4)
educhain/models/qna_models.py (1)
142-178: Well-structured Pydantic models for sanity check functionalityThe new
SanityCheckResultandSanityCheckSummarymodels are well-designed with clear field descriptions and a usefulshow()method for formatted output display.🧰 Tools
🪛 Ruff (0.11.9)
160-160: f-string without any placeholders
Remove extraneous
fprefix(F541)
educhain/engines/qna_engine.py (3)
225-225: Good improvement to text splitter overlapIncreasing the chunk overlap from 0 to 200 characters will provide better context continuity when splitting text for vector store creation.
401-462: Excellent implementation of batch generation with retriesThe enhanced
generate_questionsmethod now includes:
- Batch processing to avoid overwhelming the LLM
- Automatic retry logic with backoff
- Dynamic batch size reduction on failures
- Proper accumulation and trimming of results
This makes the question generation process more robust and reliable.
1097-1257: Comprehensive sanity check implementationThe
sanity_checkmethod provides excellent functionality:
- Batch processing for efficiency
- Detailed validation of answer correctness, clarity, and distractor plausibility
- Auto-correction of failed questions
- Retry logic for resilience
- In-place updates when auto_correct is enabled
This significantly improves the quality assurance of generated questions.
| print(f" Question Clarity: {result.question_clarity}") | ||
| print(f" Distractor Plausibility: {result.distractor_plausibility}") | ||
| print(f" Passed: {result.passed}\n") | ||
| ======= |
There was a problem hiding this comment.
Resolve merge conflict
There's an unresolved merge conflict marker (=======) on line 1299. This needs to be resolved by choosing the appropriate code version and removing the conflict markers.
🧰 Tools
🪛 Ruff (0.11.9)
1299-1299: SyntaxError: Expected a statement
1299-1299: SyntaxError: Expected a statement
1299-1299: SyntaxError: Expected a statement
1299-1299: SyntaxError: Expected a statement
1299-1300: SyntaxError: Expected a statement
🤖 Prompt for AI Agents
In educhain/engines/qna_engine.py at line 1299, there is an unresolved merge
conflict marker (`=======`). Remove this marker and resolve the conflict by
selecting the correct code version to keep, ensuring the file is clean and
syntactically correct.
| } | ||
|
|
||
| validated_question = self._validate_individual_question(q_dict, question_model=question_model) | ||
| if validated_question: | ||
| batch_validated_questions.append(validated_question) | ||
|
|
||
| # Add to existing questions to prevent duplicates in future batches | ||
| if csv_output_file: | ||
| for field in ['question', 'question_text', 'stem', 'prompt']: | ||
| if field in q_dict and q_dict[field]: | ||
| existing_questions.add(q_dict[field].strip()) | ||
| break |
There was a problem hiding this comment.
Undefined variable 'combo' in metadata assignment
The code references combo["topic"], combo["subtopic"], and combo["learning_objective"] on lines 1370-1373, but combo is not defined in this scope. This appears to be within the _write_questions_to_csv method which doesn't have access to a combo variable.
This metadata assignment logic should either:
- Be removed from this generic CSV writing method, or
- The metadata should be passed as a parameter to the method
🤖 Prompt for AI Agents
In educhain/engines/qna_engine.py around lines 1373 to 1384, the variable
'combo' is used for metadata assignment but is not defined in this scope,
causing an error. To fix this, either remove the metadata assignment referencing
'combo' from this generic CSV writing method or modify the method signature to
accept the necessary metadata as parameters and use those instead of 'combo'.
Ensure the method only uses variables passed to it or defined within its scope.
| from typing import List | ||
| from pydantic import BaseModel | ||
|
|
||
| class SanityCheckResult(BaseModel): | ||
| question: str | ||
| answer: str | ||
| options: Optional[List[str]] = None | ||
| keywords: Optional[List[str]] = None | ||
| answer_correctness: str | ||
| question_clarity: str | ||
| distractor_plausibility: str | ||
| passed: bool | ||
|
|
||
| class SanityCheckSummary(BaseModel): | ||
| results: List[SanityCheckResult] | ||
| total_questions: int | ||
| passed_questions: int | ||
| failed_questions: int | ||
|
|
||
| def show(self): | ||
| """Displays the sanitized questions with their options and sanity check results.""" | ||
| print("\nSanity Check Summary:") | ||
| print(f"Total Questions: {self.total_questions}") | ||
| print(f"Passed Questions: {self.passed_questions}") | ||
| print(f"Failed Questions: {self.failed_questions}\n") | ||
|
|
||
| for i, result in enumerate(self.results): | ||
| print(f"Question {i + 1}:") | ||
| print(f" Question Text: {result.question}") | ||
| print(f" Answer: {result.answer}") | ||
|
|
||
| if result.options: | ||
| print(" Options:") | ||
| for option in result.options: | ||
| print(f" - {option}") | ||
|
|
||
| print(f" Answer Correctness: {result.answer_correctness}") | ||
| print(f" Question Clarity: {result.question_clarity}") | ||
| print(f" Distractor Plausibility: {result.distractor_plausibility}") | ||
| print(f" Passed: {result.passed}\n") |
There was a problem hiding this comment.
Remove duplicate model definitions
The SanityCheckResult and SanityCheckSummary models are already defined in educhain/models/qna_models.py and imported on line 28. These duplicate definitions should be removed to avoid confusion and potential conflicts.
-# Extending SanityCheckSummary to include a show() method
-from typing import List
-from pydantic import BaseModel
-
-class SanityCheckResult(BaseModel):
- question: str
- answer: str
- options: Optional[List[str]] = None
- keywords: Optional[List[str]] = None
- answer_correctness: str
- question_clarity: str
- distractor_plausibility: str
- passed: bool
-
-class SanityCheckSummary(BaseModel):
- results: List[SanityCheckResult]
- total_questions: int
- passed_questions: int
- failed_questions: int
-
- def show(self):
- """Displays the sanitized questions with their options and sanity check results."""
- print("\nSanity Check Summary:")
- print(f"Total Questions: {self.total_questions}")
- print(f"Passed Questions: {self.passed_questions}")
- print(f"Failed Questions: {self.failed_questions}\n")
-
- for i, result in enumerate(self.results):
- print(f"Question {i + 1}:")
- print(f" Question Text: {result.question}")
- print(f" Answer: {result.answer}")
-
- if result.options:
- print(" Options:")
- for option in result.options:
- print(f" - {option}")
-
- print(f" Answer Correctness: {result.answer_correctness}")
- print(f" Question Clarity: {result.question_clarity}")
- print(f" Distractor Plausibility: {result.distractor_plausibility}")
- print(f" Passed: {result.passed}\n")Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In educhain/engines/qna_engine.py around lines 1259 to 1298, the
SanityCheckResult and SanityCheckSummary models are redefined but already
imported from educhain/models/qna_models.py on line 28. Remove these duplicate
class definitions entirely from this file to avoid conflicts and rely on the
imported models instead.
| validated_question = self._validate_individual_question(q_dict, question_model=question_model) | ||
| if validated_question: | ||
| batch_validated_questions.append(validated_question) | ||
|
|
There was a problem hiding this comment.
Undefined variable 'batch_validated_questions'
The code references batch_validated_questions on line 1377, but this variable is not defined in the current scope. This appears to be a copy-paste error or incomplete refactoring.
The code should use written_questions instead:
- validated_question = self._validate_individual_question(q_dict, question_model=question_model)
- if validated_question:
- batch_validated_questions.append(validated_question)
+ # Validation is already handled by the Pydantic model
+ written_questions.append(question)Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In educhain/engines/qna_engine.py around lines 1375 to 1378, the variable
batch_validated_questions is used but not defined in the current scope. Replace
all occurrences of batch_validated_questions with written_questions to use the
correctly defined list and fix the undefined variable error.
Description:
Modified the question generation function for better efficiency.
Added automated retries to handle failures gracefully.
Implemented batch-wise question generation for optimized processing.
Introduced a sanity check function to ensure output quality.
Changes:
🔄 Refactored question generation logic.
🔁 Added automated retry mechanism.
📦 Enabled batch-wise processing.
✅ Implemented a sanity check function.
Summary by CodeRabbit
New Features
Chores