feat: Add DeepLearning AI course scraping functionality and update RE…#126
Open
feat: Add DeepLearning AI course scraping functionality and update RE…#126
Conversation
Contributor
Reviewer's GuideImplemented a new scraper for DeepLearning.ai courses by querying their Algolia API endpoint. The script fetches course data, parses it using Pydantic models, and saves the results into two distinct CSV files. Sequence diagram for DeepLearning.ai course scrapingsequenceDiagram
participant Script as scrape_all_courses.py
participant Algolia
participant CSV Files
Script->>Algolia: POST /queries (fetch page 0)
activate Algolia
Algolia-->>Script: Course data (hits) + nbPages
deactivate Algolia
loop Fetch all pages
Script->>Algolia: POST /queries (fetch page N)
activate Algolia
Algolia-->>Script: Course data (hits)
deactivate Algolia
end
Script->>Script: Parse course data (parse_algolia_data)
Script->>CSV Files: Write Courses_and_Learning_Materials.csv
Script->>CSV Files: Write Learning_Pathway_Index.csv
Class diagram for new DeepLearning.ai data modelsclassDiagram
class Course {
+str Module_Code
+str Source
+Optional[str] Course_Level
+Optional[str] Duration
+Optional[str] Prerequisites
+Optional[str] Prework
+str Course_Learning_Material
+str Course_Learning_Material_Link
+str Type_Free_Paid
}
class CourseIndex {
+str Module_Code
+str Course_Learning_Material
+str Source
+str Course_Level
+str Type_Free_Paid
+str Module
+Optional[float] Duration
+Optional[str] Difficulty_Level
+Optional[str] Keywords_Tags_Skills_Interests_Categories
+str Links
}
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Contributor
There was a problem hiding this comment.
Hey @TobeTek - I've reviewed your changes and found some issues that need to be addressed.
Blocking issues:
-
Hardcoded Algolia API key and Application ID (link)
-
Move the hardcoded Algolia URL, API key, and application ID to a configuration file or environment variables.
-
Consider consolidating the
CourseandCourseIndexdata models and their corresponding CSV outputs.
Here's what I looked at during the review
- 🟡 General issues: 2 issues found
- 🔴 Security: 1 blocking issue
- 🟢 Testing: all looks good
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
f042314 to
d5b82e9
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
…ADME
Summary by Sourcery
Add scraping functionality for DeepLearning.ai courses to the course scraper module
New Features:
Enhancements: