Skip to content

feat: Implement Qooper scraper#125

Open
TobeTek wants to merge 3 commits intomainfrom
qooper-scraper
Open

feat: Implement Qooper scraper#125
TobeTek wants to merge 3 commits intomainfrom
qooper-scraper

Conversation

@TobeTek
Copy link
Copy Markdown
Collaborator

@TobeTek TobeTek commented Jan 8, 2025

Summary by Sourcery

Implement a scraper for Qooper data.

New Features:

  • Scrape groups, members, discussions, and events from Qooper.

Tests:

  • No tests were added.

@TobeTek TobeTek requested a review from neomatrix369 as a code owner January 8, 2025 22:16
@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai Bot commented Jan 8, 2025

Reviewer's Guide by Sourcery

This pull request implements a scraper for Qooper data. It uses Selenium to log in to the Qooper website and retrieve an authentication token. Then, it uses the token to make requests to the Qooper API and retrieves data about groups, members, discussions, and events. Finally, it saves the scraped data to JSON files.

Sequence diagram for Qooper data scraping process

sequenceDiagram
    participant User
    participant Selenium
    participant QooperWeb
    participant QooperAPI
    participant FileSystem

    User->>Selenium: Start scraping process
    Selenium->>QooperWeb: Navigate to login page
    Selenium->>QooperWeb: Enter email
    Selenium->>QooperWeb: Enter password
    Selenium->>QooperWeb: Select program
    QooperWeb-->>Selenium: Return auth token
    Selenium->>FileSystem: Save auth token

    Note over QooperAPI: Using saved token
    QooperAPI->>QooperAPI: Get groups
    QooperAPI->>QooperAPI: Get group members
    QooperAPI->>QooperAPI: Get discussions
    QooperAPI->>QooperAPI: Get events

    QooperAPI->>FileSystem: Save groups JSON
    QooperAPI->>FileSystem: Save members JSON
    QooperAPI->>FileSystem: Save discussions JSON
    QooperAPI->>FileSystem: Save events JSON
Loading

Class diagram for Qooper data models

classDiagram
    class Group {
        +int id
        +string name
        +string description
        +string image_url
        +string kind
        +list[string] tags
        +bool is_joined
        +int members_count
        +int resources_count
        +int discussions_count
        +int past_events_count
        +int upcoming_events_count
        +string created_at
    }

    class GroupMember {
        +int id
        +string first_name
        +string last_name
        +string image_url
        +string current_position
        +string current_organization
        +list[string] roles
    }

    class Discussion {
        +int id
        +int comments_count
        +int upvotes_count
        +int attachments_count
        +string title
        +string description
        +string created_at
        +int publisher
        +list[string] tags
        +list[int] upvotes_user_ids
        +list[int] comments
        +bool is_comment
        +int group_id
        +string group_name
    }

    class GroupEvent {
        +int id
        +string title
        +string timezone
        +string start_time
        +string end_time
        +string image_url
        +string updated_at
        +string created_at
        +string address
        +int publisher
        +int group_id
        +string group_name
    }

    Group "1" -- "*" GroupMember
    Group "1" -- "*" Discussion
    Group "1" -- "*" GroupEvent
    GroupMember "1" -- "*" Discussion
    GroupMember "1" -- "*" GroupEvent
Loading

File-Level Changes

Change Details Files
Implement Qooper scraper
  • Added scrape.py to scrape data from Qooper API.
  • Added get_token.py to retrieve Qooper authentication token using Selenium.
  • Added schemas.py to define Pydantic models for Qooper data.
  • Added requirements.txt for project dependencies.
  • Added README.md with instructions for setting up and running the scraper.
  • Added .env.example for environment variables.
  • Added .gitignore to exclude unnecessary files from version control.
app/qooper-scraper/scrape.py
app/qooper-scraper/get_token.py
app/qooper-scraper/schemas.py
app/qooper-scraper/requirements.txt
app/qooper-scraper/README.md
app/qooper-scraper/.env.example
app/qooper-scraper/.gitignore

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time. You can also use
    this command to specify where the summary should be inserted.

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @TobeTek - I've reviewed your changes - here's some feedback:

Overall Comments:

  • Consider adding proper error handling for API responses and validation of returned data to make the scraper more robust
  • Implement rate limiting or request batching to avoid potential API throttling issues when dealing with large datasets
  • Consider using a more secure method for storing authentication tokens rather than plaintext files
Here's what I looked at during the review
  • 🟡 General issues: 4 issues found
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟡 Complexity: 1 issue found
  • 🟢 Documentation: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread app/qooper-scraper/scrape.py
Comment thread app/qooper-scraper/README.md Outdated
Comment thread app/qooper-scraper/README.md Outdated
Comment thread app/qooper-scraper/README.md Outdated
Comment thread app/qooper-scraper/scrape.py
Comment thread app/qooper-scraper/get_token.py
Comment thread app/qooper-scraper/scrape.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant