Skip to content
This repository was archived by the owner on Nov 16, 2023. It is now read-only.
This repository was archived by the owner on Nov 16, 2023. It is now read-only.

Add support for crawling a GitHub User’s various contributions, independent of a specific org or repo #146

Description

@danisyellis

Our goal is to track contributions by our employees to any open-source project on GitHub. So we'll need to look at each employee’s commits, pull requests, issues, etc. We can do this through the User’s Events.

I have some questions about how to do this:

  1. Is there anything in the current constraints of ghcrawler that will make this an exceptionally difficult task?

  2. How do I say “traverse the Events for a given User”? Where is an example of similar code doing something similar?

    • Based on this discussion Add support for traversing Releases #94 I thought it would be in the GitHub processor. Inside of that file, my understanding is that this code in user() this._addCollection(request, ‘repos', ‘repo’) should tell it to look at a user’s repos and add those repos to the mongodb repo collection. But currently, as far as I can tell, it processes the user, but doesn’t even hit the repo function. Because I care most about events right now, I also tried this._addCollection(request, 'events', 'null’); and this._addCollection(request, 'events', ‘events’); but neither seemed to do anything.
  3. Will this require an advanced traversal policy? I think that I can use the default traversal policy for now and refine it with an advanced one later to grab fewer things from user, if desired, like using graphQL to do a query. Is that right?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions