Skip to content

yungryce/foliohive

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

343 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

FolioHive

AI-Powered GitHub Profile Analysis for Technical Recruiting

FolioHive is a cloud-native SaaS platform that automatically analyzes candidate GitHub profiles to help recruiters assess technical skills, coding style, and project experience. The system aggregates repository metadata, caches relevant code artifacts, and uses AI to generate contextual summaries and answer recruiter queries.


πŸš€ Quick Start

Prerequisites

  • Python 3.12+
  • Node.js 18+ and npm
  • Azure Functions Core Tools v4
  • Azurite (Azure Storage Emulator)
  • GitHub Personal Access Token (rate limit: 5000 requests/hour)
  • OpenAI API key

Local Development

# Start all development services (Azurite + API + UI)
./run-dev-session.sh --run-e2e -- --python-version 3.12+ --run-tests

# Local settings;
Ensure to update `local.settings.json` with your Github and OpenAI API key and any necessary configuration for Azure Storage connection strings and CORS.

Access the application:


πŸ—οΈ Architecture Overview

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                         Angular UI (SWA)                        β”‚
β”‚  Landing | Profile | Projects | AI Assistant | Admin Dashboard β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                         β”‚ HTTP/REST
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                  Azure Functions (Flex Consumption)             β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚ API Gateway  β”‚ Sync Worker β”‚Cache Worker β”‚Reconciliationβ”‚   β”‚
β”‚  β”‚ (HTTP Routes)β”‚ (Queue)     β”‚ (Queue)     β”‚ (Timer)      β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                         β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Azure Storage Account                        β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚Table Storageβ”‚ Blob Storage β”‚    Queue Storage           β”‚   β”‚
β”‚  β”‚(7 Tables)   β”‚(Cached Files)β”‚(sync-jobs, cache-jobs)     β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                         β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚               External Services                                 β”‚
β”‚  GitHub REST/GraphQL API  |  OpenAI GPT API                     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key Components

Function App with Blueprint Pattern (Modular Monolith)

  • API Gateway: HTTP endpoints for sync, job polling, AI summaries, queries
  • Sync Worker: Queue-triggered; fetches GitHub metadata, generates fingerprints
  • Cache Worker: Queue-triggered; fetches file contents (README, configs)
  • Reconciliation Worker: Timer-triggered; cleanup and retry logic (3-min interval)

Shared Modules (foliohive_shared/)

  • ai/: OpenAI integration, context orchestration, token management
  • cache/: Fingerprint-based caching, blob storage management
  • github/: REST + GraphQL unified interface
  • table/: 8-table normalized schema with TableManager
  • queue/: Message serialization and queue clients

Data Storage

  • Table Storage: 8 normalized tables (JobMetadata, RepoGitHubMetadata, RepoLanguages, etc.)
  • Blob Storage: Cached README and config files (content-addressable by fingerprint)
  • Queue Storage: Async job processing (sync-jobs, cache-jobs)

πŸ“Š Key Features

1. Automated GitHub Data Collection

  • Fetch candidate profiles via GitHub username
  • Collect repository metadata (languages, stars, topics, dates)
  • Track sync state with job progress monitoring
  • Deduplicate requests using fingerprint-based caching

2. Intelligent File Caching

  • Automatically fetch README files for project context
  • Cache language-specific config files (package.json, pyproject.toml, etc.)
  • Content-addressable storage (SHA-256 fingerprints)
  • Skip unchanged files to minimize API calls

3. AI-Powered Summaries

  • Profile Summary: Holistic candidate overview with skills, experience, patterns
  • Repository Summary: Individual project analysis with tech stack and architecture
  • Interactive Assistant: Answer recruiter queries with candidate-specific context

4. Asynchronous Processing

  • Queue-driven architecture for scalability
  • Job state tracking: queued β†’ syncing β†’ metadata_ready β†’ completed
  • Repo state tracking: pending β†’ synced β†’ cached
  • Automatic retry with reconciliation worker

5. Cost-Optimized AI Usage

  • Tiered model selection (gpt-5-nano, gpt-4o-mini)
  • Token budget management per summary type
  • Context chunking to fit within limits
  • Response validation and truncation detection

πŸ› οΈ Technology Stack

Layer Technology Purpose
Frontend Angular 18+ (Standalone Components) Reactive UI with RxJS
Backend Azure Functions (Flex Consumption) Serverless API + Workers
Language Python 3.12+ Core backend logic
AI OpenAI GPT (gpt-5-nano, gpt-4o-mini) Summaries and queries
Storage Azure Table, Blob, Queue Storage Data persistence
Git Data GitHub REST + GraphQL API Repository metadata
IaC Bicep Infrastructure as Code
CI/CD Azure DevOps Pipelines Automated deployments
Monitoring Application Insights Telemetry and diagnostics

πŸ“ Project Structure

foliohive/
β”œβ”€β”€ api/v0.4.0/
β”‚   β”œβ”€β”€ function-app/          # Azure Functions entry point
β”‚   β”‚   β”œβ”€β”€ function_app.py    # Main app registration
β”‚   β”‚   └── blueprints/        # Worker implementations
β”‚   β”œβ”€β”€ shared/                # foliohive_shared package
β”‚   β”‚   └── src/foliohive_shared/
β”‚   β”‚       β”œβ”€β”€ ai/            # AI integration
β”‚   β”‚       β”œβ”€β”€ cache/         # Caching logic
β”‚   β”‚       β”œβ”€β”€ github/        # GitHub API client
β”‚   β”‚       β”œβ”€β”€ queue/         # Queue messaging
β”‚   β”‚       └── table/         # Table Storage schema
β”‚   └── tests/                 # Pytest test suite
β”‚
β”œβ”€β”€ ui/                        # Angular Static Web App
β”‚   └── src/app/
β”‚       β”œβ”€β”€ landing/           # Candidate search
β”‚       β”œβ”€β”€ profile/           # Candidate summary
β”‚       β”œβ”€β”€ projects/          # Repository list
β”‚       β”œβ”€β”€ ai/                # AI assistant
β”‚       └── services/          # API clients
β”‚
β”œβ”€β”€ infra/bicep/               # Azure infrastructure
β”‚   β”œβ”€β”€ main.bicep             # Entry point
β”‚   β”œβ”€β”€ main.bicepparam        # Parameters
β”‚   └── modules/               # Resource modules
β”‚
└── README.md                  # This file

πŸ”— Documentation


πŸ§ͺ Testing

# Run all tests
cd api/v0.4.0/tests
./run_tests.sh

# Run specific test suite
pytest test_reconciliation_worker.py -v

# Run integration tests
pytest integration/ -v

# Run E2E curl tests
./e2e_curl_tests.sh

🚒 Deployment

Infrastructure Deployment

cd infra/bicep

# Deploy with default parameters
az deployment sub create \
  --name foliohive-prod \
  --location eastus \
  --template-file main.bicep \
  --parameters main.bicepparam

# Or use specific parameter file
az deployment sub create \
  --location eastus \
  --template-file main.bicep \
  --parameters @main.bicepparam

Application Deployment

Automated via Azure DevOps pipelines:

  • Functions: azure-functions-cd.yml
  • Static Web App: static-web-app-cd.yml
  • Training Worker: training-worker-cd.yml

πŸ” Security

  • Managed Identity: No stored credentials for Azure service communication
  • Private Networking: VNet integration for Function Apps
  • Key Vault: Secrets management (GitHub tokens, OpenAI keys)
  • CORS: Restricted to UI origin
  • API Keys: Optional authentication layer

πŸ’° Cost Optimization

  • Flex Consumption Plan: Pay only for execution time
  • Intelligent Caching: Minimize GitHub API calls with fingerprints
  • AI Token Management: Budget enforcement per summary type
  • Queue-Based Processing: Efficient async execution
  • Storage Lifecycle: TTL-based blob cleanup (planned)

πŸ“ˆ Monitoring

  • Application Insights: End-to-end telemetry
  • Custom Metrics: Job success rates, cache hit ratios, AI token usage
  • Structured Logging: Correlation IDs across workers
  • Queue Metrics: Message depth, processing time, DLQ counts

🀝 Contributing

  1. Create feature branch: git checkout -b feature/your-feature
  2. Follow Python coding standards (PEP 8)
  3. Add tests for new functionality
  4. Update relevant documentation
  5. Submit pull request with clear description

πŸ“ License

Proprietary - All rights reserved


πŸ†˜ Support

  • Issues: Submit via GitHub Issues
  • Documentation: Check component-specific READMEs
  • Architecture Questions: Review Architecture Decision Records (coming soon)

Built with ❀️ for technical recruiters

About

Cloud-native SaaS platform recruiter tool that automatically analyzes candidate GitHub profiles

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors