Thank you for your interest in contributing! This guide will help you get started.
fork the repository on GitHub, then:
git clone https://github.com/YOUR_USERNAME/LinkedinDataScraper.git
cd LinkedinDataScraperpython -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt
playwright install chromiumgit checkout -b feature/your-feature-namelinkedin_scraper/
βββ auth/ # Authentication (cookies, login)
βββ scraper/ # Browser, search, profile extraction, API interception
βββ export/ # CSV + Excel export with formatting
βββ utils/ # Rate limiting, helpers
- API Interception First: Always prefer Voyager API data over DOM parsing
- DOM Fallback: Use CSS selectors only when API doesn't capture the data
- Rate Limiting: All LinkedIn interactions must go through
AdaptiveRateLimiter - Anti-Detection: Never bypass stealth settings or rate limits
# CLI mode
python -m linkedin_scraper "keywords" --location "City" -n 5
# Web UI
streamlit run app.py
# Quick test with mock data
python -c "
from linkedin_scraper.models import LinkedInProfile
from linkedin_scraper.export.exporter import export_profiles
profiles = [LinkedInProfile(full_name='Test User', profile_url='https://linkedin.com/in/test')]
export_profiles(profiles, output_dir='output', fmt='both', keywords='test')
print('Export OK')
"- Follow existing patterns in the codebase
- Use type hints for function signatures
- Use
loggingmodule (notprint()) for debug output - Use
richfor user-facing terminal output
- Bug fixes with clear description of the issue
- New data extraction fields
- Improved anti-detection measures
- Better export formatting
- Documentation improvements
- Cross-platform compatibility fixes
- Performance optimizations
- Changes that increase detection risk (faster scraping, removed delays)
- Dependencies on paid services or APIs
- Features that violate LinkedIn's ToS beyond educational use
- Large refactors without prior discussion
- Test your changes β make sure the CLI and export work correctly
- Update documentation if you've changed CLI options or behavior
- Keep PRs focused β one feature or fix per PR
- Write a clear description β explain what changed and why
feat: add company size extraction
fix: handle rate limit 999 response
docs: update CLI usage examples
refactor: simplify profile extraction logic
Use GitHub Issues with the appropriate template:
- Bug Report: Something broken? Include steps to reproduce
- Feature Request: Want something new? Describe the use case
Open a Discussion or create an issue tagged with question.
Thank you for helping improve LinkedIn Data Scraper!