Thank you for your interest in contributing to IntelliTag! This document provides guidelines and instructions for contributing.
- Code of Conduct
- Getting Started
- Development Setup
- Making Changes
- Code Standards
- Testing
- Submitting Changes
- Issue Guidelines
This project follows the Contributor Covenant Code of Conduct. By participating, you are expected to uphold this code.
- Fork the repository on GitHub
- Clone your fork locally:
git clone https://github.com/YOUR_USERNAME/Classifier_Questions_StackOverflow.git cd Classifier_Questions_StackOverflow - Add the upstream remote:
git remote add upstream https://github.com/ThomasMeb/Classifier_Questions_StackOverflow.git
- Python 3.9 or higher
- pip
- Git
-
Create a virtual environment:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies:
make install-dev # Or manually: pip install -e ".[dev]" pip install -r requirements-dev.txt
-
Verify installation:
make test
Use descriptive branch names:
feature/add-new-extractor- New featuresfix/api-response-format- Bug fixesdocs/update-readme- Documentationrefactor/simplify-classifier- Code refactoring
-
Create a new branch from
main:git checkout main git pull upstream main git checkout -b feature/your-feature-name
-
Make your changes following our code standards
-
Test your changes:
make test make lint -
Commit your changes using conventional commits
-
Push to your fork:
git push origin feature/your-feature-name
-
Open a Pull Request on GitHub
We follow PEP 8 with these tools:
- Black for code formatting (line length: 88)
- isort for import sorting
- Flake8 for linting
- mypy for type checking
Run all checks:
make lintAuto-format code:
make formatUse Google-style docstrings:
def predict_tags(text: str, top_k: int = 5) -> list[tuple[str, float]]:
"""Predict tags for the given text.
Args:
text: The input text (title + body).
top_k: Maximum number of tags to return.
Returns:
List of (tag, confidence) tuples sorted by confidence.
Raises:
ValueError: If text is empty or top_k < 1.
Example:
>>> predict_tags("How to parse JSON in Python?", top_k=3)
[('python', 0.95), ('json', 0.87), ('parsing', 0.45)]
"""All functions must have type hints:
from typing import Optional
def process_text(
text: str,
lowercase: bool = True,
remove_code: Optional[bool] = None
) -> str:
...src/intellitag/
├── config/ # Configuration and settings
├── data/ # Data loading and preprocessing
├── features/ # Feature extraction
├── models/ # ML models
├── api/ # REST API
└── utils/ # Utilities
# All tests
make test
# With coverage
make test-cov
# Specific test file
pytest tests/unit/test_preprocessor.py
# Specific test
pytest tests/unit/test_preprocessor.py::test_clean_html -v- Place unit tests in
tests/unit/ - Place integration tests in
tests/integration/ - Use descriptive test names:
test_<what>_<condition>_<expected>
Example:
def test_predict_tags_with_empty_text_raises_value_error():
"""Test that empty text raises ValueError."""
classifier = TagClassifier()
with pytest.raises(ValueError, match="Text cannot be empty"):
classifier.predict("")Aim for >80% code coverage. Check coverage:
make test-cov
# Open htmlcov/index.html in browserFollow Conventional Commits:
<type>(<scope>): <description>
[optional body]
[optional footer]
Types:
feat: New featurefix: Bug fixdocs: Documentation onlystyle: Code style (formatting, no code change)refactor: Code refactoringtest: Adding or updating testschore: Maintenance tasks
Examples:
feat(api): add batch prediction endpoint
fix(preprocessor): handle Unicode characters in code blocks
docs(readme): update installation instructions
- Title: Use conventional commit format
- Description: Include:
- What changes were made
- Why the changes were necessary
- How to test the changes
- Link issues: Reference related issues with
Fixes #123 - Keep PRs focused: One feature/fix per PR
- Update documentation: If behavior changes
- All PRs require at least one approval
- CI checks must pass (lint, tests, build)
- Resolve all review comments
- Squash commits if requested
Include:
- Python version
- OS and version
- Steps to reproduce
- Expected vs actual behavior
- Error messages/stack traces
Include:
- Use case description
- Proposed solution (if any)
- Alternatives considered
bug- Something isn't workingenhancement- New feature requestdocumentation- Documentation improvementsgood first issue- Good for newcomershelp wanted- Extra attention needed
- Open an issue for questions
- Check existing issues and discussions first
Thank you for contributing!