Thank you for considering contributing to Transcript Create! 🎉
We're excited to have you here. Whether you're fixing a bug, adding a feature, improving documentation, or helping others, every contribution makes a difference.
This document provides guidelines and information about our development process to help you contribute effectively.
This project and everyone participating in it is governed by our Code of Conduct. By participating, you are expected to uphold this code. Please report unacceptable behavior to security@subculture.community.
- Code of Conduct
- Getting Started
- Development Setup
- Database Migrations
- Code Quality
- CI/CD Pipeline
- Pull Request Process
- Branch Protection Rules
- First-Time Contributors
- Getting Help
- Fork the repository on GitHub
- Clone your fork locally
- Create a new branch for your feature or bug fix
- Make your changes following our code quality guidelines
- Run tests and linting locally
- Push to your fork and submit a pull request
New to open source? Check out our First-Time Contributors Guide for a detailed walkthrough.
# Create virtual environment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Install development tools
pip install ruff black isort mypy pytest pre-commit
# Set up pre-commit hooks
pre-commit install# Install test dependencies
pip install -r requirements-dev.txt
# Set up test database (PostgreSQL required)
export DATABASE_URL="postgresql+psycopg://postgres:postgres@localhost:5432/postgres"
# Run migrations to set up schema
python scripts/run_migrations.py upgrade
# Run all tests
pytest tests/
# Run tests with coverage
pytest tests/ --cov=app --cov-report=html --cov-report=term
# Run specific test file
pytest tests/test_routes_jobs.py -v
# View HTML coverage report
open htmlcov/index.html # macOS
xdg-open htmlcov/index.html # LinuxTest Structure:
tests/conftest.py- Shared fixtures (database, test client)tests/test_crud.py- CRUD operation teststests/test_routes_*.py- API endpoint teststests/test_schemas.py- Pydantic model validation
See tests/README.md for detailed testing documentation.
cd frontend
npm installSee the main README.md for detailed instructions on running the full stack with Docker Compose or individual services.
We use Alembic to manage database schema changes. Migrations provide version control for the database schema and enable safe, reproducible schema evolution across environments.
- Migrations are stored in
alembic/versions/ - Each migration has an
upgrade()function to apply changes and adowngrade()function to revert them - Migrations are applied sequentially in the order they were created
- The
alembic_versiontable tracks which migrations have been applied
# Apply all pending migrations
python scripts/run_migrations.py upgrade
# Check current migration version
python scripts/run_migrations.py current
# View migration history
python scripts/run_migrations.py history
# Downgrade one migration (careful in production!)
python scripts/run_migrations.py downgrade
# Stamp database at a specific revision (for existing databases)
python scripts/run_migrations.py stamp head# Apply all pending migrations
alembic upgrade head
# Upgrade to a specific revision
alembic upgrade abc123
# Downgrade to a specific revision
alembic downgrade def456
# Downgrade one revision
alembic downgrade -1
# Show current revision
alembic current
# Show migration history
alembic history --verboseWhen making schema changes, you must create a migration:
# Create a new migration file
alembic revision -m "descriptive_name"
# This creates a file like: alembic/versions/20251024_1234_abc123_descriptive_name.pyThe generated file contains empty upgrade() and downgrade() functions that you must implement:
def upgrade() -> None:
"""Apply schema changes."""
# Add a new column
op.add_column('videos', sa.Column('thumbnail_url', sa.String(), nullable=True))
# Create an index
op.create_index('idx_videos_thumbnail', 'videos', ['thumbnail_url'])
def downgrade() -> None:
"""Revert schema changes."""
# Drop the index
op.drop_index('idx_videos_thumbnail', 'videos')
# Drop the column
op.drop_column('videos', 'thumbnail_url')-
Always test both upgrade and downgrade
# Test upgrade python scripts/run_migrations.py upgrade # Test downgrade python scripts/run_migrations.py downgrade # Re-apply python scripts/run_migrations.py upgrade
-
Write idempotent migrations when possible
- Use
IF NOT EXISTS/IF EXISTSclauses - Check for existence before creating/dropping objects
- Handle cases where migration is partially applied
- Use
-
Keep migrations focused and atomic
- One logical change per migration
- Don't mix DDL and data migrations
- Easier to review, test, and potentially revert
-
Document complex migrations
- Add comments explaining the purpose
- Document any manual steps required
- Note any data transformations
-
Test with production-like data
- Test on a copy of production data when possible
- Consider performance impact of migrations
- Plan for zero-downtime deployment if needed
-
Never edit existing migrations
- Once a migration is committed and deployed, never modify it
- Create a new migration to fix issues
- Exception: migrations not yet in main branch
If you have an existing database created from sql/schema.sql, you need to "stamp" it to indicate it's at the baseline:
# Stamp the database as being at the initial migration
export DATABASE_URL="postgresql+psycopg://postgres:postgres@localhost:5432/transcripts"
python scripts/run_migrations.py stamp headThis tells Alembic that your database already has the baseline schema, so it won't try to re-apply it.
When running with Docker Compose, migrations are automatically applied on startup via the migrations service:
services:
migrations:
image: transcript-create:latest
command: ["python3", "scripts/run_migrations.py", "upgrade"]
depends_on:
db:
condition: service_healthyThe API and worker services wait for migrations to complete before starting.
All migrations are automatically validated in CI:
- Fresh Database Test: Applies migrations to an empty database
- Existing Schema Test: Stamps an existing schema and verifies no conflicts
- Up/Down Test: Tests upgrade and downgrade functionality
See .github/workflows/migrations-ci.yml for details.
For detailed migration examples and templates, see:
- Migration Template Guide - Comprehensive examples for all migration types
- Production Migration Runbook - Production deployment procedures
Adding a column:
def upgrade() -> None:
op.execute("ALTER TABLE videos ADD COLUMN IF NOT EXISTS thumbnail_url TEXT")
def downgrade() -> None:
op.execute("ALTER TABLE videos DROP COLUMN IF EXISTS thumbnail_url")Creating an index:
def upgrade() -> None:
op.execute("CREATE INDEX IF NOT EXISTS idx_videos_youtube_id ON videos(youtube_id)")
def downgrade() -> None:
op.execute("DROP INDEX IF EXISTS idx_videos_youtube_id")For more examples including:
- Adding tables
- Data migrations
- Enum modifications
- Triggers and functions
- Concurrent indexes
- Constraint additions
See alembic/MIGRATION_TEMPLATE.md.
We maintain high code quality standards through automated linting, formatting, and type checking.
Linting:
# Check code quality
ruff check app/ worker/ scripts/
# Auto-fix issues
ruff check --fix app/ worker/ scripts/Formatting:
# Check formatting
black --check app/ worker/ scripts/
# Auto-format
black app/ worker/ scripts/Import Sorting:
# Check imports
isort --check-only app/ worker/
# Auto-sort imports
isort app/ worker/Type Checking:
# Run type checks
mypy app/ worker/Configuration:
- Line length: 120 characters
- Target Python version: 3.11+
- Configuration in
pyproject.toml
Linting:
cd frontend
npm run lintFormatting:
cd frontend
# Check formatting
npm run format:check
# Auto-format
npm run formatType Checking:
cd frontend
npx tsc --noEmitBuilding:
cd frontend
npm run buildWe use pre-commit hooks to catch issues before they're committed:
# Install hooks (one-time setup)
pre-commit install
# Run manually on all files
pre-commit run --all-filesThe hooks automatically run:
- ruff: Fast Python linter
- black: Code formatter
- isort: Import sorting
- mypy: Type checking
- gitleaks: Secret detection
- commitlint: Commit message validation
- Additional checks for trailing whitespace, YAML/JSON/TOML syntax
We follow Conventional Commits for automated changelog generation and semantic versioning.
Format:
<type>: <subject>
[optional body]
[optional footer(s)]
Types:
feat: New feature (triggers MINOR version bump)fix: Bug fix (triggers PATCH version bump)docs: Documentation changesstyle: Code style changes (formatting, semicolons, etc.)refactor: Code refactoring without changing functionalityperf: Performance improvementstest: Adding or updating testsbuild: Build system or dependency changesci: CI/CD configuration changeschore: Other changes that don't modify src or test filesrevert: Reverting a previous commit
Breaking Changes:
Add BREAKING CHANGE: in the footer or append ! after the type (triggers MAJOR version bump):
feat!: remove support for Python 3.10
BREAKING CHANGE: Minimum Python version is now 3.11
Examples:
feat: add speaker diarization support
fix: resolve memory leak in audio processing
docs: update API authentication guide
perf: optimize database query for search
test: add integration tests for job creation
ci: add release workflow for automated versioningUsing Commitizen (Interactive):
If you prefer an interactive prompt:
# Install commitizen (one-time)
npm install
# Use interactive commit
npm run commitThe pre-commit hook will validate your commit message format automatically.
All pull requests and pushes to main automatically trigger our CI/CD pipeline.
Runs on changes to:
app/**worker/***.pyfilesrequirements.txtpyproject.toml
Jobs:
-
Lint & Format Check (Python 3.11, 3.12)
- ruff check
- black check
- isort check
- mypy type check (informational)
-
Security Scan
- pip-audit (dependency vulnerabilities)
- bandit (code security issues)
-
Test with PostgreSQL
- Apply database schema
- Run pytest suite with coverage
- Generate coverage reports (XML, HTML, terminal)
- Check 70%+ coverage threshold
- Upload coverage artifacts
- Add GitHub Actions summary with coverage stats
-
Docker Build
- Build Docker image (CPU-compatible check)
- Verify image builds successfully
Runs on changes to:
frontend/**
Jobs:
-
Lint & Type Check (Node 20, 22)
- ESLint (informational - some errors pre-existing)
- Prettier formatting (enforced)
- TypeScript type check
-
Build Verification
- Vite build
- Bundle size check (warns if > 500KB)
- Upload build artifacts
Runs on:
- Push to
main - Tags matching
v* - Manual workflow dispatch
Features:
- Builds Docker image with ROCm support
- Publishes to GitHub Container Registry (ghcr.io)
- Multiple tagging strategies (latest, semver, sha)
- Layer caching for fast rebuilds
- SBOM and provenance attestations
- Build time verification (target < 5 min with cache)
Runs on:
- Push/PR to main/develop (when dependency files change)
- Weekly schedule (Mondays at 9 AM UTC)
- Manual workflow dispatch
Checks:
- Dependency vulnerabilities (pip-audit, safety)
- Secret scanning (gitleaks)
-
Create a branch from
mainwith a descriptive name- Feature:
feature/description - Bug fix:
fix/description - Enhancement:
enhance/description
- Feature:
-
Make your changes
- Write clear, concise commit messages
- Follow code quality guidelines
- Add tests if applicable
-
Test locally
# Backend pytest tests/ ruff check app/ worker/ black --check app/ worker/ # Frontend cd frontend npm run lint npm run format:check npm run build
-
Run pre-commit hooks
pre-commit run --all-files
-
Push and create PR
- All CI checks must pass (see status badges on PR)
- Provide clear description of changes
- Link related issues
-
Code Review
- Address reviewer feedback
- Ensure all CI checks remain green
-
Merge
- Once approved and all checks pass, maintainers will merge
The main branch is protected with the following requirements:
Before merging to main, the following checks must pass:
Backend CI:
- ✅ Lint & Format Check (Python 3.11)
- ✅ Lint & Format Check (Python 3.12)
- ✅ Security Scan
- ✅ Test with PostgreSQL
- ✅ Docker Build
Frontend CI:
- ✅ Lint & Type Check (Node 20)
- ✅ Lint & Type Check (Node 22)
- ✅ Build Verification
Note: Some checks use continue-on-error: true for informational warnings (mypy, ESLint some rules, security scans) that don't block merges but should be addressed when possible.
- Require branches to be up to date: PRs must be rebased on latest main
- Require pull request reviews: At least one approving review from maintainers
- Dismiss stale reviews: New commits dismiss previous approvals
- No force pushes: Protect commit history
- Linear history: Prefer squash or rebase merges
Our CI/CD is designed for fast feedback:
- Target: Most checks complete in < 5 minutes
- Docker builds: < 5 min with layer caching, < 15 min cold
- Full test suite: < 3 minutes
If checks take significantly longer, please report as an issue.
👋 New to the project? Welcome! We're here to help.
Start here:
- Read our First-Time Contributors Guide for a step-by-step walkthrough
- Look for issues labeled
good first issue- these are beginner-friendly - Check out our Development Setup guide
- Don't hesitate to ask questions!
Tips for success:
- Start small - documentation fixes and small bug fixes are great first contributions
- Ask questions early and often - we're happy to help
- Read existing code and pull requests to understand our style
- Join discussions in issues to learn more about the project
If you have questions or need help:
- Documentation: Check the README.md and docs/ folder
- Existing Issues: Search existing issues for similar questions
- Ask a Question: Open a new issue with the question template
- Development Questions: Check docs/development/ for architecture and code guidelines
We strive to respond to all questions within 48 hours. Don't be shy - there are no stupid questions!
Please review docs/security.md for:
- Reporting security vulnerabilities
- Secrets management guidelines
- Production security checklist
- Dependency update procedures
All contributors are recognized in our docs/contributors.md file. Your contributions, big or small, are valuable to us!
By contributing to Transcript Create, you agree that your contributions will be licensed under the Apache License 2.0. See LICENSE for details.
Thank you for contributing! 🚀 Your help makes Transcript Create better for everyone.