Contributing to PipeFrame

First off, thank you for considering contributing to PipeFrame! It's people like you that make PipeFrame such a great tool for the data science community.

🌟 Ways to Contribute

🐛 Report Bugs

Found a bug? Please open an issue with:

Clear, descriptive title
Steps to reproduce
Expected vs actual behavior
Code samples
Environment details (Python version, OS, pipeframe version)

💡 Suggest Features

Have an idea? We'd love to hear it! Include:

Use case description
Proposed API (how it would work)
Example code showing the feature
Why it would be useful

📝 Improve Documentation

Help others learn PipeFrame:

Fix typos or clarify explanations
Add examples
Create tutorials
Improve docstrings

🔧 Submit Code

Ready to code? Awesome! See development setup below.

🚀 Development Setup

1. Fork and Clone

# Fork the repository on GitHub, then:
git clone https://github.com/YOUR_USERNAME/pipeframe.git
cd pipeframe

2. Create Virtual Environment

# Using venv
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Using conda
conda create -n pipeframe python=3.10
conda activate pipeframe

3. Install Development Dependencies

# Install package in editable mode with dev dependencies
pip install -e ".[dev,test]"

# Or install from requirements
pip install -r requirements-dev.txt

4. Set Up Pre-commit Hooks

pre-commit install

🔨 Development Workflow

1. Create a Branch

git checkout -b feature/your-feature-name
# or
git checkout -b fix/issue-number-description

2. Make Changes

Follow these guidelines:

Write clear, readable code
Add docstrings (Google style)
Include type hints
Add tests for new features
Update documentation

3. Run Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=pipeframe --cov-report=html

# Run specific test
pytest tests/test_dataframe.py::test_filter

4. Format Code

# Format with black
black pipeframe/

# Sort imports
isort pipeframe/

# Lint
flake8 pipeframe/

# Type check
mypy pipeframe/

5. Commit Changes

git add .
git commit -m "feat: add amazing new feature"

Commit Message Format:

feat: New feature
fix: Bug fix
docs: Documentation changes
test: Test additions/changes
refactor: Code refactoring
perf: Performance improvements
chore: Maintenance tasks

6. Push and Create PR

git push origin feature/your-feature-name

Then create a Pull Request on GitHub with:

Clear description of changes
Link to related issues
Screenshots/examples if applicable

📋 Code Style Guide

Python Style

Follow PEP 8
Use Black for formatting (line length: 100)
Use isort for import sorting
Type hints required for public APIs

Docstrings

Use Google style:

def awesome_function(param1: str, param2: int = 0) -> DataFrame:
    """
    Brief description of what this does.
    
    More detailed explanation if needed. Can span multiple
    lines and include usage notes.
    
    Args:
        param1: Description of param1
        param2: Description of param2. Defaults to 0.
    
    Returns:
        Description of return value
    
    Raises:
        ValueError: When param1 is empty
    
    Examples:
        >>> result = awesome_function("hello", 42)
        >>> print(result)
    """
    pass

Type Hints

from typing import Any, List, Optional, Union
from pipeframe.core.dataframe import DataFrame

def process_data(
    df: Union[DataFrame, pd.DataFrame],
    columns: Optional[List[str]] = None,
    **kwargs: Any
) -> DataFrame:
    ...

🧪 Testing Guidelines

Writing Tests

import pytest
from pipeframe import DataFrame, filter, define

class TestDataFrame:
    def test_filter_basic(self):
        """Test basic filtering functionality."""
        df = DataFrame({'x': [1, 2, 3, 4]})
        result = df >> filter('x > 2')
        assert len(result) == 2
    
    def test_filter_empty_result(self):
        """Test filtering that returns no rows."""
        df = DataFrame({'x': [1, 2, 3]})
        result = df >> filter('x > 10')
        assert len(result) == 0
    
    def test_filter_invalid_column(self):
        """Test error handling for invalid column."""
        df = DataFrame({'x': [1, 2, 3]})
        with pytest.raises(PipeFrameColumnError):
            df >> filter('y > 2')

Test Organization

One test file per module
Test classes for related tests
Clear test names describing what's tested
Test both success and error cases
Test edge cases

📚 Documentation Guidelines

README Updates

Keep examples simple and focused
Ensure all code examples actually work
Update table of contents if adding sections

API Documentation

Every public function/class needs docstring
Include parameters, returns, raises
Add usage examples
Note any security considerations

Tutorial Notebooks

Start simple, build complexity
Explain the "why" not just "how"
Include real-world examples
Test all code cells

🔍 Code Review Process

What We Look For

✅ Tests pass
✅ Code is formatted (black, isort)
✅ Type hints present
✅ Docstrings complete
✅ No breaking changes (or clearly documented)
✅ Performance impact considered
✅ Security implications reviewed

Review Timeline

Initial response: Within 2 days
Full review: Within 1 week
Revisions: As needed

🎯 Priority Areas

We especially welcome contributions in:

Performance Optimization
- Profiling and benchmarking
- Vectorization improvements
- Memory efficiency
Additional Verbs
- Join operations
- Window functions
- Time series helpers
Backend Support
- Polars integration
- DuckDB support
- Arrow format
Documentation
- More examples
- Video tutorials
- Translation
Testing
- Edge case coverage
- Performance tests
- Integration tests

💬 Communication

Getting Help

GitHub Discussions: Ask questions, share ideas
Issues: Bug reports, feature requests
Email: yasser.mustafan@gmail.com

Proposing Major Changes

For substantial changes:

Open an issue first
Discuss the approach
Get feedback before coding
Then submit PR

📜 License

By contributing, you agree that your contributions will be licensed under the MIT License.

🙏 Recognition

Contributors are recognized in:

README.md contributors section
Release notes
Annual contributor highlight

❓ Questions?

Don't hesitate to ask! We're here to help:

Open a discussion on GitHub
Email: yasser.mustafan@gmail.com
Tag @Yasser03 in issues

Thank you for making PipeFrame better! 🎉

FilesExpand file tree

CONTRIBUTING.md

Latest commit

History