Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 47 additions & 0 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
name: Build

on:
push:
branches: [ main, master, develop ]
pull_request:
branches: [ main, master, develop ]
workflow_dispatch:

jobs:
build:
name: Build on Python ${{ matrix.python-version }}
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
python-version: ['3.8', '3.9', '3.10', '3.11', '3.12']

steps:
- uses: actions/checkout@v4

- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}

- name: Bootstrap development environment
run: ./script/bootstrap

- name: Build project
run: ./script/build

- name: Upload build artifacts
if: always()
uses: actions/upload-artifact@v4
with:
name: dist-${{ matrix.python-version }}
path: dist/
if-no-files-found: warn

- name: Upload test results
if: always()
uses: actions/upload-artifact@v4
with:
name: test-results-build-${{ matrix.python-version }}
path: test-results/
if-no-files-found: ignore
166 changes: 166 additions & 0 deletions MIGRATION.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,166 @@
# Migration to pyproject.toml and API Improvements

This document describes the changes made to migrate the project to modern Python packaging standards and improve the API.

## Changes Made

### 1. Migration to pyproject.toml

The project has been migrated from `setup.py` to `pyproject.toml`, following PEP 517/518 standards for modern Python packaging.

- **New file**: `pyproject.toml` - Contains all project metadata, dependencies, and build configuration
- **Status of setup.py**: The old `setup.py` file is still present for compatibility but is no longer the primary packaging configuration

### 2. Code Restructuring

The implementation code has been moved from `setlr/__init__.py` to `setlr/core.py` following best practices:

- **setlr/core.py**: Contains all implementation code (916+ lines)
- **setlr/__init__.py**: Now serves as a clean public API interface (~90 lines)

This separation provides:
- Better code organization
- Clearer public API surface
- Easier maintenance
- Improved IDE support and code navigation

### 3. New Public API: `run_setl()`

A new, well-documented public function `run_setl()` has been introduced:

```python
from rdflib import ConjunctiveGraph
from setlr import run_setl

# Load a SETL script
setl_graph = ConjunctiveGraph()
setl_graph.parse("my_script.setl.ttl", format="turtle")

# Execute the script
resources = run_setl(setl_graph)

# Access generated resources
output_graph = resources['http://example.com/output']
```

**Features:**
- Comprehensive docstring with examples
- Proper type hints in documentation
- Clear description of parameters and return values
- Usage examples

### 4. Backward Compatibility

The old `_setl()` function is still available for backward compatibility:

```python
from setlr import _setl # Still works, but deprecated

# Old code continues to work
resources = _setl(setl_graph)
```

**Deprecation Warning:**
- Using `_setl()` will emit a `DeprecationWarning`
- The warning suggests using `run_setl()` instead
- No breaking changes - existing code continues to work

### 5. Exported API

The following are now officially exported from the `setlr` package:

**Main Functions:**
- `run_setl()` - Primary API function (recommended)
- `_setl()` - Deprecated, use `run_setl()` instead
- `main()` - CLI entry point

**Utility Functions:**
- `read_csv()`, `read_excel()`, `read_json()`, `read_xml()`, `read_graph()`
- `extract()`, `json_transform()`, `transform()`, `load()`
- `isempty()`, `hash()`, `camelcase()`, `get_content()`

**Namespaces:**
- `csvw`, `ov`, `setl`, `prov`, `pv`, `sp`, `sd`, `dc`, `void`, `shacl`, `api_vocab`

## Migration Guide for Users

### If you were using `_setl()`:

**Before:**
```python
from setlr import _setl

resources = _setl(setl_graph)
```

**After (recommended):**
```python
from setlr import run_setl

resources = run_setl(setl_graph)
```

**Note:** Your old code will continue to work, but you'll see a deprecation warning. Update at your convenience.

### If you were importing internal functions:

**Before:**
```python
from setlr import read_csv, extract
```

**After:**
```python
from setlr import read_csv, extract # Still works!
```

No changes needed - all utility functions are properly exported.

## For Package Maintainers

### Building the Package

With pyproject.toml, you can now build the package using modern tools:

```bash
# Install build tool
pip install build

# Build the package
python -m build
```

This creates both wheel and source distributions in the `dist/` directory.

### Installing from Source

```bash
# Development installation
pip install -e .

# Regular installation
pip install .
```

### Running Tests

```bash
# Install test dependencies
pip install nose2 coverage

# Run tests
nose2 --verbose
```

## Benefits of This Migration

1. **Modern Standards**: Uses PEP 517/518 standards for Python packaging
2. **Better Documentation**: Clear, comprehensive API documentation
3. **Improved Structure**: Cleaner separation between public API and implementation
4. **Backward Compatible**: No breaking changes for existing users
5. **Future-Proof**: Follows current Python best practices
6. **Better IDE Support**: Clearer module structure aids code completion and navigation

## Questions or Issues?

If you encounter any issues with the migration or have questions about the new API, please open an issue on GitHub.
53 changes: 53 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
[build-system]
requires = ["setuptools>=68.0", "wheel"]
build-backend = "setuptools.build_meta"

[project]
name = "setlr"
version = "1.0.1"
description = "setlr is a tool for Semantic Extraction, Transformation, and Loading."
readme = "README.md"
license = {text = "Apache License 2.0"}
authors = [
{name = "Jamie McCusker", email = "mccusj@cs.rpi.edu"}
]
keywords = ["rdf", "semantic", "etl"]
classifiers = [
"Development Status :: 5 - Production/Stable",
"Topic :: Utilities",
"License :: OSI Approved :: Apache Software License",
]
requires-python = ">=3.8"
dependencies = [
"future",
"cython",
"numpy",
"rdflib>=6.0.0",
"pandas>=0.23.0",
"requests",
"toposort",
"beautifulsoup4",
"jinja2",
"lxml",
"six",
"xlrd",
"ijson",
"click",
"tqdm",
"requests-testadapter",
"python-slugify",
"pyshacl[js]",
]

[project.urls]
Homepage = "http://packages.python.org/setlr"

[project.scripts]
setlr = "setlr:main"

[tool.setuptools]
packages = ["setlr"]
include-package-data = true

[tool.setuptools.package-data]
setlr = ["**/*"]
101 changes: 101 additions & 0 deletions script/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
# Development Scripts

This directory contains scripts for setting up, building, and releasing the setlr project.

## Scripts

### `bootstrap`

Set up a virtual environment suitable for developing and using the project, including all package requirements for build and release.

**Usage:**
```bash
./script/bootstrap
```

This script will:
- Create a Python virtual environment in `venv/`
- Install the project in editable mode with all dependencies
- Install development dependencies (nose2, coverage, flake8, pylint, etc.)
- Install build and release tools (build, wheel, twine)

**After running bootstrap:**
```bash
source venv/bin/activate # Activate the virtual environment
```

### `build`

Build the project packages and run all tests and checks.

**Usage:**
```bash
./script/build
```

This script will:
- Activate the virtual environment (if it exists)
- Clean previous build artifacts
- Run linting checks with flake8
- Run all tests with nose2
- Build distribution packages (wheel and source tarball)

**Output:**
- `dist/setlr-*.whl` - Wheel distribution
- `dist/setlr-*.tar.gz` - Source distribution

### `release`

Upload the current version of the project to PyPI using twine.

**Usage:**
```bash
./script/release
```

This script will:
- Activate the virtual environment (if it exists)
- Check that distribution files exist
- Validate distribution files with twine
- Prompt for confirmation before uploading
- Upload to PyPI (requires PyPI credentials or API token)

**Prerequisites:**
- Run `./script/build` first to create distribution files
- Have PyPI credentials or API token ready

**Authentication:**
You can provide credentials via:
- Interactive prompt (default)
- Environment variables: `TWINE_USERNAME` and `TWINE_PASSWORD`
- PyPI API token: Set `TWINE_PASSWORD` to your `pypi-...` token

## Typical Workflow

```bash
# 1. Set up development environment (first time only)
./script/bootstrap
source venv/bin/activate

# 2. Make your changes to the code
# ... edit files ...

# 3. Build and test
./script/build

# 4. If all tests pass and you're ready to release
./script/release
```

## Requirements

- Python 3.8 or higher
- Bash shell (Linux/macOS/WSL on Windows)
- Internet connection (for downloading dependencies)

## Notes

- The virtual environment (`venv/`) is automatically excluded from git via `.gitignore`
- All scripts use color output for better readability
- The `build` script will fail if tests don't pass
- The `release` script requires confirmation before uploading to PyPI
Loading
Loading