Skip to content

Add CLI support for schemas (--schema, --list-schemas) #15

@MALathon

Description

@MALathon

Summary

Add CLI arguments to use schemas and list available schemas.

New CLI Arguments

--schema SCHEMA    Schema to use: 'auto', or schema name (e.g., 'springer_book')
--list-schemas     List all available schemas and exit

Implementation

Argument Parser

# cli.py
parser.add_argument(
    '--schema',
    type=str,
    metavar='SCHEMA',
    help="Schema to use: 'auto' for auto-detection, or schema name"
)

parser.add_argument(
    '--list-schemas',
    action='store_true',
    help='List all available schemas and exit'
)

Main Function

def main(argv=None):
    args = parser.parse_args(argv)
    
    # Handle --list-schemas
    if args.list_schemas:
        from fetcharoo.schemas import get_all_schemas
        schemas = get_all_schemas()
        if schemas:
            print("Available schemas:")
            for name, schema in sorted(schemas.items()):
                desc = schema.description or "No description"
                print(f"  {name:20} {desc}")
        else:
            print("No schemas registered.")
        return 0
    
    # Pass schema to download function
    result = download_pdfs_from_webpage(
        url=args.url,
        # ... other args ...
        schema=args.schema,
    )

Usage Examples

# List available schemas
$ fetcharoo --list-schemas
Available schemas:
  arxiv                arXiv preprint paper
  generic              Generic fallback for unknown sites
  springer_book        Springer book with chapters

# Auto-detect schema
$ fetcharoo https://link.springer.com/book/10.1007/978-3-031-41026-0 --schema auto
Detected schema: springer_book
Downloading PDFs...

# Use specific schema
$ fetcharoo https://example.com/book --schema springer_book

# Schema with other options (explicit options override schema)
$ fetcharoo https://link.springer.com/book/... --schema auto --sort-by alpha

Tasks

  • Add --schema argument to parser
  • Add --list-schemas argument
  • Implement --list-schemas output formatting
  • Pass schema to download_pdfs_from_webpage()
  • Update help text and examples in epilog
  • Add CLI tests for new arguments

Acceptance Criteria

  • fetcharoo --list-schemas shows all schemas with descriptions
  • fetcharoo URL --schema auto auto-detects and uses schema
  • fetcharoo URL --schema springer_book uses named schema
  • Schema works with other CLI options (merge, sort, etc.)

Dependencies

Part of

Parent issue: #10

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions