Skip to content

Enhance Typesense search functionality with advanced features#309

Draft
Pierre-VF wants to merge 1 commit into
mainfrom
vibe/typesense-search-improvements-38214d
Draft

Enhance Typesense search functionality with advanced features#309
Pierre-VF wants to merge 1 commit into
mainfrom
vibe/typesense-search-improvements-38214d

Conversation

@Pierre-VF

Copy link
Copy Markdown
Owner

Typesense Search Functionality Enhancements

This PR significantly improves the search functionality of the web app with Typesense, adding advanced search capabilities, better performance, and enhanced user experience.

🚀 Major Features Added

🔍 Advanced Search Capabilities

  • Fuzzy Search: Added typo tolerance with configurable edit distance
  • Hybrid Search: Enhanced keyword + vector search with adjustable weighting (alpha parameter)
  • Prefix/Infix Search: Better partial matching for autocomplete and suggestions
  • Improved Scoring: Field-specific weights (name: 3.0, organisation: 2.5, description: 1.5, etc.)

📊 Schema Enhancements

  • Added comprehensive searchable fields: website, license_url, forked_from, master_branch, readme_type, open_pull_requests, latest_update
  • Added all_languages array field for multi-language projects
  • Added search_text field combining all searchable content for better full-text search
  • Enhanced tokenization with custom separators and symbol handling
  • Added infix search capabilities with minimum length configuration

🎯 New API Endpoints

  • GET /search/advanced - Full-featured search with all advanced options
  • GET /search/autocomplete - Autocomplete suggestions for search queries
  • GET /search/suggestions - Rich suggestions with project context

🔧 Enhanced Existing Endpoints

  • GET /search now supports fuzzy_search, hybrid_search, sort_by, sort_order, exclude_forks, exclude_inactive, min_last_commit_days

🖥️ UI Improvements

  • Added fuzzy search toggle in the web interface
  • Enhanced search forms to preserve all search parameters across navigation
  • Improved filtering and sorting options in the results page

⚡ Performance Optimizations

  • Batch indexing for better performance (100 documents at a time)
  • Combined search text field reduces need for multi-field queries
  • Optimized field indexing and faceting

🔄 Backwards Compatibility

✅ Fully backwards compatible - All existing functionality is preserved

📁 Files Modified

  1. src/oss4climate_app/src/search/typesense_io.py - Core search functionality enhancements
  2. src/oss4climate_app/src/routers/api.py - New and enhanced API endpoints
  3. src/oss4climate_app/src/routers/ui.py - UI integration of new features
  4. src/oss4climate_app/templates/v2/search.html - Search form with fuzzy search option
  5. src/oss4climate_app/templates/v2/results.html - Results page with preserved search parameters

🧪 Testing

✅ Syntax validation passed
✅ Import compatibility verified
✅ Function signature compatibility maintained
✅ New functionality accessible

🎉 Next Steps

To deploy these improvements:

  1. Update Typesense Schema by running the seeding script
  2. Reindex all projects with the new fields
  3. Test thoroughly and monitor performance

The changes are ready for review and can be merged once approved.

Major improvements to the search functionality:

## Schema Enhancements
- Added comprehensive searchable fields: website, license_url, forked_from, master_branch, readme_type, open_pull_requests, latest_update
- Added all_languages as array field for multi-language projects
- Added search_text field for optimized full-text search
- Enhanced tokenization with custom separators and symbol handling
- Added infix search capabilities for partial matching

## Advanced Search Features
- Implemented fuzzy search with configurable edit distance
- Added prefix and infix matching for better partial query handling
- Enhanced hybrid search (keyword + vector) with configurable alpha weighting
- Improved scoring with field-specific weights (name: 3.0, organisation: 2.5, etc.)
- Added reranking for hybrid search results

## API Enhancements
- Added /search/advanced endpoint with full parameter support
- Added /search/autocomplete endpoint for search suggestions
- Added /search/suggestions endpoint for rich suggestions with context
- Enhanced /search endpoint with new parameters: fuzzy_search, hybrid_search, sort_by, sort_order, exclude_forks, exclude_inactive, min_last_commit_days

## UI Improvements
- Added fuzzy search toggle in web interface
- Enhanced search forms to preserve all search parameters
- Improved result filtering and sorting options

## Backend Improvements
- Batch indexing for better performance (100 documents at a time)
- Enhanced filtering with support for multiple languages and licenses
- Better handling of optional fields and null values
- Improved date handling with timestamp conversions
- Added comprehensive error handling

## New Functions
- autocomplete(): Get search suggestions for partial queries
- get_search_suggestions(): Get rich suggestions with project context
- Enhanced search_with_query() with advanced options
- Improved index_data_in_typesense() with batch processing

## Performance Optimizations
- Combined search text field reduces need for multi-field queries
- Batch document indexing reduces API calls
- Optimized field indexing and faceting

## Backwards Compatibility
- All existing functionality preserved
- New parameters are optional with sensible defaults
- Existing API endpoints unchanged in behavior
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants