Skip to content

Playmaker is Crawler-based search engine that demonstrates the main features of a search engine (web crawling, indexing and ranking) and the interaction with it along a friendly user interface.

Notifications You must be signed in to change notification settings

Roshdy23/Playmaker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

73 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Playmaker

This project is a search engine, including web crawling, indexing, ranking, and query processing.

Built With

Java SpringBoot React

Demo Video

Playmaker-SearchEngine.mp4

Search Engine Modules

Web Crawler

  • The web crawler is responsible for collecting documents from the web.
  • It starts with a list of URL addresses (seed set) and downloads the documents identified by these URLs.
  • Extracts hyperlinks from downloaded documents and adds them to the list of URLs to be downloaded.
  • Key features:
    • Avoids revisiting the same page.
    • Crawls documents of specific types (HTML).
    • Maintains state for resuming interrupted crawls.
    • Handles robot.txt exclusions.
    • Provides multithreaded implementation.
    • Crawls a specified number of pages.
    • Uses appropriate data structures for page visit order.

Indexer

  • Indexes the contents of downloaded HTML documents.
  • Features:
    • Persistence in secondary storage.
    • Fast retrieval for word-based queries.
    • Incremental update with newly crawled documents.
    • Considers storage for result ranking and searching.

Query Processor

  • Processes search queries.
  • Performs necessary preprocessing and searches the index for relevant documents.
  • Retrieves documents containing words with shared stems from the search query.

Phrase Searching

  • Supports phrase searching with quotation marks.
  • Results must match the order of words in the phrase.

Ranker

  • Ranks documents based on relevance and popularity.
  • Calculates relevance based on query-word appearance and aggregation.
  • Measures popularity using algorithms like PageRank.

Web Interface

  • Implements a user-friendly web interface.
  • Receives user queries and displays search results with snippets.
  • Displays website title, URL, and relevant paragraph with query words in bold.

How to Run

  1. Clone the repository.
  2. Install required dependencies.
  3. Run the main application file.
  4. Access the React web interface.

About

Playmaker is Crawler-based search engine that demonstrates the main features of a search engine (web crawling, indexing and ranking) and the interaction with it along a friendly user interface.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages