WikiRacer is a Java program that finds a hyperlink path between two pages in the English Wikipedia, blazingly fast!
Originally built as a school project in February 2024. Uploaded to GitHub with code cleanup and aditional performance improvements in October 2024.
Use it for exploration, or to troll your friends by cheating at an online wiki-race.
https://en.wikipedia.org/wiki/Wikipedia:Wiki_Game:
[Wikirace] is a race between any number of participants, using wikilinks to travel from one Wikipedia page to another. The first person to reach the destination page, or the person that reaches the destination using the fewest links, wins the race.
WikiRacer uses only runtime calls to Wikimedia API, to links and linkshere endpoints with no cached data.
java -jar wikiracer.jar
Optional parameters:
--start [page title] // specify starting page
--finish [page title] // specify ending page
-q // quiet mode, produces less output log
-b // benchmark mode, produces an extra pipe delimited string to save to csv
-i // interactive mode, uses chrome-cli to guide the user through the found path
Important
This mode only works on MacOS with Chrome browser and chrome-cli installed.
This mode guides you on the found path by scrolling to and highlighting links in your browser. Verified to work with en.wikipedia.org and wiki-race.com
- Start
wikiracer.jarwith the-iparameter while the Wikipedia page is currently open in the background. - Input the end page in the console.
- Wait for the path to be found and click the highlighted links.
- Enjoy!
WikiRacer was benchmarked on 350 pairs of random Wikipedia articles. See benchmark.csv for its log.
It achieved a total run time of 1.933 seconds (median) or 2.033 seconds (average).
The average path length was only 5.90 articles (including the starting one), meaning the median race would take only 5 hyperlink clicks to finish.
Comparatively, Six Degrees of Wikipedia achieves a runtime of 2.532 seconds (median) or 3.992 seconds (average) with an average path length of 5.157 articles. See below for more details on project.
Six Degrees of Wikipedia project uses a different approach, searching through the graph of 500M+ links released by Wikimedia in SQL dumps. As I'm writing this, the latest pagelinks dump is 6.78 GB in size, page is 2.25 GB, and redirects is 170 MB.
It represents the "proper" approach, finding all paths of minimal length simultaneously. Therefore, it was chosen as a target for the benchmark as I was wondering how close to its length I could come without using stored data.
However, that uses up a big chunk of RAM and requires ~10 GB of data has to be continuously re-downloaded and rebuilt to keep up with Wikipedia changes, which a a significant limitation.


