- Anastasiia Havryliv
- Yaroslav Romanus
The aim of this project is to implement a system, which will process Wikipedia stream data and process it for further access to it from REST API endpoints. Several technologies were used in the project, including Kafka, Cassandra, Spark (for batch processing), Spark Streaming and FastAPI.
System design diagram and detailed description of all of the components can be found here
Results from endpoint requests along with demonstation picture of running system in Docker (working containers) can be found here
{ more detailed urls:
To run the system (write in terminal):
docker-compose up
To shut the system down (write in terminal):
docker-compose down