Requirements:
- Python 3.x
- Kafka 2.1.0
- Elasticsearch 6.7.0
- Flask
- Client APIs and packages
pip install -r requirements.txt
Setup:
-
Start Zookeeper
zookeeper-server-start /usr/local/etc/kafka/zookeeper.properties -
Start Kafka
kafka-server-start /usr/local/etc/kafka/server.properties -
Create topics
kafka-topics --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic twitter2kafka kafka-topics --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic kafka2sketch -
Start Elasticsearch
elasticsearch -
Create Elasticsearch index
curl -XPUT "http://localhost:9200/tweets" -
Navigate to the 'code' directory and start the services:
python streaming/twitter_to_kafka.py python streaming/kafka_to_elastic.py python app/application.py -
Initialize the stateful count-min sketch:
curl -v http://127.0.0.1:5000/initialize -
Call API endpoints listed as routes in
app/application.pyto get the data