Skip to content

Real-time streaming pipelines using Apache Flink (PyFlink) with Kafka, RocksDB, and Elasticsearch

Notifications You must be signed in to change notification settings

usefusefi/flink-streaming-optimization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Apache Flink Streaming Optimization Guide

Real-Time Data Processing with Flink & Kafka

This repository contains high-performance Flink streaming pipelines, covering:
Event-Time Processing with Watermarks
Stateful Processing using RocksDB
Optimizing Checkpoints & Fault Tolerance
Kafka + Flink + Elasticsearch Pipeline

Read the Full Article on Medium: Link to Medium


Repository Structure

  • /code_examples/ → PyFlink scripts for real-time processing.
  • /notebooks/ → Jupyter Notebook for interactive learning.
  • /configs/ → Flink tuning configurations & scripts.

How to Use

1️⃣ Clone this Repository

git clone https://github.com/usefusefi/flink-streaming-optimization.git
cd flink-streaming-optimization

2️⃣ Run Example Flink Jobs

python code_examples/event_time_watermarks.py

3️⃣ Run Jupyter Notebook

jupyter notebook notebooks/flink_streaming.ipynb

About

Real-time streaming pipelines using Apache Flink (PyFlink) with Kafka, RocksDB, and Elasticsearch

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published