JLogShip is a lightweight, high-performance log shipping engine written in Java, inspired by Filebeat. It efficiently tails log files, processes events, and ships them to various outputs like Elasticsearch and Kafka with at-least-once delivery guarantees.
- File Tailing: Efficiently monitors and tails log files in real-time
- Multiple Outputs: Support for Elasticsearch and Kafka destinations
- State Persistence: Registry-based offset tracking for reliable resumption
- Configurable: YAML-based configuration for inputs and outputs
- Multithreaded: Concurrent processing with thread pools for optimal performance
- Docker Ready: Pre-configured Docker Compose setups for ELK stack and Kafka
- At-Least-Once Delivery: Ensures no log events are lost during shipping
JLogShip follows a modular architecture with the following key components:
- Config Loader: Loads runtime configuration from YAML files
- Prospector: Discovers and monitors log files using glob patterns
- Harvester: Reads individual files line-by-line, handling log rotation
- Registry: Persists file offsets and metadata for state management
- Event Queue: Batches log events for efficient processing
- Output Workers: Ships batched events to configured destinations
Config Loader β Prospector β Harvester β Event Queue β Output Worker β Elasticsearch/Kafka
β β β β β
YAML File Scan Line Read Batching Shipping
- Java 21 or higher
- Maven 3.8+
- Docker and Docker Compose (for infrastructure setup)
-
Clone the repository:
git clone https://github.com/yourusername/JLogShip.git cd JLogShip/logshipengine -
Build the project:
mvn clean compile
JLogShip uses YAML configuration files. The main configuration is in src/main/resources/application.yml.
inputs:
- type: log
paths:
- "/var/log/*.log"
- "/app/logs/**/*.log"
output:
type: elasticsearch # or kafka
hosts: ["http://localhost:9200"]
index: "logship-events"
# For Kafka output:
# output:
# type: kafka
# hosts: ["localhost:9092"]
# topic: "logship-events"
registry:
path: "./registry/registry.json"- inputs: Array of input configurations
type: Input type (currently supports "log")paths: Glob patterns for log file paths
- output: Output destination configuration
type: Output type ("elasticsearch" or "kafka")hosts: Array of host URLsindex/topic: Destination index/topic name
- registry: State persistence configuration
path: Path to registry JSON file
- Configure your settings in
application.yml - Run the application:
mvn exec:java -Dexec.mainClass="org.jlogship.App"
JLogShip includes Docker Compose files for setting up infrastructure:
cd infra/util
docker-compose -f docker-setup-elk.yaml up -dThis starts:
- Elasticsearch on
http://localhost:9200 - Kibana on
http://localhost:5601
cd infra/util
docker-compose -f docker-setup-kafka.yaml up -dThis starts:
- Zookeeper
- Kafka broker on
localhost:9092 - Kafka UI on
http://localhost:8080
mvn testJLogShip/
βββ concepts.md # Core concepts documentation
βββ design-doc.md # Detailed design documentation
βββ logshipengine/ # Main Maven project
β βββ pom.xml # Maven configuration
β βββ infra/
β β βββ util/ # Docker Compose files
β βββ src/
β β βββ main/
β β β βββ java/org/jlogship/
β β β β βββ App.java # Main application entry point
β β β β βββ config/ # Configuration loading
β β β β βββ harvester/ # File reading components
β β β β βββ model/ # Data models
β β β β βββ output/ # Output plugins
β β β β βββ pipeline/ # Event processing pipeline
β β β β βββ prospector/ # File discovery
β β β β βββ registry/ # State persistence
β β β β βββ util/ # Utilities
β β β βββ resources/ # Configuration files
β β βββ test/ # Unit tests
β βββ logs/ # Sample log files
βββ README.md # This file
mvn clean install- Harvester: Handles individual file tailing and line reading
- Prospector: Manages file discovery and harvester lifecycle
- Output Plugins: Pluggable architecture for different destinations
- Registry: JSON-based state store for offset tracking
Implement the Output interface and register via ServiceLoader:
public interface Output {
void send(LogEvent event) throws Exception;
void close() throws Exception;
}- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
-
File Discovery: Currently only detects files at startup. Post-startup file creation/deletion requires restart.
- Solution: Implement periodic re-scan and change detection.
-
Backpressure: No backpressure handling for output publishing.
- Solution: Add backpressure mechanisms in the event queue.
This project is licensed under the MIT License - see the LICENSE file for details.
- Inspired by Filebeat architecture
- Built with modern Java features and best practices
- Uses industry-standard libraries for reliability
Happy Logging! π
