Skip to content
Merged

docs #38

Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 70 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
## Commit Message Guidelines
### Commit Message Format
Each commit message consists of a **header**, a **body** and a **footer**. The header has a special
format that includes a **type**, a **scope** and a **subject**:

```
<type>(<scope>): <subject>
<BLANK LINE>
<body>
```

The **header** is mandatory and the **scope** of the header is optional (read below).

Examples of correct commit message:

```
docs(changelog): update changelog to beta.5
```
```
fix(release): need to depend on latest rxjs and zone.js

The version in our package.json gets copied to the one we publish, and users need the latest of these.
```



### Type
Must be one of the following:

* **deploy**: Changes that affect deploy of the system or additional services
* **ci**: Changes to our CI configuration files and scripts
* **docs**: Documentation only changes
* **feat**: A new feature
* **fix**: A bug fix
* **perf**: A code change that improves performance
* **refactor**: A code change that neither fixes a bug nor adds a feature
* **style**: Changes that do not affect the meaning of the code (white-space, formatting, missing semi-colons, etc)
* **test**: Adding missing tests or correcting existing tests

### Scope

The following is the current list of supported scopes (may be updated later or you can use logic here):


* **api**
* **analyzer**
* **servers**
* **dashboard**
* **db**
* **traefik**
* **k8s**
* **docker**


There are currently only one exception to the "use package name" rule:
* none/empty string: useful for `style`, `test` and `refactor` changes that are done across all packages (correct e.g. `style: add missing semicolons`)

### Subject
The subject contains a succinct description of the change:

* use the imperative, present tense: "change" not "changed" nor "changes"
* don't capitalize the first letter
* no dot (.) at the end

### Body
Just as in the **subject**, use the imperative, present tense: "change" not "changed" nor "changes".
The body should include the motivation for the change and contrast this with previous behavior.

### Revert
If the commit reverts a previous commit, it should begin with `revert: `, followed by the header of the reverted commit. In the body it should say: `This reverts commit <hash>.`, where the hash is the SHA of the commit being reverted.
135 changes: 80 additions & 55 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,75 +1,100 @@
# Distributed-Log-Analysis-Framework
MapReduce framework built for performing log analysis over large, distributed system logs (e.g., from web servers, application traces, IoT devices), extracting useful insights (e.g., error rates, frequent access paths, IP usage patterns).
System supports log splitting, distributed mapping, and result reduction.
## Project structure
### [Figma board link](https://www.figma.com/board/4VOwMDVzaCjXlxx79GB2qE/Untitled?t=h3ijaX7ESqqBWqBm-1)
## Commit Message Guidelines
### Commit Message Format
Each commit message consists of a **header**, a **body** and a **footer**. The header has a special
format that includes a **type**, a **scope** and a **subject**:

```
<type>(<scope>): <subject>
<BLANK LINE>
<body>
Modern distributed systems generate large amounts of log data from various sources, making
effective analysis essential for monitoring, security, and optimization. This project introduces a
custom distributed log analysis framework based on the MapReduce paradigm, allowing scalable
log processing and aggregation across multiple nodes. Our system extracts key data such as the
most active IP addresses and endpoints using a microservices architecture, consisting of a Java
Spring Boot API gateway, Go-based analyzer, MongoDB, and a reliable messaging pipeline using
RabbitMQ. The framework has been tested using generated logs that are similar to those found in
the real world, and it provides an interactive dashboard. The dashboard focuses on extracting the
most active IP addresses or endpoints from server logs. This allows organizations to gain valuable
insights from their distributed system's data.

## Getting Started

### Prerequisites
- Docker and Docker Compose
- Git

### Step 1: Clone the Repository
```bash
git clone https://github.com/yourusername/Distributed-Log-Analysis-Framework.git
cd Distributed-Log-Analysis-Framework
```

The **header** is mandatory and the **scope** of the header is optional (read below).

Examples of correct commit message:

```
docs(changelog): update changelog to beta.5
### Step 2: Start the Services
Navigate to the docker directory and start all services:
```bash
cd docker
docker-compose up -d --build
```
```
fix(release): need to depend on latest rxjs and zone.js

The version in our package.json gets copied to the one we publish, and users need the latest of these.
### Step 3: Verify Services
Check that all containers are running:
```bash
docker ps
```

### Step 4: Access Web Interfaces
### Frontend Web UI
- URL: http://localhost:4173

### Gateway API
- URL: http://localhost:8080

### Mongo-Express (MongoDB Web UI)
- URL: http://localhost:8081
- Login: admin
- Password: admin

### Type
Must be one of the following:
### MongoDB
- Port: 27018
- Login: admin
- Password: admin

* **deploy**: Changes that affect deploy of the system or additional services
* **ci**: Changes to our CI configuration files and scripts
* **docs**: Documentation only changes
* **feat**: A new feature
* **fix**: A bug fix
* **perf**: A code change that improves performance
* **refactor**: A code change that neither fixes a bug nor adds a feature
* **style**: Changes that do not affect the meaning of the code (white-space, formatting, missing semi-colons, etc)
* **test**: Adding missing tests or correcting existing tests
### RabbitMQ Management UI
- URL: http://localhost:15672
- Login: admin
- Password: admin

### Scope
### Test Servers
- Server 1: http://localhost:8001
- Server 2: http://localhost:8002
- Server 3: http://localhost:8003
- Metrics endpoint: /metrics

The following is the current list of supported scopes (may be updated later or you can use logic here):
### Consistency Validator
- URL: http://localhost:8090
- Metrics endpoint: /metrics

### Performance Analyzer
- URL: http://localhost:8091
- Metrics endpoint: /metrics

* **api**
* **analyzer**
* **servers**
* **dashboard**
* **db**
* **traefik**
* **k8s**
* **docker**
### Prometheus (Monitoring)
- URL: http://localhost:9090

### Grafana (Dashboards)
- URL: http://localhost:3000
- Login: admin
- Password: admin

There are currently only one exception to the "use package name" rule:
* none/empty string: useful for `style`, `test` and `refactor` changes that are done across all packages (correct e.g. `style: add missing semicolons`)

### Subject
The subject contains a succinct description of the change:

* use the imperative, present tense: "change" not "changed" nor "changes"
* don't capitalize the first letter
* no dot (.) at the end
### Step 5: Stopping the Services
When you're done, you can stop all services with:
```bash
docker compose down -v
```

### Body
Just as in the **subject**, use the imperative, present tense: "change" not "changed" nor "changes".
The body should include the motivation for the change and contrast this with previous behavior.
For more detailed information about specific components, refer to the documentation section below.

## Documentation
- [Docker setup and configuration](docker/README.md)
- [Gateway API documentation](gateway/README.md)
- [Test Servers documentation](test-servers/README.md)
- [Analyzer documentation](analyzer/README.md)
- [Frontend documentation](frontend/README.md)
### [Figma board link](https://www.figma.com/board/4VOwMDVzaCjXlxx79GB2qE/Untitled?t=h3ijaX7ESqqBWqBm-1)
## Access to services

### Revert
If the commit reverts a previous commit, it should begin with `revert: `, followed by the header of the reverted commit. In the body it should say: `This reverts commit <hash>.`, where the hash is the SHA of the commit being reverted.
21 changes: 21 additions & 0 deletions analyzer/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Analyzer Component - Distributed Log Analysis Framework

MapReduce is an efficient framework for processing large volumes of log data. In the map phase, log files are split and processed in parallel to extract key metrics. During the reduce phase, these results are aggregated. Then this information can be used to generate summaries, detect anomalies, and analyze user behavior.

The Analyzer component is implemented in Go. It executes a full MapReduce pipeline for logs. Initially, raw log records were consumed from RabbitMQ, then parsed via regular expressions into structured Log objects (Parsing). During the Map phase, client IP addresses and requested endpoints were extracted from each Log. Then mapped data with same IP/endpoint were grouped and reduced. The result data were stored in MongoDB.

## Features

- Real-time log processing via RabbitMQ message queue
- Flexible aggregation by IP addresses and endpoints
- Scalable architecture with batched processing
- MongoDB integration for persistent storage

## Architecture

The Analyzer implements a MapReduce pattern with the following components:

1. **Parser**: Converts raw log strings into structured log objects
2. **Mapper**: Transforms log entries into key-value pairs for aggregation
3. **Grouper**: Groups mapped outputs by keys
4. **Reducer**: Aggregates grouped outputs into final results
37 changes: 8 additions & 29 deletions docker/README.md
Original file line number Diff line number Diff line change
@@ -1,48 +1,27 @@
# Запуск всех сервисов в Docker
# Running all Services in Docker

## Запуск всех контейнеров одной командой
## Running all containers with one command

Чтобы запустить все сервисы (MongoDB, Mongo-Express и RabbitMQ) одной командой, выполните следующую команду в терминале из папки `docker`:
To launch all services with a single command, run the following commandl:

```bash
docker-compose up -d --build
cd docker
docker compose up -d --build
```

Это запустит все контейнеры в фоновом режиме.
This will start all containers in background mode.

Чтобы остановить все контейнеры ипользуйте:
To stop all containers use:

```bash
docker compose down -v
```


Чтобы проверить, что все контейнеры запущены:
To verify that all containers are running:

```bash
docker ps
```

## Доступ к веб-интерфейсам

### Mongo-Express (MongoDB Web UI, might take some time to boot)
- URL: http://localhost:8081
- Логин: admin
- Пароль: admin

#### MongoDB container:
- port: 27018
- Логин: admin
- Пароль: admin

### RabbitMQ Management UI
- URL: http://localhost:15672
- Логин: admin
- Пароль: admin

### Frontend Web UI
- URL: http://localhost:4173

## Storing login/password in .env

See ```.env.example ```
Expand Down
56 changes: 56 additions & 0 deletions frontend/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Frontend - Distributed Log Analysis Framework

## Overview
This frontend application provides a user interface for the Distributed Log Analysis Framework. It allows users to view dashboards, analyze logs, and configure system settings.

## Technology Stack
- **React 19** - UI library
- **TypeScript** - Type-safe JavaScript
- **Vite** - Build tool and development server
- **Redux Toolkit** - State management
- **React Router** - Navigation
- **Ant Design** - UI component library
- **Ant Design Charts** - Data visualization

## Getting Started

### Prerequisites
- Node.js (latest LTS version recommended)
- npm or yarn

### Installation
```bash
# Install dependencies
npm install
```

### Development
```bash
# Start development server
npm run dev
```

### Building for Production
```bash
# Build for production
npm run build

# Preview production build
npm run preview
```

## Project Structure
- **src/components/** - Reusable UI components
- **src/pages/** - Main application views (Dashboard, Logs, Analysis, Settings)
- **src/services/** - API integration and services
- **src/store/** - Redux store configuration
- **src/contexts/** - React context providers
- **src/hooks/** - Custom React hooks
- **src/models/** - TypeScript interfaces and types
- **src/constants/** - Application constants

## Features
- Interactive dashboards for log visualization
- Real-time log viewing and filtering
- Advanced log analysis tools
- User-configurable settings and preferences
Loading