This microservices system contains intentional race conditions to demonstrate common concurrency issues in distributed systems. The architecture uses a centralized data-service that both order-service and inventory-service call via HTTP APIs.
Order Service (Node.js) ────┐
├── HTTP API ──► Data Service (Node.js) ──► JSON Files
Inventory Service (Python) ─┘
File: data-service/src/server.ts
Lines: 192-220
async updateOrderStatus(orderId: string, status: string, message?: string): Promise<Order | null> {
const db = await this.loadOrders(); // Load #1
const orderIndex = db.orders.findIndex(o => o.id === orderId);
const freshDb = await this.loadOrders(); // Load #2
const freshOrder = freshDb.orders.find(o => o.id === orderId);
const originalOrder = db.orders[orderIndex]; // Use stale data from Load #1
originalOrder.status = status; // Modify stale object
await this.saveOrders(db); // Save stale database
return originalOrder;
}Race Condition: Uses original stale database object after loading fresh data, causing lost updates.
File: data-service/src/server.ts
Lines: 241-260
async reserveStock(productName: string, quantity: number): Promise<boolean> {
const db = await this.loadInventory();
const item = db.inventory.find(i => i.productName.toLowerCase() === productName.toLowerCase());
const available = item.quantity - item.reservedQuantity; // Calculate availability
if (available < quantity) {
return false;
}
// Gap here - another process can reserve stock
item.reservedQuantity += quantity; // Modify without re-checking availability
await this.saveInventory(db);
return true;
}Race Condition: Time-of-check-time-of-use (TOCTOU) bug between availability calculation and reservation.
File: data-service/src/server.ts
Lines: 68-71, 84-87
private async saveOrders(orders: OrderDatabase): Promise<void> {
const tempData = JSON.stringify(orders, null, 2);
await fs.writeFile(this.ordersPath, tempData); // No atomic write or locking
this.cachedOrders = orders;
}Race Condition: Multiple processes can corrupt JSON files during concurrent writes.
File: order-service/src/api.ts
Lines: 238-249
async updateOrderStatus(orderId: string, status: string, message?: string): Promise<Order | null> {
const db = await this.loadDatabase();
const order = db.orders.find(o => o.id === orderId);
const freshDb = await this.loadDatabase(); // Load fresh data
const freshOrder = freshDb.orders.find(o => o.id === orderId);
if (freshOrder.status !== oldStatus) {
console.log(`Status changed during processing: ${oldStatus} -> ${freshOrder.status}`);
// Continue with original order object anyway (race condition)
}
}Race Condition: Detects concurrent changes but continues using stale data.
File: inventory-service/src/main.py
Lines: 262-290
def _process_order(self, order_id: str, product_name: str, quantity: int):
available_check = self.data_client.check_availability(product_name, quantity) # API Call #1
if not available_check:
return False, "Insufficient stock"
# Race condition window - other orders can pass availability check here
if not self.data_client.reserve_stock(product_name, quantity): # API Call #2
return False, "Failed to reserve stock"Race Condition: Multiple orders can pass availability check before any reservation occurs.
- Negative inventory values - Stock goes below zero
- Lost order status updates - Status reverts to previous state
- JSON file corruption -
Unexpected end of JSON inputerrors - Duplicate order processing - Same order fulfilled multiple times
- Inventory over-allocation - More items reserved than available
🔀 Order abc123 status changed during processing: pending -> fulfilled
⚠️ Order def456: Stock changed during processing - Available: -2
{"error":"Failed to create order","message":"Unexpected end of JSON input"}
# Automated startup with dependency checking
./start-all-terminals.sh# Must start in this order due to dependencies
cd data-service && npm install && npm run dev &
sleep 15 # Wait for data-service to be healthy
cd order-service && npm install && npm run dev &
sleep 10 # Wait for order-service to connect
cd inventory-service && pip install -r requirements.txt
python src/main.py --start-service &docker-compose up -dfor i in {1..10}; do curl -X POST http://localhost:3001/api/orders \
-H "Content-Type: application/json" \
-d '{"productName":"Laptop","quantity":2,"customerId":"test'$i'"}' & done
### Verification
```bash
# Check for negative inventory
curl http://localhost:3002/inventory | jq '.inventory[] | select(.quantity - .reservedQuantity < 0)'
# Check for corrupted files
cat data/orders.json | tail -n 5
cat data/inventory.json | jq .
# Verify service dependencies
curl http://localhost:3002/health # Data service
curl http://localhost:3001/health # Order service
```
## Expected vs Actual Results
### Test Command Analysis
```bash
for i in {1..10}; do curl -X POST http://localhost:3001/api/orders \
-H "Content-Type: application/json" \
-d '{"productName":"Laptop","quantity":2,"customerId":"test'$i'"}' & done
```
**What Should Happen (Expected):**
- Initial Laptop inventory: 25 units
- 10 concurrent orders × 2 laptops each = 20 laptops requested
- Expected final inventory: 25 - 20 = 5 laptops available
- All 10 orders should succeed
- JSON files should remain valid and uncorrupted
**What Actually Happens (Actual):**
- ❌ **File Corruption**: `{"error":"Failed to create order","message":"Unexpected end of JSON input"}`
- ❌ **Over-allocation**: Some orders may succeed even when inventory goes negative
- ❌ **Inconsistent State**: Final inventory might show impossible values like -3 available
- ❌ **Lost Orders**: Some orders get lost due to database corruption
- ❌ **System Errors**: Services crash or become unresponsive
**Race Condition Evidence:**
```bash
# Typical corrupted output
curl http://localhost:3002/inventory/Laptop
# Returns: {"error":"Failed to fetch inventory item","message":"Unexpected end of JSON input"}
# Or shows negative inventory
# {"item":{"quantity":25,"reservedQuantity":28}} // -3 available!
```
**Why This Happens:**
1. Multiple processes write to the same JSON file simultaneously
2. Availability checks pass before any reservations are made
3. File gets corrupted during concurrent write operations
4. Lost updates occur when stale data overwrites fresh data
## Fixes
### 1. Add File Locking
**File:** `data-service/src/server.ts`
```typescript
import fs from 'fs/promises';
import lockfile from 'proper-lockfile';
private async saveOrders(orders: OrderDatabase): Promise<void> {
const release = await lockfile.lock(this.ordersPath);
try {
const tempData = JSON.stringify(orders, null, 2);
await fs.writeFile(this.ordersPath, tempData);
this.cachedOrders = orders;
} finally {
await release();
}
}
```
### 2. Implement Optimistic Locking
**File:** `data-service/src/server.ts`
```typescript
interface Order {
id: string;
version: number; // Add version field
// ... other fields
}
async updateOrderStatus(orderId: string, status: string, version: number): Promise<Order | null> {
const db = await this.loadDatabase();
const order = db.orders.find(o => o.id === orderId);
if (order.version !== version) {
throw new Error('Order was modified by another process');
}
order.status = status;
order.version++; // Increment version
await this.saveOrders(db);
return order;
}
```
### 3. Atomic Inventory Operations
**File:** `data-service/src/server.ts`
```typescript
async reserveStock(productName: string, quantity: number): Promise<boolean> {
const release = await lockfile.lock(this.inventoryPath);
try {
const db = await this.loadInventory(); // Fresh read under lock
const item = db.inventory.find(i => i.productName.toLowerCase() === productName.toLowerCase());
const available = item.quantity - item.reservedQuantity;
if (available < quantity) {
return false;
}
item.reservedQuantity += quantity;
await this.saveInventory(db); // Save under same lock
return true;
} finally {
await release();
}
}
```
### 4. Add Database Transactions (Alternative)
Replace JSON files with SQLite for ACID properties:
```typescript
import Database from 'better-sqlite3';
const db = new Database('data.db');
async reserveStock(productName: string, quantity: number): Promise<boolean> {
return db.transaction(() => {
const row = db.prepare('SELECT quantity, reserved FROM inventory WHERE name = ?').get(productName);
if (row.quantity - row.reserved < quantity) {
return false;
}
db.prepare('UPDATE inventory SET reserved = reserved + ? WHERE name = ?').run(quantity, productName);
return true;
})();
}
```
### 5. Add Message Deduplication
**File:** `order-service/src/messaging.ts`
```typescript
import Redis from 'ioredis';
private redis = new Redis();
async publishOrderCreated(event: OrderCreatedEvent): Promise<void> {
const messageKey = `order:${event.orderId}:created`;
const lockKey = `lock:${messageKey}`;
const acquired = await this.redis.set(lockKey, '1', 'EX', 10, 'NX');
if (!acquired) {
console.log('Message already being processed');
return;
}
try {
const alreadyProcessed = await this.redis.get(messageKey);
if (alreadyProcessed) {
return;
}
// Publish message
await this.publishMessage(event);
// Mark as processed
await this.redis.setex(messageKey, 300, '1');
} finally {
await this.redis.del(lockKey);
}
}
```
## Testing Fixes
### Load Testing
```bash
# Install artillery for load testing
npm install -g artillery
# Create artillery config
cat > load-test.yml << EOF
config:
target: 'http://localhost:3001'
phases:
- duration: 60
arrivalRate: 10
scenarios:
- name: "Create Orders"
requests:
- post:
url: "/api/orders"
json:
productName: "Laptop"
quantity: 1
customerId: "load-test-{{ $randomNumber() }}"
EOF
# Run load test
artillery run load-test.yml
```
### Verification Scripts
```bash
#!/bin/bash
# verify-consistency.sh
echo "Checking inventory consistency..."
TOTAL_ORDERS=$(curl -s http://localhost:3002/orders | jq '.orders | length')
LAPTOP_ORDERS=$(curl -s http://localhost:3002/orders | jq '.orders[] | select(.items[].product.name=="Laptop") | .items[].quantity' | paste -sd+ | bc)
LAPTOP_STOCK=$(curl -s http://localhost:3002/inventory/Laptop | jq '.item.quantity')
LAPTOP_RESERVED=$(curl -s http://localhost:3002/inventory/Laptop | jq '.item.reservedQuantity')
echo "Total Orders: $TOTAL_ORDERS"
echo "Laptop Orders: $LAPTOP_ORDERS"
echo "Laptop Stock: $LAPTOP_STOCK"
echo "Laptop Reserved: $LAPTOP_RESERVED"
echo "Expected Available: $((25 - LAPTOP_ORDERS))"
echo "Actual Available: $((LAPTOP_STOCK - LAPTOP_RESERVED))"
if [ $((LAPTOP_STOCK - LAPTOP_RESERVED)) -lt 0 ]; then
echo "❌ RACE CONDITION DETECTED: Negative inventory!"
else
echo "✅ Inventory consistency maintained"
fi
```
## Prevention Strategies
### 1. Design Patterns
- **Pessimistic Locking** - Lock resources before access
- **Optimistic Locking** - Version-based conflict detection
- **Event Sourcing** - Immutable event log as source of truth
- **CQRS** - Separate read/write models with eventual consistency
### 2. Infrastructure Solutions
- **Database with ACID properties** - PostgreSQL, MySQL
- **Distributed locks** - Redis, Zookeeper, etcd
- **Message queues with ordering** - Apache Kafka partitions
- **Distributed consensus** - Raft, Paxos algorithms
### 3. Testing Approaches
- **Stress testing** - High concurrency load
- **Chaos engineering** - Introduce random failures
- **Property-based testing** - Verify invariants hold
- **Formal verification** - Model checking tools
## Service Dependencies
### Startup Order
Services must start in specific order due to dependencies:
1. **Data Service** (Port 3002) - Core data layer, no dependencies
2. **Order Service** (Port 3001) - Depends on data-service
3. **Inventory Service** - Depends on data-service and RabbitMQ
### Health Check Implementation
**File:** `start-all-terminals.sh` **Lines:** 73-97
```bash
check_service_health() {
local service_name="$1"
local health_url="$2"
local max_attempts=10
local attempt=1
while [ $attempt -le $max_attempts ]; do
if curl -s -f "$health_url" >/dev/null 2>&1; then
echo "✅ $service_name is healthy"
return 0
fi
echo "⏳ Attempt $attempt/$max_attempts: waiting 10 seconds..."
sleep 10
attempt=$((attempt + 1))
done
echo "❌ $service_name failed after $max_attempts attempts"
return 1
}
```
### Connection Polling
**Order Service:** `order-service/src/api.ts` **Lines:** 286-316
**Inventory Service:** `inventory-service/src/main.py` **Lines:** 110-129
Both services implement 10 attempts × 10-second intervals polling before startup.
## Code Quality Fixes Applied
### TypeScript Configuration Issues
**File:** `data-service/tsconfig.json`
**Fix:** Added DOM library and Node types for proper TypeScript compilation
```json
{
"lib": ["ES2020", "DOM"],
"types": ["node"],
"allowSyntheticDefaultImports": true
}
```
### Import and Dependency Issues
**Files:** Multiple service files
**Fixes Applied:**
- Added `@types/axios` to order-service dependencies
- Added `axios` import with proper typing
- Added `requests==2.31.0` to inventory-service requirements
- Fixed middleware typing in data-service
### Python Type Issues
**File:** `inventory-service/src/main.py`
**Fixes:**
- Removed unused imports (`os`, `typing.Any`)
- Added `# type: ignore` for pika imports (external library)
- Fixed requests session timeout configuration
- Added null checks for RabbitMQ channel operations
### Error Handling Improvements
**File:** `order-service/src/api.ts`
**Fixes:**
- Improved error type handling with proper type guards
- Removed unused database file operations
- Added proper AxiosError handling
### RabbitMQ Channel Safety
**File:** `inventory-service/src/main.py` **Lines:** 200-231
**Fix:** Added null checks before channel operations
```python
if self.channel:
self.channel.queue_declare(...)
```
### Database Cleanup
**File:** `order-service/src/api.ts` **Lines:** 70-73
**Fix:** Removed unused database initialization logic since data-service handles persistence
```typescript
async initializeDatabase(): Promise<void> {
// Database initialization now handled by data-service
console.log("📁 Database operations delegated to data-service");
}
```
## Diagnostic Status
All major warnings and errors have been resolved:
- ✅ Python import warnings fixed
- ✅ TypeScript compilation errors resolved
- ✅ Missing type declarations added
- ✅ Unused imports removed
- ✅ Null reference checks added
- ✅ Error handling improved
## Issue Description (Non-Technical)
**Problem Summary**: The ordering system sometimes breaks when many customers try to buy the same item at the exact same time.
**What Happens**:
- Multiple customers click "buy" for the same product simultaneously
- The system checks if there's enough inventory for each customer individually
- But between checking and actually reserving the items, other customers can also pass the same check
- Result: More items get sold than are actually available, or the system crashes with corrupted data
**Real-World Analogy**:
Imagine a movie theater with 1 remaining seat. Three people approach different ticket counters at the same time. Each clerk checks and sees "1 seat available" and sells a ticket. Now you have 3 tickets sold for 1 seat.
**Symptoms Users Experience**:
- "Out of stock" errors appearing inconsistently
- Orders getting stuck in "processing" forever
- System displaying error messages about corrupted data
- Inventory showing impossible values (like -5 items in stock)
**Business Impact**:
- Lost sales due to system errors
- Customer frustration from failed orders
- Manual work required to fix corrupted data
- Overselling leading to unfulfillable orders
## Further Reading
- [Distributed Systems Concepts](https://en.wikipedia.org/wiki/Distributed_computing)
- [ACID Properties](https://en.wikipedia.org/wiki/ACID)
- [CAP Theorem](https://en.wikipedia.org/wiki/CAP_theorem)
- [Two-Phase Commit](https://en.wikipedia.org/wiki/Two-phase_commit_protocol)
- [Eventual Consistency](https://en.wikipedia.org/wiki/Eventual_consistency)