In a production environment, downtime equals lost revenue. This project demonstrates a Self-Healing Infrastructure pattern using a Bash-based Watchdog.
The system proactively monitors Docker containers and automatically performs a "Cold Boot" recovery if a service failure is detected. No manual intervention required.
The auto_heal.sh script acts as a lightweight orchestration layer:
- Heartbeat Monitoring: Frequently polls the Docker daemon for container status.
- Failure Detection: Identifies "Exited" or "Dead" states instantly.
- Automated Recovery: Force-removes stale containers and redeploys a fresh instance from the specified image.
- Audit Logging: Maintains a
health_check.logfor SRE (Site Reliability Engineering) review.
auto_heal.sh- The core automation logic.health_check.log- Persistent log of system uptime and recovery events.README.md- Documentation.
- Clone the repository:
git clone https://github.com/Deola-max/Deola-max.git && cd Deola-max