Skip to content

fix: self-healing patch for PodCrash in crash-test-55b8b665f4-hjqjs#7

Closed
aftabkh4n wants to merge 1 commit into
mainfrom
fix/self-healing-PodCrash-20260420061948
Closed

fix: self-healing patch for PodCrash in crash-test-55b8b665f4-hjqjs#7
aftabkh4n wants to merge 1 commit into
mainfrom
fix/self-healing-PodCrash-20260420061948

Conversation

@aftabkh4n
Copy link
Copy Markdown
Owner

Self-Healing Patch

This PR was opened automatically by the self-healing-k8s system.

Failure detected

  • Pod: crash-test-55b8b665f4-hjqjs
  • Namespace: default
  • Type: PodCrash
  • Detected at: 2026-04-20 06:19:42 UTC

Root cause

The application failed to establish a database connection to postgres://db:5432 during startup. The pod exited with code 1 because the application could not proceed without database connectivity. This is likely caused by: (1) the PostgreSQL service at 'db:5432' is unreachable or not running, (2) incorrect DNS resolution of the 'db' hostname, (3) network policies blocking the connection, or (4) PostgreSQL credentials are invalid. The log shows 'ERROR: Database connection failed' followed by 'FATAL: Cannot connect to postgres://db:5432', indicating the application terminated immediately after the failed connection attempt.

Severity

Critical

Suggested fix

  1. Verify the PostgreSQL pod is running: kubectl get pods -n default | grep postgres or kubectl get pods -n default | grep db. 2. Check if the PostgreSQL service exists and is accessible: kubectl get svc -n default. 3. Test connectivity from the crash-test pod to the database: kubectl exec -it crash-test-55b8b665f4-hjqjs -- sh, then try 'nc -zv db 5432' or 'telnet db 5432'. 4. Verify the database connection string, credentials, and port are correct in your application configuration. 5. Check NetworkPolicies that might be blocking traffic: kubectl get networkpolicies -n default. 6. Review PostgreSQL pod logs for authentication or startup errors: kubectl logs . 7. Ensure the 'db' hostname resolves correctly by checking Kubernetes DNS: kubectl exec -it crash-test-55b8b665f4-hjqjs -- nslookup db or getent hosts db.

Proposed code change

Add health check configuration and connection retry logic to your .NET application startup:

public class Program
{
public static async Task Main(string[] args)
{
var host = CreateHostBuilder(args).Build();

    // Add retry logic for database connection
    var maxRetries = 5;
    var retryCount = 0;
    
    while (retryCount < maxRetries)
    {
        try
        {
            using (var scope = host.Services.CreateScope())
            {
                var dbContext = scope.ServiceProvider.GetRequiredService<YourDbContext>();
                await dbContext.Database.OpenConnectionAsync();
                dbContext.Database.CloseConnection();
                break; // Connection successful
            }
        }
        catch (Exception ex)
        {
            retryCount++;
            if (retryCount >= maxRetries)
            {
                Console.WriteLine($"FATAL: Cannot connect to postgres://{Environment.GetEnvironmentVariable("DB_HOST")}:{Environment.GetEnvironmentVariable("DB_PORT")}");
                Environment.Exit(1);
            }
            await Task.Delay(TimeSpan.FromSeconds(2 * retryCount));
        }
    }
    
    await host.RunAsync();
}

}

Also add to your Kubernetes deployment:
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 10
periodSeconds: 5


Generated by self-healing-k8s

@aftabkh4n aftabkh4n closed this Apr 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant