🏥 CI Failuresquid container intermittent startup failure in debugging.sh test

## Summary

The `debugging.sh` example test in the Examples Test workflow experienced an intermittent Squid container startup failure. The container exited with code 1 during healthcheck, causing the test to fail.

## Failure Details

- **Workflow Run:** [21680840422](https://github.com/github/gh-aw-firewall/actions/runs/21680840422)
- **Commit:** 769a6f5 (PR #432 merge - "feat: filter benign operational logs from Squid access.log")
- **Test:** debugging.sh (3rd example test)
- **Error:** `dependency failed to start: container awf-squid exited (1)`

## Timeline

```
2026-02-04T17:07:10.6561324Z  Container awf-squid  Creating
2026-02-04T17:07:10.6744945Z  Container awf-squid  Created
2026-02-04T17:07:10.6857834Z  Container awf-squid  Starting
2026-02-04T17:07:10.8238120Z  Container awf-squid  Started
2026-02-04T17:07:10.8240265Z  Container awf-squid  Waiting
2026-02-04T17:07:11.3254835Z  Container awf-squid  Error
```

The container failed after ~500ms during healthcheck phase.

## Evidence of Intermittent Nature

1. **Previous tests succeeded:** basic-curl.sh and using-domains-file.sh passed with identical Squid config
2. **Recent history clean:** Last 4 workflow runs on main branch succeeded
3. **Fast failure:** Container exited in 500ms, suggesting transient issue rather than config error

## Changes in PR #432

Added ACL to filter localhost healthcheck logs:

```squid
# Don't log healthcheck probes from localhost
acl healthcheck_localhost src 127.0.0.1 ::1
log_access deny healthcheck_localhost
```

While this change is present, the fact that earlier tests passed suggests it's not a syntax error.

## Potential Root Causes

### 1. Docker Resource Exhaustion (Most Likely)
- Third test in sequence
- Possible race condition in container cleanup
- Pre-test cleanup may not always complete before next test starts

### 2. Squid Healthcheck Timing
- Container starts but fails healthcheck probe
- Possible network initialization delay
- Docker Compose healthcheck may be too aggressive

### 3. Image Pull/Caching Issue
- Using GHCR images (`ghcr.io/github/gh-aw-firewall/squid:latest`)
- Possible intermittent registry connectivity issue
- Image layer corruption (unlikely but possible)

## Recommendations

### Short-term
- Rerun failed workflows (likely to succeed)
- Monitor for recurrence pattern

### Medium-term
- Add retry logic to container startup
- Increase healthcheck interval/timeout
- Add explicit wait between example tests

### Long-term
- Improve container cleanup between tests
- Add Squid startup logging to capture failure details
- Consider test isolation improvements

## Monitoring

Watch for:
- Recurrence in debugging.sh specifically
- Pattern of "3rd test fails" in other workflows
- Similar failures after PR merges

## Related Issues

- #375 - Container architecture improvements
- Past Docker network cleanup issues (pool overlaps)

---

**Note:** This appears to be a flaky test rather than a deterministic failure. The same configuration worked in tests immediately before and after in the same run.




> AI generated by [CI Doctor](https://github.com/github/gh-aw-firewall/actions/runs/21680855347)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🏥 CI Failuresquid container intermittent startup failure in debugging.sh test #505

Summary

Failure Details

Timeline

Evidence of Intermittent Nature

Changes in PR #432

Potential Root Causes

1. Docker Resource Exhaustion (Most Likely)

2. Squid Healthcheck Timing

3. Image Pull/Caching Issue

Recommendations

Short-term

Medium-term

Long-term

Monitoring

Related Issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

🏥 CI Failuresquid container intermittent startup failure in debugging.sh test #505

Description

Summary

Failure Details

Timeline

Evidence of Intermittent Nature

Changes in PR #432

Potential Root Causes

1. Docker Resource Exhaustion (Most Likely)

2. Squid Healthcheck Timing

3. Image Pull/Caching Issue

Recommendations

Short-term

Medium-term

Long-term

Monitoring

Related Issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions