Between Feb 5, 2025 00:34 UTC and 11:16 UTC, up to 7% of organizations using GitHub-hosted larger runners with public IP addresses had those jobs fail to start during the impact window. The issue was caused by a backend migration in the public IP management system, which caused certain public IP address runners to be placed in a non-functioning state.
We have improved the rollback steps for this migration to reduce the time to mitigate any future recurrences, are working to improve automated detection of this error state, and are improving the resiliency of runners to handle this error state without customer impact.
Posted Feb 05, 2025 - 11:44 UTC
Update
We have identified a configuration change that we believe may be related. We are working to mitigate.
Posted Feb 05, 2025 - 11:17 UTC
Update
We are continuing investigation
Posted Feb 05, 2025 - 10:33 UTC
Update
We continue to investigate and have determined this is limited to a subset of larger runner pools.
Posted Feb 05, 2025 - 09:56 UTC
Update
We are investigating an incident where Actions larger runners are stuck in provisioning for some customers