GitHub still had nine availability incidents in May after moving more workloads to Azure

A 40 percent monolith shift to Azure is progress, but reliability metrics split sharply and AI load keeps climbing.

ByLama Al-RashidTechnology Correspondent, The Executives Brief

about 3 hours ago·4 min read

GitHub still had nine availability incidents in May after moving more workloads to Azure

Executive summary

Jakub Oleksy, SVP of software engineering at GitHub, says the company is making structural changes to “permanently remove failure modes” as traffic surges from AI-assisted coding. For decision-makers, May’s incidents show why migration and capacity expansion alone are not a reliability strategy.

GitHub had nine incidents that degraded performance in May, even as the company accelerates its move to Azure to handle AI-driven demand. In the May 2026 GitHub Availability Report, the platform notes nine incidents, one fewer than April. That sounds like improvement. But when you zoom out, the pattern is harder to ignore: reliability work is ongoing, and the load that’s stressing the system is not slowing down.

Oleksy, SVP of software engineering at GitHub, frames the fix as structural, not cosmetic. In the report, he says GitHub is “making structural changes that permanently remove failure modes,” acknowledging “we have work to do” while insisting it is committed to getting GitHub reliable “when and where you need it.” Translation: GitHub is not treating uptime as a dashboard problem. It is treating it as an engineering failure-mode problem.

Why the world is watching: GitHub is being asked to run the software industry’s newest operating model. The Register reports that GitHub has been struggling with service availability in recent months as traffic surged, driven in large part by AI-assisted coding and agentic development workflows. That’s not just more users. It is different behavior at higher intensity: more commits, more pull requests, more automation, and likely more bursts around popular workflows.

The scaling numbers make the urgency concrete. GitHub reportedly handled 1 billion commits for the entire year last year. Now it receives 1.4 billion commits every month. GitHub had planned to increase its capacity by 10x back in October 2025. By February 2026, it became evident that a 30x expansion would be needed to accommodate the surge of pull requests, commits, and new repos. In other words, the original plan got outpaced by reality. And in systems like this, “outpaced” often means incidents first, then stability.

To address the stress, GitHub is moving more workloads to Azure infrastructure and is also changing how it isolates parts of the system. Oleksy says GitHub is “now serving 40 percent of monolith traffic from Azure” (up from 8 percent in February), with Git traffic at 30 percent and repository replication at 99 percent. He adds that GitHub has “more than doubled our effective capacity in four months.” That is real progress, and it signals serious engineering effort and investment. It is also the kind of move that boards and CFOs like to hear about because it is measurable.

But the availability story does not land cleanly. Oleksy notes efforts to isolate GitHub’s primary database cluster by moving users, authentication, and authorization into separate domains should prevent failures that cascade across the system. That hasn’t fully solved the ongoing availability challenges. The Register points out that Azure has also confronted capacity problems recently. So even if GitHub does everything right inside its own app layer, its reliability can still be constrained by infrastructure capacity. In May, there were nine incidents compared to 10 in April, and June is on pace for a similar number.

Here is where decision-makers should pay attention to metrics and incentives. An unofficial “Missing GitHub Status Page” project counts 12 incidents in May and reports uptime over the past 90 days at 87.26 percent. By month, it puts availability at 78.33 percent in April, 93.86 percent in May, and 88.39 percent for June so far. GitHub’s official status page presents a far more flattering view of availability, with uptime figures mostly around 99.9 percent for the listed services. The Register notes that these figures depend on what gets counted and the duration of the disruption. GitHub’s own incident history page cites 26 incidents in April, 23 in May, and 12 to date in June. So the core question for executives is not only “what happened,” but “how do we define and communicate it?” In reliability, definitions become strategy, and strategy becomes trust.

There is also a Microsoft-side business lever that intersects with these pressures. The Register reports that Microsoft’s code hosting site briefly halted new Copilot subscriptions to reduce the cost impact of its AI services and to adjust its Copilot pricing to account for shifting model provider policies. That is not the same issue as GitHub uptime, but it matters because it shows a tension between AI demand and the unit economics that power AI features. Capacity, cost, reliability, and AI rollout all pull each other. When load increases due to AI-assisted development, the platform that enables it has to survive both traffic spikes and the downstream variability of model providers and infrastructure.

For boards, CTOs, and anyone responsible for developer tooling reliability, this is the strategic stake: a migration to Azure and capacity expansion can raise effective capacity, but it cannot automatically remove cascading failure modes. The company is explicitly working toward structural changes to stop recurring failure patterns, yet the month-to-month incident counts and competing uptime views show the work is still in motion. If GitHub is the nervous system for modern software creation, its availability is not just an IT metric. It is an input to everything else that depends on shipping software faster, including AI-native workflows that are currently accelerating demand beyond earlier plans.

Executive ActionsLocked

This story's Key Insights and Take-aways are locked.

Create a free account to unlock Executive Actions for one credit.

Always free for Executives Club members. Join the Club

Taggedgithub microsoft azure availability incident-tracking ai-coding devops infrastructure-capacity developer-tools

GitHub still had nine availability incidents in May after moving more workloads to Azure

This story's Key Insights and Take-aways are locked.

More in Technology

Jeff Bezos’s Prometheus raises $12B to build an “artificial general engineer”

Kimi K2.7-Code claims 30% fewer thinking tokens, but independent checks raise doubts

Google paper tackles LLM hallucinations with “faithful uncertainty” to beat the utility tax