IT'S HAPPENING
GITHUB, THE FIRST ENTERPRISE CLOUD SOLUTION TO REACH ZERO NINES RELIABILITY
Discussion
IT'S HAPPENING
GITHUB, THE FIRST ENTERPRISE CLOUD SOLUTION TO REACH ZERO NINES RELIABILITY
@0xabad1dea soon we try to keep track of percentage down time instead of uptime 🤣
@0xabad1dea WDYM, 89,9 has TWO nines!
@0xabad1dea 4 eights, here we goooooooo!
@0xabad1dea they're working toward 5 nines - 9.9999% uptime
@0xabad1dea The other day I got a 500 error just by creating a new issue with 5 text lines, that's how bad things have gotten. (Never happened before)
@0xabad1dea
Githib is shooting for "nine fives of reliability"
@0xabad1dea THERE ARE ZERO NINES
starting to think Microsoft has a humiliation kink
@0xabad1dea numbers are not my forte. I'll admit I'm an idiot. what in the name of crimes against statistics is going on here?
- major web service providers generally sign agreements with their largest customers to provide "five nines" of uptime: 99.999%.
- Github, one of the most important pieces of infrastructure for computer programming in general, has been getting increasingly unstable over the last several months.
- Github's own official downtime tracker is extremely, uh, conservative, because the more they officially acknowledge as downtime, the more trouble they're in with major customers. But even by their rosiest estimates, they've lost several nines
- This third party tracker concludes that Github has dropped below 90%, that is to say, to zero nines, for the last 90 days. Note that this is counting any big observable problem as downtime, even if the entire site isn't completely dead.
@0xabad1dea @Netraven and as someone who has been one of those 'very large' customers, I can tell you for fact that every cloud provider will use every single excuse in the book and every method they possibly can to straight up lie about their uptime.
The number one cheat? "Oh, that wasn't an official incident. So not covered." Number two, "you didn't file a ticket when it happened."
@0xabad1dea oh thank you very much, that makes sense.
What do you mean? I see two nines right there in the middle.
@0xabad1dea "We did it, boys!"
@0xabad1dea @zazzoo those LLMs sure are helping!
@0xabad1dea Five nines (0.99999%)
@0xabad1dea Paraphrasing Queeg, "it's got a nine in it".
I've worked for enough companies that I've seen teams who couldn't even manage nine fives.
@0xabad1dea
@mogi ok maybe codeberg is doing alright lol
@0xabad1dea I wonder at what point they will declare Uptime Bankruptcy and start fresh?
@0xabad1dea
"What are you talking about I see two nines right there." John Microsoft probably
@0xabad1dea microsoft can't devops.
@0xabad1dea I mean technically, they still have 2 nines there... 🤔 😅
@0xabad1dea its probably running on #windows now go https://codeberg.org
@0xabad1dea Microsoft would argue that 89.91% actually has 2 9s
@0xabad1dea Looking at it positively, they do have 4 eights of reliability.
@0xabad1dea sloptastic news!
@0xabad1dea i mean it does have nines in it, so it’s the slop version of two nines
@0xabad1dea i really don't understand how and real companies still use microsoft products. and i try really hard. i build software infrastructure and do tool selection for developer teams for a living. my previous company used microsoft dynamic 365 as their CRM. they paid about 4 million per year euros for it (significant expense given the company size) and they threw away their own solution in favor of that. yet, the thing was down and unusable (as in whole departments of the company cannot do their work at all and we were bleeding hundreds of thousands in real money every few hours) at least once every two months! longest outages lasted 2 or 3 days. and that's not even counting all the consultancy expenses to manage and customize that thing because there was no internal know-how to do that. the company before that switched from one cloud to another for no apparent practical reason with disastrous consequences for development of the product... these kinds of management decisions are entirely irrational and the only way i can explain them is that companies like microsoft and google invest more in kickbacks to decision makers than into development of their products.
@0xabad1dea I love that I am reading this while waiting for GitHub to fix their API throwing HTTP 503 errors :)
@0xabad1dea Also, I note, an issue that never made it to their status page!
@mini yeah, the official status page is *comically* conservative because of misaligned incentives.
@0xabad1dea I'm sure CoPilot would agree that 9.9999% is much easier to accomplish than 99.999%, an it would still be 5 9s
@0xabad1dea
To make matters (much) worse, copilot seems to be very reliable ...
@0xabad1dea how is it zero nines - there are two nines in 89.91 🤣
@0xabad1dea sure, if you're going to be a stick in the mud and only count nines at the start of the number.
If you get rid of that restriction, they've still got two nines.
@0xabad1dea That's like when GE merged with Bell Atlantic, and we got the worst quality of service paired with the lowest reliability provider. Just getting dial tone was a crap shoot...
@0xabad1dea Thank goodness GitHub is just a popular code repository, and not something anyone would treat as an essential piece of infrastructure in their software provisioning, integration, distribution or whatever process.
@0xabad1dea there are 2 nines in there, I don't see the issue :P
@0xabad1dea waiting for 0.99999% reliability, see there are five of them and this is where it counts
@0xabad1dea copilot is flying it straight in to a cliff face...
@0xabad1dea There's still two 9s in there. Depending on what or who that is counting, that is probably just fine.
@0xabad1dea Maybe github management counts that as "two nines"'?
@0xabad1dea my self-hosted gitea has a higher uptime lmao
@0xabad1dea they can't gain three-9s through 99.9% so they are now course-correcting to three-9s through 89.99%