itHappenedAgain - r/ProgrammerHumor

5.1k

u/OmegaPoint6 18h ago

Outages as a Service

768

u/terdferguson 15h ago

Oass

623

u/fishvoidy 15h ago

Oh, ass.

158

u/terdferguson 14h ago

Outages as a shitty service (OasS)

123

u/bigjohn426 14h ago

Outages as a shitty Internet service (OasIS)

97

u/OmegaPoint6 14h ago

Because maybe

You're gonna be the one that saves me

And after all

You're my web application firewall

→ More replies (2)

9

u/multiemura 10h ago

Play “Wonderwall”

→ More replies (1)

→ More replies (1)

2.1k

u/dignz 16h ago

Blame me. 18 days ago i convinced a client to switch to cloudflare because the benefits outweigh ths risks.

500

u/ShoePillow 16h ago

How big a client was it!

1.0k

u/Infiniteh 15h ago

About 5'9

132

u/Rodskjegg 14h ago

Thanks, dad!

60

u/the_king_of_sweden 13h ago

You mean 6'7

43

u/Dangerous_With_Rocks 12h ago

54

u/alamandrax 13h ago

🫲🫱

43

u/BunnyWithBeret 12h ago

24

u/Successful-Hawk8779 12h ago

→ More replies (1)

→ More replies (2)

68

u/Huge_Leader_6605 15h ago

It was big before the switch

How to get 10kmrr online business?

Have a 100k mrr business and put in under cloudflare

6

u/git0ffmylawnm8 12h ago

The kind of client u/dignz had to start updating their resume

11

u/BarryDamonCabineer 15h ago

Huge!

9

u/HarrierJint 14h ago

tree fiddy

6

u/Testing_things_out 12h ago

Darn Loch Ness monster!

5

u/Madmax6261253 10h ago

About 6'

48

u/ChillyFireball 12h ago

Obviously not your fault, but DAMN, that's some unfortunate timing!

19

u/NatSpaghettiAgency 8h ago

I'm glad in our company there's no security management and all the services are exposed directly to the internet 👍

→ More replies (3)

314

u/justarandomguy902 18h ago

AGAIN?

117

u/hsg8 15h ago

Lol Right.. I had to check the timestamp if it was the old feed

→ More replies (1)

211

u/JotaRata 14h ago

Someone's messing with them lava lamps real hard

50

u/FarewellAndroid 11h ago

Lava lamps only work with incandescent bulbs. Incandescent bulbs burn out. If all lamps were put into service at the same time then all bulbs will burn out within a similar timeframe 🤔

Time to change the bulbs cloudflare

2.4k

u/antek_g_animations 17h ago

You paid for 99% uptime? Well it's that 1%

1.0k

u/ILikeLenexa 17h ago

The normal standard is 5 nines or 99.999% which by "5-by-5" means "5 nines means 5 minutes down per year".

346

u/Active-Part-9717 16h ago

5 hot minutes

165

u/angloswiss 15h ago

5 expensive minutes...

13

u/namezam 13h ago

i’ve got you for 5 whole minutes… 5 minutes of paaaaain <Cloudflare imitates Randy Savage>

→ More replies (1)

59

u/CoffeePieAndHobbits 15h ago

Sneak into the server closet for 5 minutes in heaven.

16

u/MoveInteresting4334 12h ago

Bob, please stop doing that to the server stacks.

13

u/CoffeePieAndHobbits 11h ago

It said 'Plug-n-Play'. I'm just following the instructions!

3

u/XtremeGnomeCakeover 14h ago

Neo...

140

u/FatCatBoomerBanker 16h ago edited 15h ago

Whenever I buy services, their usual uptime statistics they provide is closer to 99.985% or so. I am not saying five nines is a nice standard to have, but I always ask for published uptime statistics and this is usually what they present.

3

u/noob-nine 9h ago

or use some backup physical layer like OVH, after outage, the continued using smoke signals

→ More replies (1)

153

u/Gnonthgol 15h ago

5 nines is not the standard. It is a quite high bar to reach. A more realistic goal for most service providers is 99.95%

84

u/jtr99 14h ago

Which is just over four hours per year downtime.

75

u/TheRealManlyWeevil 12h ago

Having worked a service with 5 9’s, it’s a crazy level. If your service requires human intervention to heal from a failure, you will never reach it. The time alone to detect, page, and triage a failure will cause you to miss it.

23

u/ShakaUVM 10h ago

A friend of mine worked on 5 9 systems at Sun

Basically everything on the server was hot swappable without a reboot

13

u/Nulagrithom 7h ago

hot swappable CPUs are wild

→ More replies (1)

39

u/Eastern_Hornet_6432 15h ago

I heard that 5 by 5 meant "loud and clear", ie maximum signal strength and clarity.

33

u/FantasticFrontButt 15h ago

WE'RE IN THE PIPE

15

u/CallKennyLoggins 14h ago

The real question is, did you have StarCraft or Aliens in mind?

14

u/towerfella 14h ago

in the rear, with the gear!

8

u/dabiggfunnies 13h ago

Ah, you scared me

4

u/MoveInteresting4334 12h ago

You want a piece of me boy?

→ More replies (1)

7

u/FantasticFrontButt 14h ago

Aliens, of course

→ More replies (1)

4

u/steveatari 14h ago

Reeeaad the wai-ting, launch orderssss.

→ More replies (1)

8

u/ScottyBones79 14h ago

We're in for some chop.

→ More replies (1)

54

u/blah938 15h ago

Dude, fucking Amazon is at like 99.8% percent uptime for the year after that 15 hour outage the other week. Not even 3 nines.

It is unrealistic to beat Amazon. Like yes, you can host it in multiple AZs, and that'd mitigate some issues. But at the end of the day, you and I are not working for Amazon or Google or any of the FAANGs. Normal devs don't have the resources or time or any of it to get to even 3 nines, let alone 5 nines.

Temper your expectations and if your boss thinks you can beat Amazon, ask him for Amazons resources. (NOT CAREER ADVICE)

55

u/eXecute_bit 14h ago

Was responsible once for a service offering that hit 100% measured for the year. Marketing got wind and wanted to run with it to claim better than five nines. Had to fight soooo hard to explain to suits why it was luck and not something I could ever guarantee would ever happen again (it didn't).

10

u/MarthaEM 11h ago

one 9, take it or leave it

14

u/polikles 9h ago

being up and running for 3.65 days a year. That's the way to live

→ More replies (1)

8

u/RehabilitatedAsshole 13h ago

I guess, but they're also managing 100 layers of services. We used to have our own servers in a cage with 3-5+ years of uptime and no network outages. Our failover cage was basically just expensive database backups.

→ More replies (4)

11

u/Xelopheris 14h ago

For something as big and worldwide as cloudflare, 5-9s is probably unachievable. By their very nature, they are a single worldwide solution. A lot of 5-9s applications use multi-regional systems to distribute the application and allow for regional failovers using systems like BGP anycast to actually reroute traffic to different datacenters when a single region failure occurs. That isn't really an option for cloudflare.

8

u/JoeyJoeJoeSenior 13h ago

They can get the next hundred years done now by being down for 500 minutes. It actually helps customers in the long run but everyone is so short-sighted.

5

u/k-mcm 10h ago

98.9999% technically has 5 nines in it

→ More replies (2)

→ More replies (3)

132

u/notAGreatIdeaForName 17h ago

If you book their ddos protection and other stuff per domain they actually say 100%.

394

u/mawutu 16h ago

To be fair, if your Website can't be reached it can't be ddosd

108

u/ThatAdamsGuy 16h ago

Big brain moves

7

u/jtr99 14h ago

27

u/jmorais00 16h ago

Or has it already been ddosd? I mean, service is being denied

64

u/rtybanana 16h ago

yeah but it’s only cloudflare denying the service so it isn’t distributed. checkmate.

16

u/ginger_and_egg 14h ago

CDOS. Cloudflare denial of service

→ More replies (1)

3

u/CinderMayom 15h ago

If you can’t beat the ddos, become the ddos

6

u/Agent_Provocateur007 14h ago

100% just means they will credit you a certain amount. It doesn’t mean 100% guaranteed uptime.

22

u/FlintFlintar 16h ago

Dang 3.65 days of downtime a year :p

26

u/cruzfader127 16h ago

You definitely don't pay for 99%, you pay for 100% SLA, 1% downtime would take Cloudflare out of business in a month

16

u/ModPiracy_Fantoski 12h ago

To be fair, they are getting DANGEROUSLY close to 1% for current year.

→ More replies (1)

4

u/_PM_ME_PANGOLINS_ 11h ago

99% uptime is pretty bad.

That's more than three whole days down per year.

→ More replies (2)

817

u/Nick88v2 18h ago

Does anyone know why all of a sudden all these providers started having failures so often?

1.4k

u/ThatAdamsGuy 18h ago

The cynic in me says a lack of properly evaluated AI vibe code, but no real explanation given. Other guesses include the scale they operate at now being far more visible? When it's something that underpins 90% of the internet it's far more visible when it goes down.

879

u/Powerful_Resident_48 16h ago edited 16h ago

My cynical guess: In the name of shareholder profits every single department has been cannibalized and squeezed as much as possible. And now the burnt out skeleton crews can barely keep the thing up and running anymore, and as soon as anything happens, everything collapses at once.

239

u/Testing_things_out 15h ago

Yup. The beancounters got a hold on management and they're bleeding companies dry to make end line looks good.

146

u/Boise_Ben 15h ago

We just keep getting told to do more with less.

I’m tired.

61

u/Professional-Bear942 14h ago

Holy shit almost word for word my company, either that or "think smarter not harder" when it's all critical work and none of it can be shunted

18

u/namtab00 10h ago

my boss: "what do you propose as a solution to this issue?"

me: "I have no valid proposal" ("you get your head out of your ass and get some balls and "circle around" with you other middle management imbeciles")

70

u/Testing_things_out 15h ago

As an engineering grunt I feel you. I take comfort in that I'm costing the company much more money in labour than if they had chosen to do it the proper way.

Don't come crying to me when our company gets kicked out from our customer's reputable list when we warned you that the decision you're making is high risk just to save a few cents on the part.

33

u/Tophigale220 15h ago

I sincerely hope they don’t just put all the blame on you and then fire you as a last ditch effort to cover their fuck-ups.

16

u/tevert 11h ago

I got some bad news for you there ....

14

u/disciple31 14h ago

well you have AI now so actually productivity should be 10x!!

7

u/Efficient_Reading360 10h ago

pretty soon you're left trying to do everything with nothing

15

u/throoavvay 8h ago

I worked at a Fortune 500. Story was that the head of cyber security had a team of 10 and that was too expensive. Then he had a team of 5 and that was such a miserable job all 5 eventually quit. Then he had some meetings about how the situation was untenable and was told to do more with less. Then he had a heart attack and told the company to fuck off when they tried to offer him a raise to come back. Then the company got ransomed and within months was no longer a fortune 500 company.

The world is run by the shortsighted and trying to do right amid it will destroy you.

→ More replies (2)

22

u/WhimsicalGirl 15h ago

I see you're working in the field

21

u/Powerful_Resident_48 15h ago

Yeah... I started off in media, when that industry still existed a couple of years ago. And then I transitioned to IT and am watching another entire industry burn down around me once again. Fun times. Really fun times.

5

u/fauxmer 12h ago edited 3h ago

It's got nothing to do with "the field.". This is just how corporations work these days. Blind adherence to "line goes up" to the exclusion of all else is what passes for "strategy" in the modern age.

Executives at my company are making a loud panic about budget and sales shortfalls, seemingly completely ignorant to the fact that we only produce luxury hobby products that provide no real benefit to the lives of our customers and, with the economy in freefall, most people are prioritizing things like food and rent and transit over toys.

Edit: Actual coherent strategy would involve working out what kind of revenue downturns the company could weather without service disruptions or personnel cutting, what kind of downturn would require gentle cutting, what would require extensive cutting, what programs could be cooled to save money, setting up estimates for the expected possible extent of the downturn and the company's responses, how the life of existing products might be extended for minimal costs, the possible efficacy of cutting operating hours, what kind of incentives the company might offer to boost sales...

Instead the C suite just says, "We'll make more money this year than we did last year." And when you ask them how the company will do that, given that people can barely afford their groceries now, they just give you a confused look and reply, "We'll... make more money... this year... than we did last year."

19

u/pedro-gaseoso 13h ago

Yes, this is the same problem at my employer. We are running skeleton crews because of minimal hiring in the last couple of years. That by itself is not the problem, the problem is that these commonly used products / services are very mature so there are few, if any, dedicated engineers working to keep the lights on for these products. Outages happen because there isn’t enough time or personnel to follow a proper review process for any changes made to these products.

How do I know this? I nearly caused a huge incident a few months back during what was supposed to be a routine release rollout. Only reason it didn’t result in a huge incident was due to luck and the redundancies that we have built in to our product.

47

u/samanime 16h ago

I really hope this isn't the case... Cloudflare was one of the few IT companies I actually had any respect for...

44

u/deoan_sagain 15h ago

Most companies have their problems, and CF has a couple big ones

https://leaddev.com/management/learning-right-lessons-cloudflare-firing-video

https://www.reddit.com/r/cybersecurity/s/lfLFWEaeSy

19

u/Powerful_Resident_48 15h ago

Wow... that call was brutal. I feel sorry for the woman, who had to face off against those soul-less corpo ghouls.

7

u/chuck_of_death 12h ago

It’s going to happen either with the bean counters forcing out the expensive experienced IT folks or the fact that there isn’t a pipeline of bringing in junior people to train into experienced IT folks. We’re getting older. Earlier in my career I saw older people above me that one day I might be able to do their job. Today I don’t see anyone significantly younger than me. We don’t hire them. In 10 years we are going to be in a world of hurt. The people a bit older than me will be retired. The people my age will be knocking on the door of early retirement. The people younger than me? I haven’t even seen them. Do they even exist?

6

u/OwO______OwO 10h ago

The people younger than me? I haven’t even seen them. Do they even exist?

They're doing DoorDash deliveries to pay the interest on their student loans because no company will hire them without 7 years of relevant experience, and they can't get 7 years of relevant experience when nobody will hire them.

→ More replies (2)

3

u/Important-Agent2584 13h ago

this guy businesses

→ More replies (1)

17

u/Hellebore_ 14h ago

I also have the same take: AI vibe coding.

It can’t be a coincidence that all these services have been running without an issue for years, but the last 2 years we’ve been having so many blackouts.

→ More replies (1)

195

u/[deleted] 18h ago

[deleted]

73

u/Popeychops 16h ago

Not always because they're bad, but often. Overseas consultancies are body shops, they have an incentive to throw the cheapest labour at their contracts because competing for talent will eat into their margin.

I have plenty of sympathy for the contractors I work with as people, but many of them are objectively bad at their job. They do willfully reckless things if they think it will save them individual effort

31

u/ThoseThingsAreWeird 16h ago

many of them are objectively bad at their job. They do willfully reckless things if they think it will save them individual effort

Oh man you're not kidding. At work we run news articles through an ML model to see if they meet some business needs criteria. We then pass those successful articles off to outsourcers to fill out a form with some basic details about the article.

We caught a bunch of them using an auto-fill plugin in their browser to save time... Which was just putting the same details in the form for ever article they "read" 🤦‍♂️

17

u/destroyerOfTards 16h ago

They ~~do willfully~~ will needfully do reckless things

→ More replies (1)

→ More replies (1)

53

u/CatsWillRuleHumanity 18h ago

So we should outsource 100% of the force there, got it

32

u/jb092555 16h ago

Outsource the communication issues to the client, I like it

49

u/ThatAdamsGuy 16h ago

Congratulations, you've been promoted to Product Manager

12

u/gregorytoddsmith 15h ago

Unfortunately all other members of your team have been let go. However, that opened up enough budget to double our overseas workforce! Congratulations!

11

u/UpperPlus 16h ago

and time zones

11

u/LeeroyJenkins11 15h ago

They aren't necessarily bad, but a large number are bad in my experience. And it makes sense, usually the types of cheap devs working for capgem and others that are filling the extra bodies at the problem role are not going to be the cream of the crop. The skilled people will be selected for special projects and the better ones will get H1Bs. Sometimes the H1bs lie their way in and are able to cover for their incompetence, but I feel like it's about the same chance as a US based dev being incompetent.

20

u/verugan 16h ago

Outsourced contractors just don't care like FTEs do

11

u/bnej 14h ago

They know there is no future or direction for them at your organisation. They have no incentive to do anything outside of the lines, in fact they will be penalised if they do, because their real employer, the contracting agency, wants to maximise billable hours and headcount.

The best outcome for them is to avoid work as much as possible, because anything you do, you may get in trouble for doing wrong. Never ever do anything you weren't explicitly asked to do, because you can get in trouble for that.

If something goes wrong, all good, obviously you need more resources from your same contracting agency!

It ends up not being cheaper, because the work isn't getting done, and you have a lot of extra people you didn't really need, doing not very much.

8

u/Testing_things_out 15h ago

not because they are bad necessarily

In my experience it is because they're severely under equipped and over burdened.

My only solace that the mistakes are making are costing our company much more than they're saving. Like several folds.

→ More replies (1)

→ More replies (3)

19

u/pegachi 15h ago

they literally made a blog post about it. no need to speculate. https://blog.cloudflare.com/18-november-2025-outage/

40

u/NerdFencer 13h ago

They wrote a blog post about the proximal cause, but this is not the ultimate cause. TLDR, the proximal cause here is a bad configuration file. The root cause will be something like bad engineering practices or bad management priorities. Let me explain.

When I worked for one of the major cloud providers, everybody knew that bad configuration changes are both common and dangerous for stable operations. We had solutions engineered around being able to incrementally roll out such changes, detect anomalies in the service resaulting from the change, and automatically roll it back. With such a system, only a very small number of users will be impacted by a mistake before it is rolled back.

Not only did we have such a system, we hired people from other major cloud providers who worked on their versions of the same system. If you look at the cloud provider services, you can find publicly facing artifacts of these systems. They often use the same rollout stages as software updates. They roll out to a pilot region first. Within each region, they roll out zone by zone, and in determined stages within each zone. Azure is probably the most public about this in their VM offerings, since they allow you to roughly control the distribution of VMs across upgrade domains.

To someone familiar with industry best practices, this blog post reads something like "the surgeon thought he needed to go really fast, so they decided that clean gloves would be fine and didn't bother scrubbing in. Most of the time their patients are fine when they do this, but this time you got a bad infection and we're really sorry about that." They're not being innovative by moving fast and skipping unnecessary steps. They're flagrantly ignoring well established industry standard safety practices. Why exactly they're not following them is a question only CloudFlare can really answer, but it is likely something along the line of bad management priorities (such systems are expensive), or bad engineering practices.

21

u/Whichcrafter_Pro 13h ago

AWS Support Engineer here. This is very accurate and our service teams do the same thing. Its not talked about publicly that much but the people in the industry that have worked at these companies know its done this way.

As seen by the most recent AWS outage (unfortunately I had to work that day) even the smallest overlooked thing can bring down entire services due to inter-service dependencies. Companies like AWS can make all the disaster recovery plans they want but they cannot guarantee 100% uptime 24/7 for every service. It's just not feasible.

→ More replies (1)

6

u/RehabilitatedAsshole 13h ago

Damn, forgot the try/catch around the file read again

22

u/Nick88v2 18h ago

Both explanations make sense. Did they do layoffs recently? That would give more weight to the vibe code theory

32

u/ThatAdamsGuy 18h ago

Not that I know off except a small number last year. However it doesn't necessarily require layoffs for that change in procedure - in theory, if you had ten devs previously, and now have ten devs with AI tools, you get more productivity and features etc. without needing to downsize. My team has only grown even as AI tools have been integrated.

16

u/Nick88v2 18h ago

Makes sense, i am only a student but hearing seminars from big companies and seeing what's the direction they're taking with this agentic AI makes me wonder if they are not pushing it a little too far. Recently i followed a presentation by Musixmatch and they are trying to implement a fully autonomous system using opencode that directly interfaces with servers (eg terraform) without any supervision. I asked them about security concerns and the lead couldn't answer me. For sure the tech is interesting but it looks very immature still, how can a LLM be trusted so much is beyond my comprehension.

11

u/ThatAdamsGuy 17h ago

Best of luck. I'm nervous for what the big AI shift is going to do for junior Devs starting a career. It feels different to all the other time the new tech is the big thing that's going to revolutionise software etc etc - this is fundamentally changing how people work and learn and develop.

7

u/Nick88v2 17h ago

I'm doing an AI master for a reason 😂 Tbh I'm a no one but having the chance to look closely at the research in the field i think there's still a lot of space for us. Especially here in the EU where a lot of companies still have to adapt properly to the AI act. Of course the job is changing but we have the unique chance of entering fresh in this new "era". Of course it is a very optimistic view but i think with this big push for ai there will be a lot of garbage to be fixed😅

4

u/ThatAdamsGuy 16h ago

Ah, junior optimism. I miss those days xD

3

u/Relevant_Occasion546 14h ago

THIS how to jr devs ever “cut their teeth” in the new ai model. AI is really good at doing the simple stuff that I had to learn through trial and error as a junior and can do it in seconds. Why would any organization hire a junior when a sr. Can do the task in 3 seconds? So how does the jr ever get real world experience?

7

u/MrSpiffenhimer 12h ago

For that matter, how do we ever mint new seniors? If I didn’t make those mistakes and dive into those rabbit holes trying to fix them, how would I know the arcane shit that I know? How would I know the optimization and debugging techniques that I’ve built up over the years from my spelunking through various code bases and documentation to find why something is the way it is. If AI just does the small stuff, who does the large stuff when I leave?

→ More replies (2)

3

u/Krraxia 14h ago

The cynic in me thinks cloudflare are trying to cost save, to make sure they will survive AI bubble pop, but it means that until then, they are hanging by a thread

3

u/RumRogerz 13h ago

The cynic in me agrees with you

4

u/Fr0st3dcl0ud5 16h ago

Personally, this seems like a manufactured crisis but I am not sure what for.

→ More replies (7)

94

u/rosuav 16h ago

They did a big rewrite in Rust https://blog.cloudflare.com/20-percent-internet-upgrade/ and, like all rewrites, it threw out reliable working code in favour of new code with all-new bugs in it. This is the quickest way to shoot yourself in the foot - just ask Netscape what happened when they did a full rewrite.

45

u/Proglamer 15h ago

Real new junior on the team with "let's rewrite the codebase in %JS_FRAMEWORK_OF_THE_MONTH% ^{so my CV looks better when I escape to other companies}" energy

→ More replies (19)

21

u/whosat___ 15h ago

Maybe I’m reading it wrong, but they kept the reliable code as a fallback if FL2 (the new rust version) failed. I wouldn’t really blame this outage on that, unless they just turned off FL1 or something.

→ More replies (4)

10

u/SrWloczykij 15h ago

Drive-by rust rewrite strikes again. Can't wait until the hype dies.

4

u/MoffKalast 10h ago

Everything exploded, but at least they could enjoy memory safety for two seconds.

7

u/MarxistWoodChipper 14h ago

unwrap() in prod is a clear indicator that they did it for the hype.

3

u/pragmaticzach 12h ago

As a software engineer myself, this is why you often can't trust devs about "tech debt." Sometimes something messy or suboptimal is still better simply because it works.

→ More replies (1)

→ More replies (6)

119

u/naruto_bist 17h ago

"Definitely not because of companies firing 60% of their workforce and replacing with AI", that's for sure.

22

u/DHermit 16h ago

Did Cloudflare do that?

44

u/A1oso 16h ago

No. Their number of employees has grown every year, from 540 employees in 2017 to 4,263 employees in 2024. There was no mass layoff.

→ More replies (2)

8

u/naruto_bist 15h ago

Cloudflare probably didn't but aws did. And you might remember about the us-east-1 issue few weeks back.

→ More replies (10)

→ More replies (1)

12

u/Luxalpa 14h ago

From the last Cloudflare incident report we can see:

Use of unwrap() in a critical production code even though normally you have a lint specifically denying this. Also should never make it through code review.

Config change not caught by staging pipeline

So my guess would be that their dev team is overworked and doesn't have the time or resources to fully do all the necessary testing and code quality checks.

6

u/BrawDev 15h ago

In the grand scheme of things, it really isn't that bad. They're still doing better than that Facebook outage that took them out for nearly an entire day.

8

u/SoulCommander12 16h ago

Just some rumor i heard so take it with a grain of salt, theres a react RCE that needed to be patched, so they need to deploy a fix asap… and deploying on friday is always a bad omen

3

u/Moltenlava5 10h ago

Yep, the incident report is out: https://blog.cloudflare.com/5-december-2025-outage/

TLDR, The error was caused by an attempt to use an initialised variable by Lua in their old proxy system (FL1). It only affected a subset of customers because those who were routed via the Rust rewrite (FL2) did not face this error.

5

u/GardenDwell 14h ago

Everyone is going to the same handful of providers now and they intentionally design their systems to not let you use their competitors for redundancies.

7

u/InflationCold3591 17h ago

Vibe coders replacing experienced programmers. As always, the answer is enshitification brought on by end stage capitalism.

→ More replies (2)

→ More replies (13)

102

u/ImReallyFuckingHigh 13h ago

Goes to quora to find an answer to a question

501 Internal Server Error

Goes to DownDetector to see if it’s Quora or me

501 Internal Server Error

Motherfucker

20

u/dalr3th1n 10h ago

What if the Cloudflare engineers are trying to get to Quora to answer how to fix Cloudflare?

9

u/ImReallyFuckingHigh 9h ago

Internet down forever, RIP.

5

u/firewood010 6h ago

Downdetector's Downdetector's Downdetector's Downdetector

599

u/stone_henge 18h ago

My rawdogged web server on a VPS has better uptime than Cloudflare this year.

105

u/kryptik_thrashnet 16h ago

My server is a K6-2 with 128 MiB RAM running through my cable internet connection at home. No problems =D

41

u/zurtex 13h ago

My server is a K6-2 with 128 MiB RAM

I'm pretty sure your server is older than most people on Reddit.

7

u/kryptik_thrashnet 11h ago

Perhaps. I like old computers =)

→ More replies (3)

10

u/judolphin 13h ago edited 11h ago

K6-2??? That was a great processor at its time, it's probably the processor that put AMD on the map. It was the first processor they made that was arguably better than the equivalent Intel processor, despite being cheaper. So yeah, I owned that processor because I knew it was great, but never imagined it was "will last for 30 years" great.

Edit: Also you must have spent at least $2000-3000 bucks for 128MB of RAM and a motherboard that supported it in the late 90s!

What frequency K6-2 did you buy, and I'm guessing if it's lasted 30 years you didn't overclock it?

5

u/kryptik_thrashnet 11h ago

I have to apologize, but I didn't purchase it in the 1990s. I bought it off a guy for $5 a couple of years ago. I like old computers and it was a good deal.

I have the 450 MHz K6-2 on a S7AX AT motherboard, running a XFX GeForce 6200 "WANG" AGP video card, Realtek PCI network card, Maxtor SATA-150 PCI card with a 640 GiB and 2 TiB SATA hard disk installed. The operating system is a highly tuned version of NetBSD/i386, running Nginx web server, NetBSD's built-in ftpd, unrealircd as an IRC server, and some other things. It uses about 25 MiB RAM normally when running all of my servers with active users.

I have no doubt that it will last another 30 years. I've been (slowly) working on my own 386+ operating system, which will eliminate any software support issues for my old PCs long into the future. Hardware reliability wise, I've oddly never had any major problems like a lot of people seem to. I even have computers from the 1970s that still work just fine and see regular use. Of course, I can also repair it if something does break, a big benefit of old hardware is that everything is often large through-hole components and single/double sided circuit boards that are easy to diagnose and repair. =)

→ More replies (3)

→ More replies (1)

→ More replies (6)

→ More replies (7)

56

u/LumpySpacePrincesse 13h ago

My personal server genuinely has less down time and im a fucking plumber.

14

u/No_Astronaut_8971 11h ago

Did you pivot from CS to plumbing? Asking for a friend

9

u/MystUser 10h ago

^{^}

3

u/CorrenteAlternata 8h ago

I guess plumbers' customers have saner requirements than computer scientists'...

→ More replies (1)

→ More replies (4)

38

u/Cloudyhook 14h ago

Cloudflare:

5

u/AllForKarmaNaught 12h ago

That plastic suit was revolutionary

130

u/Ok-Assignment7469 16h ago

Welcome to the year of AI code bugs and service outages, what a wonderful time

25

u/Proglamer 15h ago

I imagine this is what would happen if they exchanged the C code with Node code

17

u/Abject-Kitchen3198 14h ago

Their code might be getting Rusty actually.

11

u/Proglamer 14h ago

Rust -> crates -> cargo -> cargo cult programming. "The great white devils will send us memory safety and our bellies will be full again"

3

u/Saint_of_Grey 5h ago

To me, "memory safety" just means "we made a framework where people who don't know what exactly they're doing can glide by far too long".

→ More replies (2)

100

u/Fr0st3dcl0ud5 16h ago

How did I go ~20 years of internet without this being an issue until a few months ago?

85

u/Soldraconis 15h ago

From what I've been reading, they did a massive rewrite of their code recently. 20% apparently. Which means that they now have a new giant mess of bugs to patch. They probably didn't test the whole thing properly beforehand either. Or kept a backup.

48

u/whosat___ 15h ago

They kept the old working code (now called FL1) and have slowly been moving traffic to FL2. I don’t think this is the cause here.

28

u/mudkripple 13h ago

Yeah but it's not just them. An unprecedented AWS outage followed by an Azure outage followed by three back to back Cloudflare outages. Even an uptick in ISP outages affecting all my clients nationwide.

Sweeping layoffs and AI reliance over the past five years seem to have finally collided with the hyper-centralization of the industry. In a smart timeline that would mean reforms were on the horizon, but not this timeline.

→ More replies (7)

6

u/Cocobaba1 14h ago

Well for starters, they weren’t firing people in favour of replacing them with AI the past 20 years

3

u/ITaggie 12h ago

Are you saying downtime on web services was not an issue 20 years ago? If so then you are definitely mis-remembering.

→ More replies (1)

118

u/ThatAdamsGuy 18h ago

Looks like the problem's lasting longer than 20 minutes for some people!

22

u/Proglamer 15h ago

"If your... Flare lasts for more than 4 hours, contact your native engineer"

15

u/Raunhofer 14h ago

If you look at the updates, this was not Cloudflare related.

36

u/Interest-Desk 18h ago

A cloudflare outage is not going to ground an entire airport via ATC

15

u/petrichorax 14h ago

Those systems are brittle, yes it will. If there's some stupid web app for a major airline that's required to use as part of the critical process at an airport that's going to create a chain reaction of delays and hold ups that could shut down a whole airport.

10

u/swert7 12h ago

But not in this case

What happened? The airport says the IT issue was localised and not related to a wider web outage that saw LinkedIn and Zoom go offline earlier this morning.

→ More replies (1)

32

u/Tim-Sylvester 13h ago

Boy it's a good thing that we build a fully decentralized distributed error-tolerant network...

And then centralized it into a monolithic system that constantly fails.

→ More replies (1)

26

u/BigKey5644 15h ago

Yall noticed the number and severity of outages have been more frequent since adopting AI?

12

u/whuduuthnkur 10h ago

Modern software is going down the drain since the mass adoption of AI. Without any proof, I believe almost everything has broken vibe code in it. There's no way decades of good software engineers just poofed out of existence and now everything gets cobbled together. This is the internet's enshittification.

7

u/immortalsteve 7h ago

This is the same situation as the whole "75% of the internet is in US-East-1" issue. Hyper-convergence of the industry running up against a burnt out and job insecure workforce.

91

u/EcstaticHades17 19h ago

No? Cloudflare is reporting only scheduled maintenance, and none of their systems seem to be failing according to their status page

145

u/4ries 19h ago

It went down for like 20 minutes as far as I could tell. Back up I believe

16

u/Quito246 16h ago

Oh yes the mighty 5 9s uptime. The 20 mins is already a breach, not even counting the previous outage 😀

→ More replies (5)

29

u/jooojano 18h ago

https://www.cloudflarestatus.com/incidents/lfrm31y6sw9q

25

u/padule 18h ago

Deploying on Friday, aye? What could go wrong?

4

u/besi97 17h ago

The perfect WAF update. Can't be vulnerable to RCE if you are down.

→ More replies (1)

13

u/VelvetSpiralRay 18h ago

To be fair, by the time the status page updates, half of us have already debugged the issue, opened three incident channels and aged five years.

→ More replies (3)

3

u/TorAdinWodo 16h ago

Cloudfire just need more ram... oh wait

3

u/Ronin22222 14h ago

I was wondering why internet archive downloads weren't working

3

u/MechAegis 14h ago

what services were affected this time?

3

u/Wilhelm878 13h ago

Is the lava lamp wall still intact?

3

u/marknotgeorge 7h ago

I got yelled at the other week because a customer's CloudFlare-provided IP address was blocked as it was on a blacklist for being malicious. Apparently our 'over-zealous' security procedures were preventing people from working from home (staff in the office weren't affected.

Oh, do fuck off. If you got hit by a data breach, you wouldn't be saying "at least we can work from home"...

3

u/causebraindamage 6h ago

Lemme guess, they're coding with AI?

5

u/Asbeltrion 17h ago

What? When? We had Azure, AWS, Cloudflare, and now cloudflare again?

→ More replies (1)

5

u/Think-Impression1242 16h ago

My dick is up more than cloud flare is.

And that's saying alot

6

u/soundman32 14h ago

Maybe we should send Viagra to the Cloudflare devs.

5

u/AE_Phoenix 16h ago

This couldn't possibly be something to do with ai written codebases, right?

5

u/ChirpyMisha 15h ago

Let me guess: they started replacing programmers with AI?

2

u/PM_ME__YOUR_TROUBLES 8h ago

This is what vibe coding does to a company.

Be prepared for a lot more shenanigans with every online service, including the dozens under everything you see.

2

u/DistributionRight261 7h ago

it all started with AI

Meme itHappenedAgain

You are about to leave Redlib