AI Closes the Cloud Exit

Remember the “cloud exit” tech mega-trend that popped up a couple years ago? Of course you do, who can forget this gem and all the debates it kicked off? As we all know, another tech mega-trend popped up a bit more recently: The rise of AI and specifically ChatGPT and other LLMs. And this mega-trend effectively cancels the cloud exit for a large number of organizations.

AI means you can’t “exit” the cloud now even if you want to.

If you are a software engineering leader in 2024, then three things are true — one on the business side, one on the technical side, and one on the finance side.

Business

First, on the business side your CEO wants you to add “AI” to your software product offerings. Like, RIGHT NOW. If your product can’t do this, within a shockingly short time period your customers are going to replace it with something that can.

Technical

On the technical side, you can’t really do game-changing AI features without specialized hardware. Your data scientists will tell you that’s certainly true for model training. But something that is maybe less obvious is that you will most likely want GPUs for model inference as well. In other words, on the user-facing “request path” of your applications.

Finance

Finally, on the financial side, your company or department can’t afford to buy that kind of hardware and set it up and run it in an on-premise data center. Unless your company can afford it and it makes business/technical/security/financial sense. Those companies do exist, and if you happen to work at one of those, you are exempt from reading the rest of this post. For everybody else, flip to another browser tab and do a quick bit of research into what a minimal GPU rig will cost you in dollars (assuming you can even get your hands on one these days). Or just trust me that you’re starting in six figures and for something that vaguely looks like something your SREs wouldn’t be embarrassed to run in production you’re rapidly blowing by seven.

The Way Out is IN The Cloud

Given all of those constraints, what are you supposed to do? There’s a variously-attributed quote not quite fit for a family blog the gist of which is that if it flies or floats, rent, don’t buy. Same rules apply here. And who would you be renting GPU hardware from? Why, your friendly neighborhood cloud vendor, of course.

Welcome back to the cloud!

Public cloud vendors have the ability and economy of scale to procure and run impressive GPU fleets, and they make those available to you for, well, a new line item on your monthly cloud bill for sure, but almost certainly faster and more cost-effectively than you could pull that together yourself. That’s absolutely true in the short term for the vast majority of organizations. Maybe not the long term. But as has been observed in other contexts, in the long term we’re all dead.

And sure, you can draw an architecture diagram for a mostly on-premise system that calls out to the cloud for AI training and even inference. But then you’re stuck with some weird hybrid infrastructure. Spoiler alert, hybrid public/private cloud infrastructure is hard. You usually need a very compelling business case and ROI to make that worthwhile (not to mention some experienced engineers).

Cloud Exit was a fun debate and interesting thought experiment for a year or so. But as always, the tech world rapidly moves on to the next thing, and that’s AI. Which means AWS, Google Cloud, and Azure can rest easy knowing you still need them.

Previous
Previous

Fix the Agile Daily Standup

Next
Next

Agile vs. Developer Productivity