Why Enterprise Software Engineers Should Take AI Seriously Now

I’ve generally been skeptical of tech hype cycles for most of my career (I suppose because of my science background).

I saw the mobile revolution hyped up in 2008, only to watch it transform everything in the following decade. I saw blockchain proclaimed as the future of every industry simultaneously, only to find a handful of real-world use cases and then abandon its claim to encompass everything else. I saw big data become a consulting category before most teams had even figured out what they actually wanted to do with the data they already had.

Each of these cycles followed a recognizable pattern: a genuine technical capability is discovered, it’s extrapolated far beyond its current possibilities, it generates a huge amount of noise, and then, once the noise dies down, it changes things—usually more slowly and specifically than the initial hype suggested.

AI is in that cycle right now. And those engineers dismissing it out of euphoria may be making a mistake.

The euphoria is real. And so is the underlying capability.

When I look at what large language models can actually do—not what demonstrations suggest, but what they reliably do in production—I see something technically different from previous cycles of hype.

Natural language models are systems that can parse and generate natural language with a fluency that makes them truly useful as intermediaries between humans and structured information. This is a new capability, not a faster version of an existing one. And the use cases that stem from it—retrieval systems that understand query intent, interfaces that translate between technical systems and human language, tools that extract relevant information from large unstructured corpora—are legitimately useful in enterprise environments.

The hype focuses on extrapolation. The underlying capability is real.

Why engineers need to pay special attention

Consumer AI applications are interesting. But it’s in enterprise AI applications where the engineering challenges become serious.

Consumer applications can tolerate some inconsistency. For example, a recommendation system that’s wrong 10% of the time might be annoying. However, a medical documentation system that presents incorrect clinical details 1% of the time poses a patient safety issue. Or a recruitment tool that consistently disadvantages certain demographics isn’t a user experience problem, but a legal and ethical crisis.

Enterprise environments have three characteristics that consumer environments often lack: high risk per decision, regulatory exposure, and the need for auditability. An enterprise AI system must be explainable, not just to engineers, but also to auditors, regulators, and the people whose outcomes it affects. It must be consistent enough to be tested and validated. It must fail predictably, not unpredictably.

These requirements don’t make enterprise AI less interesting; they just make it more difficult. The engineering challenges that arise when trying to build reliable, auditable, and secure AI systems for production are, in reality, large-scale unsolved problems.

Transferable and non-transferable skills

Engineers transitioning to AI from traditional software development have real advantages that are often underestimated.

Production experience is paramount. Understanding how systems fail, why distributed components interact unexpectedly, what observability truly requires, and how to design for degraded performance are transferable skills that most AI-native engineers lack. The software engineering culture that deems the “it works on my machine” response unacceptable is precisely the culture that enterprise AI development needs and often lacks.

Skepticism toward demos is another factor. Engineers who have deployed software in production know that the gap between a working prototype and a reliable system is where most of the real work lies. This instinct proves protective in an AI context where it’s easy to produce impressive demos and difficult to build reliable systems.

However, what is not easily transferable is the assumption of determinism. Production software is based on the expectation that the same input will produce the same output. AI systems are probabilistic. This is where the mental model needs to be adjusted—not abandoned, but adapted. The question shifts from “Does this work?” to “Under what conditions does it work reliably enough to be released to market?”.

This shift can be learned, though it is challenging. It takes time to internalize, but it doesn’t represent a complete break from how meticulous engineers already conceive of failure modes and edge cases.

What I foresee for 2024

Some developments I consider more relevant than most of the noise:

Retrieval-Augmented Generation (RAG) is becoming the dominant pattern in enterprises. Instead of relying solely on model knowledge (which is static, unverifiable, and potentially obsolete), RAG systems retrieve relevant information at the time of inference and contextualize responses within that context.

The EU AI Act will come into force in August 2024. For teams developing systems in regulated fields (education, healthcare, HR, finance), the Act’s requirements regarding documentation, human oversight, and conformity assessment will influence architectural decisions in ways most teams haven’t yet considered.

Evaluation is the unsolved problem. How do you know if an AI system is working correctly? For deterministic software, you write tests. For AI systems, the question is much more complex. Defining what constitutes good performance, consistently measuring it, and detecting when performance degrades are open engineering problems without established solutions, unlike database indexing or API design. Teams developing rigorous evaluation frameworks are doing essential work.

What I’m not doing

I’m not abandoning everything I know to become an AI developer.

The experience I’m accumulating in production systems, software architecture, and building systems that must work reliably for thousands of users isn’t a liability in this transition. It’s the most relevant thing I bring to the table.

What I’m doing is learning the specific differences between AI systems and traditional software: probabilistic behavior, recovery architectures, evaluation challenges, the regulatory framework, and applying the engineering discipline I’ve acquired to an area that desperately needs it.

Some teams currently developing enterprise AI are composed mostly of people who understand the models, but not the production systems. The teams that will build reliable enterprise AI will be those that master both.

That’s where I’m focusing my attention.

Conclusion

I think dismissing AI because the hype is excessive is the wrong response to the hype. The underlying capability is real, the business use cases are legitimate, and the engineering problems are complex enough to be worth solving.

What the industry needs are engineers who can bring rigor to systems that are currently conceived more as research projects than as software products. That’s a problem that experience solves.

Reliable Enterprise AI