From “Instant Search” to “Patient AI” in Enterprise UX

Sep 7

A recently published and widely cited study from MIT claimed that some 95% of GenAI projects inside companies have produced “zero returns”. But contrary to a lot of the subsequent third-party analysis in the media, I’d suggest we can’t assume based on this study that AI is not technically ready for deployment in the enterprise. After all, the success or failure of a software project depends both on technical excellence and user expectations. And based on my experience, some projects will fall short in the latter.

Let’s take the specific use case of enterprise search, a common target for GenAI-baesd improvements.

The Magic of Instant Search

Since the dawn of Web 2.0 (maybe 20 years now), web user experiences have trained everyone to expect answers to search queries in milliseconds. The classic example that engineers try to match, of course, is Google Search. You type, hit the big button, and you get answers back in under a second. And that miniscule delay is mostly because of remote latency. The engineers behind the screen are indeed measuring responses in milliseconds. Google engineers of that era don’t even know how to count seconds.

GenAI Breaks the Millisecond Illusion

But GenAI changes all of this. An average single LLM chat response will take on the order of single-digit seconds to complete. Answers that kick in complex reasoning logic can take more than 10 seconds. Agent graphs that involve multiple LLM tools interleaved with tool calls can take even longer.

Historically, when web engineers do performance work, they are conditioned to look at time-to-first-byte. But when LLMs are involved, it’s much more useful to look at time-to-last-byte. Or maybe time-to-first-useful-byte, as chatbots will often kick off their streaming responses with nonsequiters to buy themselves a little more “thinking time”, like “That’s an interesting question, Scott. Let’s try to unpack that in detail.” So what you’re seeing when you interact with any of the mass-market chatbots these days are psychological tricks like streaming the answers to obfuscate how long things are really taking on the backend.

Chat Doesn’t Solve Everything

At some level, this is quite reasonable. When I ask a colleague a question in a meeting, I don’t expect them to answer back in milliseconds. I’d prefer that they take a second or two to consider their answer. Thinking before speaking is commonly considered a sign of intelligence, after all.

Drop a GenAI call behind a Web 2.0–style search box and your latency will spike. Users will consider your UX a failure. No amount of pressuring the engineers is going to fix this. It’s a fundamental limit of how LLMs work, and it’s very unlikely that software or hardware is going to fix this for you in the near future. (Apologies to Nvidia.)

At this point, a lot of teams take what feels like the natural next step: “Well, let’s just put the LLM behind a chat interface. People already expect chatbots to take a few seconds to respond, and the streaming answer makes the delay feel less painful.” And indeed, most of the mass-market adoption of LLMs has come through chat UIs. They set psychological expectations for slowness that match or at least approach the limitations of the technical implementation.

Search vs. Chat: Different Jobs, Different Expectations

But even though you can certainly apply LLM to both chat and search, as interfaces they are not interchangeable. Chat is conversational, exploratory, and tolerant of a little ambiguity. Enterprise search, on the other hand, is transactional and precision-oriented. When someone types into a corporate search box, they’re often looking for a very specific document, record, or fact, and they expect it quickly. If you try to bolt LLMs onto that paradigm without adjusting the UX, you’re setting yourself up for disappointment.

The problem is that after 20 years, many product managers and designers have never been asked to build any other kind of search UX. And this is going to lead to a lot of projects that will churn and wrongly be perceived as failures.

If you’re simply using retrieval-augmented generation (RAG) as a drop-in replacement for traditional search, you’re doing it wrong and your project is going to fail.

Designing for Patience, Not Speed

So let’s talk about some user experience changes that we need to start doing if we’re going to have successful GenAI projects.

Asynchronous UX: The user experience needs to express to users that when they are working with an AI, they are asking the AI to run a task and it’s reasonable for that task to take some time. Think progress bars, checking status, or even mobile or email notifications when high-quality AI work product is available.

Workflow UX: Interacting with GenAI in general and especially with AI agents should be presented more like project management or collaboration with a coworker. For example, after the agent makes a plan you might show the plan to the user and provide status on individual steps. This will reframe the user model from “query -> result” to “request -> process -> deliverable”.

Transparency and Control UX: If GenAI interactions are shifting toward asynchronous workflows, users need more than just a “please wait” spinner. They need trust that work is happening and visibility into the process. Progress indicators, intermediate results, and clear reasoning build that trust. Just as important, the UX should give users the ability to step in to make adjustments, course corrections, or approvals along the way so that they feel in control rather than at the mercy of a black box.

Stop Your GenAI Project From “Failing”

GenAI isn’t failing because the models don’t work. It’s failing because we’re still trying to wrap 2025 technology in 2005 UX patterns. If we want these projects to succeed, we need to stop pretending that AI is just a natural-language search engine and start designing around what it actually is: a coworker that works at its own pace. That means normalizing waiting, exposing process flows, and giving users control along the way. The teams who learn this shift fastest will be the ones who stop reporting “failed AI projects” and start delivering real enterprise value.

If your team is exploring AI but struggling to make it work for your users, I can help. I work with companies to design and integrate AI systems that fit real-world workflows and deliver value. Let’s talk about how to make your next AI project one of the successful ones.

Scott McMaster https://www.smcmaster.com