Outside-the-Box Thinking: The Team Culture That Built Production AI in 90 Days

The Meeting That Didn't Happen

January 2025, Dubai. We are in a conference room, whiteboarding the hallucination problem. Standard approaches are not working. The room has that particular atmosphere of people who are genuinely enjoying a problem, which is to say, it smells faintly of coffee and suppressed panic.

In a typical corporate environment, this is the moment where someone would say:

"Let's schedule a follow-up workshop."
"We should research what competitors are doing."
"Maybe we need to bring in consultants."
"Let's create a steering committee."

None of that happened.

Instead, someone asked: "What if we stop trying to make one AI smarter and make multiple AIs reach consensus?"

And instead of "interesting idea, let's commission a six-month feasibility study and present the findings to a board sub-committee," we said: "Let's prototype it."

Ninety days later, Genius2 was in production. The steering committee, had we formed one, would have still been arguing about the scope document.

That is the culture that built this.

The Team Nobody Taught How to Think

I need to be careful here. What follows will sound like bragging. It is, to some extent, bragging. But it is relevant bragging, in that it explains something real about how we work and why that matters.

Our team does not think conventionally. Not because we are smarter than everyone else (though some of us are insufferably clever and know it, which is at least better than being insufferably clever and not knowing it). It is because we have learned to question the assumptions that limit most engineering efforts.

Conventional wisdom said: "You cannot eliminate hallucinations without fine-tuning on massive datasets."
Our question: "Why not consensus across independent models?"

Conventional wisdom said: "Enterprise AI requires cloud infrastructure."
Our question: "Why not on-premise with local models?"

Conventional wisdom said: "LLMs need to calculate because that is what users expect."
Our question: "Why not generate code for calculations and let the computer do the computing?"

See the pattern? We do not accept "that is how it works" as a final answer. We accept it as a starting point for further enquiry, in the same way that "this is how we've always done it" is not an explanation but an invitation to ask why.

The Culture of Constructive Stubbornness

Here is how our team operates. Five principles, none of them particularly complicated, all of them surprisingly rare.

Principle 1: Question Everything
Not destructively. Constructively. "Why do we do it this way?" is not criticism. It is curiosity wearing a hard hat. When someone says "the standard approach is X," we ask "why is that standard?" Sometimes the answer is "because it is genuinely the best way." Great. We use it. Sometimes the answer is "because everyone does it that way." That is when we dig deeper, because "everyone does it" is a description of behaviour, not a justification for it.

Principle 2: Prototype Over Debate
We do not schedule six months of research. We build a proof of concept in a week. If it works, we iterate. If it does not work, we throw it out and try something else. Genius2's initial prototype took three weeks. Not three months. Not three quarters. Three weeks. Was it perfect? No. Was it good enough to know the concept worked? Yes. Was that good enough to keep going? Absolutely.

Principle 3: Throw Out Code Fearlessly
I have watched teams spend months on code and then refuse to discard it even when a clearly superior approach has emerged. This is the sunk cost fallacy in software form, and it is responsible for more bad production systems than most people care to admit. Our team has no emotional attachment to code. If a better architecture appears, we rebuild. We rewrote Genius2's similarity analysis three times before finding the approach that actually worked. The first two attempts were not wasted time. They were evidence gathering.

Principle 4: Real Problems Over Theoretical Purity
Academic papers optimise for elegance. Production systems optimise for results. We are not chasing publications. We are solving problems that real organisations have in the real world, which is considerably messier than the world described in academic papers. If the solution is theoretically untidy but works reliably in production, we ship it. Tidiness is a luxury. Reliability is a requirement.

Principle 5: Learn From Every Deployment
Each client teaches us something. Each industry reveals edge cases that no internal testing would have found, because internal testing does not have compliance departments or fleet management systems or a user in the Barcelona office who has discovered seventeen creative ways to break the query parser. We built feedback loops into everything: user behaviour analysis, error pattern tracking, performance monitoring, direct client feedback. The product improves continuously because we are listening continuously.

The Right Kind of Smart

Here is the tricky part. Our team is smart. Really smart. But not in the way people usually mean when they say that.

We are not "memorise algorithms and recite them on demand" smart. That skill is genuinely useful and we are not dismissing it. We are, however, describing something different.

We are "see problems differently" smart. The kind of smart that says:

"Everyone is trying to make LLMs better at maths. What if we simply stop asking them to calculate?"
"Everyone is accepting hallucinations as an inevitable feature of the technology. What if they are not?"
"Everyone is sending their data to cloud providers because that is how AI works. What if it does not have to?"

This is not IQ. It is perspective. And perspective, as it turns out, is more useful in production engineering than the ability to derive a gradient from first principles at short notice.

The Diversity of Thought

Our team includes engineers from enterprise backgrounds who understand production requirements, startup veterans who understand that "done and deployed" beats "perfect and scheduled," academic researchers who understand theoretical foundations, and industry practitioners who understand real-world constraints well enough to know which theoretical foundations can be safely ignored in a pinch.

Crucially, none of us came from "traditional AI research." Most of us arrived from other disciplines entirely: systems architecture, database optimisation, security engineering, distributed systems. This cross-pollination creates insights that AI-native teams sometimes miss, because AI-native teams have the advantage of deep expertise and the corresponding disadvantage of not being able to see past it.

Fresh eyes are not always better than experienced eyes. But they occasionally notice the thing the experienced eyes have learned to filter out as background noise.

The Humility Component

Here is where I need to be honest about our limitations, because no one should take seriously a team that claims not to have any.

We are good at questioning assumptions, rapid prototyping, production deployment, and real-world problem solving. We are considerably less good at formal academic research, publishing papers, following established processes, and slow methodical planning of the kind that produces excellent Gantt charts and very little code.

Different skills for different contexts. If you need a peer-reviewed paper, hire academics. If you need production AI in 90 days, call us. These are not competing claims. They are different jobs requiring different people, and the confusion between them is responsible for a substantial amount of corporate disappointment on both sides.

The January-to-April Journey

How team culture translated to execution, week by week:

Week 1 (Dubai): Whiteboarding, wild ideas, no sacred cows. Consensus approach emerges from a room where no one is allowed to say "that is not how it is done" without immediately following it with "and here is why that matters."

Weeks 2-4: Prototype sprint. Everyone contributes ideas. The best concepts survive. The rest are discarded without ceremony or grief.

Weeks 5-8: Refinement based on initial testing. Multiple rewrites. No attachment to old code. The goal is the best product, not the preservation of our initial cleverness.

Weeks 9-10: Alpha deployment with a friendly client who had agreed to be honest with us. Real feedback. Real problems. Real moments of "ah, we did not think of that."

Weeks 11-12: Architecture revision based on that feedback. The resulting design is better than the initial one, which is exactly what real-world testing is for.

Week 13: Production ready. Not perfect. But good enough to deploy and improve, which is the correct standard for production systems. Perfect is not a shipping state. It is a direction of travel.

Weeks 14 onwards: Continuous improvement based on real usage. Which is where we still are, and expect to remain.

This only works with a team that moves fast without breaking things critically, learns from failure without assigning blame, iterates without ego, and ships without demanding perfection as a precondition.

The Mistakes We Made

The "perfect team" narrative is fiction, and fiction belongs in novels, not post-mortems. We made mistakes. Here are the ones worth mentioning.

Mistake 1: Initial timeout was too aggressive. It killed legitimate queries that simply required more time to resolve. Lesson: test edge cases with real data, because real data is more creative than any test suite you will ever write.

Mistake 2: First prompt management system was too comprehensive. Users did not want comprehensive. They wanted simple. Lesson: users want answers, not architecture.

Mistake 3: Early caching strategy was too aggressive. It served stale data with full confidence. Lesson: smart invalidation beats aggressive caching, every time, without exception.

Mistake 4: We assumed users would want to see technical details. Most users want the answer, not the reasoning. They trust the mechanic to fix the car. They do not want to watch. Lesson: what users say they want and what they actually want are related but not identical.

The key, in each case: we caught these quickly, fixed them quickly, and moved on without lengthy post-mortems, blame assignment, or documentation designed to establish that no one in particular was at fault. "That did not work. Try this instead." Four words. Entire management philosophy.

The Remote Work Factor

An interesting detail: our team is distributed. Vilnius, Dubai, and several other locations whose time zones do not obviously overlap. We do not have daily standups. We do not have sprint planning meetings. We do not have formal processes of the kind that generate Confluence pages nobody reads.

What we have: asynchronous communication that respects the fact that it is currently 2am in someone's timezone, trust that people will deliver without being watched, shared access to everything, and direct lines to everyone. The best idea wins regardless of who has it or where they are. This is not a radical principle. It is, however, one that a surprising number of organisations find difficult to implement.

Modest Smart, Not Arrogant Smart

Arrogant smart says: "We know better than everyone." Modest smart says: "We see this differently. Perhaps we are onto something. Let us find out."

Arrogant smart dismisses conventional approaches without understanding them. Modest smart understands them thoroughly, then asks whether there is a better way, and accepts that sometimes the answer is no.

Arrogant smart does not listen to clients, because the clients do not understand the technology. Modest smart knows that every deployment teaches us something the clients could not have told us in advance, and that we could not have predicted without them.

We are confident in our ability to solve problems. We are humble about how much we do not know, because the amount we do not know is, on any reasonable estimate, enormous. This is not a contradiction. It is what competence actually looks like.

What This Means If You Are Building AI Systems

Hire for perspective, not just credentials. The best AI engineers do not necessarily come from AI backgrounds. The best insights occasionally come from people who see the problem without the accumulated assumptions of the field. This is not always true. But it is true often enough to be worth considering.

Create a culture of constructive questioning. "Why do we do it this way?" should be encouraged, not threatening. If the question has a good answer, give it. If it does not, that is information worth having.

Value speed of iteration. Months of planning often produces worse results than weeks of prototyping, because months of planning optimises for the world as imagined, while weeks of prototyping reveals the world as it actually is.

Remove attachment to code. The best solution might require discarding your initial approach entirely. This is fine. The initial approach was not wasted. It was the means by which you discovered what the actual approach should be.

Listen to production. Real users reveal things internal testing will not find. This is not a criticism of internal testing. It is an acknowledgement that the real world is larger and stranger than any test environment.

What's Next

Our team is already working on the next generation: expanding Genius2 to 40 or more models, industry-specific model optimisation, advanced agentic integration, and privacy-preserving federated learning. Whether any of this arrives in the order listed is a question we are wisely choosing not to answer definitively.

More importantly, we are staying curious. Staying humble. Staying willing to question everything, including our own answers.

Because the moment you think you have figured it all out is the moment you stop improving. And we have not figured it all out. We have figured out quite a lot, which is not the same thing, and we are not confusing the two.

We are just getting started. The universe, typically, has something to say about statements like that. We look forward to finding out what.

Next: The future of enterprise AI and where this is all heading.

Still reading? You've clearly got questions that need answering.

The good news is there's an AI built specifically for exactly that situation. No steering committee required. No six-month feasibility study. Just answers.

Try AskDiana for FREE

All Posts