Private by Architecture (Or: How to Stop Worrying and Love Your Own Server)

AI Privacy Architecture Work Related

When we designed AskDiana and Genius2 in January 2025, we made a fundamental architectural choice. It was the kind of choice that usually involves a lot of expensive coffee and people in suits saying "synergy," but we kept it simple:

Data sovereignty is not negotiable.

Not as a "premium add-on." Not as an optional "privacy pack" buried in a 40-page EULA. It is a core architectural principle, much like "not being on fire" is a core principle of successful space travel.

This decision shaped everything else: how we deploy, which models we support, and why our pricing doesn't require a degree in Galactic Economics to understand.

The Three Deployment Models

We support three deployment patterns, designed to satisfy everyone from the "I just want it to work" crowd to the "I keep my servers in a lead-lined bunker" enthusiasts.

Model 1: Your Cloud (The VPC Deployment)

You deploy everything in your own cloud environment: your AWS/Azure/GCP account, your security groups, and your encryption keys. We provide the software; you provide the digital real estate. Your data never touches our infrastructure or any external AI provider.

Privacy guarantee: Your data never leaves your cloud account. It stays exactly where you can keep an eye on it.

Model 2: On-Premise (The "Air-Gapped" Fortress)

For organisations that view the public internet with the same suspicion one might reserve for a Vogonic poetry recital:

  • Deploy on physical hardware you can actually kick.
  • Completely air-gapped if needed.
  • No internet connection required.

One of our security sector clients runs completely air-gapped. Their data physically cannot leave their facility unless someone carries it out in a bucket.

Privacy guarantee: Data never leaves your physical infrastructure.

Model 3: Hybrid (The Tiered Security "Don't Panic" Option)

For organisations that need flexibility:

  • Sensitive queries → Local models.
  • Non-sensitive queries → Cloud models (if you're feeling adventurous).

The system is smart enough to route based on sensitivity markers, much like a Babel fish, but for corporate secrets.

Privacy guarantee: Sensitive data stays local. You control what goes where.

The Technical Architecture: Under the Hood

Here is how it works. When you ask AskDiana a question, a series of very fast, very sensible things happen:

  1. Query processed locally on your infrastructure.
  2. Document retrieval from your own data stores.
  3. LLM inference on your hardware.
  4. Genius2 consensus calculated locally.
  5. Response generated and returned.

At no point does your data transit the public internet to our servers. It doesn't go for a stroll, it doesn't look for snacks, and it certainly doesn't talk to strangers.

The Model Problem: "But the best models are cloud-only!"

True. GPT-4, Claude, and Gemini currently live on provider infrastructure. Our solution is radical: We make it your choice.

If you want to use cloud models, plug in your API keys and go for it. If you don't? Use open-source models like Llama, Mistral, or Qwen. They run on your hardware and don't report back to a mothership.

The fascinating part: Open-source AI is evolving at a rate that would make a hyper-intelligent shade of the colour blue jealous. In January 2025, they were the "underdogs." By November 2025, the gap is closing so fast you can hear the sonic boom. Plus, with Genius2's consensus approach, multiple "pretty good" models working together can actually beat one "excellent" model -- especially when that excellent model wants to phone home.

The GDPR Bonus & The Cost Equation

If you operate in the EU, this architecture is a compliance dream. No transfer agreements, no "where is my data?" existential crises. It's on your hardware. If you want to delete it, you hit delete. Simple.

As for the money:

Traditional Cloud AI: upwards of $75,000/month for high-volume users

Local Execution: hardware upfront + ~$1,000/month electricity

You break even in about eight weeks. After that, you're saving enough money to buy a small moon, or at least a very nice lunch.

The Trust Architecture (Or: Why You Shouldn't Believe Us)

Here's what we tell clients: "We could be lying. We could be collecting your data despite our promises. How do you know we're not?"

The answer is beautiful: You don't have to trust us.

Because it's on your infrastructure, you can monitor the network traffic. You can verify that there are no external calls. You can review the code. We build systems where you can verify the truth for yourself.

Privacy means you control your data. Always. Completely. Our job is to build AI that respects that control, rather than trying to circumvent it with clever marketing.

Next up: Code generation architecture and how we make AI calculate correctly (without it hallucinating that 2 + 2 = 5 for extremely large values of 2).

I am sure the marketing department have someone tailing me. They are recording my calls. My desk is bugged and I think they are even putting something in my coffee. Please click this link to keep them off my back:

Try it out for free: askdiana.ai