AI/MLFull-StackLive

ArDine

AI-powered restaurant management platform

Next.jsDjangoPostgreSQLChromaDBGroq APIRAG

Technical co-founder. Restaurants digitise their menus; diners get a per-restaurant AI chatbot delivering personalised dish recommendations via RAG — an open-source LLM (Groq API) interprets user intent and ranks dishes by cosine similarity over per-dish ChromaDB vector embeddings.

The problem

Restaurant menus are dead. A PDF or a laminated card tells you nothing about what you actually want to eat tonight. We kept asking ourselves: why does a food ordering experience have less intelligence than a basic Spotify recommendation?

ArDine started as a frustration — both my co-founder and I had sat in restaurants squinting at menus, unsure what to order. The idea was simple: make every menu a conversation.

How the RAG pipeline works

Each dish on a restaurant's menu is embedded into a vector using a sentence transformer. These embeddings live in a per-restaurant ChromaDB collection — so queries stay isolated between restaurants.

When a user sends a message ("something spicy but not too heavy, I don't eat beef"), the LLM (accessed via Groq API for low-latency inference) first extracts intent and constraints from the query. We then do a cosine similarity search over the dish embeddings, re-rank by constraint satisfaction, and pass the top candidates back to the LLM to generate a natural language recommendation.

The key architectural decision was separating intent extraction from retrieval. Early versions just embedded the raw query and searched — this worked poorly for negations ("no onions") and soft preferences. Explicit intent extraction fixed this.

What broke, what I learned

The hardest problem wasn't the AI — it was menu ingestion. Restaurants have menus in every format: PDFs, photos of handwritten boards, Word docs, WhatsApp messages. We spent more time on robust menu parsing than on the chatbot itself.

I also underestimated how much latency matters in a conversational UI. Our first setup used a self-hosted model that added 3–4 seconds per response. Switching to Groq brought this under 800ms and the difference in perceived quality was enormous — the same outputs felt smarter just because they arrived faster.

Launching a live product as a first-year taught me that the gap between "demo works" and "product works" is mostly about edge cases nobody in your friend group will hit.

← all projects visit ArDine ↗