Back to Roami Hobbies
Engineering case study

How Roami finds your hobby

Roami is not a directory you scroll through. It is an AI-powered matching system that reads what you actually mean and surfaces hobbies that fit your life — not just your keywords.

222 hobbies4-step pipelineClaude Haiku NLPSemantic embeddings

The pipeline

What happens when you search

A real trace through the system for a live query.

Example query

“I'm burned out and want something physical I can be bad at”

Understanding You

Claude Haiku parses the free-text query and extracts structured intent. It reads emotional state, infers physical and social preferences, and produces a typed object validated with Zod.

// Extracted intent from NLP step (Claude Haiku)
{
  physicalLevel:  "active",
  pressureLevel:  "zero",
  screenFree:     true,
  queryIntent:    "hybrid",
  vibePhrase:     "burned out, want something physical"
}
Claude Haiku~200 msZod schema validationStructured output

Semantic Search

The vibe phrase is embedded and compared against every hobby's precomputed embedding. This catches conceptual matches that keyword search misses — “burned out” finds hobbies tagged with low-pressure and restorative, even if neither word appears in the description.

Vibe query

Semantic weight: 85%

Embeddings carry most of the signal — what the person feels matters more than what they typed.

Criteria query

Structured weight: 65%

Hard constraints (“cheap, solo, indoor”) are better served by attribute matching than semantic distance.

Scoring Engine

Every hobby is scored across 7 dimensions, then those dimension scores are blended with the semantic similarity score. The weights shift based on queryIntent — so a vibe query leans on embeddings, a criteria query leans on attributes.

7 scoring dimensions

PhysicalDoes the activity level match what the user wants?
SocialSolo, group, or flexible?
CostDoes the start-up cost fit within the stated bracket?
TimeSession length vs available time
SettingIndoor / outdoor / both
PressureCompetitive pressure tolerance
Screen-freeHard filter when the user needs to be offline

Your Matches

Results are bucketed into tiers — not just sorted by a raw number. A hobby scoring 40+ is a great fit; 20+ is worth trying. Each match comes with a personalised explanation tied to what the user actually said, not a generic blurb.

Bouldering

Great fit

Physical, pressure-free — you can literally not care how hard the route is. No score, no leaderboard. Just you and the wall.

0.88

Open-water swimming

Worth trying

Deeply physical and completely screen-free. Cold water is a reset button that is difficult to replicate indoors.

0.62

Design system

Deep Tay palette

Named after the Tay estuary in Scotland. Ink and Tay anchor the dark sections; Copper brings warmth and focus; Cream breathes as the primary background.

Ink

#111a24

Tay

#1a2832

Copper

#c4956a

Cream

#faf6f1

Pine

#5a7247

River

#2a5a5a

Stone

#e0d9cf

Typography

Playfair Display

Weight 400 — headings, display text, brand voice. Serif warmth without formality.

System UI body

system-ui stack — fast to render, zero network cost, legible across all platforms.

Fira Code — data font

Used for scores, code blocks, and numeric output. Tabular numerals for alignment.

Component tokens

Outline badgeMuted badge
Card — rounded-2xl

Noise texture: 2.5% opacity SVG turbulence overlay on bg-background sections — adds tactile depth without visible grain.

Architecture

Key decisions

Every architectural choice is a tradeoff. Here is why we made these ones.

200 ms · Zod-validated

Why Haiku for NLP

Claude Haiku returns structured JSON in ~200 ms — fast enough that it feels instant to the user. We validate the output with Zod, so malformed responses fail loudly rather than silently. GPT-4 would be roughly 4× slower and 10× more expensive for this narrow extraction task.

Adaptive weighting

Why blend semantic + structured scoring

"I need to touch grass" carries meaning that keywords alone miss — that is semantic search territory. But "cheap, solo, indoor" is a constraint filter that embeddings blur. Vibe queries weight semantic at 85%; criteria queries flip to 65% structured. The blend adapts to what the user actually asked.

Zero cold starts

Why static JSON, not a database

222 hobbies fit comfortably in memory. Importing the JSON at build time means zero cold starts, no connection pools, and instant deploys. RAG would add an embedding pipeline, a vector store, and an ingestion cron job — all to replace a 15 KB file. We chose the option with fewer moving parts.

Zero friction

Why no accounts

Friction kills discovery. Every extra step between "I'm curious" and "here are your hobbies" loses people. No sign-up means the tool is available in seconds, shareable via a single link, and requires zero trust from a first-time user. Authentication can always be added later; first impressions cannot be taken back.

The data

Curated, not scraped

Every hobby was written by hand. Not aggregated, not auto-generated. Each entry has a first-session narrative so the user knows exactly what their first hour looks like.

222

Hobbies

9

Categories

15+

Attributes per hobby

7

Scoring dimensions

What each hobby includes

Name, tagline, description
Cost bracket + specific estimate (£)
Time per session bracket
Physical level (sedentary → active)
Social mode (solo / group / optional)
Setting (indoor / outdoor / both)
Pressure level (zero / low / moderate)
Screen-free flag
Tags for semantic indexing
Why it sticks — the retention hook
First session narrative
Starting points (kit / class / community)
Hidden gem flag
Category (craft / art / sport / mind / etc.)
Precomputed embedding vector

Try it yourself

Type what you're in the mood for — or how you're feeling. The pipeline does the rest.

Find my hobby