Why
Personal websites are mostly static. You browse them once, read a few things, then leave. I wanted mine to feel more alive — something you could actually interact with, not just read.
The idea was simple: a small chat bubble in the corner. You ask it something about me or my work, it answers. If you want to leave feedback, it passes that on too. No forms, no email links, just a conversation.
What I didn't expect was how many iterations it would take to get right.
Starting Simple
The first version was a basic chatbox. Single API route, sends a message, gets a response, renders it. No tools, no streaming. It worked, but it had obvious problems.
The main one: the model would confidently make up details about my projects and posts. Ask it "what has ZJ been working on?" and it would hallucinate a project that didn't exist. That's worse than saying nothing.
The fix was obvious in hindsight — give the model tools to look things up instead of relying on what's baked into its weights.
Going Agentic
The API route now uses Vercel's ai SDK with streamText and a set of tools the model can call before it responds:
const result = streamText({
model: litellm("claude-haiku-4-5"),
system: SYSTEM_PROMPT,
messages,
maxOutputTokens: 200,
stopWhen: stepCountIs(5),
tools: {
searchPosts,
getPostContent,
getProjects,
navigateToPost,
getTimeline,
getListeningActivity,
submitFeedback,
},
// ...
});tsSeven tools in total. The model decides which ones to call based on the question. Ask about my projects, it calls getProjects. Ask what I'm listening to, it calls getListeningActivity which hits the Spotify API. Ask for a blog recommendation, it calls searchPosts and then navigateToPost to get the URL — I specifically made that a separate tool because the model kept constructing URLs itself and getting them wrong.
The system prompt is tight about this:
Do not make up information not provided by your tools.
When recommending a specific blog post, always call navigateToPost to get its URL,
then use that exact returned URL. Never construct the URL yourself.textGive the model a tool that returns the URL instead. Without it, it will confidently produce /blog/wrong-slug and the page will 404. This applies to any identifier the model could plausibly guess wrong — slugs, IDs, API paths.
The Streaming Protocol
I didn't want to use standard SSE. I wanted the ability to send different types of payloads in the same stream — text chunks, card data, feedback confirmations, token counts. So I built a simple line-delimited JSON protocol.
Every line the server sends is a JSON object with a single-letter key:
{ t: "Hello" } // text delta
{ s: true } // seal the current bubble (new response starting)
{ c: { type: "posts", items: [...] } } // card data
{ fb: true } // feedback was sent
{ u: { input: 42, output: 18 } } // token usage at the endtsThe client reads the stream line by line, parses each object, and routes it. Text deltas get appended to the current message. A { s: true } means the model just finished a tool call and is now speaking again — start a new bubble. Cards get attached to the last message and rendered as a grid below the text.
The { s: true } seal signal was necessary because the model sometimes calls a tool, then continues with a text response. Without it, the tool call output and the follow-up text would merge into one garbled bubble.
Cards
When searchPosts or getProjects returns results, the model doesn't describe them in prose. It gives a brief intro ("here's what I found"), and the UI renders the results as clickable cards.
This is a deliberate split. The model says "take a look at these" and the UI handles the layout. Post cards show the title, date, and description. Project cards show the title, status badge, and a short description. Both are links.
The reason for this: if the model had to describe three blog posts in text, it would either write too much or awkwardly truncate. Cards let the content speak for itself. The model's job is to introduce, not to summarise things that are already well-described.
Mobile: Draggable Button
On desktop, the chat panel opens above the button in the bottom corner. Simple enough.
On mobile it's more complicated. A fixed-position button in a corner can overlap content depending on the page. I made it draggable — you can move it anywhere on screen, and it snaps to the nearest side when you let go.
The drag is implemented with touch events:
onTouchStart={(e) => {
dragStart.current = { x: e.touches[0].clientX, y: e.touches[0].clientY };
startPos.current = { ...pos };
}}
onTouchMove={(e) => {
const dx = e.touches[0].clientX - dragStart.current.x;
const dy = e.touches[0].clientY - dragStart.current.y;
if (!isDragging && Math.abs(dx) < 4 && Math.abs(dy) < 4) return;
setIsDragging(true);
setPos({ x: startPos.current.x + dx, y: startPos.current.y + dy });
}}tsxThe 4px threshold distinguishes a tap from a drag. Without it, any slight finger movement while tapping would trigger a drag, which made the button feel unresponsive. On release, the button snaps to the left or right edge depending on which side of the screen it's on.
Set touchAction: "none" on any element with custom touch handlers. Without it, the browser claims the touch event for scrolling before your onTouchMove fires, and the drag won't work.
Small Things That Matter
A few details that seem minor but make a real difference:
Auto-focus. When you open the chat on desktop, the input is immediately focused. You don't have to tap into it. This sounds obvious, but it's the difference between a widget that feels native and one that feels like an afterthought.
Nudge tooltip. A small "ask me anything" tooltip appears near the button 2 seconds after the page loads, then again after 60 seconds of idle time (once per session). It disappears on its own. It's subtle enough to not be annoying, but it does increase engagement — people who might not notice the button in the corner will eventually see it.
Session limit. Users get 10 messages per session. After that, they can reset and start fresh. Two reasons: cost control, and honestly, if someone needs more than 10 turns to get what they want, the assistant probably isn't being helpful enough. The limit also discourages bots from running long conversations.
Token counter. Input and output token counts show below each response. This is mostly for my own curiosity, but some people find it interesting.
What I Learned
Agentic doesn't mean complicated. Seven tools sounds like a lot. In practice, each one is a few lines of code and a clear description. The hard part is writing good tool descriptions — the model uses those to decide when to call what. Vague descriptions lead to wrong tool calls or no tool calls at all.
Streaming protocols are worth designing. I could've used standard SSE with a single text event type. But mixing cards, feedback signals, and usage stats into the stream cleanly was only possible because I had a typed protocol. Adding a new event type is just adding a new key.
The model needs constraints, not trust.
The model needs constraints, not trust. The first version gave the model too much freedom and it hallucinated. Every tool in the current version exists specifically to replace something the model was getting wrong: wrong project details, wrong blog descriptions, wrong URLs. The system prompt is less about telling the model what to do and more about fencing in what it's not allowed to do.
The chat widget is live at the bottom right of the main page if you want to try it. Ask it about my projects, my posts, or what I'm listening to right now.