Motivation
If you've read my Website Migration post, you'll remember I teased a TLDR ✨ button at the end. I've been meaning to write about it since, so here we are.
To be honest, nowadays people have very short attention spans, and I don't blame them. I much rather prefer to scroll through Tiktok videos, laugh at brainrot content, and read short-form content on Twitter than to read long-form content on blogs. Which is quite funny considering I write long-form content on my blog (or at least I try to).
In light of this new reality, I wanted to make my blog posts a bit more cliche, a bit more digestible, and a bit more short-form. So I thought, why not copy everyone else and add a "summary" button at the start of each blog? And honestly, I just wanted an excuse to wire up an LLM to do something useful on my website.
So I built a little AI summariser - Click the button, and you'll get a nice summary of the blog post.
How It Works
There are three main components to this:
- The beautiful
TLDR ✨button with a splash of stars and sparkles. - The API route that calls the LLM.
- A lightweight cache so we don't call the LLM on every click, because I'm poor and I don't want to waste my credits. (don't abuse my API btw).
The Button
<button
onClick={handleSummarize}
disabled={loading}
>
{loading ? (
<>
<span className="spinner" />
TLDR-ing...
</>
) : (
"TLDR ✨"
)}
</button>tsxOnce clicked, the button is replaced by the summary text. So once you've read the summary, you can decide whether you want to read the full article or not.
The API Route
On the backend, I have a Next.js API route at /api/summarize that receives the post title and raw markdown content, strips all the markdown formatting, and sends it to Claude Haiku.
function stripMarkdown(text: string): string {
return text
.replace(/```[\s\S]*?```/g, "") // code blocks
.replace(/`[^`]*`/g, "") // inline code
.replace(/!\[.*?\]\(.*?\)/g, "") // images
.replace(/\[([^\]]*)\]\(.*?\)/g, "$1") // links → keep label
.replace(/^#{1,6}\s+/gm, "") // headings
.replace(/(\*\*|__)(.*?)\1/g, "$2") // bold
.replace(/(\*|_)(.*?)\1/g, "$2") // italic
.trim();
}tsI strip the markdown before sending it because the LLM doesn't need the formatting syntax, it's just noise. Then I pass the clean text (capped at 1,750 characters) to the model with a tight prompt:
Summarize this blog post in 1-2 sentences. Be concise and capture the key insight. Only return the summary text. Refer to the author as "he" and "ZJ" only.textI'm also doing this to tighten the token limit and reduce the cost, because I don't want to pay for random tokens. I am cheap like that.
The model is Claude Haiku, fast and cheap, and it does a pretty good job at summarizing the content. I experimented with a few different models, and this one gave me the best balance of cost and quality.
Caching
The only annoying thing about calling an LLM is cost. I don't want to spend tokens every time someone clicks the button on a post they've already visited. So I added a simple localStorage cache. When the user clicks the button, it first checks if there's a cached summary for that post. If there is, it returns that instead of calling the API. If not, it calls the API and then stores the result in the cache for next time.
The cache key is built from the post title and a hash of the content:
function cacheKey(title: string, content: string) {
return `summary:${title}:${hashContent(content)}`;
}tsHashing the content is the important bit, if I update a post, the content hash changes, the old cached summary is automatically invalidated, and the next click fetches a fresh one. No manual cache busting needed.
Cached summaries expire after 24 hours, just in case:
const CACHE_TTL_MS = 24 * 60 * 60 * 1000;
if (Date.now() - timestamp < CACHE_TTL_MS) return summary;tsThe first time you click TLDR ✨ on a post, it calls the API. Every click after that (on the same device, same browser) loads instantly from cache. Clean, and cost effective. (Cheap).
What I Learned
-
Keep the prompt simple. My first instinct was to write a long, detailed prompt with lots of instructions. It didn't help, Haiku is good at summarisation and doesn't need much hand-holding. The shorter the prompt, the more predictable the output.
-
Client-side caching goes a long way. For a personal blog, I don't need a database or a Redis cache. localStorage is fine. The summaries are small, the TTL is short, and the cache-busting via content hash works perfectly.
Try It Out
If you want to see it in action, just click the TLDR ✨ button at the top of this post. It should give you a quick summary of what this post is about. Pretty neat being a cheapo.