Blog Summariser with AI

|4 min read|

How I built an AI summariser for my blog posts using a Large Language Model (LLM).

Motivation

If you've read my Website Migration post, you'll remember I teased a TLDR ✨ button at the end. I've been meaning to write about it since, so here we are.

To be honest, nowadays people have very short attention spans, and I don't blame them. I much rather prefer to scroll through Tiktok videos, laugh at brainrot content, and read short-form content on Twitter than to read long-form content on blogs. Which is quite funny considering I write long-form content on my blog (or at least I try to).

In light of this new reality, I wanted to make my blog posts a bit more cliche, a bit more digestible, and a bit more short-form. So I thought, why not copy everyone else and add a "summary" button at the start of each blog? And honestly, I just wanted an excuse to wire up an LLM to do something useful on my website.

So I built a little AI summariser - Click the button, and you'll get a nice summary of the blog post.

How It Works

There are three main components to this:

  1. The beautiful TLDR ✨ button with a splash of stars and sparkles.
  2. The API route that calls the LLM.
  3. A lightweight cache so we don't call the LLM on every click, because I'm poor and I don't want to waste my credits. (don't abuse my API btw).

The Button

PostSummarizer.tsx
<button
  onClick={handleSummarize}
  disabled={loading}
>
  {loading ? (
    <>
      <span className="spinner" />
      TLDR-ing...
    </>
  ) : (
    "TLDR ✨"
  )}
</button>
tsx

Once clicked, the button is replaced by the summary text. So once you've read the summary, you can decide whether you want to read the full article or not.

The API Route

On the backend, I have a Next.js API route at /api/summarize that receives the post title and raw markdown content, strips all the markdown formatting, and sends it to Claude Haiku.

pages/api/summarize.ts
function stripMarkdown(text: string): string {
  return text
    .replace(/```[\s\S]*?```/g, "")        // code blocks
    .replace(/`[^`]*`/g, "")               // inline code
    .replace(/!\[.*?\]\(.*?\)/g, "")       // images
    .replace(/\[([^\]]*)\]\(.*?\)/g, "$1") // links → keep label
    .replace(/^#{1,6}\s+/gm, "")          // headings
    .replace(/(\*\*|__)(.*?)\1/g, "$2")   // bold
    .replace(/(\*|_)(.*?)\1/g, "$2")      // italic
    .trim();
}
ts

I strip the markdown before sending it because the LLM doesn't need the formatting syntax, it's just noise. Then I pass the clean text (capped at 1,750 characters) to the model with a tight prompt:

system prompt
Summarize this blog post in 1-2 sentences. Be concise and capture the key insight. Only return the summary text. Refer to the author as "he" and "ZJ" only.
text

I'm also doing this to tighten the token limit and reduce the cost, because I don't want to pay for random tokens. I am cheap like that.

The model is Claude Haiku, fast and cheap, and it does a pretty good job at summarizing the content. I experimented with a few different models, and this one gave me the best balance of cost and quality.

Caching

The only annoying thing about calling an LLM is cost. I don't want to spend tokens every time someone clicks the button on a post they've already visited. So I added a simple localStorage cache. When the user clicks the button, it first checks if there's a cached summary for that post. If there is, it returns that instead of calling the API. If not, it calls the API and then stores the result in the cache for next time.

The cache key is built from the post title and a hash of the content:

caching logic
function cacheKey(title: string, content: string) {
  return `summary:${title}:${hashContent(content)}`;
}
ts

Hashing the content is the important bit, if I update a post, the content hash changes, the old cached summary is automatically invalidated, and the next click fetches a fresh one. No manual cache busting needed.

Cached summaries expire after 24 hours, just in case:

cache expiration
const CACHE_TTL_MS = 24 * 60 * 60 * 1000;
 
if (Date.now() - timestamp < CACHE_TTL_MS) return summary;
ts

The first time you click TLDR ✨ on a post, it calls the API. Every click after that (on the same device, same browser) loads instantly from cache. Clean, and cost effective. (Cheap).

What I Learned

  • Keep the prompt simple. My first instinct was to write a long, detailed prompt with lots of instructions. It didn't help, Haiku is good at summarisation and doesn't need much hand-holding. The shorter the prompt, the more predictable the output.

  • Client-side caching goes a long way. For a personal blog, I don't need a database or a Redis cache. localStorage is fine. The summaries are small, the TTL is short, and the cache-busting via content hash works perfectly.

Try It Out

If you want to see it in action, just click the TLDR ✨ button at the top of this post. It should give you a quick summary of what this post is about. Pretty neat being a cheapo.