Skip to main content

Command Palette

Search for a command to run...

How I Reduced API Response Time by 40% Using Redis Caching in Node.js

Updated
10 min readView as Markdown

A few months back, I started getting Slack messages that every backend engineer dreads: "hey, is the app slow for anyone else?" Nothing was down. No errors in the logs. Just... sluggish. Pages that used to load instantly were taking a beat longer, and that beat kept getting longer as our user base grew.

The first instinct on the team was to throw more hardware at it. Bump the instance size, add another replica, call it a day. I get why that's tempting — it's a five-minute fix that doesn't require touching code. But I'd been burned by that approach before at a previous job, where we 2x'd our server costs and the latency barely moved. Scaling compute doesn't help much when your actual bottleneck is a database getting hammered with the same queries over and over. So before approving any infra spend, I asked for a week to actually profile what was happening.

That week turned into the most useful debugging exercise I've done in a while, and Redis ended up being the answer. Here's how it played out.

The Problem

Once I started actually logging query times and counting hits per endpoint, the pattern jumped out fast. A handful of our "read" endpoints — things like product listings, user profile summaries, and a few config/lookup endpoints — were getting hit constantly, and every single hit triggered a fresh SQL query.

None of this data changed often. Product catalog info might update a few times a day. Config values might change once a week, if that. But because there was no caching layer, every request meant a round trip to the database, a query execution, a result mapping, and a response. Multiply that by thousands of requests an hour and you've got a database doing a lot of repetitive, avoidable work.

Response times on the worst-offending endpoints were sitting around 500–700ms, with the database itself becoming a shared bottleneck — slow queries on one endpoint were dragging down response times on completely unrelated endpoints because everything was competing for the same DB connections.

That's the part that made it click for me: this wasn't really an "endpoint" problem, it was a database contention problem wearing an endpoint costume.

Why Redis?

I'd used Redis before for session storage, but never leaned on it for general API response caching, so I spent some time convincing myself it was the right tool before writing any code.

The pitch for Redis is pretty simple once you sit with it:

  • It's in-memory, so reads are absurdly fast compared to disk-backed SQL queries — we're talking sub-millisecond lookups versus tens of milliseconds for a DB round trip.

  • It sits in front of the database, not instead of it. The source of truth stays in SQL; Redis just absorbs the repetitive read traffic.

  • It's dead simple to reason about for this use case: key in, value out, optional expiry. No need to overengineer it with anything fancier on day one.

I didn't go in trying to cache everything. I picked two or three of the worst-offending endpoints first, the ones with high read volume and low change frequency, and treated everything else as a "maybe later." That scoping decision mattered more than I expected — more on that below.

Implementation

Nothing exotic here. We were already running Express, so this was mostly about wiring ioredis into our existing data-fetching logic.

Connecting Redis

// redisClient.js
const Redis = require('ioredis');

const redisClient = new Redis({
  host: process.env.REDIS_HOST,
  port: process.env.REDIS_PORT,
  password: process.env.REDIS_PASSWORD,
  retryStrategy(times) {
    return Math.min(times * 50, 2000);
  },
});

redisClient.on('error', (err) => {
  console.error('Redis connection error:', err);
});

module.exports = redisClient;

The retryStrategy matters more than it looks. Our first version didn't have one, and during a brief Redis restart in staging, every request just hung waiting on a dead connection instead of failing fast. Don't skip this.

Checking the cache before hitting SQL

This is the core pattern, applied to a product listing endpoint:

const express = require('express');
const router = express.Router();
const redisClient = require('./redisClient');
const db = require('./db'); // existing SQL query layer

const CACHE_TTL_SECONDS = 300; // 5 minutes

router.get('/api/products/:categoryId', async (req, res) => {
  const { categoryId } = req.params;
  const cacheKey = `products:category:${categoryId}`;

  try {
    const cached = await redisClient.get(cacheKey);

    if (cached) {
      return res.json(JSON.parse(cached));
    }

    const products = await db.query(
      'SELECT id, name, price, stock FROM products WHERE category_id = ?',
      [categoryId]
    );

    await redisClient.set(
      cacheKey,
      JSON.stringify(products),
      'EX',
      CACHE_TTL_SECONDS
    );

    return res.json(products);
  } catch (err) {
    console.error('Error fetching products:', err);
    return res.status(500).json({ error: 'Something went wrong' });
  }
});

module.exports = router;

A few things worth calling out:

  • The cache check happens before any DB call. If it's a hit, the SQL layer never gets touched at all.

  • I used EX for a TTL on every SET. No cache entry should live forever — more on this in the next section.

  • Cache key naming matters more than you'd think once you have more than a couple of cached resources. I settled on a resource:filter:value pattern (products:category:12) early on, which made debugging and selective invalidation way easier later.

We wrapped this same get-or-fetch logic into a small reusable helper once we'd proven it out on a couple of endpoints, instead of copy-pasting the try/catch block everywhere:

async function getOrSetCache(key, ttl, fetchFn) {
  const cached = await redisClient.get(key);
  if (cached) return JSON.parse(cached);

  const fresh = await fetchFn();
  await redisClient.set(key, JSON.stringify(fresh), 'EX', ttl);
  return fresh;
}

That made adding caching to new endpoints a one-line change instead of a copy-paste job, which honestly should've been the starting point.

Cache Expiration Strategy

I went with TTL-based expiration as the primary mechanism, not because it's the most elegant option, but because it's the one that fails safely. If something goes wrong and an invalidation event gets missed, a TTL puts a hard ceiling on how stale the data can get. Worst case, users see data that's a few minutes old, not data that's permanently wrong.

For most of the cached endpoints, a 5-minute TTL was the sweet spot — long enough to meaningfully cut DB load, short enough that nobody noticed the lag in practice. For a couple of slower-changing config-style endpoints, I pushed it out to closer to 30 minutes.

On top of TTL, I added explicit invalidation for the cases where staleness was actually visible to users — mainly anything tied to a write operation:

router.put('/api/products/:id', async (req, res) => {
  const { id } = req.params;
  const updates = req.body;

  await db.query('UPDATE products SET name = ?, price = ?, stock = ? WHERE id = ?', [
    updates.name,
    updates.price,
    updates.stock,
    id,
  ]);

  const product = await db.query('SELECT category_id FROM products WHERE id = ?', [id]);
  const categoryId = product[0]?.category_id;

  if (categoryId) {
    await redisClient.del(`products:category:${categoryId}`);
  }

  return res.json({ message: 'Product updated' });
});

The tradeoff here is real: TTL alone means you accept some window of staleness by design. Manual invalidation closes that window but adds more code paths that can be forgotten or get out of sync. We ended up using both — TTL as the safety net, explicit invalidation on writes where staleness was visible enough to matter.

Results

We tracked the same set of endpoints before and after rollout. Numbers below are rounded, since traffic varies day to day, but they reflect the general shift:

Metric Before After
Average Response Time 620ms 370ms
Database Queries (per minute, peak traffic) High Reduced
Database CPU Utilization Consistently elevated Noticeably lower
User-Reported Latency Complaints Recurring Rare

That ~40% drop in average response time was the headline number, but the part that actually mattered operationally was the database load. With fewer redundant queries hitting it, the DB had more headroom for the writes and less-frequent reads that genuinely needed to go through. Endpoints that had nothing to do with caching got faster too, simply because they weren't competing for DB connections anymore.

Problems I Faced

It wasn't a clean, one-pass implementation. A few things bit us along the way:

Forgetting to invalidate on update. Early on, I added caching to the product listing endpoint but forgot the cache had to be cleared on the corresponding update endpoint. QA caught it — someone updated a price and it just... didn't show up. Took embarrassingly long to trace back to a stale cache entry rather than a bug in the update logic itself. Lesson: any time you cache a read, immediately go find every write path that affects that data and handle invalidation right then, not "later."

Caching too aggressively. Once it worked on one endpoint, the temptation was to slap caching on everything. I did that for a notifications endpoint that's inherently near-real-time, and it predictably caused users to see outdated notification counts. Had to roll that one back. Not everything benefits from caching — high write-frequency, low-tolerance-for-staleness data usually shouldn't be cached at all, or needs a much shorter TTL than feels comfortable.

Debugging cache misses that should've been hits. We had a bug where cache keys included a query param that wasn't actually relevant to the result, which meant we were generating way more unique keys than necessary and barely getting any cache hits. Logging the cache key alongside hit/miss status for a few days made this obvious — our hit rate on that endpoint was sitting under 10%, which defeated the entire point. Fixed by tightening up what went into the key.

Lessons Learned

A few things I'd tell a past version of myself starting this:

  • Profile before you cache anything. I almost cached the wrong endpoints first because they "felt" slow. The actual data showed different ones were the real DB load drivers.

  • Cache invalidation is the actual hard part, not the caching itself. Writing the get/set logic took an afternoon. Getting invalidation right took the rest of the week.

  • Start with a short TTL and widen it once you trust the data. It's much less painful to debug staleness when the window is 60 seconds than when it's an hour.

  • Log your cache hit rate from day one. Without it, you're guessing whether the caching layer is even doing anything.

  • Resist caching everything just because the pattern is easy to copy-paste. Some data genuinely needs to be fresh every time.

Conclusion

Redis solved a real problem for us, but it's worth being honest about what it actually did: it didn't fix bad queries, it didn't fix poor indexing, and it wouldn't have helped at all if our bottleneck had been something else entirely, like an inefficient algorithm or an external API call. It worked because we'd already confirmed the bottleneck was repeated, cacheable database reads.

If you're seeing API slowness, I'd resist the urge to reach for Redis — or any caching layer — as the first move. Profile first. Figure out what's actually slow and why. If it turns out to be the same data getting fetched over and over from a database that doesn't need to be asked twice, then yeah, Redis caching in Node.js is a genuinely good fix, and one that's relatively low-risk to roll out incrementally. But it's a targeted fix for a specific kind of problem, not a default performance upgrade you bolt onto every Express app.