Back to Blog
Engineering

Scaling Quiz Delivery: From 100 to 100,000 Concurrent Players

Scale your quiz platform to handle massive concurrent load with database optimization, caching, connection pooling, and read replicas.

Bobby Iliev2026-04-088 min read

When Your Quiz Goes Viral

Your quiz works fine with 100 users. Then a teacher assigns it to 2,000 students who all click "Start" at the same time. Or a marketing campaign drives 50,000 people to your quiz in an hour. Suddenly you are dealing with connection pool exhaustion, slow queries, and timeouts.

Scaling a quiz platform is not about rewriting everything. It is about identifying bottlenecks and addressing them in the right order. This guide covers the progression from 100 to 100,000 concurrent players, tackling each bottleneck as it appears.

Stage 1: Database Optimization (100 to 1,000 Players)

The first bottleneck is always the database. Before adding infrastructure, make sure your queries are efficient.

Index Your Queries

Find slow queries and add targeted indexes:

-- Quizzes are fetched by ID and published status constantly
CREATE INDEX idx_quizzes_published ON quizzes(id) WHERE published = true;

-- Questions are always fetched with their quiz
CREATE INDEX idx_questions_quiz_id ON questions(quiz_id);

-- Answers are fetched with their question
CREATE INDEX idx_answers_question_id ON answers(question_id);

-- Submissions are queried by user and quiz
CREATE INDEX idx_submissions_user_quiz ON submissions(user_id, quiz_id);

-- Leaderboard queries sort by score
CREATE INDEX idx_submissions_quiz_score ON submissions(quiz_id, score DESC, completed_at ASC);

Optimize the Quiz Fetch Query

A naive approach makes N+1 queries. Use a single query with joins:

async function getQuizWithQuestions(quizId: string) {
  // Bad: N+1 queries
  // const quiz = await prisma.quiz.findUnique({ where: { id: quizId } });
  // const questions = await prisma.question.findMany({ where: { quizId } });
  // for (const q of questions) {
  //   q.answers = await prisma.answer.findMany({ where: { questionId: q.id } });
  // }

  // Good: single query with includes
  return prisma.quiz.findUnique({
    where: { id: quizId, published: true },
    include: {
      questions: {
        orderBy: { sortOrder: "asc" },
        include: {
          answers: {
            select: { id: true, text: true, sortOrder: true },
            orderBy: { sortOrder: "asc" },
          },
        },
      },
    },
  });
}

Connection Pooling

A single database connection can handle one query at a time. Under load, you need a pool:

// prisma/schema.prisma
datasource db {
  provider = "postgresql"
  url      = env("DATABASE_URL")
  // Connection pool size should be roughly:
  // (number of CPU cores * 2) + number of disks
  // For a 4-core server: pool_size = 10
}

For Prisma, configure the connection pool via the URL:

DATABASE_URL="postgresql://user:pass@host:5432/db?connection_limit=10&pool_timeout=10"

If you are using external connection pooling (PgBouncer), set the pool mode to transaction:

DATABASE_URL="postgresql://user:pass@pgbouncer:6432/db?pgbouncer=true"

Stage 2: Application-Level Caching (1,000 to 10,000 Players)

Quizzes are read-heavy. The same quiz gets fetched thousands of times but changes rarely. This is a perfect caching use case.

Redis Cache Layer

import Redis from "ioredis";

const redis = new Redis(process.env.REDIS_URL);

const CACHE_TTL = 300; // 5 minutes

async function getCachedQuiz(quizId: string) {
  const cacheKey = `quiz:${quizId}`;

  // Try cache first
  const cached = await redis.get(cacheKey);
  if (cached) {
    return JSON.parse(cached);
  }

  // Cache miss - fetch from database
  const quiz = await getQuizWithQuestions(quizId);
  if (!quiz) return null;

  // Store in cache
  await redis.set(cacheKey, JSON.stringify(quiz), "EX", CACHE_TTL);

  return quiz;
}

async function invalidateQuizCache(quizId: string) {
  await redis.del(`quiz:${quizId}`);
}

Cache Stampede Prevention

When the cache expires, hundreds of concurrent requests all miss the cache and hit the database simultaneously. Use a mutex:

async function getCachedQuizSafe(quizId: string) {
  const cacheKey = `quiz:${quizId}`;
  const lockKey = `lock:quiz:${quizId}`;

  const cached = await redis.get(cacheKey);
  if (cached) return JSON.parse(cached);

  // Try to acquire lock
  const acquired = await redis.set(lockKey, "1", "EX", 5, "NX");

  if (acquired) {
    // We got the lock - fetch and cache
    try {
      const quiz = await getQuizWithQuestions(quizId);
      if (quiz) {
        await redis.set(cacheKey, JSON.stringify(quiz), "EX", CACHE_TTL);
      }
      return quiz;
    } finally {
      await redis.del(lockKey);
    }
  }

  // Another process is fetching - wait briefly and retry
  await new Promise((resolve) => setTimeout(resolve, 100));
  const retried = await redis.get(cacheKey);
  if (retried) return JSON.parse(retried);

  // Fallback to database if cache still empty
  return getQuizWithQuestions(quizId);
}

Response Caching with ETags

Reduce bandwidth by letting clients cache responses:

import crypto from "crypto";

app.get("/api/v1/quizzes/:id", async (req, res) => {
  const quiz = await getCachedQuizSafe(req.params.id);

  if (!quiz) {
    return res.status(404).json({ error: "Quiz not found" });
  }

  // Generate ETag from content
  const etag = crypto
    .createHash("md5")
    .update(JSON.stringify(quiz))
    .digest("hex");

  res.setHeader("ETag", `"${etag}"`);
  res.setHeader("Cache-Control", "private, max-age=60");

  // Check if client has current version
  if (req.headers["if-none-match"] === `"${etag}"`) {
    return res.status(304).end();
  }

  res.json(quiz);
});

Stage 3: Horizontal Scaling (10,000 to 50,000 Players)

When a single server is not enough, scale horizontally.

Stateless Application Servers

Make sure your API servers share no in-memory state. Move all state to Redis or the database:

// Bad: in-memory session store
const sessions = new Map();

// Good: Redis session store
import session from "express-session";
import RedisStore from "connect-redis";

app.use(
  session({
    store: new RedisStore({ client: redis }),
    secret: process.env.SESSION_SECRET!,
    resave: false,
    saveUninitialized: false,
    cookie: { maxAge: 3600000 },
  })
);

Load Balancer Configuration

With stateless servers, put them behind a load balancer. Here is an Nginx configuration for upstream servers:

upstream quiz_api {
    least_conn;  # Route to the server with fewest connections
    server 10.0.1.10:3000;
    server 10.0.1.11:3000;
    server 10.0.1.12:3000;

    keepalive 64;
}

server {
    listen 80;

    location / {
        proxy_pass http://quiz_api;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }

    location /health {
        proxy_pass http://quiz_api;
        access_log off;
    }
}

Queue Heavy Operations

Quiz submissions that trigger score calculations, leaderboard updates, and webhook notifications should be queued:

import { Queue, Worker } from "bullmq";

const submissionQueue = new Queue("quiz-submissions", {
  connection: { host: "redis-host", port: 6379 },
});

// API handler - enqueue and respond immediately
app.post("/api/v1/quizzes/:id/submit", async (req, res) => {
  const { id: quizId } = req.params;
  const { answers } = req.body;
  const userId = req.user!.id;

  // Quick score calculation for immediate response
  const quiz = await getCachedQuizSafe(quizId);
  const score = calculateScore(quiz!.questions, answers);

  // Enqueue the heavy operations
  await submissionQueue.add("process-submission", {
    quizId,
    userId,
    answers,
    score: score.score,
    completedAt: new Date().toISOString(),
  });

  // Respond immediately with the score
  res.json(score);
});

// Worker processes submissions in the background
const worker = new Worker(
  "quiz-submissions",
  async (job) => {
    const { quizId, userId, answers, score, completedAt } = job.data;

    // Store in database
    await prisma.submission.create({
      data: { quizId, userId, score, completedAt: new Date(completedAt) },
    });

    // Update leaderboard
    await updateLeaderboard(quizId, userId, score);

    // Send webhooks
    await sendWebhook("quiz.completed", { quizId, userId, score });

    // Update user stats
    await updateUserStats(userId, score);
  },
  { connection: { host: "redis-host", port: 6379 }, concurrency: 10 }
);

Stage 4: Database Read Replicas (50,000 to 100,000 Players)

At this scale, the database becomes the bottleneck again. Separate read and write traffic:

// lib/prisma.ts
import { PrismaClient } from "@prisma/client";

// Primary for writes
export const prismaWrite = new PrismaClient({
  datasources: { db: { url: process.env.DATABASE_PRIMARY_URL } },
});

// Read replica for queries
export const prismaRead = new PrismaClient({
  datasources: { db: { url: process.env.DATABASE_REPLICA_URL } },
});

// Helper to select the right client
export function prisma(mode: "read" | "write" = "read") {
  return mode === "write" ? prismaWrite : prismaRead;
}

Use this in your code:

// Reads go to the replica
const quiz = await prisma("read").quiz.findUnique({
  where: { id: quizId },
});

// Writes go to the primary
await prisma("write").submission.create({
  data: { quizId, userId, score },
});

Be aware of replication lag. After a write, the data may not be immediately available on the read replica. For submission results, read from the primary:

// After creating a submission, read it back from primary
const submission = await prisma("write").submission.findUnique({
  where: { id: submissionId },
});

Stage 5: CDN for Static Assets

Serve quiz images, static question data, and the frontend through a CDN:

// Set cache headers for static quiz content
app.get("/api/v1/quizzes/:id/static", async (req, res) => {
  const quiz = await getCachedQuizSafe(req.params.id);

  // Long cache for published quiz content
  res.setHeader("Cache-Control", "public, max-age=3600, s-maxage=86400");
  res.setHeader("CDN-Cache-Control", "max-age=86400");
  res.json(quiz);
});

Performance Benchmarks

Here is what to expect at each stage:

StageConcurrent Usersp95 LatencyInfrastructure
Baseline100200ms1 server, 1 DB
DB Optimized1,000100ms1 server, 1 DB
Cached10,00030ms1 server, 1 DB, Redis
Horizontal50,00050ms3 servers, 1 DB, Redis
Read Replicas100,00040ms3 servers, 1 primary + 2 replicas, Redis

Summary

Scale incrementally. Most quiz platforms will never need read replicas, and over-engineering early wastes money and adds complexity. Start with database optimization and caching - these two steps alone handle the 1,000 to 10,000 range, which covers the majority of use cases.

The scaling path:

  1. Optimize queries and add indexes
  2. Add Redis caching for quiz content
  3. Prevent cache stampedes with locking
  4. Move to stateless servers behind a load balancer
  5. Queue heavy operations like webhook delivery and stat updates
  6. Add read replicas when the primary database is at capacity
  7. Put static content behind a CDN

Measure before you optimize. Use the monitoring setup from our Prometheus and Grafana guide to identify actual bottlenecks rather than guessing.

Stay Updated

Get the latest tutorials and API tips delivered to your inbox.

No spam, unsubscribe anytime.

Enjoyed this article?

Share it with your team or try our quiz platform.