Scaling Quiz Delivery: From 100 to 100,000 Concurrent Players

When Your Quiz Goes Viral

Your quiz works fine with 100 users. Then a teacher assigns it to 2,000 students who all click "Start" at the same time. Or a marketing campaign drives 50,000 people to your quiz in an hour. Suddenly you are dealing with connection pool exhaustion, slow queries, and timeouts.

Scaling a quiz platform is not about rewriting everything. It is about identifying bottlenecks and addressing them in the right order. This guide covers the progression from 100 to 100,000 concurrent players, tackling each bottleneck as it appears.

Stage 1: Database Optimization (100 to 1,000 Players)

The first bottleneck is always the database. Before adding infrastructure, make sure your queries are efficient.

Index Your Queries

Find slow queries and add targeted indexes:

1-- Quizzes are fetched by ID and published status constantly
2CREATE INDEX idx_quizzes_published ON quizzes(id) WHERE published = true;
3
4-- Questions are always fetched with their quiz
5CREATE INDEX idx_questions_quiz_id ON questions(quiz_id);
6
7-- Answers are fetched with their question
8CREATE INDEX idx_answers_question_id ON answers(question_id);
9
10-- Submissions are queried by user and quiz
11CREATE INDEX idx_submissions_user_quiz ON submissions(user_id, quiz_id);
12
13-- Leaderboard queries sort by score
14CREATE INDEX idx_submissions_quiz_score ON submissions(quiz_id, score DESC, completed_at ASC);

Optimize the Quiz Fetch Query

A naive approach makes N+1 queries. Use a single query with joins:

1async function getQuizWithQuestions(quizId: string) {
2  // Bad: N+1 queries
3  // const quiz = await prisma.quiz.findUnique({ where: { id: quizId } });
4  // const questions = await prisma.question.findMany({ where: { quizId } });
5  // for (const q of questions) {
6  //   q.answers = await prisma.answer.findMany({ where: { questionId: q.id } });
7  // }
8
9  // Good: single query with includes
10  return prisma.quiz.findUnique({
11    where: { id: quizId, published: true },
12    include: {
13      questions: {
14        orderBy: { sortOrder: "asc" },
15        include: {
16          answers: {
17            select: { id: true, text: true, sortOrder: true },
18            orderBy: { sortOrder: "asc" },
19          },
20        },
21      },
22    },
23  });
24}

Connection Pooling

A single database connection can handle one query at a time. Under load, you need a pool:

1// prisma/schema.prisma
2datasource db {
3  provider = "postgresql"
4  url      = env("DATABASE_URL")
5  // Connection pool size should be roughly:
6  // (number of CPU cores * 2) + number of disks
7  // For a 4-core server: pool_size = 10
8}

For Prisma, configure the connection pool via the URL:

DATABASE_URL="postgresql://user:pass@host:5432/db?connection_limit=10&pool_timeout=10"

If you are using external connection pooling (PgBouncer), set the pool mode to transaction:

DATABASE_URL="postgresql://user:pass@pgbouncer:6432/db?pgbouncer=true"

Stage 2: Application-Level Caching (1,000 to 10,000 Players)

Quizzes are read-heavy. The same quiz gets fetched thousands of times but changes rarely. This is a perfect caching use case.

Redis Cache Layer

1import Redis from "ioredis";
2
3const redis = new Redis(process.env.REDIS_URL);
4
5const CACHE_TTL = 300; // 5 minutes
6
7async function getCachedQuiz(quizId: string) {
8  const cacheKey = `quiz:${quizId}`;
9
10  // Try cache first
11  const cached = await redis.get(cacheKey);
12  if (cached) {
13    return JSON.parse(cached);
14  }
15
16  // Cache miss - fetch from database
17  const quiz = await getQuizWithQuestions(quizId);
18  if (!quiz) return null;
19
20  // Store in cache
21  await redis.set(cacheKey, JSON.stringify(quiz), "EX", CACHE_TTL);
22
23  return quiz;
24}
25
26async function invalidateQuizCache(quizId: string) {
27  await redis.del(`quiz:${quizId}`);
28}

Cache Stampede Prevention

When the cache expires, hundreds of concurrent requests all miss the cache and hit the database simultaneously. Use a mutex:

1async function getCachedQuizSafe(quizId: string) {
2  const cacheKey = `quiz:${quizId}`;
3  const lockKey = `lock:quiz:${quizId}`;
4
5  const cached = await redis.get(cacheKey);
6  if (cached) return JSON.parse(cached);
7
8  // Try to acquire lock
9  const acquired = await redis.set(lockKey, "1", "EX", 5, "NX");
10
11  if (acquired) {
12    // We got the lock - fetch and cache
13    try {
14      const quiz = await getQuizWithQuestions(quizId);
15      if (quiz) {
16        await redis.set(cacheKey, JSON.stringify(quiz), "EX", CACHE_TTL);
17      }
18      return quiz;
19    } finally {
20      await redis.del(lockKey);
21    }
22  }
23
24  // Another process is fetching - wait briefly and retry
25  await new Promise((resolve) => setTimeout(resolve, 100));
26  const retried = await redis.get(cacheKey);
27  if (retried) return JSON.parse(retried);
28
29  // Fallback to database if cache still empty
30  return getQuizWithQuestions(quizId);
31}

Response Caching with ETags

Reduce bandwidth by letting clients cache responses:

1import crypto from "crypto";
2
3app.get("/api/v1/quizzes/:id", async (req, res) => {
4  const quiz = await getCachedQuizSafe(req.params.id);
5
6  if (!quiz) {
7    return res.status(404).json({ error: "Quiz not found" });
8  }
9
10  // Generate ETag from content
11  const etag = crypto
12    .createHash("md5")
13    .update(JSON.stringify(quiz))
14    .digest("hex");
15
16  res.setHeader("ETag", `"${etag}"`);
17  res.setHeader("Cache-Control", "private, max-age=60");
18
19  // Check if client has current version
20  if (req.headers["if-none-match"] === `"${etag}"`) {
21    return res.status(304).end();
22  }
23
24  res.json(quiz);
25});

Stage 3: Horizontal Scaling (10,000 to 50,000 Players)

When a single server is not enough, scale horizontally.

Stateless Application Servers

Make sure your API servers share no in-memory state. Move all state to Redis or the database:

1// Bad: in-memory session store
2const sessions = new Map();
3
4// Good: Redis session store
5import session from "express-session";
6import RedisStore from "connect-redis";
7
8app.use(
9  session({
10    store: new RedisStore({ client: redis }),
11    secret: process.env.SESSION_SECRET!,
12    resave: false,
13    saveUninitialized: false,
14    cookie: { maxAge: 3600000 },
15  })
16);

Load Balancer Configuration

With stateless servers, put them behind a load balancer. Here is an Nginx configuration for upstream servers:

1upstream quiz_api {
2    least_conn;  # Route to the server with fewest connections
3    server 10.0.1.10:3000;
4    server 10.0.1.11:3000;
5    server 10.0.1.12:3000;
6
7    keepalive 64;
8}
9
10server {
11    listen 80;
12
13    location / {
14        proxy_pass http://quiz_api;
15        proxy_http_version 1.1;
16        proxy_set_header Connection "";
17        proxy_set_header Host $host;
18        proxy_set_header X-Real-IP $remote_addr;
19    }
20
21    location /health {
22        proxy_pass http://quiz_api;
23        access_log off;
24    }
25}

Queue Heavy Operations

Quiz submissions that trigger score calculations, leaderboard updates, and webhook notifications should be queued:

1import { Queue, Worker } from "bullmq";
2
3const submissionQueue = new Queue("quiz-submissions", {
4  connection: { host: "redis-host", port: 6379 },
5});
6
7// API handler - enqueue and respond immediately
8app.post("/api/v1/quizzes/:id/submit", async (req, res) => {
9  const { id: quizId } = req.params;
10  const { answers } = req.body;
11  const userId = req.user!.id;
12
13  // Quick score calculation for immediate response
14  const quiz = await getCachedQuizSafe(quizId);
15  const score = calculateScore(quiz!.questions, answers);
16
17  // Enqueue the heavy operations
18  await submissionQueue.add("process-submission", {
19    quizId,
20    userId,
21    answers,
22    score: score.score,
23    completedAt: new Date().toISOString(),
24  });
25
26  // Respond immediately with the score
27  res.json(score);
28});
29
30// Worker processes submissions in the background
31const worker = new Worker(
32  "quiz-submissions",
33  async (job) => {
34    const { quizId, userId, answers, score, completedAt } = job.data;
35
36    // Store in database
37    await prisma.submission.create({
38      data: { quizId, userId, score, completedAt: new Date(completedAt) },
39    });
40
41    // Update leaderboard
42    await updateLeaderboard(quizId, userId, score);
43
44    // Send webhooks
45    await sendWebhook("quiz.completed", { quizId, userId, score });
46
47    // Update user stats
48    await updateUserStats(userId, score);
49  },
50  { connection: { host: "redis-host", port: 6379 }, concurrency: 10 }
51);

Stage 4: Database Read Replicas (50,000 to 100,000 Players)

At this scale, the database becomes the bottleneck again. Separate read and write traffic:

1// lib/prisma.ts
2import { PrismaClient } from "@prisma/client";
3
4// Primary for writes
5export const prismaWrite = new PrismaClient({
6  datasources: { db: { url: process.env.DATABASE_PRIMARY_URL } },
7});
8
9// Read replica for queries
10export const prismaRead = new PrismaClient({
11  datasources: { db: { url: process.env.DATABASE_REPLICA_URL } },
12});
13
14// Helper to select the right client
15export function prisma(mode: "read" | "write" = "read") {
16  return mode === "write" ? prismaWrite : prismaRead;
17}

Use this in your code:

1// Reads go to the replica
2const quiz = await prisma("read").quiz.findUnique({
3  where: { id: quizId },
4});
5
6// Writes go to the primary
7await prisma("write").submission.create({
8  data: { quizId, userId, score },
9});

Be aware of replication lag. After a write, the data may not be immediately available on the read replica. For submission results, read from the primary:

// After creating a submission, read it back from primary
const submission = await prisma("write").submission.findUnique({
  where: { id: submissionId },
});

Stage 5: CDN for Static Assets

Serve quiz images, static question data, and the frontend through a CDN:

1// Set cache headers for static quiz content
2app.get("/api/v1/quizzes/:id/static", async (req, res) => {
3  const quiz = await getCachedQuizSafe(req.params.id);
4
5  // Long cache for published quiz content
6  res.setHeader("Cache-Control", "public, max-age=3600, s-maxage=86400");
7  res.setHeader("CDN-Cache-Control", "max-age=86400");
8  res.json(quiz);
9});

Performance Benchmarks

Here is what to expect at each stage:

Stage	Concurrent Users	p95 Latency	Infrastructure
Baseline	100	200ms	1 server, 1 DB
DB Optimized	1,000	100ms	1 server, 1 DB
Cached	10,000	30ms	1 server, 1 DB, Redis
Horizontal	50,000	50ms	3 servers, 1 DB, Redis
Read Replicas	100,000	40ms	3 servers, 1 primary + 2 replicas, Redis

Summary

Scale incrementally. Most quiz platforms will never need read replicas, and over-engineering early wastes money and adds complexity. Start with database optimization and caching - these two steps alone handle the 1,000 to 10,000 range, which covers the majority of use cases.

The scaling path:

Optimize queries and add indexes
Add Redis caching for quiz content
Prevent cache stampedes with locking
Move to stateless servers behind a load balancer
Queue heavy operations like webhook delivery and stat updates
Add read replicas when the primary database is at capacity
Put static content behind a CDN

Measure before you optimize. Use the monitoring setup from our Prometheus and Grafana guide to identify actual bottlenecks rather than guessing.