Scaling Quiz Delivery: From 100 to 100,000 Concurrent Players
Scale your quiz platform to handle massive concurrent load with database optimization, caching, connection pooling, and read replicas.
When Your Quiz Goes Viral
Your quiz works fine with 100 users. Then a teacher assigns it to 2,000 students who all click "Start" at the same time. Or a marketing campaign drives 50,000 people to your quiz in an hour. Suddenly you are dealing with connection pool exhaustion, slow queries, and timeouts.
Scaling a quiz platform is not about rewriting everything. It is about identifying bottlenecks and addressing them in the right order. This guide covers the progression from 100 to 100,000 concurrent players, tackling each bottleneck as it appears.
Stage 1: Database Optimization (100 to 1,000 Players)
The first bottleneck is always the database. Before adding infrastructure, make sure your queries are efficient.
Index Your Queries
Find slow queries and add targeted indexes:
-- Quizzes are fetched by ID and published status constantly
CREATE INDEX idx_quizzes_published ON quizzes(id) WHERE published = true;
-- Questions are always fetched with their quiz
CREATE INDEX idx_questions_quiz_id ON questions(quiz_id);
-- Answers are fetched with their question
CREATE INDEX idx_answers_question_id ON answers(question_id);
-- Submissions are queried by user and quiz
CREATE INDEX idx_submissions_user_quiz ON submissions(user_id, quiz_id);
-- Leaderboard queries sort by score
CREATE INDEX idx_submissions_quiz_score ON submissions(quiz_id, score DESC, completed_at ASC);
Optimize the Quiz Fetch Query
A naive approach makes N+1 queries. Use a single query with joins:
async function getQuizWithQuestions(quizId: string) {
// Bad: N+1 queries
// const quiz = await prisma.quiz.findUnique({ where: { id: quizId } });
// const questions = await prisma.question.findMany({ where: { quizId } });
// for (const q of questions) {
// q.answers = await prisma.answer.findMany({ where: { questionId: q.id } });
// }
// Good: single query with includes
return prisma.quiz.findUnique({
where: { id: quizId, published: true },
include: {
questions: {
orderBy: { sortOrder: "asc" },
include: {
answers: {
select: { id: true, text: true, sortOrder: true },
orderBy: { sortOrder: "asc" },
},
},
},
},
});
}
Connection Pooling
A single database connection can handle one query at a time. Under load, you need a pool:
// prisma/schema.prisma
datasource db {
provider = "postgresql"
url = env("DATABASE_URL")
// Connection pool size should be roughly:
// (number of CPU cores * 2) + number of disks
// For a 4-core server: pool_size = 10
}
For Prisma, configure the connection pool via the URL:
DATABASE_URL="postgresql://user:pass@host:5432/db?connection_limit=10&pool_timeout=10"
If you are using external connection pooling (PgBouncer), set the pool mode to transaction:
DATABASE_URL="postgresql://user:pass@pgbouncer:6432/db?pgbouncer=true"
Stage 2: Application-Level Caching (1,000 to 10,000 Players)
Quizzes are read-heavy. The same quiz gets fetched thousands of times but changes rarely. This is a perfect caching use case.
Redis Cache Layer
import Redis from "ioredis";
const redis = new Redis(process.env.REDIS_URL);
const CACHE_TTL = 300; // 5 minutes
async function getCachedQuiz(quizId: string) {
const cacheKey = `quiz:${quizId}`;
// Try cache first
const cached = await redis.get(cacheKey);
if (cached) {
return JSON.parse(cached);
}
// Cache miss - fetch from database
const quiz = await getQuizWithQuestions(quizId);
if (!quiz) return null;
// Store in cache
await redis.set(cacheKey, JSON.stringify(quiz), "EX", CACHE_TTL);
return quiz;
}
async function invalidateQuizCache(quizId: string) {
await redis.del(`quiz:${quizId}`);
}
Cache Stampede Prevention
When the cache expires, hundreds of concurrent requests all miss the cache and hit the database simultaneously. Use a mutex:
async function getCachedQuizSafe(quizId: string) {
const cacheKey = `quiz:${quizId}`;
const lockKey = `lock:quiz:${quizId}`;
const cached = await redis.get(cacheKey);
if (cached) return JSON.parse(cached);
// Try to acquire lock
const acquired = await redis.set(lockKey, "1", "EX", 5, "NX");
if (acquired) {
// We got the lock - fetch and cache
try {
const quiz = await getQuizWithQuestions(quizId);
if (quiz) {
await redis.set(cacheKey, JSON.stringify(quiz), "EX", CACHE_TTL);
}
return quiz;
} finally {
await redis.del(lockKey);
}
}
// Another process is fetching - wait briefly and retry
await new Promise((resolve) => setTimeout(resolve, 100));
const retried = await redis.get(cacheKey);
if (retried) return JSON.parse(retried);
// Fallback to database if cache still empty
return getQuizWithQuestions(quizId);
}
Response Caching with ETags
Reduce bandwidth by letting clients cache responses:
import crypto from "crypto";
app.get("/api/v1/quizzes/:id", async (req, res) => {
const quiz = await getCachedQuizSafe(req.params.id);
if (!quiz) {
return res.status(404).json({ error: "Quiz not found" });
}
// Generate ETag from content
const etag = crypto
.createHash("md5")
.update(JSON.stringify(quiz))
.digest("hex");
res.setHeader("ETag", `"${etag}"`);
res.setHeader("Cache-Control", "private, max-age=60");
// Check if client has current version
if (req.headers["if-none-match"] === `"${etag}"`) {
return res.status(304).end();
}
res.json(quiz);
});
Stage 3: Horizontal Scaling (10,000 to 50,000 Players)
When a single server is not enough, scale horizontally.
Stateless Application Servers
Make sure your API servers share no in-memory state. Move all state to Redis or the database:
// Bad: in-memory session store
const sessions = new Map();
// Good: Redis session store
import session from "express-session";
import RedisStore from "connect-redis";
app.use(
session({
store: new RedisStore({ client: redis }),
secret: process.env.SESSION_SECRET!,
resave: false,
saveUninitialized: false,
cookie: { maxAge: 3600000 },
})
);
Load Balancer Configuration
With stateless servers, put them behind a load balancer. Here is an Nginx configuration for upstream servers:
upstream quiz_api {
least_conn; # Route to the server with fewest connections
server 10.0.1.10:3000;
server 10.0.1.11:3000;
server 10.0.1.12:3000;
keepalive 64;
}
server {
listen 80;
location / {
proxy_pass http://quiz_api;
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
location /health {
proxy_pass http://quiz_api;
access_log off;
}
}
Queue Heavy Operations
Quiz submissions that trigger score calculations, leaderboard updates, and webhook notifications should be queued:
import { Queue, Worker } from "bullmq";
const submissionQueue = new Queue("quiz-submissions", {
connection: { host: "redis-host", port: 6379 },
});
// API handler - enqueue and respond immediately
app.post("/api/v1/quizzes/:id/submit", async (req, res) => {
const { id: quizId } = req.params;
const { answers } = req.body;
const userId = req.user!.id;
// Quick score calculation for immediate response
const quiz = await getCachedQuizSafe(quizId);
const score = calculateScore(quiz!.questions, answers);
// Enqueue the heavy operations
await submissionQueue.add("process-submission", {
quizId,
userId,
answers,
score: score.score,
completedAt: new Date().toISOString(),
});
// Respond immediately with the score
res.json(score);
});
// Worker processes submissions in the background
const worker = new Worker(
"quiz-submissions",
async (job) => {
const { quizId, userId, answers, score, completedAt } = job.data;
// Store in database
await prisma.submission.create({
data: { quizId, userId, score, completedAt: new Date(completedAt) },
});
// Update leaderboard
await updateLeaderboard(quizId, userId, score);
// Send webhooks
await sendWebhook("quiz.completed", { quizId, userId, score });
// Update user stats
await updateUserStats(userId, score);
},
{ connection: { host: "redis-host", port: 6379 }, concurrency: 10 }
);
Stage 4: Database Read Replicas (50,000 to 100,000 Players)
At this scale, the database becomes the bottleneck again. Separate read and write traffic:
// lib/prisma.ts
import { PrismaClient } from "@prisma/client";
// Primary for writes
export const prismaWrite = new PrismaClient({
datasources: { db: { url: process.env.DATABASE_PRIMARY_URL } },
});
// Read replica for queries
export const prismaRead = new PrismaClient({
datasources: { db: { url: process.env.DATABASE_REPLICA_URL } },
});
// Helper to select the right client
export function prisma(mode: "read" | "write" = "read") {
return mode === "write" ? prismaWrite : prismaRead;
}
Use this in your code:
// Reads go to the replica
const quiz = await prisma("read").quiz.findUnique({
where: { id: quizId },
});
// Writes go to the primary
await prisma("write").submission.create({
data: { quizId, userId, score },
});
Be aware of replication lag. After a write, the data may not be immediately available on the read replica. For submission results, read from the primary:
// After creating a submission, read it back from primary
const submission = await prisma("write").submission.findUnique({
where: { id: submissionId },
});
Stage 5: CDN for Static Assets
Serve quiz images, static question data, and the frontend through a CDN:
// Set cache headers for static quiz content
app.get("/api/v1/quizzes/:id/static", async (req, res) => {
const quiz = await getCachedQuizSafe(req.params.id);
// Long cache for published quiz content
res.setHeader("Cache-Control", "public, max-age=3600, s-maxage=86400");
res.setHeader("CDN-Cache-Control", "max-age=86400");
res.json(quiz);
});
Performance Benchmarks
Here is what to expect at each stage:
| Stage | Concurrent Users | p95 Latency | Infrastructure |
|---|---|---|---|
| Baseline | 100 | 200ms | 1 server, 1 DB |
| DB Optimized | 1,000 | 100ms | 1 server, 1 DB |
| Cached | 10,000 | 30ms | 1 server, 1 DB, Redis |
| Horizontal | 50,000 | 50ms | 3 servers, 1 DB, Redis |
| Read Replicas | 100,000 | 40ms | 3 servers, 1 primary + 2 replicas, Redis |
Summary
Scale incrementally. Most quiz platforms will never need read replicas, and over-engineering early wastes money and adds complexity. Start with database optimization and caching - these two steps alone handle the 1,000 to 10,000 range, which covers the majority of use cases.
The scaling path:
- Optimize queries and add indexes
- Add Redis caching for quiz content
- Prevent cache stampedes with locking
- Move to stateless servers behind a load balancer
- Queue heavy operations like webhook delivery and stat updates
- Add read replicas when the primary database is at capacity
- Put static content behind a CDN
Measure before you optimize. Use the monitoring setup from our Prometheus and Grafana guide to identify actual bottlenecks rather than guessing.
Stay Updated
Get the latest tutorials and API tips delivered to your inbox.
No spam, unsubscribe anytime.
Related Articles
Building a Quiz Import/Export System
Design a robust import/export system for quizzes with JSON and CSV support, validation schemas, bulk operations, and clear error reporting.
Monitoring Quiz API Performance with Prometheus and Grafana
Instrument your quiz API with Prometheus metrics, build Grafana dashboards, and set up alerts that catch problems before users notice.
Rate Limiting Your Quiz API: A Practical Guide
Protect your quiz API from abuse with token bucket and sliding window rate limiters. Includes Redis-based implementation and graceful 429 handling.
Enjoyed this article?
Share it with your team or try our quiz platform.