One Million Checkboxes: 125KB of Bitset, Binary WebSockets, and a 50ms Batch
On this page
In summer 2024, @itseieio built a page with one million checkboxes on it. Shared state. Every visitor saw every change in real time. It went viral — not because of the checkboxes themselves, but because of what people started doing with them: drawing pixel art, writing messages, carving out territory across a canvas of booleans that anyone could flip.
I read his writeup. Then I spent a week building my own version from scratch in Go and React.
The checkboxes are the least interesting part of the problem. What fascinated me was the question underneath: how do you keep a million boolean values in sync across every connected client simultaneously, without either melting your server or flooding each user with stale updates?
This is what I built, and what I learned building it. The full source is available as a free listing on DevsDistro — a decentralized protocol for code monetization powered by Solana and GitHub.
Phase 1: The Original
Credit where it is due: @itseieio built the original One Million Checkboxes. His writeup is his own story to tell. I am not going to summarize it here.
What this post is about is my own implementation at omcb.anuragpsarmah.me — built from scratch, with no reference to his codebase. Go backend, React frontend, binary WebSocket protocol, deployed on a Contabo VPS behind Nginx with Vercel on the frontend. The engineering decisions in every section below are from that codebase alone.
The goal was not to replicate his work. It was to sit with the same problem statement and see what I arrived at.
Phase 2: 125,000 Bytes
The first question is where to store the state of one million checkboxes.
The naive answer is a database — one row per checkbox. Toggle index 482,311 and write an update. Simple enough until a new user connects and you need to send the full current state. That is one million rows to read, serialize, and transfer per new connection. Even with a warm cache, that is unacceptably slow and expensive as connection volume grows.
The next naive answer is an in-memory array of booleans, JSON-encoded. A JSON array of one million true/false values weighs roughly 5MB to 6MB before compression, depending on how many of them are true versus false. With dozens of simultaneous connections coming and going, you are pushing hundreds of megabytes of initial state constantly. For booleans.
The actual answer is older than most of the frameworks involved: a bitset. A boolean is one bit, not one byte. 1,000,000 bits is exactly 125,000 bytes — 125KB.
func (s *CheckboxState) Toggle(index uint32) (newVal bool, seq uint64, ok bool) {
s.mu.Lock()
defer s.mu.Unlock()
byteIdx := index >> 3 // which byte (divide by 8)
bit := byte(1 << (index & 7)) // which bit within that byte (mod 8)
s.bits[byteIdx] ^= bit // XOR flips exactly this bit
newVal = s.bits[byteIdx]&bit != 0
seq = s.toggleSeq.Add(1)
return newVal, seq, true
}The XOR flips exactly one bit. index >> 3 is the byte index. index & 7 is the bit position within that byte. The same layout lives in the frontend as a Uint8Array:
export class BitSet {
private bits: Uint8Array; // same 125,000 bytes
get(index: number): boolean {
return (this.bits[index >> 3] & (1 << (index & 7))) !== 0;
}
toggle(index: number): void {
this.bits[index >> 3] ^= 1 << (index & 7);
}
loadBytes(bytes: Uint8Array): void {
this.bits.set(bytes); // direct copy, no parsing
}
}The wire format and the in-memory format are identical. When the server sends the initial state snapshot, the client calls loadBytes — a direct byte copy, no JSON parsing, no intermediate representation. Initial state delivery is always 125KB, regardless of how many checkboxes are checked.
That is the whole storage model. No database. No external service. A single []byte slice held in memory, persisted to disk every five seconds via atomic file rename, with a final save on graceful shutdown. Routine restarts keep the current state; an unexpected crash can lose at most one save interval.
Phase 3: The Sync Problem
Storing state efficiently is the easy part. The harder question is what happens every time someone flips a checkbox.
Every connected client needs to receive that change. With many clients each capable of toggling multiple checkboxes per second, the most obvious implementation — forward each toggle individually to all clients the moment it arrives — creates a message storm. Imagine a thousand users all clicking at once: each click fans out to a thousand clients, generating a million message deliveries per second. That is not a WebSocket problem. That is a design problem.
There is also a subtlety about what you actually need to communicate. Inside the batch format, a change carries only two pieces of information: the checkbox index (4 bytes) and its new value (1 byte). The delta message itself also carries fixed framing — message type, sequence number, update count — but the dominant cost is still how many changes and how many broadcasts you send. The problem is message volume, not message size.
You cannot forward every toggle immediately. You need a batch window.
Phase 4: 50ms, Then Deduplicate
The batch window in my implementation is 50 milliseconds. Every 50ms, the server drains its pending update queue and broadcasts a single delta message to all clients. One broadcast per 50ms window instead of one per toggle. That is the obvious optimization.
The less obvious one is deduplication. Consider what happens when a user rapidly clicks the same checkbox multiple times within a single 50ms window. If you replay every toggle in order, you produce a message like “checkbox 42: true, false, true, false, true” for a single checkbox in one delta. Clients would have to apply all five intermediate states. That is wasteful, and it means clients need to understand history rather than just applying current state.
The fix is to deduplicate within the batch: keep only the final state of each checkbox index, discarding earlier occurrences. The implementation does this with a reverse scan:
seen := make(map[uint32]bool, len(batch))
for i := len(batch) - 1; i >= 0; i-- {
u := batch[i]
if !seen[u.Index] {
seen[u.Index] = true
deduped = append(deduped, u) // keep only the last state
}
// earlier occurrences of the same index are silently dropped
}Scan backwards. The first time you encounter a checkbox index, that is its final state — keep it. Any earlier occurrence of the same index is stale — drop it. The result is one entry per changed checkbox, representing where it ended up, not how it got there.
After deduplication, the delta message contains exactly what every client needs: which checkboxes changed and what state they are now in. The entire batch becomes idempotent. Apply it once, apply it twice — the result is the same.
Phase 5: Sequence Numbers
Batching and deduplication solve the broadcast problem. Sequence numbers solve the reconnect problem.
When a client connects or reconnects, the server takes a full state snapshot — all 125KB — along with a monotonically increasing sequence number: the last accepted toggle represented in that state. In the hub, that full-state message is enqueued to the client before the client is added to the broadcast set.
That ordering does most of the work. Registration and batch flushing both happen on the same hub goroutine, so no delta can slip into the gap between snapshot creation and the client's first full-state message. The race is largely eliminated at the server boundary, not papered over afterward.
Sequence numbers still matter on the client. Each delta message carries the highest toggle sequence it includes. The client stores the sequence number from its most recent full state. When a delta arrives:
case MSG_DELTA: {
const toSeq = readUint64BE(view, 1);
// Discard any delta already covered by the full state snapshot
if (toSeq <= lastFullStateSeqRef.current) return;
const n = view.getUint32(9, false);
const updates = Array.from({ length: n }, (_, i) => ({
index: view.getUint32(13 + i * 5, false),
value: view.getUint8(13 + i * 5 + 4) === 1,
}));
callbacksRef.current.onDelta(updates);
}Any delta with a toSeq at or below the snapshot's sequence number is already covered by the state the client received. Discard it. Only apply deltas that represent changes newer than the snapshot.
This is a small amount of code, but it makes the reconnect path explicit: once a full state says I include everything through sequence N, anything at or below N is old news. Snapshot-before-registration removes the primary race, and the sequence check acts as the final guardrail.
Phase 6: The Frontend's Problem
Rendering is where most naive implementations collapse before any of the server design even matters.
A DOM with one million nodes is not a DOM. It is a memory allocation that will crash most browsers before it finishes rendering. The solution is virtual rendering — only render what the user can actually see. I used react-window's FixedSizeGrid to maintain a 28×28px checkbox grid where only the visible cells plus a small overscan buffer exist as actual DOM nodes. On a typical desktop viewport, that is a few hundred nodes, not a million.
But there is a deeper problem: React state. If every checkbox were a useState(false), you would need one million React state hooks. More practically, toggling any single checkbox would schedule a reconciliation pass that React has to evaluate against one million components. The result would be unusable.
The solution is to move the checkbox state entirely outside of React, into a useRef, and use a single integer counter as the React state that drives checkbox-grid re-rendering:
const bitsetRef = useRef<BitSet>(new BitSet(TOTAL_CHECKBOXES));
const [version, setVersion] = useState(0);
const bump = useCallback(() => setVersion((v) => v + 1), []);
// On incoming delta from server
onDelta: (updates) => {
for (const u of updates) bitsetRef.current.set(u.index, u.value);
bump(); // one re-render for the entire batch
},
// On user click
const toggle = useCallback((index: number) => {
bitsetRef.current.toggle(index); // optimistic, immediate
bump();
sendToggle(index); // fire to server
}, [bump, sendToggle]);React sees one thing change: the version integer. That triggers a re-render of the grid. Cells read their current state directly from the ref. The version counter is purely a render trigger — it carries no information about which checkboxes changed. That information lives in the bitset, and cells read it themselves.
In steady state, memoization keeps the re-render cost proportional to the number of visible checkboxes whose checked state changed, not the total grid size:
const Cell = memo(
({ style, onToggle, isChecked, index }: CellProps) => {
// ...render checkbox
},
(prevProps, nextProps) => prevProps.isChecked === nextProps.isChecked
);A cell compares only its own isChecked prop. As long as the grid layout is unchanged, a delta that touches twenty visible checkboxes makes those twenty do real work and the rest bail out at the memo boundary. Resize is a separate event, because changing the column count remaps indices.
Optimistic Updates and Rollback
Clicks toggle the local bitset before the server responds. The change appears instantaneous. If the server comes back with a rate-limit message — too many toggles too fast — the toggle is rolled back: the same index is flipped again in the local bitset, version is bumped, and the checkbox snaps back to where it was. The rate limiter runs a token bucket at five toggles per second with a burst capacity of fifteen. Clicking faster than that produces a visible rollback and a toast notification.
It is a small detail, but it is the difference between a UI that feels responsive and one that feels like it is waiting for permission.
Conclusion
The checkboxes are a surface. Every architectural decision here exists to serve one constraint: shared mutable state must stay consistent across every connected client, at low latency, without melting the server or the browser.
- —Bitset: 125KB of constant-size state. No database, no JSON arrays. Direct byte copy on initial connect.
- —Binary protocol: 5 bytes per change inside the batch payload, plus a small fixed header. The overhead is message count, not message size — so reduce the count.
- —50ms batch + dedup: One delta per window instead of one per toggle. Final state per checkbox, not intermediate history.
- —Sequence numbers: Snapshot-before-registration keeps reconnect ordering clean, and stale-delta discard is the final guardrail on the client.
- —Ref-based state: For checkbox state, React tracks one integer. The bitset lives outside the component tree. Re-renders are cheap.
- —Cell memoization: Within a stable layout, visible re-renders track changed checked states rather than the whole grid.
Each of these decisions is independent. Take out any one of them and a specific failure mode reappears — memory pressure on connect, reconnect flicker, browser melt on render, message storms under load. They compose.
Full credit to @itseieio for the original experiment and the writeup that sent me down this rabbit hole. My implementation is live at omcb.anuragpsarmah.me.