capnwasm render bench

End-to-end pipeline: request → wire → decodefield readsDOM mutation → forced layout. capnweb × capnwasm crossed with WebSocket × HTTP-batch, each at three sizes, with cold & warm timings. No averages between transports — you see every cell.

Not production-ready yet. This benchmark explores the tradeoff boundary, not a hardened deployment claim. Normal readers now use managed WebAssembly.Memory, while allocator, large-data, hostile-input, and secure-memory hardening continues.

Summary

Run a bench to see the rollup.

How to read this page

Pipeline diagram: client call, wire, decode, field reads, DOM mutation, forced layout. Each cell compares capnweb or capnwasm over WebSocket or HTTP.

Every cell measures the full path above. The matrix crosses capnweb/capnwasm with WebSocket/HTTP; lower cold and warm timings win. Both libraries are allowed to use their intended fast path, and each workload renders the same DOM output.

Workload 1 — List render

Server returns N user records in one RPC reply. Client decodes, builds a <ul> with one <li> per row, forces layout via offsetHeight. Realistic shape: list view, table page, search results.

Same app output: render id, name, email, and active for every row. Below is the literal client code timed in each cell — everything else (request, transport, render, layout) is identical.

capnwebconst arr = await session.getUserList(n);
for (const u of arr) renderRow(u.id, u.name, u.email, u.active);
capnwasmconst rows = await call.send({
  resultsReader: UserListReader,
  extract: (rdr) => rdr.draft(r => r.users.map(u => ({
    id: u.id, name: u.name, email: u.email, active: u.active,
  }))),
}).promise;
for (const u of rows) renderRow(u.id, u.name, u.email, u.active);

Workload 2 — Sparse field access (read 3 of 32)

Server returns a 32-field metadata struct. Client reads only 3 fields and renders them. Cap'n Proto’s schema-evolved layout lets capnwasm skip the other 29 and batch the selected reads through planned draft(); capnweb's JSON round-trip materializes all 32 upfront. The size axis is the number of consecutive requests fired in the same tick (testing batching).

Same app output: render field0, field5, and field10. Literal client code per cell:

capnwebconst o = await session.getMetadata();
render([o.field0, o.field5, o.field10]);
capnwasmconst out = await call.send({
  resultsReader: WideUserDataReader,
  extract: (rdr) => {
    const p = rdr.draft(m => ({
      field0: m.field0, field5: m.field5, field10: m.field10,
    }));
    return [p.field0, p.field5, p.field10];
  },
}).promise;
render(out);

Workload 3 — Dense field access (read all 32)

Same metadata struct, but the client reads every field and renders all 32 to the DOM. capnweb's eager-decode model turns every field into a plain JS property read. capnwasm uses one draft() projection over all 32 fields so the per-field reads cross the wasm boundary in a single batch.

Same app output as sparse, but every field0…field31 is read. Literal client code per cell:

capnwebconst o = await session.getMetadata();
render(Array.from({ length: 32 }, (_, i) => o["field" + i]));
capnwasmconst out = await call.send({
  resultsReader: WideUserDataReader,
  extract: (rdr) => {
    const p = rdr.draft(m => {
      const o = {}; for (const n of DENSE_FIELDS) o[n] = m[n]; return o;
    });
    return DENSE_FIELDS.map(n => p[n]);
  },
}).promise;
render(out);

Workload 4 — Re-read storm (10 fields × N reads, animation simulation)

After one fetch, read 10 fields N times each (simulating a re-render loop or per-frame animation). Both paths materialize the 10 fields once, then re-read plain JS values in the hot loop. Each timed cell includes one fetch followed by repeated reads; increasing N makes the post-fetch read cost dominate the single fetch.

Same app output: repeatedly read the same 10 logical fields after one response. Literal client code:

capnwebconst o = await session.getMetadata();
const read = (() => { let i = 0; return () => o[FIELDS[i++ % 10]]; })();
for (let k = 0; k < N; k++) acc += read();
capnwasmconst { bytes } = await call.send().promise;
const reader = openWideUserData(cpp, bytes);
const picked = reader.draft(m => { const o = {}; for (const n of FIELDS) o[n] = m[n]; return o; });
const read = (() => { let i = 0; return () => picked[FIELDS[i++ % 10]]; })();
for (let k = 0; k < N; k++) acc += read();

Workload 5 — Binary blob round-trip

Server echoes K bytes of binary data. Client decodes, renders the byte length to DOM (no canvas paint — we're measuring wire + decode, not GPU work). capnwasm sends raw bytes; capnweb base64-encodes (1.33× bandwidth + parse cost both directions).

Same app output: display the echoed byte length. This is not a field-read benchmark; it isolates binary transport and decode overhead. Literal client code per cell:

capnwebconst bytes = await session.getBlob(n);
render(`bytes: ${bytes.length}`);
capnwasmconst len = await call.send({
  resultsReader: BlobReplyReader,
  extract: (rdr) => rdr.data.length,
}).promise;
render(`bytes: ${len}`);

Wire-format bench — live in your browser

Same in-browser pipeline that used to live on /playground: REST/JSON, capnweb, and capnwasm side-by-side, fetching the same fixtures, decoding, rendering. Below it: an RPC bench (auto-runs after the fetch) and a general workload suite (auto-runs after RPC).

REST / JSON fetch → JSON.parseo.id / o.name / …
fetch
decode
render
total
wire bytes
egress @ 1k r/s
    capnweb fetch → capnweb.deserializeo.id / o.name / …
    fetch
    decode
    render
    total
    wire bytes
    egress @ 1k r/s
      capnwasm fetch → openUser(cpp, bytes)r.id / r.name / … → r.dispose()
      fetch
      decode
      render
      total
      wire bytes
      egress @ 1k r/s

        RPC bench (auto-runs after fetch)

        Probing RPC server…

        Burst — 200 concurrent echoU8 calls

        capnwasm
        capnweb

        Pipelining — getChild() → echoU8()

        capnwasm
        capnweb

        Big binary — echoBinary(64 KB)

        capnwasm
        capnweb

        General workload suite (auto-runs after RPC)

        Same browser, same wasm runtime, no network. Each row spells out the exact capnwasm and JSON expressions being timed so the comparison is auditable from the page itself.

        Waiting for RPC bench…
        workload capnwasm JSON ratio wire bytes
        Small struct — 5 fields, full read
        capnwasmconst r = openUser(cpp, bytes);
        const v = Number(r.id) + r.name.length + r.email.length
                + (r.active ? 1 : 0) + r.avatar.length;
        r.dispose();
        JSONconst o = JSON.parse(text);
        return o.id + o.name.length + o.email.length
             + (o.active ? 1 : 0) + o.avatar.length;
        Sparse struct — 17 fields present, 5 read
        capnwasmconst r = openPrimitives(cpp, bytes);
        const sum = r.u8 + r.u32 + r.f64
                  + (r.flag0 ? 1 : 0) + r.text.length;
        r.dispose();
        JSONconst o = JSON.parse(text);
        return o.u8 + o.u32 + o.f64
             + (o.flag0 ? 1 : 0) + o.text.length;
        Numeric list — 1000 f64s, sum
        capnwasmconst r = openNumericProbe(cpp, bytes);
        const v = r.f64s.view();    // zero-copy Float64Array
        let sum = 0;
        for (let i = 0; i < v.length; i++) sum += v[i];
        r.dispose();
        JSONconst o = JSON.parse(text);
        let sum = 0;
        for (let i = 0; i < o.f64s.length; i++) sum += o.f64s[i];
        List scan — 200 rows, reduce over name + active
        capnwasmconst r = openUserList(cpp, bytes);
        const sum = r.draft(p =>
          p.users.reduce(
            (a, u) => a + u.name.length + (u.active ? 1 : 0),
            0,
          ),
        );
        r.dispose();
        JSONconst o = JSON.parse(text);
        let sum = 0;
        for (let i = 0; i < o.users.length; i++) {
          sum += o.users[i].name.length
               + (o.users[i].active ? 1 : 0);
        }

        Methodology