Performance

How to profile, benchmark, and optimize Ingglish.

Profiling Scripts

Core Library (`packages/core/scripts/profile/`)

Script	Purpose
`benchmark.ts`	Full benchmark suite (1000 iterations, statistics)
`overview.ts`	Quick translation profiling
`translate.ts`	translateSync performance analysis
`convert.ts`	Phoneme conversion performance
`cpu-profile.ts`	V8 CPU profiler for flame graphs
`harness.ts`	Shared benchmark/formatting utilities

DOM Library (`packages/dom/scripts/`)

Script	Purpose
`profile-wikipedia.ts`	Real Wikipedia HTML profiling (~300KB)
`profile-tree-walker.ts`	TreeWalker alternatives comparison
`profile-process-node.ts`	Text node processing analysis
`profile-dom.ts`	General DOM translation profiling
`profile-real-html.ts`	Article-style HTML profiling
`profile-tooltips.ts`	Tooltip overhead comparison

Running Benchmarks

Core Library

cd packages/core

# Full benchmark suite
npx vite-node --script scripts/profile/benchmark.ts

# Quick profile
npx vite-node --script scripts/profile/overview.ts

Sample output:

=== Ingglish Core Benchmarks ===

Iterations: 1000, Warmup: 100

--- Forward Translation ---
translateSync(short text)                      0.005ms  (min: 0.003ms, max: 0.033ms)    212690 ops/sec
translateSync(medium text)                     0.013ms  (min: 0.009ms, max: 0.077ms)     74524 ops/sec

--- Reverse Translation ---
reverseTranslateWord(single)                   0.005ms  (min: 0.003ms, max: 0.544ms)    183851 ops/sec

DOM Library

cd packages/dom

# Profile with real Wikipedia content
npx vite-node --script scripts/profile-wikipedia.ts

# Compare TreeWalker implementations
npx vite-node --script scripts/profile-tree-walker.ts

Sample output from profile-wikipedia.ts:

=== Wikipedia HTML Profiling ===
HTML size: 219KB, Text nodes: 808, Words: 1,769

Phase                    Time (ms)    Per-item
─────────────────────────────────────────────
Collect nodes            13.8         17.0µs/node
Extract words            0.8          0.8µs/word
Apply translations       34.2         19.3µs/word
─────────────────────────────────────────────
Total                    34.2ms

Performance Characteristics

Summary

Path	Complexity	Notes
Forward (dictionary hit)	O(p)	p = phoneme count
Forward (unknown word)	O(n)	n = word length
Reverse	O(n)	Pre-sorted at build time
Full text	O(w × n)	w = word count

All paths are linear: no quadratic or exponential complexity.

Forward Translation (`translateWord`)

Operation	Complexity	Notes
Dictionary lookup	O(1)	Hash table, phonemes pre-split at build time
ARPAbet→Ingglish	O(p)	p = phoneme count, single pass
CamelCase split	O(n)	n = word length, single pass
Case detection	O(n)	Single pass through word

Fallback chain for unknown words

Strategy	Complexity	Notes
Custom pronunciations	O(1)	Hash table lookup
Initialism check	O(1)	Hash table + O(e) for expansion
Compound splitting	O(n)	n-2 split points × O(1) lookup each
Stemming	O(1)	~20 suffixes × ~4 variants = constant
G2P rules	O(n)	n chars × ~40 rules (constant)

Reverse Translation (`reverseTranslateWord`)

Operation	Complexity	Notes
Ingglish→ARPAbet	O(n)	n = word length
Phoneme key lookup	O(1)	Hash table, words pre-sorted by frequency at build time

Infrastructure

Operation	Complexity	Notes
Forward dictionary load	O(n)	~1MB gzipped, phonemes pre-split
Reverse dictionary load	O(n)	~300KB gzipped, words pre-sorted
DOM traversal	O(n)	TreeWalker, n = nodes

Optimization Guidelines

Profile first - Measure before optimizing to identify actual bottlenecks
Use pre-collected nodes - Pass textNodes to applyTranslationsMap() to avoid double DOM traversal
Batch translations - Use translateWordsInBatches() for large word sets
Chunked rendering - Use requestAnimationFrame for smooth rendering on large pages
Cache translations - The extension caches 50K translations in the background worker

Bundle Splitting

The @ingglish/dictionary package uses dynamic imports for code splitting:

ingglish index - Minimal public API (~2KB)
Forward dictionary - Loaded on first translate() call (~1MB gzipped)
Reverse dictionary - Loaded on first reverse translation (~300KB gzipped)
Word frequencies - Loaded on first reverse translation (~500KB)

Dictionaries are pre-processed at build time (in @ingglish/dictionary):

Phonemes pre-split into arrays (no runtime string splitting)
Reverse dictionary pre-sorted by word frequency (no runtime sorting)

This keeps initial page load fast while deferring heavy data until needed.