I’ve been benchmarking a hardware-aware Signal Processing library for Node.js (dspx) and found that with the right architecture, you can effectively bypass the V8 garbage collector. By implementing a zero-copy pipeline, I managed to hit 78 million samples per second on a single vCPU on AWS Lambda (1769MB RAM). Even more interesting is the memory profile: at input sizes between 212 and 220, the system shows zero or negative heap growth, resulting in deterministic p99 latencies that stay flat even under heavy load.
I also focused on microsecond-level state serialization to make stateful functions (like Kalman filters) viable on ephemeral runtimes like Lambda. The deployment size is a lean 1.3MB, which keeps cold starts consistently between 170ms and 240ms. It includes a full toolkit from MFCCs and Mel-Spectrograms to adaptive filters and ICA/PCA transforms.
Its single threaded by default on both the C++ and JavaScript side, so the user can multi-thread it in JavaScript using worker threads, atomics, and SharedArrayBuffers.
Benchmark repository: https://github.com/A-KGeorge/dspx-benchmark
Code repository: https://github.com/A-KGeorge/dspx