Benchmarks
Cyber is fast and efficient with memory. Here are some benchmarks against similar languages. Each benchmark either compares the VM or the JIT. Read more about Performance.
Performance
Cyber was designed with performance in mind from the start. The language, type system, memory management, VM, and JIT were carefully considered to enable optimizations.
Crafty register VM.
Cyber's VM is register based so most bytecode instructions have a destination operand. This reduces the amount of cpu cycles and memory accesses compared to a stack based VM. Having registers as operands enables allocation strategies that reduce the amount of instructions the VM has to run.
Unlike physical registers, virtual registers are reserved as stack slots. This allows fibers to be swapped in and out by just replacing the current stack pointer.
Efficient call convention.
Call arguments are assigned unique stack slots which reduces copy instructions. The compiler also arranges the return slot to feed directly as an argument to a parent function call. This makes composing functions fast which is suitable for declarative programming paradigms. The call stack is also repurposed for storing call frames which increases cache locality.
In many dynamic languages, functions and fields are looked up in a hash map. In Cyber, they are indexed in an array by a symbol ID which is faster.
Inline caching.
Cyber optimizes instructions by patching bytecode at runtime. Object operations tend to involve more lookups and checks since values can have a dynamic type. By caching the lookup results in the bytecode, the instruction will run faster. In the rare case of a cache miss, the deoptimized version uses an MRU table for object types and symbols.
Compact values.
In Cyber, all values are 8 bytes and use NaN tagging to represent primitive types or heap objects. Having a compact value representation simplifies the data structures used in the VM. It's also easier to align them in memory to improve cache locality.
Small objects are allocated from object pools which is fast since they don't require much bookkeeping. mimalloc is used to allocate heap memory which has proven to be fast and reliable.
Fast dispatch.
Computed gotos allow the next bytecode to be dispatched using a jump table directly from the end of each bytecode segment. This does the least amount of work and leverages the cpu's branch prediction.
On the contrary, a switch statement performs additional bounds checks and funnels each bytecode segment to the same place which is unfavorable for branch predictions.
Precompiled JIT.
Cyber's JIT is implemented using precompiled stencils from LLVM. With only a small assembler, each bytecode instruction can be stitched together at runtime to generate performant machine code while still remaining fast to compile. More details can be found in the v0.3 Release Notes.
Compiled using Zig/LLVM.
Cyber itself was written in Zig, a system programming language that makes writing performant software easier. Zig leverages modern compiler features from LLVM which produces fast machine instructions for most targets.