This part is for tooling developers — those building new languages, compilers, debuggers, or targeting alternate EVMs and virtual machines. If you're only looking to use solx to compile smart contracts, Part 1 should cover everything you need.
If you're working on compilers, debuggers, or alternate VMs, we recommend starting with Part 1 for context, then continuing here.
With 2 man-years of engineering effort, solx has already reached a state where it produces better runtime gas efficiency than solc’s --via-ir --optimize pipeline — and it does so without tuning the LLVM optimizer or implementing any EVM-specific optimizations yet.
Thanks to LLVM, solx’s optimizer and code generator are under 8,000 lines of code and far easier to maintain than a custom pipeline. Much of the complexity typical of compiler development can be offloaded to LLVM’s existing infrastructure.
In this part, we briefly cover solx’s internal structure, how it can be reused or extended, and what it offers if you’re planning to:
develop a new language or adapt an existing one for smart contract development,
implement an experimental EVM feature,
or retarget Solidity to something like RISC-V.
solx combines the standard Solidity front-end with a new LLVM-based back-end.
It reuses the entire solc front-end (lexer, parser, AST), ensuring full Solidity support and compatibility with language updates. From there, solc lowers the AST to Yul IR, which we translate into LLVM IR.
We also support solc’s legacy pipeline. Since it doesn't expose an IR, we lift the generated EVM assembly into LLVM IR. Because we observed some contracts that compile cleanly with the legacy pipeline fail or behave inconsistently under --via-ir
, we’ve made the legacy path the default - just like solc does today. Users must explicitly pass --via-ir
to use the Yul-based flow.
Once in LLVM IR, solx applies standard LLVM optimizations, then hands off to our custom EVM backend. This handles instruction selection, scheduling, stackification, and assembly or binary emission - reusing as much of LLVM’s infrastructure as possible.
This design let us deliver a working pre-alpha with two man-years of engineering effort, but it comes with trade-offs: some optimizations are currently inhibited, and binaries can be larger. We briefly touched on these issues in Part 1. To address them, we're developing a new high-level Solidity IR based on LLVM's MLIR framework.
But let’s start with focusing on what solx as a compiler infrastructure is offering today.
One of LLVM’s most important innovations dating back to the early 2000s was its Intermediate Representation. At the time, most compilers tightly coupled language-specific ASTs with target-specific code generators. LLVM IR introduced a complete and self-contained representation of programs, including type information and debug metadata. This let compiler components and tools operate entirely at the IR level without requiring access to language-specific internals.
Frontend developers started to rely on a simple and stable interface - LLVM IR - without needing to understand code generation or debug formats. Backend developers only needed to support that IR format, regardless of how the source language worked. The optimizer, meanwhile, transformed IR and automatically updated associated metadata and debug info. And in 2025, LLVM’s optimizer preserves debug information in optimized builds reasonably well, thanks in large part to demands from the gaming industry, which needs fully optimized code to remain debuggable.
LLVM’s modularity is what makes solx adaptable.
Vitalik Buterin recently floated the idea of replacing the EVM with a RISC-V–based machine for Ethereum’s execution layer. If that ever happens, solx is ready - we estimate that adapting our LLVM IR from EVM to RISC-V would require less than 10% of the IR to change. In contrast, compilers like solc and Vyper would face a much steeper challenge. They would either need to start emitting LLVM IR or implement instruction selection, scheduling, register allocation, and binary emission for RISC-V from scratch.
This flexibility goes beyond backends. With LLVM, writing a new language for Ethereum requires just two entry points: IRBuilder
for IR construction, and DIBuilder
for debug info. You don’t need to know how the EVM works, or what ethdebug
expects. You just write to the IR.
It makes the widespread idea of Rust for EVM within reach. We’re not working on this right now, and Rust isn’t a language designed for smart contract development. Still, supporting the EVM subset of LLVM IR we designed in Rust could be done with about one engineer-year of effort.
Modularity also means the ability to focus - without the necessity to maintain large bases of code.
At first glance, relying on LLVM might seem counterintuitive - it's a large system, and integrating it can appear to add complexity. But think of the alternative. Without reuse, not just optimizations, but all the foundational infrastructure - IR design, printing, parsing, assembly emission, and more - must be built from scratch and then maintained. In contrast, for solx we only maintains translation to LLVM IR and a custom LLVM backend for the EVM. To give you a sense of the difference, below is the cloc
output for the EVM backend in solx compared to libyul
and libevmasm
from solc.
> cloc solidity/libyul
---------------------------------------------------------------------------
Language files blank comment code
---------------------------------------------------------------------------
C++ 96 2652 2269 15585
C/C++ Header 108 1784 3781 5597
CMake 1 2 0 207
Markdown 1 2 0 3
---------------------------------------------------------------------------
SUM: 206 4440 6050 21392
---------------------------------------------------------------------------
> cloc solidity/libevmasm
---------------------------------------------------------------------------
Language files blank comment code
---------------------------------------------------------------------------
C++ 20 716 649 6603
C/C++ Header 24 583 995 2601
CMake 1 1 0 47
---------------------------------------------------------------------------
SUM: 45 1300 1644 9251
---------------------------------------------------------------------------
> cloc --force-lang="C/C++ Header",def llvm/lib/Target/EVM
---------------------------------------------------------------------------
Language files blank comment code
---------------------------------------------------------------------------
C++ 39 1180 1400 5661
C/C++ Header 24 366 434 1198
TableGen 4 241 191 915
CMake 4 16 0 106
LLVM IR 1 15 0 69
---------------------------------------------------------------------------
SUM: 72 1818 2025 7949
But it’s not the end, LLVM does not natively support stack machines, so we implemented stackification ourselves. In doing so, we drew inspiration from both the designs used in solc and the WASM backend in LLVM. This significantly increased the amount of code we need to maintain today.
Still, one can imagine generalizing stackification across stack-based targets - similar to how LLVM handles register machines through declarative target descriptions and shared algorithms. In LLVM, most of the code generation logic - like instruction selection, register allocation, and scheduling - is written once and reused across architectures. Targets such as x86, ARM, or RISC-V don’t each implement these from scratch; instead, they define their register sets, constraints, and instruction mappings declaratively. The LLVM backend then uses a shared pipeline to apply this information in a uniform way.
If integrated upstream, stackification generalization would be a meaningful contribution to LLVM itself - advancing support for all stack-based architectures and saving significant human-hours across the compiler ecosystem. It would also reduce solx maintenance burden in the long run. Here is the number of LoCs for stackification logic we have.
---------------------------------------------------------------------------
Language files blank comment code
---------------------------------------------------------------------------
C++ 9 530 701 2759
C/C++ Header 4 103 120 467
---------------------------------------------------------------------------
SUM: 13 633 821 3226
---------------------------------------------------------------------------
This reflects a broader paradigm common in Web2 tooling: upstream whatever can be upstreamed. Maintaining compilers is expensive, and reducing custom logic is a proven way to keep development velocity high in long run.
Small teams rarely have time to build tools that support compiler development. But since solx is built on LLVM, it gets them for free. Here are three tools that, in our experience, have each saved us at least a human-day during the week we were working on this blogpost.
Note: To try these tools yourself, you’ll need to build our LLVM fork - these tools aren’t bundled with solx directly.
opt
and llc
: Step-by-Step Transformation Trackingopt
and llc
are LLVM’s command-line tools for running optimization passes and lowering IR to target code, respectively—and they’re invaluable for understanding what the compiler is doing.
In Part 1, we analyzed a contract’s performance and identified which transformations contributed to the observed gas savings. With LLVM, this kind of investigation is straightforward: opt -print-changed
shows which passes modified the IR, and --debug
reveals how those transformations were applied—or why some didn’t happen. The same applies to llc
, which provides similar insights for the code generation phase.
bugpoint
: Automatic Test Case Reductionbugpoint
is LLVM’s built-in tool for automatically minimizing test cases that trigger miscompilations or crashes.
While working on a stack-too-deep issue we found in a contract, we used bugpoint
to reduce the size of problematic LLVM IR inputs—shrinking one from 67.6 KB to 8.6 KB of textual LLVM IR representation, and another from 75.3 KB to just 5.9 KB. These reductions made debugging faster and more focused.
llvm-lit
and update_llc_test_checks.py
: Making Regression Testing SimpleThe contract reductions from the previous example were saved along with the bug fixes they exposed. Stackification is difficult to debug, and minimal test cases that trigger specific corner cases are worth preserving. LLVM’s testing tools make that easy: with llvm-lit
, you can add IR-based tests that specify how to run the compiler and what to check. update_llc_test_checks.py
then auto-generates assembly checks based on the current output.
These tools haven’t saved us a day yet on stackification - but based on prior experience, we know they will.
solx isn't only a command-line compiler - it’s a rethink of how Ethereum tooling can be built.
By leveraging LLVM, solx provides a modular, maintainable, and extensible compiler infrastructure. Teams can specialize, collaborate, and reuse tooling instead of rebuilding everything from scratch.
It also challenges a long-standing belief: that the EVM is too different for general-purpose tooling. Yes, the EVM has unique constraints. But it shares more with other platforms than often acknowledged. Like embedded systems, it imposes strict size limits, making optimization a correctness issue. Like game engines, debugging optimized builds is essential - replaying transactions of optimized bytecode requires it. And EVM isn’t even the only stack-based VM; LLVM already supports WebAssembly.
LLVM for Ethereum isn’t theoretical anymore - it’s here.
And with LLVM, tools like llvm-cov
for coverage, lldb
and VS Code for step-debugging, and other mainstream dev tools are within reach.
If you're working on dev tools, solx is ready to be integrated, and extended to your needs. Here’s how you can start collaborating with us.
🧪 Integrate solx into dev tools: From a CLI perspective, solx works as a drop-in solc
replacement. But if you encounter edge cases, we’re happy to help. If you need access to internal LLVM options not exposed in the CLI or to LLVM tools, we’re happy to assist with that as well.
🤝 Contribute to solx: solx is an open-source project, and we welcome contributions - with no gatekeeping. Small additions like tests, bug fixes, or features are always welcome. If you’re planning something larger, we recommend reaching out early - it helps align efforts and avoid duplicate work. We do expect major contributions to be maintained.
💬 Stay in touch: You can reach us via Telegram or by email at solx@matterlabs.dev. We’re always interested in hearing what you’re building, what you need, and where we can make things easier.