Design Principles of Ethereum 2.0Friday, 7th of June 2019 · by Raul Jordan
Quite a few articles have been going around talking about Ethereum 2.0’s roadmap, its research proposals, and current status. However, not much has been publicly written on the design principles and invariants that are behind many of its inner workings. Having a clear-cut set of invariants is crucial for such a coordinated, multi-year effort to be successful, and it also allows implementers to be on the same page with respect to the philosophy of Ethereum. This article will explain some of these design decisions, their background, and why they matter for the future of the protocol.
The motivation for switching Ethereum from PoW to a Proof of Stake protocol has been in heavy development ever since the birth of the Ethereum network itself. Vitalik Buterin at the time was exploring a viable solution against the pitfalls of naive proof of stake protocols to provide greater security guarantees than PoW. In particular, him and the Ethereum research team devised a mechanism known as slasher as a way to penalize malicious actors in proof of stake and cut their entire deposit (Buterin 2014).
Mathematician Vlad Zamfir then joined the project and most of the work in 2014 focused on solving what is known as a long-range attack on Proof of Stake. Long-range attacks occur when attackers can trivially create an entire chain from scratch that is longer than the current canonical blockchain to convince the rest of the network on a new canonical state of the world. This is almost impossible to perform in Proof of Work as it would require an enormous amount of compounded computa- tional power. Proof of Stake, however, does not rely on computa- tional capacity and therefore collapses under such attack (Zamfir 2014).
Vitalik and Vlad agreed there was no viable solution to a long range attack other than strictly preventing clients from syncing a chain older than a certain checkpoint (Buterin 2015). This means that instead of syncing a chain from the genesis block, a new node in the network would only need to sync from a recent “checkpoint” other nodes in the network agreed as finalized. That is, there is an inherent trust on old nodes when new nodes join the network. This phenomenon came to be known as Proof of Stake’s weak subjectivity. There is subjective trust in what “finalized”, “irrevertible” blocks are across participants in the network when new nodes join (Buterin 2018).
Throughout this time, Vitalik and Virgil Griffith from the Ethereum foundation both worked on publishing the initial version of the Casper Proof of Stake white paper on ArXiV. (Buterin and Griffith 2015). 2014-2017 marked a long period of time when Ethereum was attempting to overlay a Proof of Stake based finality system on top of the PoW chain Ethereum currently runs. In parallel, efforts were spinning up to implement state sharding as a partitioning scheme to scale the Ethereum blockchain. In 2018, however, there was a significant push towards bringing both initiatives together, and after an iconic research gathering in Taipei in March, the Ethereum research team proposed merging Casper Proof of Stake and Sharding into a single initiative known as Ethereum Serenity, also known as Ethereum 2.0.
Why ETH 2.0 At All?
This post aims to explain the design rationale behind the central question: “Why have Ethereum 2.0 at all?”. Surely, an overhaul of an existing system’s consensus protocol and data integrity is not something that can be easily done through a hard fork - isn’t it easier to simply create a new system from scratch and abandon Ethereum 1.0 entirely? A difficult problem we face when building Eth2 is needing to engage the community about this challenge and clearly understand the massive benefits and needs behind transitioning into Eth2.
While understanding the huge responsibility that comes with such a paradigm shift, there’s no better time to build Eth2 than today. Like it or not, crypto is still in its infancy, and the decisions we make today will have a compounding effect on accelerating growth and adoption many years down the line. The migration to proof of stake has already waited long enough and so has scalability of Ethereum’s applications. There is no better time to build Eth2, and the teams doing so are well positioned to do so.
Naive layer 1 scaling can come at a massive security expense, as sharding a blockchain prevents global transaction verification in the same way the current Bitcoin and Ethereum chains do today. The key question is: how can we obtain scalability while not sacrificing decentralization or security? Many competing chains aim to go for the centralization route as a means of solving this problem. Ethereum opts for a different approach which partitions the state of the network into 1024 shards which behave as a homogenous set of blockchains, each coordinated by a single root chain called the beacon chain. The beacon chain runs on full Casper Proof of Stake and there is no delegation nor centralized voting power. In this approach, every node is only responsible for a portion of transactions happening throughout the network, and many blocks can happen in parallel, increasing the overall network throughput linearly. A non-naive specification for this solution seeks to answer the following questions:
How does the security profile of a network change if transactions are not globally verified? How should validating participants be selected while preventing cartels from forming? How should incentives be designed to maximize data availability and active participation?
Ethereum has sought after proof of stake as its consensus algorithm of choice after years of research, exploration, and understanding the trade-offs to be made. For reasons discussed in this article, rewards are deterministic, validating entities have equal treatment within the protocol, with equal probability of participating in committees, earning rewards/penalties. Global verification of transactions is changed into an indirect verification. Each shard tx will be first verified by shard validators who commit checkpoints on the beacon chain, and the beacon serves a “coordinator” of shard information on Eth2.
A key pillar of protocol design is to understand the invariants under which the protocol will operate. For Ethereum and its developer community, having a list of non-negotiable design decisions the future of the project will pursue is critical.
We can break down the core of Eth2 into the following bullet points:
- Participation in the network should be permissionless
- Layer 1 should be concise, abstract, and compact in its scope
- The protocol should be maximally expressive while not assuming anything about its future uses - i.e. the old Ethereum adage of “we have no features”
- The network should favor liveness to recover from any catastrophic scenarios effectively
- Separate protocol complexity from application development complexity
A notable difference between Eth2 and other “next-gen” blockchains is how participation in consensus is determined. The only requirement for Eth2 is to have 32 ETH to become a validator. There is no delegation, no voting to select validating nodes, and no centralized constitution deciding who gets to participate. More importantly, validators in Eth2 are all treated equally at a hard cap of 32 ETH per entity. Any individual, however, can own multiple validator instances. This is merely a decision to simplify the security and compactness of the consensus protocol. Treating all atomic participants equally with an equal stake when voting on blocks is important from an incentive design angle as well as for formal modeling. 1 validator = 32 ETH at stake, no more than that. Other chains aim to solve scalability by taking on a more centralized approach to validation. For Ethereum, however, that is not an option.
Be Concise, Yet Maximally Expressive
Eth2 aims to be concise and compact in its core definitions and what it aims to achieve. At the fundamental level, it is a scalable, permissionless platform for creating decentralized applications.
There is no need to bake in application logic, and for good reason. One could make the analogy to a trimmed down linux kernel - it is not up to the operating system to include features or assume use cases for itself, but up to the developers who will build applications for said kernel. Assuming intent is restrictive. An old Ethereum adage says “we have no features” and the same philosophy will be applied to Eth2.
Eth2’s proof of stake model, known as Casper the Friendly Finality Gadget, operates under a series of incentives designed to maintain a high degree of liveness and network participation. Eth2 expands on Casper to leverage its properties to secure a network of sharded blockchains. That is, it uses the concepts of chain finality thresholds to ensure the 1024 shards in Eth2 all share the same security pool as the beacon chain.
The core premise of proof of stake is that validators all get rewarded for doing their assigned role as expected, lose money over time for being idle, and get penalized heavily (slashed) if they act maliciously against the protocol. Although the premise is succinct, the devil is in the details. The economics of Casper quickly become more complex once we realize we have to take into account the action profile of not only each individual validator, but committees of validators as a whole.
An open question for Proof of Stake chains in general is when to penalize behaviors and how should certain validator penalties differ based on the gravity of the action. That is, we need to find a measure of penalization that is holistic enough to cover all edge cases while being concise. Given the protocol relies on validator activity and is dependent on a strong measure of time to function, there will be scenarios where validators fail to perform while being honest. Honest validators could come offline due to power outages, network instability, or other factors yet we need to clearly differentiate between idleness penalties and those due to malicious activity.
Part of the design rationale of Ethereum 2.0 is for attackers to incur a massive cost for any attempts to game the protocol. That is, 51% attacks as typically observed in other chains should be extremely costly and even counterproductive for anyone to perform. That is, reverting finality in a protocol that has explicit finality would make the attackers extremely obvious to the honest validators, allowing a community coordinated soft fork to remove the bad actors and invalidate their attack. Granted, even if an attack is successful and this coordination is unsuccessful, integrity in the system can still be diminished if the attackers sole purpose is to damage the system and incur a large loss.
An additional limitation in proof of stake based systems is the validators dilemma, an aptly named phenomenon through which validators in the system are lazy and simply trust others in the protocol are doing their job correctly, therefore not validating messages they are responsible for. These validators save on bandwidth or general computational requirements by not performing their responsibility unless there are significant penalties. This limitation can be mitigated by adding extremely strong penalties and challenge mechanisms for missing data or incorrectly signed information in the network.
Ethereum 2.0’s validator incentives are as follows:
Validator Inactivity: Quadratic Leak
ETH2 relies on the Byzantine Fault Tolerance threshold of ⅔ honest actors of the network. Penalties for inactivity are known as inactivity leaks. If a chain fails to finalize for more than 4 epochs, the protocol becomes as strict as it can on validator rewards. That is, if a long time has passed since chain finality, the maximum expected reward becomes 0, so that validators need to behave perfectly or risk more penalties. The size of the penalty is proportional to the time since finality, in order to discourage validators being offline when they are preventing block finality. This penalty becomes exponential in nature the longer certain validators are idle. This penalty, known as “quadratic leak”, is designed so it does not penalize short term idleness but comes with big downsides for longer periods of time, accounting for expected real world behavior. Balances lost by this penalty are burnt and not redistributed to honest validators.
Intentionally Malicious Activity: Slashing
In earlier Proof of Stake proposals for Ethereum written that, it was malicious validators will suffer large penalties known as slashings, typically these mechanisms only discuss individual malicious validator penalties and not the importance of validator collu- sion. The network only suffers if a large portion of validators act maliciously against it in coordination. In line with the guarantee of byzantine fault tolerance, penalties for malicious actors are multiplied by 3 times the number of validators that also acted maliciously within a certain time interval. This helps penalize large coordinated attacks and also disincentivizes malicious validator pooling. That is, it is against malicious validators interest to perform an aggregate attack on the network. Slashing occurs via a whistleblowing mechanism, where validators are incentivized to discover slashable offenses from other validators, and earn the slashed funds as compensation.
From Vitalik’s Ethereum Serenity design rationale, he outlines 4 specific components to a validator’s base reward during each epoch (a period of 64 blocks):
- 1/4 reward for the attestation specifying the correct epoch checkpoint
- 1/4 reward for the attestation specifying the correct chain head
- 1/4 reward for the attestation being included quickly on chain
- 1/4 reward for the attestation specifying the correct shard block
There is an additional reward on top of this base reward depending on the number of validators that participated correctly. This additional reward is to incentivize everyone to do the right thing, creating a collective push for honest activity. The issuance schedule of rewards should be consistent and straightforward. Adding more complexity would only make the system more error prone and harder to understand from a macroeconomic standpoint.
Separate Protocol Complexity From Application Complexity
To say the Eth2 roadmap is daunting is an understatement. It is perhaps one of the most ambitious, multi-year plans to take the best lessons from the industry and create a protocol that solves the scalability trilemma elegantly and is built to last. A lot of discussion has been going around with respect to how sharding dramatically decreases developer experience. The rationale is that it is extremely difficult to abstract away protocol internals from applications developers in Eth2, as we have a highly complex system of shards that need to interact with each other (cross-shard transactions). At first sight, this observation makes sense given how daunting Eth2 looks from the outside and how there is not much clarity on several aspects of smart contract execution in the project. However, the truth is far more nuanced.
Application developers will only need to know about a small portion of the Eth2 protocol. There is no need for average smart contract developers to be aware of the validator registry or internals of the beacon chain’s finality gadget. Phase 0 is therefore quite removed from the application layer. Phase 1 and 2 have also had pretty strong proposals lately which advocate for a higher degree of abstraction of execution environments, making Eth2 more powerful and more concise. At worst, wallet/application developers would need to be aware of certain details of cross-shard transactions to show instant transaction settlement via some tricks. Computer operating systems and internals are much more complex today than 10 years ago, yet, most application developers need not understand the hidden internals that make powerful computer architectures the way they are. This separation of concerns is at the heart of good architecture design, and one could argue is a design invariant we should keep in mind when building Eth2.
Building a True World Computer
To wrap things up, Ethereum is turing complete, which means it can run any sort of conceivable code in the same way computers due today, albeit a very limited, slow, and single-threaded computer. Ethereum today is akin to the weak processors in the early days. Running applications on Ethereum today is expensive, as the protocol has built in mechanisms to prevent tragedy of the commons scenarios that plague public goods. Its vibrant developer community never runs out of innovations for improving the current network, both at its core level and at layer-2. However, scheduled upgrades can be problematic and are painful from a governance standpoint. If once Eth2 is live for a few years and we feel constrained by it and desire to build an Eth3, we have failed at the former’s core design. Upgradeability should be baked into the protocol in a way that does not require risky hard forks. That is, innovation at layer-1 should be minimal or near zero once the system is in production for the long run. We have a long way to go, but by carefully reminding ourselves why we’re building this software and where we want to see it in 10 years, we can write more robust code that can last the test of time.