Update Documentation to Reflect Beacon Chain Research (#247)

Former-commit-id: 91dea499d9e27b0ca9e22769582eae6f8d042a98 [formerly 9bfb7f7fef2b8e4802cfd5d99be561378a5ada65]
Former-commit-id: 5a2a52d440d9d67857f5ecac173aa721ce46d12c
This commit is contained in:
Raul Jordan 2018-07-12 12:12:11 -05:00 committed by GitHub
parent 923e727819
commit 0444ee81c4
6 changed files with 210 additions and 372 deletions

View File

@ -2,24 +2,20 @@
![Travis Build](https://travis-ci.org/prysmaticlabs/geth-sharding.svg?branch=master)
This is the main repository for the sharding implementation for the go-ethereum project by [Prysmatic Labs](https://prysmaticlabs.com). For the original, go-ethereum project, refer to the following [link](https://github.com/ethereum/go-ethereum).
This is the main repository for the beacon chain and sharding implementation for Ethereum 2.0 [Prysmatic Labs](https://prysmaticlabs.com).
Before you begin, check out our [Contribution Guidelines](#contribution-guidelines) and join our active chat room on Gitter below:
[![Gitter](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/prysmaticlabs/geth-sharding?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge)
Also, read our [Sharding Reference Implementation Doc](https://github.com/prysmaticlabs/geth-sharding/blob/master/sharding/README.md). This doc serves as a source of truth for the sharding implementation we follow at Prysmatic Labs.
Also, read our [Sharding Reference Implementation Doc](https://github.com/prysmaticlabs/geth-sharding/blob/master/sharding/README.md). This doc provides a background on the sharding implementation we follow at Prysmatic Labs.
# Table of Contents
- [Installation](#installation)
- [Sharding Instructions](#sharding)
- [Running a Local Geth Node](#running-a-local-geth-node)
- [Transaction Generator](#transaction-generator)
- [Becoming a Notary](#becoming-a-notary)
- [Running a Collation Proposal Node](#running-a-collation-proposal-node)
- [Testing](#testing)
- [Contributing](#contributing)
- [License](#license)
@ -29,36 +25,33 @@ Also, read our [Sharding Reference Implementation Doc](https://github.com/prysma
Create a folder in your `$GOPATH` and navigate to it
```
$ mkdir -p $GOPATH/src/github.com/ethereum && cd $GOPATH/src/github.com/ethereum
$ mkdir -p $GOPATH/src/github.com/prysmaticlabs && cd $GOPATH/src/github.com/prysmaticlabs
```
Clone our repository as `go-ethereum`
Clone our repository:
```
$ git clone https://github.com/prysmaticlabs/geth-sharding ./go-ethereum
$ git clone https://github.com/prysmaticlabs/geth-sharding
```
For prerequisites and detailed build instructions please read the
[Installation Instructions](https://github.com/ethereum/go-ethereum/wiki/Building-Ethereum)
on the wiki.
Download the Bazel build tool by Google [here](https://docs.bazel.build/versions/master/install.html) and ensure it works by typing
```
$ bazel
```
You will also need to download Geth:
```
$ go get -u github.com/ethereum/go-ethereum
```
Building geth requires both a Go (version 1.7 or later) and a C compiler.
You can install them using your favourite package manager.
Once the dependencies are installed, run
```
$ make geth
```
or, to build the full suite of utilities:
```
$ make all
```
# Sharding Instructions
To get started with running the project, follow the instructions to initialize your own private Ethereum blockchain and geth node, as they will be required to run before you can begin proposing collations into shard chains.
To get started with running the project, follow the instructions to initialize your own private Ethereum blockchain and geth node, as they will be required to run before you can begin running our system
## Running a Local Geth Node
@ -82,11 +75,10 @@ To start a local Geth node, you can create your own `genesis.json` file similar
The `alloc` portion specifies account addresses with prefunded ETH when the Ethereum blockchain is created. You can modify this section of the genesis to include your own test address and prefund it with 100ETH.
Then, you can build `geth` and init a new instance of a local, Ethereum blockchain as follows:
Then, you can build and init a new instance of a local, Ethereum blockchain as follows:
$ make geth
$ ./build/bin/geth init /path/to/genesis.json -datadir /path/to/your/datadir
$ ./build/bin/geth --nodiscover console --datadir /path/to/your/datadir --networkid 12345
$ geth init /path/to/genesis.json -datadir /path/to/your/datadir
$ geth --nodiscover console --datadir /path/to/your/datadir --networkid 12345
It is **important** to note that the `--networkid` flag must match the `chainId` property in the genesis file.
@ -98,18 +90,24 @@ Then, the geth console can start up and you can start a miner as follows:
Now, save the passphrase you used in the geth node into a text file called password.txt. Then, once you have this private geth node running on your local network, we will need to generate test, pending transactions that can then be processed into collations by proposers. For this, we have created an in-house transaction generator CLI tool.
## Transaction Generator
Work in Progress. To track our current draft of the tx generator cli spec, visit this [link](https://docs.google.com/document/d/1YohsW4R9dIRo0u5RqfNOYjCkYKVCmzjgoBDBYDdu5m0/edit?usp=drive_web&ouid=105756662967435769870). Generating test transactions on a local network will allow for benchmarking of tx throughput within our system.
# Sharding Minimal Protocol
**NOTE**: This section is in flux: will be deprecated in favor of a beacon chain)
Build our system first
```
$ bazel build //sharding/...
```
## Becoming a Notary
Our system outlined below follows the [Minimal Sharding Protocol](https://ethresear.ch/t/a-minimal-sharding-protocol-that-may-be-worthwhile-as-a-development-target-now/1650) as outlined by Vitalik on ETHResearch where any actor can submit collation headers via the SMC, but only a selected committee of notaries is allowed to vote on collations in each period. Notaries are in charge of data availability checking and consensus is reached upon a collation header receiving >= 2/3 votes in a period.
To deposit ETH and join as a notary in the Sharding Manager Contract, run the following command:
Make sure a geth node is running as a separate process. Then, to deposit ETH and join as a notary in the Sharding Manager Contract, run the following command:
```
geth sharding --actor "notary" --deposit --datadir /path/to/your/datadir --password /path/to/your/password.txt --networkid 12345
$ bazel run //sharding --actor "notary" --deposit --datadir /path/to/your/datadir --password /path/to/your/password.txt --networkid 12345
```
This will extract 1000ETH from your account balance and insert you into the SMC's notaries. Then, the program will listen for incoming block headers and notify you when you have been selected as to vote on proposals for a certain shard in a given period. Once you are selected, your sharding node will download collation information to check for data availability on vote on proposals that have been submitted via the `addHeader` function on the SMC.
@ -119,14 +117,14 @@ Concurrently, you will need to run another service that is tasked with processin
## Running a Collation Proposal Node
```
geth sharding --actor "proposer" --datadir /path/to/your/datadir --password /path/to/your/password.txt --shardid 0 --networkid 12345
$ bazel run //sharding --actor "proposer" --datadir /path/to/your/datadir --password /path/to/your/password.txt --shardid 0 --networkid 12345
```
This node is tasked with processing pending transactions into blobs within collations by serializing data into collation bodies. It is responsible for submitting proposals on shard 0 (collation headers) to the SMC via the `addHeader` function.
## Running an Observer Node
geth sharding --datadir /path/to/your/datadir --password /path/to/your/password.txt --shardid 0 --networkid 12345
$ bazel run //sharding --datadir /path/to/your/datadir --password /path/to/your/password.txt --shardid 0 --networkid 12345
Omitting the `--actor` flag will launch a simple observer service attached to the sharding client that is able to listen to changes happening throughout the sharded Ethereum network on shard 0.
@ -136,19 +134,23 @@ Omitting the `--actor` flag will launch a simple observer service attached to th
The Sharding Manager Contract is built in Solidity and deployed to a running geth node upon launch of the sharding node if it does not exist in the network at a specified address. If there are any changes to the SMC's code, the Golang bindigs must be rebuilt with the following command.
go generate github.com/prysmaticlabs/geth-sharding/sharding
go generate github.com/prysmaticlabs/geth-sharding/sharding/contracts
# OR
cd sharding && go generate
cd sharding/contracts && go generate
# Testing
To run the unit tests of our system do:
```
go test github.com/prysmaticlabs/geth-sharding/sharding
$ bazel test //...
```
We will require more complex testing scenarios (fuzz tests) to measure the full integrity of the system as it evolves.
To run our linter, make sure you have [gometalinter](https://github.com/alecthomas/gometalinter) installed and then run
```
$ gometalinter ./
```
# Contributing
@ -158,10 +160,8 @@ We have put all of our contribution guidelines into [CONTRIBUTING.md](https://gi
# License
The go-ethereum library (i.e. all code outside of the `cmd` directory) is licensed under the
[GNU Lesser General Public License v3.0](https://www.gnu.org/licenses/lgpl-3.0.en.html), also
included in our repository in the `COPYING.LESSER` file.
The go-ethereum library is licensed under the
[GNU Lesser General Public License v3.0](https://www.gnu.org/licenses/lgpl-3.0.en.html)
The go-ethereum binaries (i.e. all code inside of the `cmd` directory) is licensed under the
[GNU General Public License v3.0](https://www.gnu.org/licenses/gpl-3.0.en.html), also included
in our repository in the `COPYING` file.
The go-ethereum binaries is licensed under the
[GNU General Public License v3.0](https://www.gnu.org/licenses/gpl-3.0.en.html)

View File

@ -1,6 +1,6 @@
# Prysmatic Labs Beacon Chain Implementation
This is the main repository for the beacon chain implementation of Ethereum 2.0 in Golang by [Prysmatic Labs](https://prysmaticlabs.com). Before you begin, check out our [Contribution Guidelines](#contribution-guidelines) and join our active chat room on Gitter below:
This is the main project folder for the beacon chain implementation of Ethereum 2.0 in Golang by [Prysmatic Labs](https://prysmaticlabs.com). Before you begin, check out our [Contribution Guidelines](#contribution-guidelines) and join our active chat room on Gitter below:
[![Gitter](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/prysmaticlabs/geth-sharding?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge)

113
beacon-chain/RESEARCH.md Normal file
View File

@ -0,0 +1,113 @@
# Beacon Chain Research Synopsis
This doc will summarize the latest discussions and roadmap updates around integrating Casper/Sharding through a beacon chain and what it means for Ethereum's end-game.
### Research Notes Leading Up To This
- [Offchain Collation Headers](https://ethresear.ch/t/offchain-collation-headers/1679)
- [RANDAO Beacon Exploitability Part. 1](https://ethresear.ch/t/rng-exploitability-analysis-assuming-pure-randao-based-main-chain/1825/9)
- [Leaderless k-of-n Random Beacon](https://ethresear.ch/t/leaderless-k-of-n-random-beacon/2046)
- [Two Ways to do Cross Links](https://ethresear.ch/t/two-ways-to-do-cross-links/2074)
- [Registrations, Shard Count, and Shuffling](https://ethresear.ch/t/registrations-shard-count-and-shuffling/2129)
- [Committee Based Sharded Casper](https://ethresear.ch/t/committee-based-sharded-casper/2197)
- [Attestation Committee Based Full PoS Chains](https://ethresear.ch/t/attestation-committee-based-full-pos-chains/2259)
The beacon chain idea emerged from the research around notarization of shard information by a committee of notaries, similar to a committee of validators in Casper. The key difference in sharding, however, is the pseudorandomness generation required for fast reshuffling of actors across shards within the system.
Piggybacking off the VRF (verifiable random function) research alongside the ideas of BLS signatures put forth by DFINITY, there are many elements that can be taken to create a sidechain that would potentially merge sharding/casper :heart:.
## The Beacon Chain
### A Sidechain Instead of a Sharding Manager Contract
When handling the sharded system via smart contract on the Ethereum mainchain, we were able to derive implicit finality via the transactions that submit a collation header into the contract that would then be mined onto a block in the mainchain. However, we were _bounded_ by gas and the current functioning of the EVM 1.0. That is, the number of shards realistically could only grow as much as the sharding manager contract could handle as a load of incoming transactions.
This system is also more affected by hard-forks and changes occurring in the main Ethereum network. Moreover, the incoming integration of a hybrid Casper PoS system would create a complicated co-existence of two types of validators: namely Casper and sharding validators. Signature verification, in particular, is an extremely expensive operation if done entirely through a smart contract on the mainchain, creating inherent bottlenecks in a hybrid system.
Instead, we propose the creation of a _sidechain_ known as a **beacon chain** that has links to the mainchain by containing hashes of canonical mainchain blocks within its own block construction.
### Desiderata
There are a few important traits we will include in this sidechain construct that are particularly important for sharding.
- Block references to the main chain
- Full Proof of Stake via Casper FFG
- RANDAO for random selection of committees
- One-Way Deposits
- Ability to store verifiable metadata of occurrences across shards (we refer to this information as a cross-link)
In Vitalik Buterin's [words](https://notes.ethereum.org/SCIg8AH5SA-O4C1G1LYZHQ?both):
> A cross-link is a special type of transaction that says “here is the hash of some recent block on shard X. Here are signatures from at least 2/3 of some randomly selected sample of M validators (eg. M = 1024) that attest to the validity of the cross-link”. Every shard (eg. there might be 4000 shards total) is itself a PoS chain, and the shard chains are where the transactions and accounts will be stored. The cross-links serve to “confirm” segments of the shard chains into the main chain, and are also the primary way through which the different shards will be able to talk to each other.
The main purpose of beacon chain is to handle these shard cross-links as well as the set of validators that are locked into the system. This initial set is seeded by users burning 32ETH into a contract on the mainchain and specifying their public key, which can then be verified by the beacon chain which interacts with this contract.
The exact specification of the beacon chain is a work-in-progress being written [here](https://notes.ethereum.org/SCIg8AH5SA-O4C1G1LYZHQ?both). Instead of paraphrasing its spec, we will elaborate on the research that led to this point and what it means moving forward.
## Research History
Recall that the **Sharding Manager Contract** was originally going to be used for management of the sharded system.
> Whole point of this is to give a background and better understanding of the system...I think I won't follow the structure below in the final writing but will help as an outline.
### Limitations of the Sharding Manager Contract
Using a Sharding Manager Contract, albeit attractive from a development standpoint, poses the following challenges:
- miners on the main chain can censor transactions
- the number of shards is limited by gas costs of writes to the storage of this contract
- any upgrades to this contract/sharding system would require a hard fork on the main chain
#### Offchain Collation Headers
[ETHResearch Link](https://ethresear.ch/t/offchain-collation-headers/1679)
One idea that naturally arose was to store collation headers offchain, allowing for more shards to exist as there would be no processing bottleneck and faster finality guarantees. Reminiscent of DFINITY's chain, this process could be done through a construct known as a **random beacon side chain** that is pegged to the main chain via checkpoints on main chain blocks.
Reminiscent of plasma chains without the exit mechanism according to Justin Drake, this construct would provide a solid ground for experimentation of shard designs without modifications to the main chain. The beacon chain would provide the pseudorandomness required for committee selection of the sharding system, through BLS distributed key generation, as well as better finality given that shards only care about finalized deposits from the main chain.
Management and submission of collation headers can now be done within each respective shard, with the beacon chain only serving as a coordination device for summarizing what was voted on within each shard. That is, the beacon chain would be responsible for handling what we call a **cross-link**, which is a piece of metadata summarizing which collations were voted on as canonical within shards, who voted on these collations, and what are their blob merkle roots.
#### Randomness in Committee Selections
[ETHResearch Link](https://ethresear.ch/t/rng-exploitability-analysis-assuming-pure-randao-based-main-chain/1825/9)
While DFINITY's beacon chain uses the BLS Signature scheme integrated into something called Threshold Relay for distributed randomness, this is not needed for the random beacon chain construct we are building for sharding. That is, there are other satisfactory ways to achieve randomness that make more sense for sharding.
Ethereum's sharding beacon chain will use a system called **RANDAO** which is a Decentralized Autonomous Organization (a DAO) where randomness is generated by participants contributing a value to a "hash-onion" in the system
$$H(H(H(.....S.....)))$$
where participants creating a block have to reveal the pre-image (the value before the hash) of their commitment value and update the current commitment value to this pre-image. In a recursive manner, the next participants will have to do the same and reveal their pre-images when creating a block. This system updates a random value $R$ during each iteration by taking the $XOR$ of it with the revealed pre-image. This value, $R$, is then used for randomization of committee selection. For sharding in particular, Justin mentioned the source of randomness can be this global $R$ value also $XOR$ with the proposer's pre-image commitment for that particular shard, restricting the visibility of $R\_{shard}$ to only proposers participating in that shard.
#### Leaderless Random Beacon - Alternative to RANDAO
[ETHResearch Link](https://ethresear.ch/t/leaderless-k-of-n-random-beacon/2046)
The downside of RANDAO is the need for a leader in each step of the hash pre-image reveal step for determining the next value of $R$. In an alternative approach, we have a "committee of size $n$ generate random numbers if $k$ participants generate correctly" (Justin Drake). In this approach, we have every participant commit a temporary secret key, public key pair and form a polynomial via point interpolation. At a reveal phase of the protocol, k-of-n polynomial encrypted shares. At a reveal phase, $k$ participants reveal their private keys and it is then easy to check which ones did not reveal correctly. Even then, we can reconstruct an appropriate candidate polynomial using the revealed keys. Then, the randomness becomes random output to be the "sum of the secret keys for which the corresponding participants committed correctly" as explained by Justin.
### Shard Metadata & Finality
[ETHResearch Link](https://ethresear.ch/t/two-ways-to-do-cross-links/2074)
The main idea behind sharding is to leverage this beacon chain construct for consensus on **cross-links** which are the heart of the sharding spec. In a way, cross-links are metadata that summarize the latest occurrences on shards. That is, they compress the results of PoS consensus from each shard into a simple message that uses proposer signatures to confirm this information. They are called **cross-links** because they will eventually be the way shards can communicate between each other, as they are linked to the finality of the main chain.
#### Registrations and Committee Reshuffling
[ETHResearch Link](https://ethresear.ch/t/registrations-shard-count-and-shuffling/2129)
Leading up to the beacon chain spec, a few proposals for structuring committees were being explored, including this one by Justin Drake. We refer to participants in the Ethereum protocol 2.0 (Casper + Sharding) as validators which can be in one of three states: either pending_registration, registered, or pending_deregistration (Justin Drake).
The main idea behind structuring committees of validators is that shards will be empty and uninitialized until the beacon chain reaches a certain minimum number of registered validators.
In the post, it is mentioned that
> Proposers and notaries are shuffled (via pseudo-random permutations) across shards in a staggered fashion and at a constant rate. Proposers are assigned to shards for 2^19 periods (~30 days) and the oldest proposer from each shard are shuffled every 2^(19 - 10) periods. Notaries are assigned to a shard for 2^7 periods (~10 minutes) and the oldest 2^(10 - 7) notaries from each shard are shuffled every period.
>
> However, the current spec for the beacon chain mentions a fixed number for the SHARD_COUNT set to 1024.
The entire reshuffling mechanism was revamped given this fixed shard_count number. In the current beacon chain spec, it is mentioned that
> For shard crosslinks, the process is somewhat more complicated. First, we choose the set of shards active during each epoch. We want to target some fixed number of notaries per crosslink, but this means that since there is a fixed number of shards and a variable number of validators, we wont be able to go through every shard in every epoch. Hence, we first need to select which shards we will be crosslinking in some given epoch
Additionally, the current spec forces casper validators to also be sharding validators, which enforces greater security and takes advantage of the enshrined randomness + full PoS properties of the beacon chain.

View File

@ -8,19 +8,20 @@ You can explore our [Current Projects](https://github.com/prysmaticlabs/geth-sha
**Contribution Steps**
- Create a folder in your `$GOPATH` and navigate to it `mkdir -p $GOPATH/src/github.com/ethereum && cd $GOPATH/src/github.com/ethereum`
- Clone our repository as `go-ethereum`, `git clone https://github.com/prysmaticlabs/geth-sharding ./go-ethereum`
- Follow the setup instructions in our [README.md](https://github.com/prysmaticlabs/geth-sharding/blob/master/README.md)
- Create a folder in your `$GOPATH` and navigate to it `mkdir -p $GOPATH/src/github.com/prysmaticlabs && cd $GOPATH/src/github.com/prysmaticlabs`
- `git clone https://github.com/prysmaticlabs/geth-sharding`
- Fork the our repository on Github: <https://github.com/prysmaticlabs/geth-sharding>
- Add a remote to your fork
\`git remote add YOURNAME <https://github.com/YOURNAME/geth-sharding>
Now you should have a remote pointing to the `origin` repo (geth-sharding) and to your forked, go-ethereum repo on Github. To commit changes and start a Pull Request, our workflow is as follows:
Now you should have a remote pointing to the `origin` repo (geth-sharding). To commit changes and start a Pull Request, our workflow is as follows:
- Create a new branch with a clear feature name such as `git checkout -b collations-pool`
- Issue changes with clear commit messages
- Run the linter and CI tester as follows `go run build/ci.go test && go run build/ci.go lint`
- Run the linter and tester as follows `gometalinter && bazel test //...`
- Push to your remote `git push YOURNAME collations-pool`
- Go to the [geth-sharding](https://github.com/prysmaticlabs/geth-sharding) repository on Github and start a PR comparing `geth-sharding:master` with `go-ethereum:collations-pool` (your fork on your profile).
- Go to the [geth-sharding](https://github.com/prysmaticlabs/geth-sharding) repository on Github and start a PR comparing `geth-sharding:master` with `geth-sharding:collations-pool` (your fork on your profile).
- Add a clear PR title along with a description of what this PR encompasses, when it can be closed, and what you are currently working on. Github markdown checklists work great for this.
Pull requests must be cleanly rebased ontop of master. If master advances while your PR is in review, please keep rebasing it.
@ -58,4 +59,4 @@ Core contributors are remote contractors of Prysmatic Labs, LLC. and are conside
We love working with people that are autonomous, bring independent thoughts to the team, and are excited for their work! We believe in a merit-based approach to becoming a core contributor, and any part-time contributor that puts in the time, work, and drive can become a core member of our team.
![eth](https://steemitimages.com/DQmV1NASyCJYusDjY1WCvpoWiXh32HyumQHFQhY8zYZ6WDH/source.gif)
![eth](https://steemitimages.com/DQmV1NASyCJYusDjY1WCvpoWiXh32HyumQHFQhY8zYZ6WDH/source.gif)

View File

@ -80,7 +80,6 @@ With respect to knowing enough about sharding, we will cover the requirements fo
- [How to Scale Ethereum: Sharding Explained](https://medium.com/prysmatic-labs/how-to-scale-ethereum-sharding-explained-ba2e283b7fce)
- [Sharding FAQ](https://github.com/ethereum/wiki/wiki/Sharding-FAQ)
- [Sharding Introduction: R&D Compendium](https://github.com/ethereum/wiki/wiki/Sharding-introduction-R&D-compendium)
- [Sharding Minimal Protocol](https://ethresear.ch/t/a-minimal-sharding-protocol-that-may-be-worthwhile-as-a-development-target-now/1650)
### For Core Contributors
@ -97,8 +96,6 @@ After reading the Sharding FAQ, it is important to understand the minimal implem
**Sharding Concepts and Notes**
- [Sharding Concepts Mental Map](https://www.mindomo.com/zh/mindmap/sharding-d7cf8b6dee714d01a77388cb5d9d2a01)
- [Sharding Minimal Protocol](https://ethresear.ch/t/a-minimal-sharding-protocol-that-may-be-worthwhile-as-a-development-target-now/1650)
- [Sharding Roadmap](https://github.com/ethereum/wiki/wiki/Sharding-roadmap)
- [Taiwan Sharding Workshop Notes](https://hackmd.io/s/HJ_BbgCFz#%E2%9F%A0-General-Introduction)
- [Sharding Research Compendium](http://notes.ethereum.org/s/BJc_eGVFM)
- [Torus Shaped Sharding Network](https://ethresear.ch/t/torus-shaped-sharding-network/1720/8)
@ -115,9 +112,6 @@ After reading the Sharding FAQ, it is important to understand the minimal implem
- [Future Compatibility for Sharding](https://ethresear.ch/t/future-compatibility-for-sharding/386)
- [Fork Choice Rule for Collation Proposal Mechanisms](https://ethresear.ch/t/fork-choice-rule-for-collation-proposal-mechanisms/922/8)
- [State Execution](https://ethresear.ch/t/state-execution-scalability-and-cost-under-dos-attacks/1048)
- [Enforcing Windback](https://ethresear.ch/t/enforcing-windback-validity-and-availability-and-a-proof-of-custody/949/5)
- [Fork Free Sharding](https://ethresear.ch/t/fork-free-sharding/1058/12)
- [Merge Blocks](https://ethresear.ch/t/merge-blocks-and-synchronous-cross-shard-state-execution/1240/4)
- [Fast Shard Chains With Notarization](https://ethresear.ch/t/as-fast-as-possible-shard-chains-with-notarization/1806/2)
- [RANDAO Notary Committees](https://ethresear.ch/t/fork-free-randao/1835/3)
- [Safe Notary Pool Size](https://ethresear.ch/t/safe-notary-pool-size/1728/3)

View File

@ -1,6 +1,6 @@
# Prysmatic Labs Main Sharding Reference
This document serves as a main reference for Prysmatic Labs' sharding implementation for the go-ethereum client, along with our roadmap and compilation of active research and approaches to various sharding schemes.
This document serves as a main reference for Prysmatic Labs' sharding and beacon chain implementation in Go, along with our roadmap and compilation of active research.
# Table of Contents
@ -10,39 +10,14 @@ This document serves as a main reference for Prysmatic Labs' sharding implementa
- [The Ruby Release: Local Network](#the-ruby-release-local-network)
- [The Sapphire Release: Ropsten Testnet](#the-sapphire-release-ropsten-testnet)
- [The Diamond Release: Ethereum Mainnet](#the-diamond-release-ethereum-mainnet)
- [Go-Ethereum Sharding Alpha Implementation](#go-ethereum-sharding-alpha-implementation)
- [Beacon Chain and Sharding Alpha Implementation](#beacon-chain-and-sharding-alpha-implementation)
- [System Architecture](#system-architecture)
- [System Start and User Entrypoint](#system-start-and-user-entrypoint)
- [The Sharding Manager Contract](#the-sharding-manager-contract)
- [Notary Sampling](#notary-sampling)
- [The Notary Client](#the-notary-client)
- [Local Shard Storage](#local-shard-storage)
- [The Proposer Client](#the-proposer-client)
- [Collation Headers](#collation-headers)
- [Protocol Modifications](#protocol-modifications)
- [Protocol Primitives: Collations, Blocks, Transactions, Accounts](#protocol-primitives-collations-blocks-transactions-accounts)
- [The EVM: What You Need to Know](#the-evm-what-you-need-to-know)
- [Sharding In-Practice](#sharding-in-practice)
- [Use-Case Stories: Proposers](#use-case-stories-proposers)
- [Use-Case Stories: Notaries](#use-case-stories-notaries)
- [Current Status](#current-status)
- [Notary Sampling](#notary-sampling)
- [Security Considerations](#security-considerations)
- [Not Included in Ruby Release](#not-included-in-ruby-release)
- [Bribing, Coordinated Attack Models](#bribing-coordinated-attack-models)
- [Enforced Windback](#enforced-windback)
- [The Data Availability Problem](#the-data-availability-problem)
- [Introduction and Background](#introduction-and-background)
- [On Uniquely Attributable Faults](#on-uniquely-attributable-faults)
- [Erasure Codes](#erasure-codes)
- [Beyond Phase 1](#beyond-phase-1)
- [Cross-Shard Communication](#cross-shard-communication)
- [Receipts Method](#receipts-method)
- [Merge Blocks](#merge-blocks)
- [Synchronous State Execution](#synchronous-state-execution)
- [Transparent Sharding](#transparent-sharding)
- [Tightly-Coupled Sharding (Fork-Free Sharding)](#tightly-coupled-sharding-fork-free-sharding)
- [Active Questions and Research](#active-questions-and-research)
- [Community Updates and Contributions](#community-updates-and-contributions)
- [Acknowledgements](#acknowledgements)
- [References](#references)
@ -56,73 +31,42 @@ An approach to solving the scalability trilemma is the idea of blockchain shardi
## Basic Sharding Idea and Design
A sharded blockchain system is made possible by having nodes store “signed metadata” in the main chain of latest changes within each shard chain. Through this, we manage to create a layer of abstraction that tells us enough information about the global, synced state of parallel shard chains. These messages are called **collation headers**, which are specific structures that encompass important information about the chainstate of a shard in question. Collations are created by actors known as **proposers** that are tasked with packaging transactions into collation bodies. These collations are then voted on by a party of actors known as **notaries**. These notaries are randomly selected for particular periods of time in certain shards and are then tasked into reaching consensus on these chains via a **proof of stake** system occurring through a smart contract on the Ethereum main chain.
A sharded blockchain system is made possible by having nodes store “signed metadata” in the main chain of latest changes within each shard chain. Through this, we manage to create a layer of abstraction that tells us enough information about the global, synced state of parallel shard chains. These messages are called **cross-links**, which are specific structures that encompass important information about the shard blocks (known as **collations**) of a shard in question. Collations are created by actors known as **proposers** that are tasked with packaging transactions into collation bodies. These collations are then voted on by a party of actors known as **notaries**. These notaries are randomly selected for particular periods of time in certain shards and are then tasked into reaching consensus on these chains via a **proof of stake** system.
These collations are holistic descriptions of the state and transactions on a certain shard. A collation header at its most basic, high level summary contains the following information:
Cross-links are stored in blocks on a full proof of stake chain known as a **beacon chain**, which will be implemented as a sidechain to the Ethereum main chain initially.
- Information about what shard the collation corresponds to (lets say shard 10)
Cross-links are holistic descriptions of the state and transactions on a certain shard. Transactions in a shard are stored in **collations** which contain both a collation header and collation body A collation header at its most basic, high level summary contains information about who created it, when it was added to a shard, and its internal data stored as serialized blobs.
For detailed information on protocol primitives including collations, see: [Protocol Primitives](#protocol-primitives). We will have two types of nodes that do the heavy lifting of our sharding logic: **proposers and notaries**. The basic role of proposers is to fetch pending transactions from the txpool, wrap them into collations, and submit them to a smart contract on the Ethereum main chain.
For detailed information on protocol primitives including collations, see: [Protocol Primitives](#protocol-primitives). We will have a few types of nodes that do the heavy lifting of our sharding logic: **proposers, notaries, and attesters**. The basic role of proposers is to fetch pending transactions from the txpool, wrap them into collations, grow the shard chains, and submit cross-links to the beacon chain.
<!--[Proposer{bg:wheat}]fetch txs-.->[TXPool], [TXPool]-.->[Proposer{bg:wheat}], [Proposer{bg:wheat}]-package txs>[Collation|header|body], [Collation|header|body]-submit header>[Sharding Manager Contract], [Notary{bg:wheat}]downloads collation availability and votes-.->[Sharding Manager Contract]-->
![proposers](https://yuml.me/69cbd7da.png)
We still keep the Ethereum main chain and deploy a smart contract into it known as the **Validator Registration Contract**, where users can deposit and burn 32 ETH. Beacon chain nodes would listen to deposits in this contract and consequently queue up a user with the associated address as a validator in the beacon chain PoS system. Validators then become part of a registered validator set in the beacon chain, and are committees of validators are selected to become notaries on shard chains in certain periods of blocks until they are ventually reshuffled into different shards.
Notaries stake ETH into the contract and vote on collations submitted by proposers during a certain period. Notaries are in charge of checking for data availability of such collations and reach consensus on canonical shard chains.
So then, are proposers in charge of state execution? The short answer is that phase 1 will contain **no state execution**. Instead, proposers will simply package all types of transactions into collations and later down the line, agents known as executors will download, run, and validate state as they need to through possibly different types of execution engines (potentially TrueBit-style, interactive execution).
Notaries are in charge of checking for data availability of such collations and reach consensus on canonical shard chains. So then, are proposers in charge of state execution? The short answer is that phase 1 will contain **no state execution**. Instead, proposers will simply package all types of transactions into collations and later down the line, agents known as executors will download, run, and validate state as they need to through possibly different types of execution engines (potentially TrueBit-style, interactive execution).
This separation of concerns between notaries and proposers allows for more computational efficiency within the system, as notaries will not have to do the heavy lifting of state execution and focus solely on consensus through fork-choice rules. In this scheme, it makes sense that eventually **proposers** will become **executors** in later phases of a sharding spec.
Notaries periodically get assigned to different shards, a period is defined as a certain interval of blocks.
Given that we are splitting up the global state of the Ethereum blockchain into shards, new types of attacks arise because fewer resources are required to completely dominate a shard. This is why a **source of randomness** and periods are critical components to ensuring the integrity of the system.
The Ethereum Wikis [Sharding FAQ](https://github.com/ethereum/wiki/wiki/Sharding-FAQ) suggests pseudorandom sampling of notaries on each shard. The goal is so that these notaries will not know which shard they will get in advance. Otherwise, malicious actors could concentrate resources into a single shard and try to overtake it (See: [1% Attack](https://medium.com/@icebearhww/ethereum-sharding-and-finality-65248951f649)).
Casper Proof of Stake (Casper [FFG](https://arxiv.org/abs/1710.09437) and [CBC](https://arxiv.org/abs/1710.09437)) makes this quite trivial because there is already a set of global validators that we can select notaries from. The source of randomness needs to be common to ensure that this sampling is entirely compulsory and cant be gamed by the notaries in question.
In practice, the first phase of sharding will not be a complete overhaul of the network, but rather an implementation through a smart contract on the main chain known as the **Sharding Manager Contract (SMC)**. Its responsibility is to manage submitted collation headers and manage notaries.
Among its basic responsibilities, the SMC is responsible for reconciling notaries across all shards. It is in charge of pseudorandomly sampling notaries from addresses that have staked ETH into the SMC. The SMC is also responsible for providing immediate collation header verification that records a valid collation header hash on the main chain. In essence, sharding revolves around being able to store collation headers and their associated votes in the main chain through this smart contract.
Sharding revolves around being able to store shard metadata in a full proof of stake chain known as a beacon chain. For pseudorandomness generation, a RANDAO mechanism can be used in the beacon chain to shuffle validators securely.
# Roadmap Phases
Prysmatic Labs will follow the parts of the (now deprecated) Phase 1 Spec posted on [ETHResearch](https://ethresear.ch/t/sharding-phase-1-spec/1407) by the Foundation's research team to roll out a local version of qudratic sharding. In essence, the high-level sharding roadmap is as follows as outlined by Justin Drake:
Prysmatic Labs will implement the beacon chain spec posted on [ETHResearch](https://ethresear.ch/t/convenience-link-to-full-casper-chain-v2-spec/2332) by the Foundation's research team and roll out a sharding client that communicates with this beacon.
- Phase 1: Basic sharding without EVM
- Blob shard without transactions
- Proposers
- Notaries
- Phase 2: EVM state transition function
- Full nodes only
- Asynchronous cross-contract calls only
- Account abstraction
- eWASM
- Archive accumulators
- Storage rent
- Phase 3: Light client state protocol
- Executors
- Stateless clients
- Phase 4: Cross-shard transactions
- Internally-synchronous zones
- Phase 5: Tight coupling with main chain security
- Data availability proofs
- Casper integration
- Internally fork-free sharding
- Manager shard
- Phase 6: Super-quadratic sharding
- Load balancing
To concretize these phases, we will be releasing our implementation of sharding for the geth client as follows:
To concretize these phases, we will be releasing our implementation of sharding and the beacon chain as follows:
## The Ruby Release: Local Network
Our current work is focused on creating a localized version of phase 1, quadratic sharding that would include the following:
Our current work is focused on creating a localized version of a beacon chain with a sharding system that would include the following:
- A minimal, **sharding node** system that will interact with a **Sharding Manager Contract** on a locally running geth node
- Ability to deposit ETH into the SMC through the command line and to be selected as a notary by the local **SMC** in addition to the ability to withdraw the ETH staked
- A **proposer** that listens for pending txs, creates collations, and submits them to the SMC
- Ability to inspect the shard states and visualize the working system locally
- A minimal, **beacon chain node** that will interact with a main chain geth node via JSON-RPC
- A **Validator Registration Contract** deployed on the main chain where a beacon node can read logs to check for registered validators
- A minimal, gossipsub shardp2p network
- Ability for proposers/notaries/attesters to be selected by the beacon chain's randomness into committees that work on specific shards
- Ability to serialize blobs into collations on shard chains and advance the growth of the shard chains
- An observer node that can join a network on shardp2p, sync to the latest head, and send tx's to nodes in the network
We will forego several security considerations that will be critical for testnet and mainnet release for the purposes of demonstration and local network testing as part of the Ruby Release (See: [Security Considerations Not Included in Ruby](#not-included-in-ruby-release)).
@ -131,7 +75,7 @@ ETA: To be determined
## The Sapphire Release: Ropsten Testnet
Part 1 of the **Sapphire Release** will focus around getting the **Ruby Release** polished enough to be live on an Ethereum testnet and manage a set of notaries voting on collations through the **on-chain SMC**. This will require a lot more elaborate simulations around the safety of the randomness behind the notary assignments in the SMC. Futhermore we need to pass stress testing against DoS and other sorts of byzantine attacks. Additionally, it will be the first release to have real users proposing collations concurrently with notaries reaching consensus on these collations.
Part 1 of the **Sapphire Release** will focus around getting the **Ruby Release** polished enough to be live on an Ethereum testnet and manage a a beacon chain + sharding system. This will require a lot more elaborate simulations around the safety of the randomness behind the notary assignments in the SMC. Futhermore we need to pass stress testing against DoS and other sorts of byzantine attacks. Additionally, it will be the first release to have real users proposing collations concurrently with notaries reaching consensus on these collations, alongside beacon node validators producing blocks via PoS.
Part 2 of the **Sapphire Release** will focus on implementing state execution and defining the State Transition Function for sharding on a local testnet (as outlined in [Beyond Phase 1](#beyond-phase-1)) as an extenstion to the Ruby Release.
@ -139,84 +83,55 @@ ETA: To be determined
## The Diamond Release: Ethereum Mainnet
The **Diamond Release** will reconcile the best parts of the previous releases and deploy a full-featured, cross-shard transaction system through a Sharding Manager Contract on the Ethereum mainnet. As expected, this is the most difficult and time consuming release on the horizon for Prysmatic Labs. We plan on growing our community effort significantly over the first few releases to get all hands-on deck preparing for real ether to be staked in the SMC.
The **Diamond Release** will reconcile the best parts of the previous releases and deploy a full-featured, cross-shard transaction system through a beacon chain, casper FFG-enabled, sharding release. As expected, this is the most difficult and time consuming release on the horizon for Prysmatic Labs. We plan on growing our community effort significantly over the first few releases to get all hands-on deck preparing for this.
The Diamond Release should be considered the production release candidate for sharding Ethereum on the mainnet.
ETA: To Be determined
# Go-Ethereum Sharding Alpha Implementation
# Beacon Chain and Sharding Alpha Implementation
Prysmatic Labs will begin by focusing its implementation entirely on the **Ruby Release** from our roadmap. We plan on being as pragmatic as possible to create something that can be locally run by any developer as soon as possible. Our initial deliverable will center around a command line tool that will serve as an entrypoint into a sharding node that allows staking to become a notary, proposer, manages shard state local storage, and does on-chain voting of collation headers via the Sharding Manager Contract.
Prysmatic Labs will begin by focusing its implementation entirely on the **Ruby Release** from our roadmap. We plan on being as pragmatic as possible to create something that can be locally run by any developer as soon as possible. Our initial deliverable will center around a command line tool that will serve as an entrypoint into a beacon chain node that allows for users to become a notary, proposer, and to manage the growth of shard chains.
Here is a full reference spec explaining how our initial system will function:
Here is a reference spec explaining how our initial system will function:
## System Architecture
Our implementation revolves around 5 core components:
Our implementation revolves around the following core components:
- A **locally-running geth node** that spins up an instance of the Ethereum blockchain and mines on the Proof of Work chain
- A **Sharding Manager Contract (SMC)** that is deployed onto this blockchain instance
- A **sharding node** that connects to the running geth node through JSON-RPC, provides bindings to the SMC
- A **notary service** that allows users to stake ETH into the SMC and be selected as a notary in a certain period on a shard
- A **proposer service** that is tasked with processing pending tx's into collations that are then submitted to the SMC. In phase 1, proposers _do not_ execute state, but rather just serialize pending tx data into possibly valid/invalid data blobs.
Our initial implementation will function through simple command line arguments that will allow a user running the local geth node to deposit ETH into the SMC and join as a notary that is randomly assigned to a shard in a certain period.
- A **main chain node** that spins up an instance of the Ethereum blockchain and mines on the Proof of Work chain
- A **beacon chain** that connects to this main chain node via JSON-RPC
- A **shardp2p system** that allows nodes to reshuffle across shard networks
A basic, end-to-end example of the system is as follows:
1. _**User starts a sharding node and deposits 1000ETH into the SMC:**_ the sharding node connects to a locally running geth node and asks the user to confirm a deposit from his/her personal account.
1. _**User deposits 32 ETH into a Validator Registration Contract on the main chain:**_ the beacon chain listens for the logs in the main chain to queue that validator into the beacon chain chain's main event loop
2. _**Client connects & listens to incoming headers from the geth node and assigns user as notary on a shard per period:**_ The notary is selected in CURRENT_PERIOD + LOOKEAD_PERIOD (which is around a 5 minutes notice) and must download data for collation headers submitted in that time period.
2. _**Registered validator begins PoS process to propose blocks:**_ the PoS validator has the resposibility to participate in the addition of new blocks to the beacon chain
3. _**Concurrently, a proposer protocol processes pending transaction data into blobs:**_ the proposer client will create collation bodies and submit their headers to the SMC. In Phase 1, it is important to note that we will _not_ have any state execution. Proposers will just serialize pending tx into fixed collation body sizes without executing them for state transition validity.
3. _**RANDAO mechanism selects committees of proposers/notaries/attesters for shards:**_ the beacon chain node will use its RANDAO mechanism to select committees of proposers, notaries, and attesters that each have responsibilities within the sharding system. Refer to the [Full Casper Chain V2 Doc](https://ethresear.ch/t/convenience-link-to-full-casper-chain-v2-spec/2332) for extensive detail on the different fields in the state of the beacon chain related to sharding
5. _**The set of notaries vote on collation headers as canonical unitl the period ends:**_ the headers that received >= 2/3 votes are accepted as canonical.
6. _**User is selected as notary again on the SMC in a different period or can withdraw his/her stake:**_ the user can keep staking and voting on incoming collation headers and restart the process, or withdraw his/her stake and be deregistered from the SMC.
Now, well explore our architecture and implementation in detail as part of the go-ethereum repository.
4. _**Beacon Chain State Advances, Committees are Reshuffled:**_ upon completing responsibilities, the different actors of the sharding system are them reshuffled into new committees on different shards
## System Start and User Entrypoint
Our Ruby Release requires users to start a local geth node running a localized, private blockchain to deploy the **SMC** into. Users can spin up a notary client as a command line entrypoint into geth while the node is running as follows:
Our Ruby Release requires users to start a local geth node running a localized, private blockchain to deploy the **Validator Registration Contract**. Then, the deployed address of this contract can be supplied to the beacon chain as an argument:
geth sharding --actor "notary" --datadir /path/to/your/datadir --password /path/to/your/password.txt --networkid 12345 --deposit
beacon-chain --datadir /path/to/your/datadir --password /path/to/your/password.txt --networkid 12345 --vrc VALIDATOR_REGISTRATION_CONTRACT_ADDRESS
This will extract 1000ETH from the user's account balance and insert him/her into the SMC's notaries. Then, the program will listen for incoming block headers and notify the user when he/she has been selected as to vote on collations for a certain shard in a given period. Once you are selected, the sharding node will download collation information to check for data availability on vote on proposals that have been submitted via the `addHeader` function on the SMC.
This will kickstart the entire beacon chain sync process and listen for registrations of validators in the main chain VRC. The beacon node begins to work by its main loop, which involves the following steps:
Users can also run a proposer client that is tasked with processing transactions into collations and submitting them to the SMC via the `addHeader` function.
1. _**Sync to the latest block header on the beacon chain:**_ the node will begin a sync process for the beacon chain
geth sharding --actor "proposer" --datadir /path/to/your/datadir --password /path/to/your/password.txt --networkid 12345
2. _**Assign the validator as a proposer/notary/attester based on RANDAO mechanism:**_ on incoming headers, the client will interact with the SMC to check if the current user is an eligible notary for an upcoming period (only a few minutes notice)
This client is tasked with processing pending transactions into blobs within collations by serializing data into collation bodies. It is responsible for submitting proposals (collation headers) to the SMC via the `addHeader` function.
3. _**Process shard cross-links:**_ once a notary is selected, he/she has to download subimtted collation headers for the shard in a certain period and check for their data availability
The sharding node begins to work by its main loop, which involves the following steps:
5. _**Reshuffle committees**_ the notary votes on the available collation header that came first in the submissions.
1. _**Subscribe to incoming block headers:**_ the client will begin by issuing a subscription over JSON-RPC for block headers from the running geth node.
6. _**Propose blocks and finalize incoming blocks via PoS:**_ Once notaries vote, headers that received >=2/3 votes are selected as canonical
2. _**Check shards for notary selection within LOOKEAD_PERIOD:**_ on incoming headers, the client will interact with the SMC to check if the current user is an eligible notary for an upcoming period (only a few minutes notice)
3. _**If the notary is selected, check data availability for submitted collation headers:**_ once a notary is selected, he/she has to download subimtted collation headers for the shard in a certain period and check for their data availability
5. _**The notary issues a vote:**_ the notary votes on the available collation header that came first in the submissions.
6. _**Other notaries vote, period ends, and header is selected as canonical shard chain header:**_ Once notaries vote, headers that received >=2/3 votes are selected as canonical
<!--[Transaction Generator]generate test txs->[Shard TXPool],[Geth Node]-deploys>[Sharding Manager Contract{bg:wheat}], [Shard TXPool]<fetch pending txs-.->[Proposer Client], [Proposer Client]-propose collation>[Sharding Manager Contract],[Notary Client]download availability and vote->[Sharding Manager Contract{bg:wheat}]-->
![system functioning](https://yuml.me/6c2f90a5.png)
## The Sharding Manager Contract
Our solidity implementation of the Sharding Manager Contract follows the reference spec outlined in ETHResearch's [minimal sharding protocol](https://ethresear.ch/t/a-minimal-sharding-protocol-that-may-be-worthwhile-as-a-development-target-now/1650)
<!-- removed old solidity code cause it will be bound to change -->
Our current [solidity implementation](https://github.com/prysmaticlabs/geth-sharding/blob/master/sharding/contracts/sharding_manager.sol) includes all of these functions along with other utilities important for the our Ruby Release sharding scheme.
For more details on these methods, please refer to the Phase 1 spec as it details all important requirements and additional functions to be included in the production-ready SMC.
### Notary Sampling
## Notary Sampling
The probability of being selected as a notary on a particular shard is being heavily researched in the latest ETHResearch discussions. As specified in the [Sharding FAQ](https://github.com/ethereum/wiki/wiki/Sharding-FAQ) by Vitalik, “if validators [collators] could choose, then attackers with small total stake could concentrate their stake onto one shard and attack it, thereby eliminating the systems security.”
@ -231,43 +146,7 @@ At that point, the attacker has the ability to conduct 51% attacks against that
However, this problem transcends the sharding scheme itself and goes into the broader problem of fraud detection, which we have yet to comprehensively address.
## The Notary Client
One of the main running threads of our implementation is the notary client, which serves as a bridge between users staking their ETH to become notaries and the **Sharding Manager Contract** that verifies collation headers on the canonical chain.
When we launch the client, The instance connects to a running geth node via JSON-RPC and calls the deposit function on a deployed, Sharding Manager Contract to insert the user into a notary pool. Then, we subscribe for updates on incoming block headers and determine if the user is a notary on receiving each header. Once we are selected within a LOOKAHEAD_PERIOD, our client fetches data associated with submitted collation headers to that shard. The notary votes on the SMC, and if other notaries reach consensus, the collation is accepted as canonical.
### Local Shard Storage
Local shard information is done through a key-value store used to store the mainchain information in the local data directory specified by the running geth node. Adding a collation to a shard will effectively modify this key-value store.
Work in progress.
## The Proposer Client
In addition to launching a notary client, our system requires a user to concurrently launch a proposer client that is tasked with fetching pending txs from the network and creating collations that can be sent to the SMC.
This client connects via JSON-RPC to give the client the ability to call required functions on the SMC. The proposer is tasked with packaging pending transaction data into _blobs_ and **not** executing these transactions. This is very important, we will not consider state execution until later phases of a sharding roadmap.
Then, the proposer node calls the `addHeader` function on the SMC by submitting this collation header. Well explore the structure of collation headers in this next section.
### Collation Headers
Work in progress.
## Peer Discovery and Shard Wire Protocol
Work in progress.
## Protocol Modifications
### Protocol Primitives: Collations, Blocks, Transactions, Accounts
(Outline the interfaces for each of these constructs, mention crucial changes in types or receiver methods in Go for each, mention transaction access lists)
Work in progress.
### The EVM: What You Need to Know
## The EVM: What You Need to Know
As an important aside, well take a brief detour into the EVM and what we need to understand before we modify it for a sharded blockchain. At its core, the functionality of the EVM optimizes for _security_ and not for computational power with the following restrictions:
@ -291,39 +170,13 @@ It is important to note that the merkle root of an Ethereum account is updated a
How is this relevant to sharding? It is important to note the importance of certain opcodes in our implementation and how we will need to introduce and modify several of them for both security and scalability considerations in a sharded chain.
Work in progress.
## Sharding In-Practice
### Use-Case Stories: Proposers
The primary purpose of proposers is to package transaction data into collations that can then be submitted to the SMC.
The primary incentive for proposers to generate these collations is to receive a payout to their coinbase address from transactions fees once these collations are added to a block in the canonical chain. This process, however, cannot occur until we have state execution in our protocol, so proposers will be running at a loss for our Phase 1 implementation.
### Use-Case Stories: Notaries
The primary purpose of notaries is to use Proof of Stake and reach **consensus** on valid shard chains based on the collations they process and add to the Sharding Manager Contract. They have three primary things to do:
- They can deposit ETH into the SMC and become a notary. They then have to wait to be selected by the SMC on a particular period to vote on collation headers in the SMC.
- They download availability of collation headers submitted to their assigned shard in the period.
- They vote on available collation headers
## Current Status
Currently, Prysmatic Labs is focusing its initial implementation around the logic of the notary and proposer clients, as well as shard state local storage and p2p networking. We have built the command line entrypoints as well as the minimum, required functions of the Sharding Manager Contract that is deployed to a local Ethereum blockchain instance. Our notary client is able to subscribe for block headers from the running Geth node and determine when we are selected as an eligible notary in a given period if we have deposited ETH into the contract.
You can track our progress, open issues, and projects in our repository [here](https://github.com/prysmaticlabs/geth-sharding).
# Security Considerations
## Not Included in Ruby Release
We will not be considering data availability proofs (part of the stateless client model) as part of the ruby release we will not be implementing them as it just yet as they are an area of active research.
## Bribing, Coordinated Attack Models
Work in progress.
Additionally, we will be using simple blockhashes for randomness in committee selections instead of a full RANDAO mechanism.
## Enforced Windback
@ -335,66 +188,8 @@ One way to enforce **validity** during the windback process is for nodes to prod
On the other hand, to enforce **availability** for the windback process, a possible approach is for nodes to produce “proofs of custody” in collation headers that prove the notary was in possession of the full data of a collation when produced. Drake proposes a constant time, non-interactive zkSNARK method for notaries to check these proofs of custody. In his construction, he mentions splitting up a collation body into “chunks” that are then mixed with the node's private key through a hashing scheme. The security in this relies in the idea that a node would not leak his/her private key without compromising him or herself, so it provides a succinct way of checking if the full data was available when a node processed the collation body and proof was created.
## The Data Availability Problem
### Introduction and Background
Work in progress.
### On Uniquely Attributable Faults
Work in progress.
### Erasure Codes
Work in progress.
# Beyond Phase 1
## Cross-Shard Communication
### Receipts Method
Work in progress.
### Merge Blocks
Work in progress.
### Synchronous State Execution
Work in progress.
## Transparent Sharding
One of the first question dApp developers ask about sharding is how much will they need to change their workflow and smart contract development to adopt the sharded blockchain scheme. An idea tangentially explored by Vitalik in his [Sharding FAQ](https://github.com/ethereum/wiki/wiki/Sharding-FAQ) was the concept of **“transparent sharding”** which means that sharding will exist exclusively at the protocol layer and will not be exposed to developers. The Ethereum state system will continue to look as it currently does, but the protocol will have a built-in system that creates shards, balances state across shards, gets rid of shards that are too small, and more. This will all be done behind the scenes, allowing devs to continue their current workflow on Ethereum. This was only briefly mentioned, but will be critical to ensure a better user experience moving forward after security considerations are addressed.
## Tightly-Coupled Sharding (Fork-Free Sharding)
A current problem with the scheme we are following for sharding is the reliance on **two fork-choice rules**. When we are reaching consensus on the best shard chain, we not only have to check for the longest canonical, main chain, but also the longest shard chain **within** this longest main chain. Fork-choice rules have long been an approach to solve the constraints that distributed systems impose on us due to factors outside of our control (Byzantine faults) and are the current standard in most public blockchains.
A problem that can occur with current distributed fork-choice ledgers is the possibility of choosing a wrong fork and continuing to do PoW on it, thereby wasting potential profits of mining on the canonical chain. Another current burden is the large amount of data that needs to be downloaded in order to validate which fork is potentially the best one to follow in any situation, opening up avenues for spam DDoS attacks.
Fortunately, there is a potential method of creating a fork-free sharding mechanism that relies on what we are currently implementing through the Sharding Manager Contract that has been explored by Justin Drake and Vitalik in [this](https://ethresear.ch/t/fork-free-sharding/1058) and this [other post](https://ethresear.ch/t/a-model-for-stage-4-tightly-coupled-sharding-plus-full-casper/1065), respectively.
The current spec of the Sharding Manager Contract __already does a canonical ordering of collation headers for us__ (i.e. we can track the timestamped logs of collation headers being added). Because the data for the SMC lives on the canonical main chain, we are able to easily extract an exact ordering and validity from headers added through the contract.
To add validity to our current SMC spec, Drake mentions that we can use a succinct zkSNARK in the collation root proving validity upon construction that can be checked directly by the `addHeader` function on the the SMC.
The other missing piece is the guarantee of data availability within collation headers submitted to the SMC which can once again be done through zero-knowledge proofs and erasure codes (See: The Data Availability Problem). By escalating this up to the SMC, we can ensure what Vitalik calls “tightly-coupled” sharding, in which case breaking a single shard would entail also breaking the progression of the canonical chain, enabling easier cross-shard communication due to having a single source of truth being the SMC and the associated collation headers it has processed. In Justin Drakes words, “there is no fork-choice rule within the SMC”.
It is important to note that this “tightly coupled” sharding has been relegated to the latter phases of the roadmap.
Work in progress.
# Active Questions and Research
## Selecting Notaries Via Random Beacon Chains
In our current implementation for the Ruby Release, we are selecting notaries through a pseudorandom method built into the SMC directly. Inspired by dfinity's random beacon chains, the Ethereum Research team has been proposing [better solutions](https://github.com/ethereum/research/tree/master/sharding_fork_choice_poc) that have faster finality guarantees. The random beacon chain would be in charge for pseudorandomly sampling notaries and would allow for cool stuff such as off-chain collation headers that were not possible before. Through this, no gas would need to be paid for including collation headers and we can achieve faster finality guarantees, making the system way better than before.
<https://ethresear.ch/t/posw-random-beacon/1814/6>
## Leaderless Random Beacons
In the prevous research on random beacons, committees are able to generate random numbers if a certain number of participants participate correctly. This is similar to the random beacon used in Dfinity without the use of BLS threshold signatures. The scheme is separated into two separate sections.
@ -420,56 +215,9 @@ With a sharded network comes sharded state storage. State sync today is difficul
<https://ethresear.ch/t/data-availability-proof-friendly-state-tree-transitions/1453>
## Proof of Custody
A critique against the notary scheme currently followed in the minimal sharding protocol is the susceptibility of these agents towards the “validators dilemma”, wherein agents are incentivized to be “lazy” and trust the work of other validators when making coordinated decisions. Specifically, notaries are tasked with checking data availability of collation headers submitted to the SMC during their assigned period. This means notaries have to download headers via a shardp2p network and commit their votes after confirming availability. Proposers can try to game validators by publishing unavailable proposals and then challenging lazy validators to take their deposits. In order to prevent abuse of collation availability traps, the responsibility of notaries is extended to also provide a “Merkle root of a signature tree, where each signature in the signature tree is the signature of the corresponding chunk of the original collation data.” (ETHResearch) This means that at challenge time, notaries must have the fully available collation data in order to construct a signature tree of all its chunks.
<https://ethresear.ch/t/extending-skin-in-the-game-of-notarization-with-proofs-of-custody/1639>
## Safe Notary Pool Sizes: RANDAO Exploration
When notary pool sizes are too small a few things can happen: A small pool would result in the notary requiring a large amount of bandwidth. The amount of bandwidth required by each notary is inversely proportional to the size of the pool, so in order to be sufficiently decentralized the notary pool should be large enough so that the bandwidth required should be manageable with poor internet connection. Secondly the notary pool size has a direct effect on the capital requirements in order to take over notarisation and revert/censor transactions. An acceptable notary pool size would be one that required a minimum acceptable capital threshold for a takeover of the chain. In Vitaliks RANDAO analysis he looked at how vulnerable the RANDAO chain was comparatively to a POW(Proof of Work) chain. The result of the exercise was that an attacker with a 40% of stake on the RANDAO chain can effectively revert transactions; to achieve the same result on a POW chain they would require 50% of the hashpower. On the other hand if the chain utilised a 2/2 notarization committee, the attacker would need to up their stake to 46% on the chain to be able to effectively censor transactions.
<https://ethresear.ch/t/safe-notary-pool-size/1728>
## Cross Links Between Shard Chain and Main Chain
For synchronizing cross shard chain communications, we are researching how to properly link between the shard chain and the beacon chain. In order to accomplish this, a randomly sampled committee will vote to approve a collation in a sharded chain per period and per shard. As Vitalik wrote, there are two ways create cross links between main shard and shards. On-chain aggregation and off-chain aggregation. For on-chain aggregation, the state of beacon chain will keep track of the randomly sampled committee as validators. Each validator can make one vote casper FFG style, the vote will also contain cross-link of that committee. For off-chain aggregation, every beacon chain block creator will choose one CAS to link the sharded chain to main chain. Off chain aggregation mechanisms have benefits as there is no need for the beacon chain to track of vote counts.
<https://ethresear.ch/t/extending-minimal-sharding-with-cross-links/1989/8>
<https://ethresear.ch/t/two-ways-to-do-cross-links/2074/2>
## Fixed ETH Deposit Size for Notaries
A notary must submit a deposit to the Sharding Manager Contract in order to get randomly selected to vote on a block. A fixed size deposit is good for making the random selection convenient and work well with slashing, as it can always destroy at least a minimum amount of ether. However, a fixed-size deposit does not do well with rewards and penalties. An alternative solution is to design incentive system where rewards and penalties are tracked in a separate variable, and when the final balance when the withdrawal penalties minus rewards reach a threshold, the notary can be voted out. Such a design might ignore an important function which is to reduce the influence of notaries that are offline. In Casper FFG, if more than 1/3 of validators to offline around same time, the deposits will begin to leak quickly. This is called quadratic leak.
<https://ethresear.ch/t/fixed-size-deposits-and-rewards-penalties-quad-leak/2073/7>
# Community Updates and Contributions
Excited by our work and want to get involved in building out our sharding releases? We created this document as a single source of reference for all things related to sharding Ethereum, and we need as much help as we can get!
You can explore our [Current Projects](https://github.com/prysmaticlabs/geth-sharding/projects) in-the works for the Ruby release. Each of the project boards contain a full collection of open and closed issues relevant to the different parts of our first implementation that we use to track our open source progress. Feel free to fork our repo and start creating PRs after assigning yourself to an issue of interest. We are always chatting on [Gitter](https://gitter.im/prysmaticlabs/geth-sharding), so drop us a line there if you want to get more involved or have any questions on our implementation!
**Contribution Steps**
- Create a folder in your `$GOPATH` and navigate to it `mkdir -p $GOPATH/src/github.com/ethereum && cd $GOPATH/src/github.com/ethereum`
- Clone our repository as `go-ethereum`, `git clone https://github.com/prysmaticlabs/geth-sharding ./go-ethereum`
- Fork the `go-ethereum` repository on Github: <https://github.com/ethereum/go-ethereum>
- Add a remote to your fork
\`git remote add YOURNAME <https://github.com/YOURNAME/go-ethereum>
Now you should have a remote pointing to the `origin` repo (geth-sharding) and to your forked, go-ethereum repo on Github. To commit changes and start a Pull Request, our workflow is as follows:
- Create a new branch with a clear feature name such as `git checkout -b collations-pool`
- Issue changes with clear commit messages
- Push to your remote `git push YOURNAME collations-pool`
- Go to the [geth-sharding](https://github.com/prysmaticlabs/geth-sharding) repository on Github and start a PR comparing `geth-sharding:master` with `go-ethereum:collations-pool` (your fork on your profile).
- Add a clear PR title along with a description of what this PR encompasses, when it can be closed, and what you are currently working on. Github markdown checklists work great for this.
# Acknowledgements
A special thanks for entire [Prysmatic Labs](https://gitter.im/prysmaticlabs/geth-sharding) team for helping put this together and to Ethereum Research (Hsiao-Wei Wang) for the help and guidance in our approach.
A special thanks for entire [Prysmatic Labs](https://gitter.im/prysmaticlabs/geth-sharding) team for helping put this together and to Ethereum Research (Hsiao-Wei Wang, Vitalik, Justin Drake) for the help and guidance in our approach.
# References
@ -477,34 +225,16 @@ A special thanks for entire [Prysmatic Labs](https://gitter.im/prysmaticlabs/get
[Sharding Reference Spec](https://github.com/ethereum/sharding/blob/develop/docs/doc.md)
[Ethereum Sharding and Finality - Hsiao-Wei Wang](https://medium.com/@icebearhww/ethereum-sharding-and-finality-65248951f649)
[Data Availability and Erasure Coding](https://github.com/ethereum/research/wiki/A-note-on-data-availability-and-erasure-coding)
[Proof of Visibility for Data Availability](https://ethresear.ch/t/proof-of-visibility-for-data-availability/1073)
[Enforcing Windback and Proof of Custody](https://ethresear.ch/t/enforcing-windback-validity-and-availability-and-a-proof-of-custody/949)
[Fork-Free Sharding](https://ethresear.ch/t/fork-free-sharding/1058)
[Delayed State Execution](https://ethresear.ch/t/delayed-state-execution-finality-and-cross-chain-operations/987)
[State Execution Scalability and Cost Under DDoS Attacks](https://ethresear.ch/t/state-execution-scalability-and-cost-under-dos-attacks/1048)
[Guaranteed Collation Subsidies](https://ethresear.ch/t/guaranteed-collation-subsidies/1016)
[Fork Choice Rule for Collation Proposals](https://ethresear.ch/t/fork-choice-rule-for-collation-proposal-mechanisms/922)
[Model for Phase 4 Tightly-Coupled Sharding](https://ethresear.ch/t/a-model-for-stage-4-tightly-coupled-sharding-plus-full-casper/1065)
[History, State, and Asynchronous Accumulators in the Stateless Model](https://ethresear.ch/t/history-state-and-asynchronous-accumulators-in-the-stateless-model/287)
[Torus Shaped Sharding Network](https://ethresear.ch/t/torus-shaped-sharding-network/1720)
[Data Availability Proof-friendly State Tree Transitions](https://ethresear.ch/t/data-availability-proof-friendly-state-tree-transitions/1453)
[General Framework of Overhead and Finality Time in Sharding](https://ethresear.ch/t/a-general-framework-of-overhead-and-finality-time-in-sharding-and-a-proposal/1638)
[Safety Notary Pool Size](https://ethresear.ch/t/safe-notary-pool-size/1728)
[Fixed Size Deposits and Rewards Penalties Quadleak](https://ethresear.ch/t/fixed-size-deposits-and-rewards-penalties-quad-leak/2073/7)