erigon-pulse/eth/stagedsync
Dmitry Savonin a49d409457
Full BSC support with validator mode (#3233)
* migrated consensus and chain config files for bsc support

* migrated more files from bsc

* fixed consensus crashing

* updated erigon lib for parlia snapshot prefix

* added staticpeers for bsc

* [+] added system contracts
[*] fixed bug with loading snapshot
[+] enabled gas bailout
[+] added fix to prevent syncing more than 1000 headers (for testing only)
[*] fixed bug with crashing sender recover sometimes

* migrated system contract calls

* [*] fixed bug with returning mutable balance object
[+] migrated lightclient contracts from bsc
[*] fixed parlia consensus config param

* [*] fixed tendermint deps

* [+] added some logs

* [+] enabled bsc forks
[*] fixed syscalls from coinbase
[*] more logging

* Fix call sys contract gas calculation

* [*] fixed executing system transactions

* [*] enabled receipt hash, gas and bloom filter checks

* [-] removed some logging scripts
[*] set header checkpoint to 10 million blocks (for testing forks)

* [*] fixed bug with commiting dirty inter block state state after system transaction execution
[-] removed some extra logs and comments

* [+] added chapel and rialto testnet support

* [*] fixed chapel allocs

* [-] removed 6 mil block limit for headers sync

* Fix hardforks on chapel and other testnets

* [*] fixed header sync issue after merge

* [*] tiny code cleanup

* [-] removed some comments

* [*] increased mdbx map size to 4 TB

* [*] increased max chaindata size to 6 tb

* [*] bring more compatibility with origin erigon and some code cleanup

* [+] added support of validator mode for BSC chain

* [*] enable private key load for bsc, rialto and chapel chains

* [*] fixed running BSC validator node

* Fix the branch list

* [*] tiny fixes for linter

* [*] formatted imports for core and parlia packages

* [*] fixed import rules in other files

* Revert "[*] formatted imports for core and parlia packages"

This reverts commit c764b58b34fedc2b14d69458583ba0dad114f227.

* [*] changed import rules in more packages

* [*] fixed type mismatch in hack command

* [*] fixed crash on new epoch, enabled bootstrap flags

* [*] fixed linter errors

* [*] fixed missing err check for syscalls

* [*] now BSC implementation is fully compatible with erigon original sources

* Revert "Add chain config and CLI changes for Binance Smart Chain support (#3131)"

This reverts commit 3d048b7f1a.

* Revert "Add Parlia consensus engine for Binance Smart Chain support (#3086)"

This reverts commit ee99f17fbe.

* [*] fixed several issues after merge

* [*] fixed integration compilation

* Revert "Fix the branch list"

This reverts commit 8150ca57e5f2707a84a9f6a1c5b809b7cc84547b.

* [-] removed receipt repair migration

* [*] fixed parlia fork numbers output

* [*] bring more devel compatibility, fixed bsc address list for access list calculation

* [*] fixed bug with commiting state transition for bad blocks in BSC

* [*] fixed bsc changes apply for integration command and updated config print for parlia

* [*] fixed bug with applying bsc forks for chapel and rialto testnet chains
[*] let's use finalize and assemble for mining to  let consensus know for what it's finalizing block

* Fix compilation errors in hack.go

* Fix lint

* reset changes in erigon-snapshots to devel

* Remove unrelated changes

* Fix embed

* Remove more unrelated changes

* Remove more unrelated changes

* Restore clique and aura miner config

* Refactor interfaces not to use slice pointers

* Refactor parlia functions to return tx and receipt instead of dealing with slices

* Fix for header panic

* Fix lint, restore system contract addresses

* Remove more unrelated changes, unify GatherForks

Co-authored-by: Dmitry Ivanov <convexman18@gmail.com>
Co-authored-by: j75689 <j75689@gmail.com>
Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro.local>
2022-01-14 19:06:35 +00:00
..
stages Verify PoS headers before inserting them into DB (#3151) 2021-12-23 15:06:10 +01:00
all_stages.go Grafana to show all stages progress, less spooky logs, no runtime logPrefix calculation (#2493) 2021-08-07 11:11:45 +07:00
chain_reader.go Serenity engine: fall back to eth1 engine before the Merge (#3112) 2021-12-13 18:29:38 +01:00
default_stages.go don't interrup stage bodies on first cycle (#3253) 2022-01-13 11:22:53 +00:00
README.md prohibit change prune mode (#2908) 2021-11-03 20:05:52 +07:00
stage_blockhashes.go Genesis sync from existing snapshots (#3087) 2021-12-05 09:03:08 +07:00
stage_bodies_test.go Add error tests for starknet_sendRawTransaction method (#3102) 2021-12-12 22:24:21 +00:00
stage_bodies.go don't interrup stage bodies on first cycle (#3253) 2022-01-13 11:22:53 +00:00
stage_call_traces_test.go CallTraces prune to use ETL (#2592) 2021-08-30 09:13:30 +07:00
stage_call_traces.go Rule guard linter enable (#3168) 2021-12-26 10:54:26 +07:00
stage_execute_test.go More convenient pruning for non-PoW consensus: add --prune.*.before flags (#2714) 2021-09-23 09:13:19 +07:00
stage_execute.go Full BSC support with validator mode (#3233) 2022-01-14 19:06:35 +00:00
stage_finish.go save 2021-11-04 15:45:55 +07:00
stage_hashstate_test.go Move ETL to erigon-lib (#2667) 2021-09-12 08:50:17 +01:00
stage_hashstate.go Rule guard linter enable (#3168) 2021-12-26 10:54:26 +07:00
stage_headers.go stage header tmpdir (#3218) 2022-01-08 14:11:08 +07:00
stage_indexes_test.go Move lengths to erigon-lib other packages depend on it (#2799) 2021-10-08 10:20:45 +07:00
stage_indexes.go erl.collector - move logPrefix to constructor (#2866) 2021-10-25 15:09:43 +07:00
stage_interhashes_test.go Tests for incremental intermediate hashes (#3172) 2021-12-27 08:08:31 +01:00
stage_interhashes.go Snapshot: exec and trie to support snapshot files (#3033) 2021-11-29 10:43:19 +07:00
stage_issuance.go Snapshots: watch the burn stage #3259 2022-01-14 14:55:31 +07:00
stage_log_index_test.go Move lengths to erigon-lib other packages depend on it (#2799) 2021-10-08 10:20:45 +07:00
stage_log_index.go erl.collector - move logPrefix to constructor (#2866) 2021-10-25 15:09:43 +07:00
stage_mining_create_block.go Full BSC support with validator mode (#3233) 2022-01-14 19:06:35 +00:00
stage_mining_exec.go Full BSC support with validator mode (#3233) 2022-01-14 19:06:35 +00:00
stage_mining_finish.go Added mining for POS (#3187) 2022-01-04 18:37:36 +01:00
stage_senders_test.go Snapshots: tx lookup in RPC from snapshots (#3214) 2022-01-07 20:52:38 +07:00
stage_senders.go Snapshots: tx lookup in RPC from snapshots (#3214) 2022-01-07 20:52:38 +07:00
stage_tevm.go Move ETL to erigon-lib (#2667) 2021-09-12 08:50:17 +01:00
stage_txlookup.go Snapshots: tx lookup in RPC from snapshots (#3214) 2022-01-07 20:52:38 +07:00
stage.go Merge branch 'more-generalised-pruneReceipts' of https://github.com/enriavil1/erigon into enriavil1-more-generalised-pruneReceipts 2021-09-12 10:20:39 +01:00
stagebuilder.go Ropsten to find correct chain (#2614) 2021-09-01 22:21:57 +01:00
sync_test.go Fixed Proof-of-stake transition (#3075) 2021-12-03 11:55:00 +01:00
sync.go Fixed Proof-of-stake transition (#3075) 2021-12-03 11:55:00 +01:00
testutil.go move kv to erigon-lib (#2467) 2021-07-29 18:53:13 +07:00
types.go rename (#1978) 2021-05-20 19:25:53 +01:00

Staged Sync

Staged Sync is a version of Go-Ethereum's Full Sync that was rearchitected for better performance.

It is I/O intensive and even though we have a goal on being able to sync the node on an HDD, we still recommend using fast SSDs.

Staged Sync, as its name suggests, consists of 10 stages that are executed in order, one after another.

How The Sync Works

For each peer Erigon learns what the HEAD blocks is and it executes each stage in order for the missing blocks between the local HEAD block and the peer's head blocks.

The first stage (downloading headers) sets the local HEAD block.

Each stage is executed in order and a stage N does not stop until the local head is reached for it.

That mean, that in the ideal scenario (no network interruptions, the app isn't restarted, etc), for the full initial sync, each stage will be executed exactly once.

After the last stage is finished, the process starts from the beginning, by looking for the new headers to download.

If the app is restarted in between stages, it restarts from the first stage.

If the app is restarted in the middle of the stage execution, it restarts from that stage, giving it the opportunity to complete.

How long do the stages take?

Here is a pie chart showing the proportional time spent on each stage (it was taken from the full sync). It is by all means just an estimation, but it gives an idea.

Reorgs / Unwinds

Sometimes the chain makes a reorg and we need to "undo" some parts of our sync.

This happens backward from the last stage to the first one with one caveat that tx pool is updated after we already unwound the execution so we know the new nonces.

That is the example of stages order to be unwound (unwind happens from right to left).

state.unwindOrder = []*Stage{
		// Unwinding of tx pool (reinjecting transactions into the pool needs to happen after unwinding execution)
		stages[0], stages[1], stages[2], stages[9], stages[3], stages[4], stages[5], stages[6], stages[7], stages[8],
	}

Preprocessing with ETL

Some stages use our ETL framework to sort data by keys before inserting it into the database.

That allows to reduce db write amplification significantly.

So, when we are generating indexes or hashed state, we do a multi-step process.

  1. We write the processed data into a couple of temp files in your data directory;
  2. We then use a heap to insert data from the temp files into the database, in the order that minimizes db write amplification.

This optimization sometimes leads to dramatic (orders of magnitude) write speed improvements.

Stages (for the up to date list see stagedsync.go and stagebuilder.go):

Each stage consists of 2 functions ExecFunc that progesses the stage forward and UnwindFunc that unwinds the stage backwards.

Some of the stages can theoretically work offline though it isn't implemented in the current version.

Stage 1: Download Headers Stage

During this stage we download all the headers between the local HEAD and our peer's head.

This stage is CPU intensive and can benefit from a multicore processor due to verifying PoW of the headers.

Most of the unwinds are initiated on this stage due to the chain reorgs.

This stage promotes local HEAD pointer.

Stage 2: Block Hashes

Creates an index of blockHash -> blockNumber extracted from the headers for faster lookups and making the sync friendlier for HDDs.

Stage 4: Download Block Bodies Stage

At that stage, we download bodies for block headers that we already downloaded.

That is the most intensive stage for the network connection, the vast majority of data is downloaded here.

Stage 6: Recover Senders Stage

This stage recovers and stores senders for each transaction in each downloaded block.

This is also a CPU intensive stage and also benefits from multi-core CPUs.

This stage doesn't use any network connection.

Stage 7: Execute Blocks Stage

During this stage, we execute block-by-block everything that we downloaded before.

One important point there, that we don't check root hashes during this execution, we don't even build a merkle trie here.

This stage is single threaded.

This stage doesn't use internet connection.

This stage is disk intensive.

This stage can spawn unwinds if the block execution fails.

Stage 8: Transpile marked VM contracts to TEVM

[TODO]

Stage 10: Generate Hashed State Stage

Erigon during execution uses Plain state storage.

Plain State: Instead of the normal (we call it "Hashed State") where accounts and storage items are addressed as keccak256(address), in the plain state them are addressed by the address itself.

Though, to make sure that some APIs work and keep the compatibility with the other clients, we generate Hashed state as well.

If the hashed state is not empty, then we are looking at the History ChangeSets and update only the items that were changed.

This stage doesn't use a network connection.

Stage 11: Compute State Root Stage

This stage build the Merkle trie and checks the root hash for the current state.

It also builds Intermediate Hashes along the way and stores them into the database.

If there were no intermediate hashes stored before (that could happend during the first initial sync), it builds the full Merkle Trie and its root hash.

If there are intermediate hashes in the database, it uses the block history to figure out which ones are outdated and which ones are still up to date. Then it builds a partial Merkle trie using the up-to-date hashes and only rebuilding the outdated ones.

If the root hash doesn't match, it initiates an unwind one block backwards.

This stage doesn't use a network connection.

Stage 12: Generate call traces index

[TODO]

Stages 13, 14, 15, 16: Generate Indexes Stages 8, 9, 10, and 11

There are 4 indexes that are generated during sync.

They might be disabled because they aren't used for all the APIs.

These stages do not use a network connection.

Account History Index

This index stores the mapping from the account address to the list of blocks where this account was changed in some way.

Storage History Index

This index stores the mapping from the storage item address to the list of blocks where this storage item was changed in some way.

Log Index

This index sets up a link from the [TODO] to [TODO].

Tx Lookup Index

This index sets up a link from the transaction hash to the block number.

Stage 17: Transaction Pool Stage

During this stage we start the transaction pool or update its state. For instance, we remove the transactions from the blocks we have downloaded from the pool.

On unwinds, we add the transactions from the blocks we unwind, back to the pool.

This stage doesn't use a network connection.

Stage 18: Finish

This stage sets the current block number that is then used by RPC calls, such as eth_blockNumber.