Go to file
ledgerwatch 5ea590c18e
State cache switching writes to reads during commit (#1368)
* State cache init

* More code

* Fix lint

* More tests

* More tests

* More tests

* Fix test

* Transformations

* remove writeQueue, before fixing the tests

* Fix tests

* Add more tests, incarnation to the code items

* Fix lint

* Fix lint

* Remove shards prototype, add incarnation to the state reader code

* Clean up and replace cache in call_traces stage

* fix flaky test

* Save changes

* Readers to use addrHash, writes - addresses

* Fix lint

* Fix lint

* More accurate tracking of size

* Optimise for smaller write batches

* Attempt to integrate state cache into Execution stage

* cacheSize to default flags

* Print correct cache sizes and batch sizes

* cacheSize in the integration

* Fix tests

* Fix lint

* Remove print

* Fix exec stage

* Fix test

* Refresh sequence on write

* No double increment

* heap.Remove

* Try to fix alignment

* Refactoring, adding hashItems

* More changes

* Fix compile errors

* Fix lint

* Wrapping cached reader

* Wrap writer into cached writer

* Turn state cache off by default

* Fix plain state writer

* Fix for code/storage mixup

* Fix tests

* Fix clique test

* Better fix for the tests

* Add test and fix some more

* Fix compile error|

* More functions

* Fixes

* Fix for the tests

* sepatate DeletedFlag and AbsentFlag

* Minor fixes

* Test refactoring

* More changes

* Fix some tests

* More test fixes

* More test fixes

* Fix lint

* Move blockchain_test to be able to use stagedsync

* More fixes

* Fixes and cleanup

* Fix tests in turbo/stages

* Fix lint

* Fix lint

* Intemediate

* Fix tests

* Intemediate

* More fixes

* Compilation fixes

* More fixes

* Fix compile errors

* More test fixes

* More fixes

* More test fixes

* Fix compile error

* Fixes

* Fix

* Fix

* More fixes

* Fixes

* More fixes and cleanup

* Further fix

* Check gas used and bloom with header

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2020-12-08 09:44:29 +00:00
.buildkite Nightly tests (#444) 2020-04-12 19:37:15 +01:00
.circleci Remove z3 and semantics (#1314) 2020-10-28 17:52:00 +00:00
.github .github: remove whisper from CODEOWNERS (#21527) 2020-09-11 16:29:50 +02:00
.golangci adopt --metrics.addr flag in integration (#889) 2020-08-11 06:38:34 +07:00
accounts State cache switching writes to reads during commit (#1368) 2020-12-08 09:44:29 +00:00
cmd State cache switching writes to reads during commit (#1368) 2020-12-08 09:44:29 +00:00
common State cache switching writes to reads during commit (#1368) 2020-12-08 09:44:29 +00:00
consensus State cache switching writes to reads during commit (#1368) 2020-12-08 09:44:29 +00:00
console/prompt Granular rpc control (Allow list for RPC daemon) (#1341) 2020-11-10 10:08:42 +01:00
contracts/checkpointoracle accounts/abi: ABI explicit difference between Unpack and UnpackIntoInterface (#21091) 2020-10-26 17:16:00 +01:00
core State cache switching writes to reads during commit (#1368) 2020-12-08 09:44:29 +00:00
crypto Fix parallel recovery senders (#962) 2020-08-22 12:08:47 +02:00
debug-web-ui Docker compose (#841) 2020-08-01 09:39:04 +02:00
design auto-format code by prettier (similar to gofmt) (#405) 2020-03-25 12:45:21 +07:00
docs Update docs testing (#1385) 2020-12-04 10:24:49 +00:00
eth State cache switching writes to reads during commit (#1368) 2020-12-08 09:44:29 +00:00
ethclient State cache switching writes to reads during commit (#1368) 2020-12-08 09:44:29 +00:00
ethdb subscription_doesnt_preserve_shutdown (#1391) 2020-12-04 21:17:13 +00:00
ethstats Rpcdaemon as lib (#940) 2020-08-19 12:46:20 +01:00
event event, whisper/whisperv6: use defer where possible (#20940) 2020-05-20 15:26:22 +03:00
graphql graphql: add support for retrieving the chain id (#21451) 2020-08-29 13:28:52 +02:00
interfaces Remote snapshot downloader (#1343) 2020-11-13 16:16:47 +00:00
internal post-rebase fixes 2020-12-03 18:59:17 +01:00
log Rpcdaemon as lib (#940) 2020-08-19 12:46:20 +01:00
metrics Turning off failing test (#1226) 2020-10-11 19:39:01 +01:00
migrations History bitmap 64 (#1374) 2020-12-04 21:16:51 +00:00
miner State cache switching writes to reads during commit (#1368) 2020-12-08 09:44:29 +00:00
node need handle empty db path special way in couple places - with meaning "in-mem db" (#1389) 2020-12-04 10:27:59 +00:00
p2p p2p/simulations/adapters/exec: fix some issues (#21801) 2020-12-03 18:59:17 +01:00
params params: update yolov2 bootnode with elastic ip 2020-12-03 18:59:17 +01:00
rlp core/types, rlp: optimize derivesha (#21728) 2020-12-03 18:59:17 +01:00
rpc Granular rpc control (Allow list for RPC daemon) (#1341) 2020-11-10 10:08:42 +01:00
signer geth-1.9.23: post-rebase fixups 2020-10-26 17:16:00 +01:00
tests State cache switching writes to reads during commit (#1368) 2020-12-08 09:44:29 +00:00
turbo State cache switching writes to reads during commit (#1368) 2020-12-08 09:44:29 +00:00
visual Continue comparison of genesis block with geth, expand long values (#223) 2019-12-06 12:03:12 +00:00
.dockerignore dont send .git to Docker (#1319) 2020-10-29 16:39:05 +00:00
.gitattributes .gitattributes: enable solidity highlighting on github (#16425) 2018-04-03 15:21:24 +02:00
.gitignore post-rebase fixes 2020-12-03 18:59:17 +01:00
.gitmodules Remove z3 and semantics (#1314) 2020-10-28 17:52:00 +00:00
.golangci.yml adopt --metrics.addr flag in integration (#889) 2020-08-11 06:38:34 +07:00
.mailmap all: update license information (#16089) 2018-02-14 13:49:11 +01:00
.readthedocs.yml first sphinx doc portion (#1144) 2020-09-27 20:40:48 +01:00
.travis.yml params: release Geth v1.9.24 with Go 1.15.5 (#21842) 2020-12-03 18:59:17 +01:00
appveyor.yml params: release Geth v1.9.24 with Go 1.15.5 (#21842) 2020-12-03 18:59:17 +01:00
AUTHORS build: deduplicate same authors with different casing 2019-07-22 12:31:11 +03:00
circle.yml circleci: enable docker based hive testing 2016-07-15 16:07:34 +03:00
COPYING COYPING: restore the full text text of GPL (#21568) 2020-10-06 14:12:09 +02:00
COPYING.LESSER all: update license information 2015-07-07 14:12:44 +02:00
docker-compose.yml Mdbx v0.9.2 (#1373) 2020-11-28 14:26:28 +00:00
Dockerfile fix git commit when using make docker (#1323) 2020-10-30 08:43:24 +00:00
go.mod History bitmap 64 (#1374) 2020-12-04 21:16:51 +00:00
go.sum History bitmap 64 (#1374) 2020-12-04 21:16:51 +00:00
interfaces.go [GC] uint256 rather than big.Int in Transaction (#614) 2020-06-04 08:43:08 +01:00
Makefile Mdbx v0.9.2 (#1373) 2020-11-28 14:26:28 +00:00
nightly.sh Nightly tests (#444) 2020-04-12 19:37:15 +01:00
oss-fuzz.sh scripts: create oss-fuzz script in go-ethereum (#21808) 2020-12-03 18:59:17 +01:00
README.geth.md eth_syncing (#991) 2020-08-29 08:24:50 +01:00
README.md Updating RPC tests in Postman (#1340) 2020-11-09 09:52:18 +01:00
RELEASE_INSTRUCTIONS.md Jumpdest skip optimisation (#851) 2020-08-01 17:56:57 +01:00
SECURITY.md SECURITY.md: create security policy (#19666) 2019-06-06 14:40:52 +02:00
UPGRADE_INFO.md prepare for merging 2020-02-27 17:20:35 +03:00

Turbo-Geth

Turbo-Geth is a fork of Go-Ethereum with focus on performance. CircleCI

Table of contents

NB! In-depth links are marked by the microscope sign (🔬)

Disclaimer: this software is currenly a tech preview. We will do our best to keep it stable and make no breaking changes but we don't guarantee anything. Things can and will break.

The current version is currently based on Go-Ethereum 1.9.15.

System Requirements

About 830 GB of free disk storage (630 GB state storage, 200GB temp files)

16 or 32 GB of RAM is recommended

🔬 more info on disk storage is here here)

Usage

> git clone --recurse-submodules -j8 https://github.com/ledgerwatch/turbo-geth.git && cd turbo-geth
> make tg
> ./build/bin/tg

Key features

🔬 See more detailed overview of functionality and current limitations. It is being updated on recurring basis.

More Efficient State Storage

Flat KV storage. Turbo-Geth uses a key-value database and storing accounts and storage in a simple way.

🔬 See our detailed DB walkthrough here.

Preprocessing. For some operations, turbo-geth uses temporary files to preprocess data before inserting it into the main DB. That reduces write amplification and DB inserts sometimes are orders of magnitude quicker.

🔬 See our detailed ETL explanation here.

Plain state.

Single accounts/state trie. Turbo-Geth uses a single Merkle trie for both accounts and the storage.

Faster Initial Sync

Turbo-Geth uses a rearchitected full sync algorithm from Go-Ethereum that is split into "stages".

🔬 See more detailed explanation in the Staged Sync Readme

It uses the same network primitives and is compatible with regular go-ethereum nodes that are using full sync, you do not need any special sync capabilities for turbo-geth to sync.

When reimagining the full sync, we focused on batching data together and minimize DB overwrites. That makes it possible to sync Ethereum mainnet in under 2 days if you have a fast enough network connection and an SSD drive.

Examples of stages are:

  • Downloading headers;

  • Downloading block bodies;

  • Executing blocks;

  • Validating root hashes and building intermediate hashes for the state Merkle trie;

  • And more...

JSON-RPC daemon

In turbo-geth RPC calls are extracted out of the main binary into a separate daemon. This daemon can use both local or remote DBs. That means, that this RPC daemon doesn't have to be running on the same machine as the main turbo-geth binary or it can run from a snapshot of a database for read-only calls.

🔬 See RPC-Daemon docs

For local DB

> make rpcdaemon
> ./build/bin/rpcdaemon --chaindata ~/Library/TurboGeth/tg/chaindata --http.api=eth,debug,net

For remote DB

Run turbo-geth in one terminal window

> ./build/bin/tg --private.api.addr=localhost:9090

Run RPC daemon

> ./build/bin/rpcdaemon --private.api.addr=localhost:9090

Supported JSON-RPC calls (eth, debug, net, web3):

For a details on the implementation status of each command, see this table.

Grafana dashboard:

docker-compose up prometheus grafana, detailed docs.

Or run all components by docker-compose

Next command starts: turbo-geth on port 30303, rpcdaemon 8545, prometheus 9090, grafana 3000

docker-compose build
XDG_DATA_HOME=/preferred/data/folder docker-compose up

Getting in touch

Turbo-Geth Discord Server

The main discussions are happening on our Discord server. To get an invite, send an email to tg [at] torquem.ch with your name, occupation, a brief explanation of why you want to join the Discord, and how you heard about Turbo-Geth.

Reporting security issues/concerns

Send an email to security [at] torquem.ch.

Team

Core contributors:

Thanks to:

  • All contributors of Turbo-Geth

  • All contributors of Go-Ethereum

  • Our special respect and graditude is to the core team of Go-Ethereum. Keep up the great job!

Happy testing! 🥤

Known issues

1. htop shows incorrect memory usage

TurboGeth's internal DB (LMDB) using MemoryMap - when OS does manage all read, write, cache operations instead of Application (linux, windows)

htop on column res shows memory of "App + OS used to hold page cache for given App", but it's not informative, because if htop says that app using 90% of memory you still can run 3 more instances of app on the same machine - because most of that 90% is "OS pages cache".
OS automatically free this cache any time it needs memory. Smaller "page cache size" may not impact performance of TurboGeth at all.

Next tools show correct memory usage of TurboGeth:

  • vmmap -summary PID | grep -i "Physical footprint". Without grep you can see details - section MALLOC ZONE column Resident Size shows App memory usage, section REGION TYPE column Resident Size shows OS pages cache size.
  • Prometheus dashboard shows memory of Go app without OS pages cache (make prometheus, open in browser localhost:3000, credentials admin/admin)
  • cat /proc/<PID>/smaps

TurboGeth uses ~4Gb of RAM during genesis sync and < 1Gb during normal work. OS pages cache can utilize unlimited amount of memory.

Warning: Multiple instances of TG on same machine will touch Disk concurrently, it impacts performance - one of main TG optimisations: "reduce Disk random access". "Blocks Execution stage" still does much random reads - this is reason why it's slowest stage. We do not recommend run multiple genesis syncs on same Disk. If genesis sync passed, then it's fine to run multiple TG on same Disk.