This change introduces additional processes to manage snapshot uploading
for E2 snapshots:
## erigon snapshots upload
The `snapshots uploader` command starts a version of erigon customized
for uploading snapshot files to
a remote location.
It breaks the stage execution process after the senders stage and then
uses the snapshot stage to send
uploaded headers, bodies and (in the case of polygon) bor spans and
events to snapshot files. Because
this process avoids execution in run signifigantly faster than a
standard erigon configuration.
The uploader uses rclone to send seedable (100K or 500K blocks) to a
remote storage location specified
in the rclone config file.
The **uploader** is configured to minimize disk usage by doing the
following:
* It removes snapshots once they are loaded
* It aggressively prunes the database once entities are transferred to
snapshots
in addition to this it has the following performance related features:
* maximizes the workers allocated to snapshot processing to improve
throughput
* Can be started from scratch by downloading the latest snapshots from
the remote location to seed processing
## snapshots command
Is a stand alone command for managing remote snapshots it has the
following sub commands
* **cmp** - compare snapshots
* **copy** - copy snapshots
* **verify** - verify snapshots
* **manifest** - manage the manifest file in the root of remote snapshot
locations
* **torrent** - manage snapshot torrent files
* Testing Beacon API
* Fixed sentinel code (a little bit)
* Fixed sentinel tests
* Added historical state support
* Fixed state-related endpoints (i was drunk when writing them)
* Reconstruct previous epoch without looking at DB: no hindrance to
performance -> removed 15GB
* Store inactivity scores and slashings in MDBX and do not store diffs
for them(they are tiny 700/400 bytes)
* Reduced dumps from every 2048 to 1024 -> Added 5 GB (maybe we should
down it to 768)
* Parallel processing of shuffled sets, 2x performance boost in reading
participation.
* Store balances diffs in a Btree diff matter, see:
https://github.com/ledgerwatch/erigon-documents/blob/master/caplin/design/data-model.md#uint64listuint64vector
"whitelisting" mechanism (list of files - stored in DB) - which
protecting us from downloading new files after upgrade/downgrade was
broken. And seems it became over-complicated with time.
I replacing it by 1 persistent flag inside downloader:
"prohibit_new_downloads.lock"
Erigon will turn downloader into this mode after
downloading/verification of first snapshots.
```
//Corner cases:
// - Erigon generated file X with hash H1. User upgraded Erigon. New version has preverified file X with hash H2. Must ignore H2 (don't send to Downloader)
// - Erigon "download once": means restart/upgrade/downgrade must not download files (and will be fast)
// - After "download once" - Erigon will produce and seed new files
```
------
`downloader --seedbox` is never "prohibit new downloads"
* Most of the PR changed files are extra and slightly more complicated
unit tests.
* Fixed Eth1DataVotes not inheriting genesis
* Fixed Attestations simulation using wrong slot when reconstructing
partecipation
* Fixed Copy() operation on BeaconState on Eth1DataVotes
* Used correct ListSSZ type for Eth1DataVotes and HistoricalSummaries
* Fixed wrong []uint64 deltas on empty slots
What does this PR do:
* Optional Backfilling and Caplin Archive Node
* Create antiquary for historical states
* Fixed gaps of chain gap related to the Head of the chain and anchor of
the chain.
* Added basic reader object to Read the Historical state