This change introduces additional processes to manage snapshot uploading
for E2 snapshots:
## erigon snapshots upload
The `snapshots uploader` command starts a version of erigon customized
for uploading snapshot files to
a remote location.
It breaks the stage execution process after the senders stage and then
uses the snapshot stage to send
uploaded headers, bodies and (in the case of polygon) bor spans and
events to snapshot files. Because
this process avoids execution in run signifigantly faster than a
standard erigon configuration.
The uploader uses rclone to send seedable (100K or 500K blocks) to a
remote storage location specified
in the rclone config file.
The **uploader** is configured to minimize disk usage by doing the
following:
* It removes snapshots once they are loaded
* It aggressively prunes the database once entities are transferred to
snapshots
in addition to this it has the following performance related features:
* maximizes the workers allocated to snapshot processing to improve
throughput
* Can be started from scratch by downloading the latest snapshots from
the remote location to seed processing
## snapshots command
Is a stand alone command for managing remote snapshots it has the
following sub commands
* **cmp** - compare snapshots
* **copy** - copy snapshots
* **verify** - verify snapshots
* **manifest** - manage the manifest file in the root of remote snapshot
locations
* **torrent** - manage snapshot torrent files
* Chunked format -> blinded
* LZ4 -> ZSTD
* Implemented parent block root support for history download
* Rationale: Allows to optimize GC collection easily on state
reconstruction and it allows to read fast attestations in historical
states reader
Found the different log style for announced and broadcasted tx:
```
[INFO] [12-22|05:18:01.363] Local tx broadcasted txHash=ec6b1c87aafd7f8ead5794477be50bda696f2ce17271ad4f6022a756722fa0be to peer=10
[INFO] [12-22|05:18:01.363] local tx announced tx_hash=ec6b1c87aafd7f8ead5794477be50bda696f2ce17271ad4f6022a756722fa0be to peer=40 baseFee=1
```
adjust them to the same style
Signed-off-by: jsvisa <delweng@gmail.com>
"whitelisting" mechanism (list of files - stored in DB) - which
protecting us from downloading new files after upgrade/downgrade was
broken. And seems it became over-complicated with time.
I replacing it by 1 persistent flag inside downloader:
"prohibit_new_downloads.lock"
Erigon will turn downloader into this mode after
downloading/verification of first snapshots.
```
//Corner cases:
// - Erigon generated file X with hash H1. User upgraded Erigon. New version has preverified file X with hash H2. Must ignore H2 (don't send to Downloader)
// - Erigon "download once": means restart/upgrade/downgrade must not download files (and will be fast)
// - After "download once" - Erigon will produce and seed new files
```
------
`downloader --seedbox` is never "prohibit new downloads"
What does this PR do:
* Optional Backfilling and Caplin Archive Node
* Create antiquary for historical states
* Fixed gaps of chain gap related to the Head of the chain and anchor of
the chain.
* Added basic reader object to Read the Historical state
eventsource is required for the validator api. this implements the
eventsource sink/server handler
the implementation is based off of this document:
https://html.spec.whatwg.org/multipage/server-sent-events.html
note that this is a building block for the full eventsource server.
there still needs to be work done
prysm has their own custom solution based off of protobuf/grpc:
https://hackmd.io/@prysmaticlabs/eventstream-api using that would be not
good
existing eventsource implementations for golang are not good for our
situation. options are:
1. https://github.com/r3labs/sse - has most stars - this is the best
contender, since it uses []byte and not string, but it allocates and
copies extra times in the server (because of use of fprintf) and makes
an incorrect assumption about Last-Event-ID needing to be a number (i
can't find this in the specification).
2. https://github.com/antage/eventsource -requires full buffers, copies
many times, does not provide abstraction for headers. relatively
unmaintained
3. https://github.com/donovanhide/eventsource - missing functionality
around sending ids, requires full buffers, etc
4. https://github.com/bernerdschaefer/eventsource - 10 years old,
unmaintained.
additionally, implemetations other than r3labs/sse are very incorrect
because they do not split up the data field correctly when newlines are
sent. (parsers by specification will fail to encode messages sent by
most of these implementations that have newlines, as i understand it).
the implementation by r3labs/sse is also incorrect because it does not
respect \r
finally, all these implementations have very heavy implementation of the
server, which we do not need since we will use fixed sequence ids.
r3labs/sse for instance hijacks the entire handler and ties that to the
server, losing a lot of flexiblity in how we implement our server
for the beacon api, we need to stream:
```head, block, attestation, voluntary_exit, bls_to_execution_change, finalized_checkpoint, chain_reorg, contribution_and_proof, light_client_finality_update, light_client_optimistic_update, payload_attributes```
some of these are rather big json payloads, and the ability to simultaneously stream them from io.Readers instead of making a full copy of the payload every time we wish to rebroadcast it will save a lot of heap size for both resource constrained environments and serving at scale.
the protocol itself is relatively simple, there are just a few gotchas
This PR adds support to store the transaction dependency (generated by
the block producer) in the block header for bor. This transaction
dependency will then be used by the parallel processor
([Block-STM](https://github.com/ledgerwatch/erigon/pull/7812/)).
I have created another
[PR](https://github.com/ledgerwatch/erigon-lib/pull/1064) in the
erigon-lib repo which adds the `IsParallelUniverse()` function.
# Background
Erigon currently uses a combination of Victoria Metrics and Prometheus
client for providing metrics.
We want to rationalize this and use only the Prometheus client library,
but we want to maintain the simplified Victoria Metrics methods for
constructing metrics.
This task is currently partly complete and needs to be finished to a
stage where we can remove the Victoria Metrics module from the Erigon
code base.
# Summary of changes
- Remove `UsePrometheusClient` boolean flag
- Remove `VictoriaMetrics` client lib and related code (simplifies
registry and prometheus http handler initialisation since now we have
only 1 registry and can use default `promhttp.Handler`)