While working on fixing the bor mining loop I stumbled across an error
in `ChainReader.BorSpan` - not implemented panic. Also hit a few other
panics due to missed logger in `ChainReaderImpl` struct initialisations.
This PR fixes both.
A crash on startup happens on --chain=mumbai , because I've confused
chainConfig.Bor (from type chain.Config) and config.Bor (from type
ethconfig.Config) in the setup code.
ethconfig.Config.Bor property contained bogus values, and was used only
to check its type in CreateConsensusEngine(). Its value was never read
(before PR #9117).
This change removes the property to avoid confusion and fix the crash.
Devnet network.BorStateSyncDelay was implemented using
ethconfig.Config.Bor, but it wasn't taking any effect. It should be
fixed separately in a different way.
Getting an error in one of the bor nodes in devnet when trying to run
the "state-sync" scenario:
```
[EROR] [01-03|16:55:44.179] cli.StartRpcServer error err="could not start separate Websocket RPC api at port 8546: listen tcp 127.0.0.1:8546: bind: address already in use"
```
This happens for scenarios with more than 1 node.
Digging further this regressions has happened due to this change:
https://github.com/ledgerwatch/erigon/pull/8909
This PR fixes this by updating the devnet `NodeArgs` struct to set the
corresponding `--ws.port` `arg` tag which now exists.
Hi, I made three suggestions for this section:
- "devenet" should be "devnet" (typo).
- "are currently build" should be "are currently built" (grammatical
error).
- "sptep" should be "step" (typo).
Thanks.
Mdbx now takes a logger - but this has not been pushed to all callers -
meaning it had an invalid logger
This fixes the log propagation.
It also fixed a start-up issue for http.enabled and txpool.disable
created by a previous merge
Users reported this error
```
[bor.heimdall] an error while trying fetching path=clerk/event-record/list attempt=5 error="unexpected end of JSON input"
```
Which may happen if:
1. Heimdall is behind and not sync-ed - for more info check
https://github.com/maticnetwork/heimdall/pull/993
2. Or the header time erigon is sending is far into the future
The logs in this PR will help us see which of the 2 is the culprit but
most likely it is 1. We will investigate further 2. if it ever happens.
Changes:
1. Improves logging upon heimdall client retries - prints out the full
url that failed.
2. Fixes a bug where the body was incorrectly checked if it is empty -
`len(body) == 0` vs `body == nil`
3. Unit test for the bug regression
4. Adds a log to indicate to users to check their heimdall process if
they run into this scenario since that may be the culprit
Example output with new logs
<img width="1465" alt="Screenshot 2023-12-29 at 20 16 57"
src="https://github.com/ledgerwatch/erigon/assets/94537774/1ebfde68-aa93-41d6-889a-27bef5414f25">
Heimdall prepares the next span a number of sprints before the current
span ends. Currently we always fetch the next span regardless of which
sprint we are in during the current span. This causes a liveness issue
due to how the Heimdall client works (it infinitely retries until it
fetches a span - this issue will be fixed in a separate PR). This PR
fixes this by matching what bor does - it fetches the next span only in
the last sprint of the current span.
Changes:
- Adds a unit test for the above
- Adds a new function BlockInLastSprintOfSpan
- Some code reorg and cleanup - moves the span num related functions
from the bor package to the span sub package for better logical grouping
Adds unit tests for:
- Bor Heimdall Stage - `checkHeaderExtraData`
- at end of each sprint verifies that the validators in the header extra
data matches the selected proposers from the heimdall span
- 1 test for selected proposers length mismatch
- 1 test for selected proposers bytes mismatch
- BorHeimdall Stage - `persistValidatorSets`
- verifies that each header is created by a validator in the validator
set
- in such situation we set the unwind point
This change introduces additional processes to manage snapshot uploading
for E2 snapshots:
## erigon snapshots upload
The `snapshots uploader` command starts a version of erigon customized
for uploading snapshot files to
a remote location.
It breaks the stage execution process after the senders stage and then
uses the snapshot stage to send
uploaded headers, bodies and (in the case of polygon) bor spans and
events to snapshot files. Because
this process avoids execution in run signifigantly faster than a
standard erigon configuration.
The uploader uses rclone to send seedable (100K or 500K blocks) to a
remote storage location specified
in the rclone config file.
The **uploader** is configured to minimize disk usage by doing the
following:
* It removes snapshots once they are loaded
* It aggressively prunes the database once entities are transferred to
snapshots
in addition to this it has the following performance related features:
* maximizes the workers allocated to snapshot processing to improve
throughput
* Can be started from scratch by downloading the latest snapshots from
the remote location to seed processing
## snapshots command
Is a stand alone command for managing remote snapshots it has the
following sub commands
* **cmp** - compare snapshots
* **copy** - copy snapshots
* **verify** - verify snapshots
* **manifest** - manage the manifest file in the root of remote snapshot
locations
* **torrent** - manage snapshot torrent files
This adds a simulator object with implements the SentryServer api but
takes objects from a pre-existing snapshot file.
If the snapshot is not available locally it will download and index the
.seg file for the header range being asked for.
It is created as follows:
```go
sim, err := simulator.NewSentry(ctx, "mumbai", dataDir, 1, logger)
```
Where the arguments are:
* ctx - a callable context where cancel will close the simulator torrent
and file connections (it also has a Close method)
* chain - the name of the chain to take the snapshots from
* datadir - a directory potentially containing snapshot .seg files. If
not files exist in this directory they will be downloaded
* num peers - the number of peers the simulator should create
* logger - the loger to log actions to
It can be attached to a client as follows:
```go
simClient := direct.NewSentryClientDirect(66, sim)
```
At the moment only very basic functionality is implemented:
* get headers will return headers by range or hash (hash assumes a
pre-downloaded .seg as it needs an index
* the header replay semantics need to be confirmed
* eth 65 and 66(+) messaging is supported
* For details see: `simulator_test.go
More advanced peer behavior (e.g. header rewriting) can be added
Bodies/Transactions handling can be added
* Testing Beacon API
* Fixed sentinel code (a little bit)
* Fixed sentinel tests
* Added historical state support
* Fixed state-related endpoints (i was drunk when writing them)
* Chunked format -> blinded
* LZ4 -> ZSTD
* Implemented parent block root support for history download
* Rationale: Allows to optimize GC collection easily on state
reconstruction and it allows to read fast attestations in historical
states reader