Reason:
- produce and seed snapshots earlier on chain tip. reduce depnedency on
"good peers with history" at p2p-network.
Some networks have no much archive peers, also ConsensusLayer clients
are not-good(not-incentivised) at serving history.
- avoiding having too much files:
more files(shards) - means "more metadata", "more lookups for
non-indexed queries", "more dictionaries", "more bittorrent
connections", ...
less files - means small files will be removed after merge (no peers for
this files).
ToDo:
[x] Recent 500K - merge up to 100K
[x] Older than 500K - merge up to 500K
[x] Start seeding 100k files
[x] Stop seeding 100k files after merge (right before delete)
In next PR:
[] Old version of Erigon must be able download recent hashes. To achieve
it - at first start erigon will download preverified hashes .toml from
s3 - if it's newer that what we have (build-in) - use it.
If HeaderDownload.VerifyHeader always returns false, the memory usage
grows at a fast pace
due to Link objects (containing headers) not deallocated even after the
link queue pruning.
This introduces _experimental_ block execution run by embedded Silkworm
API library:
- new command-line option `silkworm.path` to enable the feature by
specifying the path to the Silkworm library
- the Silkworm API shared library is dynamically loaded on-demand
- currently requires to build Silkworm library on the target machine
- available only on Linux at the moment: macOS has issue with [stack
size](https://github.com/golang/go/issues/28024) and Windows would
require [TDM-GCC-64](https://jmeubank.github.io/tdm-gcc/), both need
dedicated effort for an assessment
Fixes for label discrepancies in collector for summaries etc which have
a template which includes a quantile.
Initial native Prometheus client implementation of metrics - which is
currently turned off except for local testing and interface exports.
Closes#8078
This change is primarily intended to support go 1.21, but as a
side-effect requires updating libp2p, which in turn triggers an update
of golang.org/x/exp which creates quite a bit of (simple) churn in the
slice sorting.
This change introduces a new `cmp.Compare` function which can be used to
return an integer satisfying the compare interface for slice sorting.
In order to continue to support mplex for libp2p, the change references
github.com/libp2p/go-libp2p-mplex instead. Please see the PR at
https://github.com/libp2p/go-libp2p/pull/2498 for the official usptream
comment that indicates official support for mplex being moved to this
location.
Co-authored-by: Jason Yellick <jason@enya.ai>
Add code to the headers state to break processing if a bor milestone
rewind is detected.
The rewind processing happens in the bor/heimdall stage - this change
just avoids unnecessary header loading
if a milestone fork is likely to be detected
---------
Co-authored-by: Anshal Shukla <shukla.anshal85@gmail.com>
This is the second part of the bor milestone release it contains the
following changes:
* Initialize services
* This is a change from the initial pull request I have moved all of the
initialization to the bor engine. To facilitate this I have just passed
in the heimdall client interface, rather than the whole engine
* Stage processing
* This is also a change from the original PR - the code is contained in
the bor heimdall stage rather than in headers - the effect should be the
same, but this needs testing
---------
Co-authored-by: Mark Holt <mark@disributed.vision>
Co-authored-by: Anshal Shukla <shukla.anshal85@gmail.com>
For example, erigon on devnet8 marked a block as bad due to
"mdbx_cursor_open: cannot allocate memory":
```
[INFO] [09-12|04:57:36.041] [NewPayload] Handling new payload height=171035 hash=0x321dea00c4853ee354bebaf8aef3e63fbe06c4508271c0db4c92b0f087aedc3b
171034
[WARN] [09-12|04:57:36.069] Could not validate block err="[3/7 BlockHashes] table: Header, mdbx_cursor_open: cannot allocate memory, stack: [kv_mdbx.go:1057 kv_mdbx.
go:1069 kv_mdbx.go:1077 memory_mutation.go:473 memory_mutation.go:502 etl.go:123 etl.go:96 block_writer.go:40 stage_blockhashes.go:49 default_stages.go:457 sync.go:425 sync.go:258 s
tageloop.go:414 backend.go:476 fork_validator.go:250 fork_validator.go:156 ethereum_execution.go:151 execution_client.go:51 chain_reader.go:252 engine_server.go:741 engine_server.go
:235 engine_server.go:600 value.go:586 value.go:370 service.go:224 handler.go:494 handler.go:444 handler.go:392 handler.go:223 handler.go:316 asm_amd64.s:1598]"
[WARN] [09-12|04:57:36.069] ethereumExecutionModule.ValidateChain: chain is invalid hash=0x321dea00c4853ee354bebaf8aef3e63fbe06c4508271c0db4c92b0f087aedc3b
```
With this PR blocks are marked as bad only on genuine protocol errors.
This is a non functional change which consolidates the various packages
under metrics into the top level package now that the dead code is
removed.
It is a precursor to the removal of Victoria metrics after which all
erigon metrics code will be contained in this single package.
ergon/metrics contains a lot of unused types - the pull is removing them
prior to rationalizing between Prometheus and Victoria Metrics types.
It also have upgrades of golang-lru + sets types to v2 when possible.
This is an update of:
https://github.com/ledgerwatch/erigon/pull/7846
which uses a local fork of victoria metrics to include the changes that
https://github.com/anshalshukla added to the original for we where
using.
It also includes code to address the duplicate metrics issue identified
here:
https://github.com/ledgerwatch/erigon/issues/8053
It has one more associated fix which is to correctly add a metadata
label to counters, these where previously labelled as gauges.
e.g.
```
# TYPE p2p_peers counter
p2p_peers 0
```
rather than
```
# TYPE p2p_peers gauge
p2p_peers 0
```
---------
Co-authored-by: Anshal Shukla <53994948+anshalshukla@users.noreply.github.com>
Co-authored-by: Anshal Shukla <shukla.anshal85@gmail.com>
I have added:
```go
{
ID: stages.BorHeimdall,
Description: "Download Bor-specific data from Heimdall",
Forward: func(firstCycle bool, badBlockUnwind bool, s *StageState, u Unwinder, tx kv.RwTx, logger log.Logger) error {
if badBlockUnwind {
return nil
}
return BorHeimdallForward(s, u, ctx, tx, borHeimdallCfg, true, logger)
},
Unwind: func(firstCycle bool, u *UnwindState, s *StageState, tx kv.RwTx, logger log.Logger) error {
return BorHeimdallUnwind(u, ctx, s, tx, borHeimdallCfg)
},
Prune: func(firstCycle bool, p *PruneState, tx kv.RwTx, logger log.Logger) error {
return BorHeimdallPrune(p, ctx, tx, borHeimdallCfg)
},
},
```
To MiningStages as well as Default as otherwise bor events are not added
when the block producer creates new blocks.
There are a couple of questions I have around this implementation:
* Is this the right place to add this
* As the state is also executed when the default stage is processed ther
is some duplicate processing for the block producing node.
* There is a duplicated call to heimdall which could be removed if the
stages share state - but its not clear if we want to do this.
* I don't think the mining stage needs to prune as this will be
replicated in the default iteration
This can be tested using the devnet with the following arguments:
```
--chain bor-devnet --bor.localheimdall --scenarios state-sync
```
This will generate sync events via an ethereum devnet which are
transmitted to bor chain and will be executed at the end of the snapshot
delay, which results in events generated from the bor chain. This tests
the whole sync, block generation, event lifecycle. As it needs to wait
for sprints to end after a sufficient delay it is quite slow to run.