* Chunked format -> blinded
* LZ4 -> ZSTD
* Implemented parent block root support for history download
* Rationale: Allows to optimize GC collection easily on state
reconstruction and it allows to read fast attestations in historical
states reader
1. Adds an eth/stagedsync/test package which provides a test Harness
object
2. Adds the first automated test to the bor-heimdall stage regarding
span persistence (more to come in subsequent PRs)
3. Fixes a bug in the bor-heimdall stage which was uncovered with the
test - we do not fetch span 0 when we sync straight from blockNum=0
without snapshots
4. Reorganises all mocks to be placed under ./mock sub-package within
their respective packages
"whitelisting" mechanism (list of files - stored in DB) - which
protecting us from downloading new files after upgrade/downgrade was
broken. And seems it became over-complicated with time.
I replacing it by 1 persistent flag inside downloader:
"prohibit_new_downloads.lock"
Erigon will turn downloader into this mode after
downloading/verification of first snapshots.
```
//Corner cases:
// - Erigon generated file X with hash H1. User upgraded Erigon. New version has preverified file X with hash H2. Must ignore H2 (don't send to Downloader)
// - Erigon "download once": means restart/upgrade/downgrade must not download files (and will be fast)
// - After "download once" - Erigon will produce and seed new files
```
------
`downloader --seedbox` is never "prohibit new downloads"
Because access lists use maps with the `StorageKey` as the key, they are
subject to inconsistent ordering in the results of the `.accessList()`
method.
To get around this, an `accessListSorted` method has been added, and
exposed with the same name. The `equal` method has also been exposed to
allow for equality checks at this level outside of this module.
Co-authored-by: 3commascapital <8562488-3commascapital@users.noreply.gitlab.com>
What does this PR do:
* Optional Backfilling and Caplin Archive Node
* Create antiquary for historical states
* Fixed gaps of chain gap related to the Head of the chain and anchor of
the chain.
* Added basic reader object to Read the Historical state
This PR has fixes for a number of instances in the bor heimdall stage
where nil headers are either ignored or inadvertently processed.
It also has a demotion of milestone related logging messages to debug
for missing blocks because the process is not at the head of the chain +
a general reduction in periodic logging to 30 secs rather than 20 to
reduce the log output on long runs.
In addition there is a refactor of persistValidatorSets to perform
validator set initiation in a seperate function. This is intended to
clarify the operation of persistValidatorSets - which is till performing
2 actions, persisting the snapshot and then using it to check the header
against synthesized validator set in the snapshot.
Changed distribution of httpcfg.HttpCfg to be pointer.
Added new flags:
rpc.slow.log - which is false by default, this flag need to enable
logging slow RPC requests
rpc.slow.log.threshold - which is 100 by default, this flag specify slow
threshold in milliseconds
Updated rpc handler to log slow requests:
- added map[request id] {method, timestamp}
- put every request details to map above
- delete request details from map above
- added time interval check for elements in map and if time difference
is more than given threshold print request id and the method
- app will print slow requests in next cases:
1. As soon as request take more than given threshold
2. Every 20 seconds if request still in process
3. After request finished and it took more than give threshold
---------
Co-authored-by: alex.sharov <AskAlexSharov@gmail.com>
This PR adds support to store the transaction dependency (generated by
the block producer) in the block header for bor. This transaction
dependency will then be used by the parallel processor
([Block-STM](https://github.com/ledgerwatch/erigon/pull/7812/)).
I have created another
[PR](https://github.com/ledgerwatch/erigon-lib/pull/1064) in the
erigon-lib repo which adds the `IsParallelUniverse()` function.
# Background
Erigon currently uses a combination of Victoria Metrics and Prometheus
client for providing metrics.
We want to rationalize this and use only the Prometheus client library,
but we want to maintain the simplified Victoria Metrics methods for
constructing metrics.
This task is currently partly complete and needs to be finished to a
stage where we can remove the Victoria Metrics module from the Erigon
code base.
# Summary of changes
- Adds missing `NewCounter`, `NewSummary`, `NewHistogram`,
`GetOrCreateHistogram` functions to `erigon-lib/metrics` similar to the
interface VictoriaMetrics lib provides
- Minor tidy up for consistency inside `erigon-lib/metrics/set.go`
around return types (panic vs err consistency for funcs inside the
file), error messages, comments
- Replace all remaining usages of `github.com/VictoriaMetrics/metrics`
with `github.com/ledgerwatch/erigon-lib/metrics` - seamless (only import
changes) since interfaces match
This fixes an issue where the mumbai testnet node struggle to find
peers. Before this fix in general test peer numbers are typically around
20 in total between eth66, eth67 and eth68. For new peers some can
struggle to find even a single peer after days of operation.
These are the numbers after 12 hours or running on a node which
previously could not find any peers: eth66=13, eth67=76, eth68=91.
The root cause of this issue is the following:
- A significant number of mumbai peers around the boot node return
network ids which are different from those currently available in the
DHT
- The available nodes are all consequently busy and return 'too many
peers' for long periods
These issues case a significant number of discovery timeouts, some of
the queries will never receive a response.
This causes the discovery read loop to enter a channel deadlock - which
means that no responses are processed, nor timeouts fired. This causes
the discovery process in the node to stop. From then on it just
re-requests handshakes from a relatively small number of peers.
This check in fixes this situation with the following changes:
- Remove the deadlock by running the timer in a separate go-routine so
it can run independently of the main request processing.
- Allow the discovery process matcher to match on port if no id match
can be established on initial ping. This allows subsequent node
validation to proceed and if the node proves to be valid via the
remainder of the look-up and handshake process it us used as a valid
peer.
- Completely unsolicited responses, i.e. those which come from a
completely unknown ip:port combination continue to be ignored.
-
Reason:
- produce and seed snapshots earlier on chain tip. reduce depnedency on
"good peers with history" at p2p-network.
Some networks have no much archive peers, also ConsensusLayer clients
are not-good(not-incentivised) at serving history.
- avoiding having too much files:
more files(shards) - means "more metadata", "more lookups for
non-indexed queries", "more dictionaries", "more bittorrent
connections", ...
less files - means small files will be removed after merge (no peers for
this files).
ToDo:
[x] Recent 500K - merge up to 100K
[x] Older than 500K - merge up to 500K
[x] Start seeding 100k files
[x] Stop seeding 100k files after merge (right before delete)
In next PR:
[] Old version of Erigon must be able download recent hashes. To achieve
it - at first start erigon will download preverified hashes .toml from
s3 - if it's newer that what we have (build-in) - use it.
Newly introduced `t.logGaps` was being set to `nil` and still accessed
within `clearFailedLogs`. This PR changes the ordering, moving the nil
setting to `CaptureTxEnd`.