Commit Graph

3199 Commits

Author SHA1 Message Date
Mark Holt
509a7af26a
Discovery zero refresh timer (#8661)
This fixes an issue where the mumbai testnet node struggle to find
peers. Before this fix in general test peer numbers are typically around
20 in total between eth66, eth67 and eth68. For new peers some can
struggle to find even a single peer after days of operation.

These are the numbers after 12 hours or running on a node which
previously could not find any peers: eth66=13, eth67=76, eth68=91.

The root cause of this issue is the following:

- A significant number of mumbai peers around the boot node return
network ids which are different from those currently available in the
DHT
- The available nodes are all consequently busy and return 'too many
peers' for long periods

These issues case a significant number of discovery timeouts, some of
the queries will never receive a response.

This causes the discovery read loop to enter a channel deadlock - which
means that no responses are processed, nor timeouts fired. This causes
the discovery process in the node to stop. From then on it just
re-requests handshakes from a relatively small number of peers.

This check in fixes this situation with the following changes:

- Remove the deadlock by running the timer in a separate go-routine so
it can run independently of the main request processing.
- Allow the discovery process matcher to match on port if no id match
can be established on initial ping. This allows subsequent node
validation to proceed and if the node proves to be valid via the
remainder of the look-up and handshake process it us used as a valid
peer.
- Completely unsolicited responses, i.e. those which come from a
completely unknown ip:port combination continue to be ignored.
-
2023-11-07 08:48:58 +00:00
Giulio rebuffo
4b580dcc2f
update caplin snapshots hashes (#8663)
This PR also adds snippets to download caplin snapshots
2023-11-06 21:05:07 +01:00
ledgerwatch
1185587b20
Move validator set snapshot computation to bor_heimdall stage (#8646)
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro-2.local>
2023-11-06 08:24:33 +00:00
ledgerwatch
2064edc5e6
Add arguments (no-op) (#8653) 2023-11-04 17:44:34 +00:00
ledgerwatch
a77e33e7c4
Introduce extra functions for BorSpans (no-op) (#8648) 2023-11-04 10:59:07 +00:00
Giulio rebuffo
36595994e8
Bumps down Bodies Timeout from 30 secs to 2 secs (#8619)
Now our timings are 8.6 MB/sec
2023-11-03 21:44:35 +01:00
battlmonstr
d92898a508
p2p: silkworm sentry (#8527) 2023-11-02 08:35:13 +07:00
Alex Sharov
329d18ef6f
snapshots: reduce merge limit of blocks to 100K (#8614)
Reason: 
- produce and seed snapshots earlier on chain tip. reduce depnedency on
"good peers with history" at p2p-network.
Some networks have no much archive peers, also ConsensusLayer clients
are not-good(not-incentivised) at serving history.
- avoiding having too much files:
more files(shards) - means "more metadata", "more lookups for
non-indexed queries", "more dictionaries", "more bittorrent
connections", ...
less files - means small files will be removed after merge (no peers for
this files).


ToDo:
[x] Recent 500K - merge up to 100K 
[x] Older than 500K - merge up to 500K 
[x] Start seeding 100k files
[x] Stop seeding 100k files after merge (right before delete)

In next PR: 
[] Old version of Erigon must be able download recent hashes. To achieve
it - at first start erigon will download preverified hashes .toml from
s3 - if it's newer that what we have (build-in) - use it.
2023-11-01 23:22:35 +07:00
battlmonstr
eb65ffbaa7
config: avoid OOM in docker using cgroups v2 limit (#6646) (#8632)
Tested on Debian 11 by:

    $ git revert HEAD
    $ make docker
    $ docker run --memory=8.8G <sha>

before the fix outputs:

    15.6 GB

after the fix outputs:

    8.8 GB
2023-11-01 09:02:34 +07:00
Giulio rebuffo
513fd50fa5
Compress snapshots for Caplin with lz4 level=1 (#8609) 2023-10-30 13:48:14 +01:00
Alex Sharov
c23e5a1abf
downloader: preparations for reducing blocks merge limit (#8612) 2023-10-30 13:46:35 +07:00
Giulio rebuffo
0e5af0a69c
Added beacon snapshots download (#8601) 2023-10-28 17:41:50 +02:00
Dmytro
9adf31b8eb
bytes transfet separated by capability and category (#8568)
Co-authored-by: Mark Holt <mark@distributed.vision>
2023-10-27 22:30:28 +03:00
Somnath
043ccef4ca
Fix null ptr in debug_traceTx (#8585)
Newly introduced `t.logGaps` was being set to `nil` and still accessed
within `clearFailedLogs`. This PR changes the ordering, moving the nil
setting to `CaptureTxEnd`.
2023-10-26 12:56:27 +07:00
Alex Sharov
0e1fa8dc3e
blocksReadAheadFunc: to calc engine.Author in background (#8499)
Bor consensus: this calc is heavy and has cache
2023-10-25 20:09:09 +07:00
Andrew Ashikhmin
38e91809f9
Revert "Move validator set snapshot computation to bor_heimdall stage… (#8580)
PR #8202 might cause Issue #8550, so reverting it until Alexey's return.

This reverts commit 2ce98f8337.
2023-10-25 14:02:31 +02:00
Anshal Shukla
95424b5ccc
fix nil error in bor devnet (#8578) 2023-10-25 10:16:43 +01:00
Dmytro
ec59be2261
Dvovk/sentinel and sentry peers data collect (#8533) 2023-10-23 17:33:08 +03:00
a
436493350e
Sentinel refactor (#8296)
1. changes sentinel to use an http-like interface

2. moves hexutil, crypto/blake2b, metrics packages to erigon-lib
2023-10-22 01:17:18 +02:00
ledgerwatch
2ce98f8337
Move validator set snapshot computation to bor_heimdall stage (#8202)
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro-2.local>
2023-10-20 18:31:00 +01:00
Andrew Ashikhmin
f26be2c4c3
Simplify RLP size calculations (#8506)
Use `ListPrefixLen` & `StringLen` from erigon-lib more
2023-10-18 09:28:11 +02:00
Alex Sharov
33d5399436
downloader: support token (#8507) 2023-10-18 14:24:09 +07:00
Alex Sharov
3ac9f493b6
move chainname and snapcfg packages to erigon-lib (#8508) 2023-10-18 13:37:39 +07:00
canepat
9568567eda
Add RPC daemon using Silkworm (#8486)
This introduces _experimental_ RPC daemon run by embedded Silkworm
library. Same notes as in PR #8353 apply here plus the following ones:

- activated if `http` command-line option is enabled and `silkworm.path`
option is present, nothing more is required (i.e. currently, both block
execution and RPC daemon run by Silkworm when specifying
`silkworm.path`, just to keep things as simple as possible)
- only Execution API endpoints are implemented by Silkworm RPCDaemon,
whilst Engine API endpoints are still served by Erigon RPCDaemon
- some features are still missing, in particular:
  - state change notification handling
- custom JSON RPC settings (i.e. Erigon RPC settings are not passed to
Silkworm yet)
2023-10-18 06:37:16 +07:00
canepat
19c428dd7d
Use Erigon execution for memory mutations (#8503) 2023-10-17 22:52:52 +02:00
Alex Sharov
987d35a8b5
GOMEMLIMIT: support (#8498)
related to https://github.com/ledgerwatch/erigon/issues/8485 and
https://github.com/ledgerwatch/erigon/issues/6646
2023-10-17 10:38:49 +07:00
Giulio rebuffo
9e42b705ce
Caplin: under the hood block downloading (#8459) 2023-10-16 15:35:26 +02:00
Alex Sharov
f265bd8bfc
add "integration stage_headers --integrity.slow" to check gaps in headers or canonical markers (#8489) 2023-10-16 19:52:04 +07:00
Giulio rebuffo
2e9436a7d0
Verbose --internalcl for whenever we fail at startup (#8487)
Just does not make it die silently
2023-10-16 11:09:43 +02:00
battlmonstr
757a91c44d
sync: fix a memory leak when header verification fails (#8431)
If HeaderDownload.VerifyHeader always returns false, the memory usage
grows at a fast pace
due to Link objects (containing headers) not deallocated even after the
link queue pruning.
2023-10-14 08:39:43 +07:00
Giulio rebuffo
cfc2bcf707
Proper use of Dirs in Caplin (#8458) 2023-10-13 15:22:48 +02:00
Andrew Ashikhmin
b60642fa5a
Configure EIP-4844 parameters for Gnosis (#8464)
See https://github.com/gnosischain/specs/pull/20 &
https://github.com/gnosischain/specs/pull/24
2023-10-13 11:43:16 +02:00
Mark Holt
7b3570c019
Add block producer progress check (#8447)
Add a check against the inbound headers before publishing a newly mined
block after the wait delay.

If the node received a block while it was processing transactions, or
waiting for its publish slot, do a final check that another node hasn't
already published a block.
2023-10-12 19:08:05 +01:00
Alex Sharov
6d9a4f4d94
rpcdaemon: must not create db - because doesn't know right parameters (#8445) 2023-10-12 14:11:46 +07:00
Mark Holt
3fb89cc5c4
Prune all only on unwindBlock (#8430)
TruncateCanonicalHash should only have all pruned in the case of a
badBlock or a fork unwind
2023-10-10 17:13:19 +01:00
Mark Holt
0d190ff9e9
Bor rpc config fix (#8413)
This is an additional fix for BorRo to add bor config in the constructor
- otherwise code which accesses chain config will panic.
2023-10-10 15:26:02 +01:00
Mark Holt
7deb69967f
Avoid marking blocks as bad at rewind (#8414)
Co-authored-by: Anshal Shukla <shukla.anshal85@gmail.com>
2023-10-10 19:47:51 +05:30
Anshal Shukla
076dc33232
move borfinality package out of eth (#8407)
- Move borfinality out of eth package
- Adds nil pointer check in bor_verifier
2023-10-09 19:13:31 +01:00
Giulio rebuffo
2a5c51dc57
ValidateChain not reliant on stageBodies (#8411) 2023-10-09 17:24:12 +02:00
Alex Sharov
62395c767e
move NewHashBatch to erigon-lib, remove oddb package (#8408) 2023-10-09 15:47:54 +07:00
Giulio rebuffo
d90572b786
Hopefully an even faster version of mocked sentry (#8402) 2023-10-07 22:30:10 +02:00
Mark Holt
d4d5cb9561
Bor reduce logging verbosity (#8401)
Don't log every sync event action, but gather stats and print every 30
secs
2023-10-07 14:40:16 +01:00
Anshal Shukla
bdc9d5577b
add milestone metrics (#8386) 2023-10-06 09:53:04 +05:30
Anshal Shukla
da4e1c1ba4
fix bor.milestone flag (#8384)
- Made milestones enabled by default, since it has now been tested on
devnets
- Small bug fix which fixes nil panic while minning
2023-10-05 17:55:32 +01:00
Giulio rebuffo
2294c8c66c
EthereumExecutionService in MockSentry (#8373)
Now we use the ethereum execution service directly:

* Changed sig of InsertChain
* Use of the service in case of PoS
2023-10-05 18:30:19 +02:00
Anshal Shukla
c6f532d68e
Rewind fixes (#8380) 2023-10-05 10:12:47 +01:00
Alex Sharov
c4399da0f1
evm: no interface on contract creation (#8378) 2023-10-05 13:42:20 +07:00
Alex Sharov
c293883ec0
evm: no interface (#8376)
after removal of tevm experiment - we left interfaces everywhere 
removing it for performance and for geth compatibility
2023-10-05 12:23:08 +07:00
Manav Darji
b107a97e49
eth/borfinality: handle errors to prevent access logging (#8372)
This PR adds a few checks on top of
https://github.com/ledgerwatch/erigon/pull/8364 to prevent access
logging. Moreover, fixes a bug which sent wrong error to parent
function.
2023-10-05 10:45:40 +07:00
canepat
47690db676
Block execution using embedded Silkworm (#8353)
This introduces _experimental_ block execution run by embedded Silkworm
API library:

- new command-line option `silkworm.path` to enable the feature by
specifying the path to the Silkworm library
- the Silkworm API shared library is dynamically loaded on-demand
- currently requires to build Silkworm library on the target machine
- available only on Linux at the moment: macOS has issue with [stack
size](https://github.com/golang/go/issues/28024) and Windows would
require [TDM-GCC-64](https://jmeubank.github.io/tdm-gcc/), both need
dedicated effort for an assessment
2023-10-05 09:27:37 +07:00