This PR removes the empty headers check when responding to
`GetBlockHeaders` p2p request. Currently, as we return from the
function, the requests on the caller's end gets timed out causing peer
drop. This will prevent that from happening by sending empty header list
which will be of no use but still helps in maintaining the peer
connection.
With this change, those running erigon 2.48 with
`--sentry.drop-useless-peers` will drop peers as they consider receiving
empty headers a harsh move which is fine.
We have also discovered that if we don't remove this check maintaining
connectivity to Bor is not maintained. This is significant as Erigon
becomes a minority validator, as bor chain stability (reducing forks) is
dependent upon good validator connectivity.
Fixes and issue with Polygon validators where locally mined blocks are
broadcast with invalid header hashes because the NewBlock message
constructor was removing the ReceiptHash which contributed to the header
hash.
The results in the bor header validation code not being able to
correctly identify the signer of the header - so header validation
fails.
This also likely fixes part of the bogon-block issue which was
identified by the polygon team.
Improve p2p error handling to propagate errors
from the origin up the call chain the Server peer removal code
using a new PeerError type containing a DiscReason and a more detailed
description.
The origin can be tracked down using PeerErrorCode (code) and DiscReason
(reason)
which looks like this in the log:
> [TRACE] [08-28|16:33:40.205] Removing p2p peer peercount=0
url=enode://d399f4b...@1.2.3.4:30303 duration=6.901ms
err="PeerError(code=remote disconnect reason, reason=too many peers,
err=<nil>, message=Peer.run got a remote DiscReason)"
The current logic is flawed, because it drops all peers that are less
synced.
It is valid to return empty responses by the eth spec.
A proper logic should penalize from the context of the sync process,
where enough "reputation" data is collected about a peer.
In order to be able to connect to erigon 2.48 peers that have
--sentry.drop-useless-peers enabled,
this adds a check to not reply with an empty headers list.
If we reply with an empty list, we're going to be considered useless and
kicked.
Once enough of erigon nodes are updated in the network past this commit,
this check should be removed,
because it is totally acceptable to return an empty list by the eth
spec.
Currently PropagateNewBlockHashes and BroadcastNewBlock
selects a subset of all sentries by taking a `Sqrt(len(sentries))`,
and then for each sentry SendMessageToRandomPeers
selects a subset of its peers by taking `Sqrt(len(peerInfos))`.
This behaviour limits the broadcast scope with a lot of peers, e.g. 100
becomes 10,
but is not great with very few peers, or if the message is very
important
to broadcast to everyone, which is the case of bor validator/proposer
nodes.
* send to all sentries in both BroadcastNewBlock and PropagateNewBlockHashes
* remove peerCountConstrained sqrt logic in SendMessageToRandomPeers
* add maxPeers provider func as a parameter to MultiClient
* default it to 10 for eth and 0 (unlimited) for bor validators
---------
Co-authored-by: Mark Holt <mark@distributed.vision>
problem: it was possible to call startSync
and start sending messages before our Status is sent
solution: wait for the sender goroutine to finish
before calling startSync
refactor handShake parameters to not require peerID and a startSync
callback
Co-authored-by: Mark Holt <mark@distributed.vision>
This request implements the insertion of Bor ephemeral transactions into
snapshot indexes.
I does this by taking the block hash from the header index and passing
it to the transaction indexer to add an additional index entry per block
into the transaction hash -> block index.
The passed entries are currently contained in an in memory array which
is (32 * number of blocks / sprint size) bytes.
In addition to the functional code there is also an update to the
`dump_test.go` so that it runs `DumpBlocks` to exercise the indexing
code. To facilitate this the `InsertChain` method in `mock_sentry` has
been modified so that it can process >128 blocks.
The code in this request also includes additional bor/consensus code
with the following functions:
`CalculateSprint`
`CalculateSprintCount`
The first function is a modification of the code in erigon-lib so that
the sprints are numerically rather than lexically ordered. This code
should be migrated to erigon-lib and should have its sprint set
calculated once from its underlying map rather than this process being
repeated every calculation.
---------
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro-2.local>
Co-authored-by: ledgerwatch <akhounov@gmail.com>
Co-authored-by: Enrique Jose Avila Asapche <eavilaasapche@gmail.com>
Co-authored-by: Giulio <giulio.rebuffo@gmail.com>
This PR separates ENGINE from Ethbackend. It makes it so:
1) EthBackend not a god class
2) We can abstract away engine API so that we can make it CL-like and
enable Consensus-Execution driven design
3) Objective is Json-RPC -> Engine Consensus Module -> Execution module.
Blob transactions are SSZ encoded, so it had to be added to decoding.
There are 2 encoding forms: `network` and `minimal` (usual). Network
encoded blob transactions include "wrapper data" which are `kzgs`,
`blobs` and `proofs`, and decoded by `DecodeWrappedTransaction`. For
previous types of transactions the network encoding is no different.
Execution-payloads / blocks use the minimal encoding of transactions. In
the transaction-pool and local transaction-journal the network encoding
is used.
Concerns:
1. Possible performance reduction caused by these changes, not sure if
streams are better then slices. Go slices in this modifications are
read-only, so they should be referred to the same underlying array and
passed by a reference.
2. If `DecodeWrappedTransaction` and `DecodeTransaction` will create
confusion and should be merged into one function.
This is the beginning of the series of changes to make it possible to
run multiple instances of erigon inside a single process (as devnet tool
does), with the logging from these processes going to respective log
files correctly.
This is the first part where the initial infrastructure is being
established
---------
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro-2.local>
It turns out that "standard" BSC nodes based on Geth, do not propagate
new block hashes and blocks, at least towards Erigon nodes. This is a
workaround
---------
Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
There are 3 changes:
1. Replace `anchorQueue` with `anchorTree` to be able to always walk the
anchors in the order of increasing blockHeights (not possible with the
queue) to prioritise making progress on the lowest block heights
2. Not increment `nextRetryTime` if the request was not sent
3. Reduce the strides in skeleton from `8*192` to `192` to reduce
reliance of the long series of requests to make progress
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro-2.local>
1. Replacing temporary MBDX table with limited-size btree
2. Always scan block numbers from the start to prioritise low-number
blocks
3. Other fixes and simplifications
Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>