If HeaderDownload.VerifyHeader always returns false, the memory usage
grows at a fast pace
due to Link objects (containing headers) not deallocated even after the
link queue pruning.
Add a check against the inbound headers before publishing a newly mined
block after the wait delay.
If the node received a block while it was processing transactions, or
waiting for its publish slot, do a final check that another node hasn't
already published a block.
Fixes and issue with Polygon validators where locally mined blocks are
broadcast with invalid header hashes because the NewBlock message
constructor was removing the ReceiptHash which contributed to the header
hash.
The results in the bor header validation code not being able to
correctly identify the signer of the header - so header validation
fails.
This also likely fixes part of the bogon-block issue which was
identified by the polygon team.
This fixes 2 related issues:
* Now that the bor consensus engine is required for queries it can't be
created based on the pretense of a db directory, but must be based on
chain config read from the db. Using the DB presence causes Bor to get
instantiated for non bor chains which breaks.
* At the moment eth_calls on a remote daemon don't check Bor headers
prior to calling the EVM code as it was just using a fake ETHash
instance - which performs ETH header validation only.
The current version is mostly working but needs adapting to perform lazy
initialization of the engine.
System contract upgrades for Polygon are already handled by the
`BlockAlloc` logic and there's no need to duplicate it with the
`CalcuttaBlock` logic (there's no Calcutta in
https://github.com/maticnetwork/bor).
When a new feature (like for the upcoming `Aalborg` hard fork) for
Polygon hasn't kicked in (in heimdall), the endpoints of heimdall will
now return 503 (Service Unavailable) status code. This PR makes sure
that erigon handles that code separately and doesn't keep retying to
fetch info. It also acts as a notifier of the HF in erigon.
Similar reference PR in bor:
https://github.com/maticnetwork/bor/pull/1023
Whitelisting calculation of the roothash should not be dependent on the
bor api running. This will not always be the case, for example when
erigon is configured with a separate rpc deamon.
To fix this the calculation has been moved to Bor.
Additionally the redundant Bor API code has been removed as this is not
called by any code and the functionality looks to have migrated to the
turbo/jsonrpc package.
Fixes for label discrepancies in collector for summaries etc which have
a template which includes a quantile.
Initial native Prometheus client implementation of metrics - which is
currently turned off except for local testing and interface exports.
This is a fix for at least one cause of this isssue:
https://github.com/ledgerwatch/erigon/issues/8212.
It happens after the end of the snapshot load, because the snapshot
processing which was introduced a couple of month ago does not deal with
validation of the headers at the start of the start of the chain.
I have also added a fix to the persistence so that the last snapshot is
recorded so that subsequent runs are not forced to process the whole
snapshot run from start.
The relationship between this an memory usage is that the fact that
headers are not processed leads to a queue of pending headers with size
of around 5GB. I have not changed any header parameters - so likely a
prolonged stop to header processing will result in a similar level of
memory growth.
Add code to the headers state to break processing if a bor milestone
rewind is detected.
The rewind processing happens in the bor/heimdall stage - this change
just avoids unnecessary header loading
if a milestone fork is likely to be detected
---------
Co-authored-by: Anshal Shukla <shukla.anshal85@gmail.com>
This is the second part of the bor milestone release it contains the
following changes:
* Initialize services
* This is a change from the initial pull request I have moved all of the
initialization to the bor engine. To facilitate this I have just passed
in the heimdall client interface, rather than the whole engine
* Stage processing
* This is also a change from the original PR - the code is contained in
the bor heimdall stage rather than in headers - the effect should be the
same, but this needs testing
---------
Co-authored-by: Mark Holt <mark@disributed.vision>
Co-authored-by: Anshal Shukla <shukla.anshal85@gmail.com>
This is a non functional change which consolidates the various packages
under metrics into the top level package now that the dead code is
removed.
It is a precursor to the removal of Victoria metrics after which all
erigon metrics code will be contained in this single package.
This is an update of:
https://github.com/ledgerwatch/erigon/pull/7846
which uses a local fork of victoria metrics to include the changes that
https://github.com/anshalshukla added to the original for we where
using.
It also includes code to address the duplicate metrics issue identified
here:
https://github.com/ledgerwatch/erigon/issues/8053
It has one more associated fix which is to correctly add a metadata
label to counters, these where previously labelled as gauges.
e.g.
```
# TYPE p2p_peers counter
p2p_peers 0
```
rather than
```
# TYPE p2p_peers gauge
p2p_peers 0
```
---------
Co-authored-by: Anshal Shukla <53994948+anshalshukla@users.noreply.github.com>
Co-authored-by: Anshal Shukla <shukla.anshal85@gmail.com>
Currently PropagateNewBlockHashes and BroadcastNewBlock
selects a subset of all sentries by taking a `Sqrt(len(sentries))`,
and then for each sentry SendMessageToRandomPeers
selects a subset of its peers by taking `Sqrt(len(peerInfos))`.
This behaviour limits the broadcast scope with a lot of peers, e.g. 100
becomes 10,
but is not great with very few peers, or if the message is very
important
to broadcast to everyone, which is the case of bor validator/proposer
nodes.
* send to all sentries in both BroadcastNewBlock and PropagateNewBlockHashes
* remove peerCountConstrained sqrt logic in SendMessageToRandomPeers
* add maxPeers provider func as a parameter to MultiClient
* default it to 10 for eth and 0 (unlimited) for bor validators
---------
Co-authored-by: Mark Holt <mark@distributed.vision>
This request is extending the devnet functionality to more fully handle
contract processing by adding support for the following calls:
* trace_call,
* trace_transaction
* debug_accountAt,
* eth_getCode
* eth_estimateGas
* eth_gasPrice
It also contains an initial rationalization of the devnet subscription
code to use the erigon client code directly rather than using its own
intermediate subscription management.
This is used to implement a general purpose block waiter - which can be
used in any scenario step - rather than being specific to transaction
processing.
This pull also contains an end to end tested sync processor for bor and
associated support services:
* Heimdall (supports sync event transfer)
* Faucet - allows the creation and funding of arbitary test specific
accounts (cross chain)
Notes and Caveats:
* Code generation for contracts requires `--evm-version paris`. For
chains which don't support push0 for solc over 0.8.19
* The bor log processing post the application of sync events causes a
panic - this will be the subject of a seperate smaller push as it is not
devnet specific
* The bor code seems to make repeated calls for the same sync events and
also reverts requests - this needs further investigation. This is the
behaviour of the current implementation and may be required - although
it does seem to generate repeat processing - which could be avoided.
An update to the devnet to introduce a local heimdall to facilitate
multiple validators without the need for an external process, and hence
validator registration/staking etc.
In this initial release only span generation is supported.
It has the following changes:
* Introduction of a local grpc heimdall interface
* Allocation of accounts via a devnet account generator ()
* Introduction on 'Services' for the network config
"--chain bor-devnet --bor.localheimdall" will run a 2 validator network
with a local service
"--chain bor-devnet --bor.withoutheimdall" will sun a single validator
with no heimdall service as before
---------
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro-2.local>
This request implements the insertion of Bor ephemeral transactions into
snapshot indexes.
I does this by taking the block hash from the header index and passing
it to the transaction indexer to add an additional index entry per block
into the transaction hash -> block index.
The passed entries are currently contained in an in memory array which
is (32 * number of blocks / sprint size) bytes.
In addition to the functional code there is also an update to the
`dump_test.go` so that it runs `DumpBlocks` to exercise the indexing
code. To facilitate this the `InsertChain` method in `mock_sentry` has
been modified so that it can process >128 blocks.
The code in this request also includes additional bor/consensus code
with the following functions:
`CalculateSprint`
`CalculateSprintCount`
The first function is a modification of the code in erigon-lib so that
the sprints are numerically rather than lexically ordered. This code
should be migrated to erigon-lib and should have its sprint set
calculated once from its underlying map rather than this process being
repeated every calculation.
---------
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro-2.local>
Co-authored-by: ledgerwatch <akhounov@gmail.com>
Co-authored-by: Enrique Jose Avila Asapche <eavilaasapche@gmail.com>
Co-authored-by: Giulio <giulio.rebuffo@gmail.com>
Changes summary:
- Continue with the gasLimit check skip in ``verifyHeader`` of
``merge.go`` for unless pre-merge block and blockGasLimitContract
present
- Refactor ``aura.go`` a bit
- Have ``sysCall`` method customized to be able to call state (contract)
at a parent (or any other) header state
This PR does the following things.
- Optimises the get span for bor function. The function was responsible
to fetch span from heimdall and store in cache. If not found, it would
iterate back (or front) depending on the last found span in cache. In
this iteration it would also fetch the span every time it moves front /
back which is not necessary at all. As we know the `spanLength` we can
leverage it to directly jump to the required span ID without fetching
all the intermediate ones.
- Adds a check for `number > 255` in validating producers from headers'
extra data with ones in span. As bor fetches this data from contract, it
used to give correct results for 0th span (i.e. 0-255 blocks) and hence
no error occurs. Erigon on the other hand directly uses span to get
producers for all blocks. Hence, the data in 0th span turns out to be
wrong (as it's hardcoded in contract). We can skip validation for 0th
span blocks until we start fetching data from contract.
- As we're planning to use erigon as a validator, it will also be
responsible for preparing headers. It used to write all the validators
in the `header.Extra` field instead of just the selected producers. As
we have `GetCurrentProducers` function available now, we can use it
instead of `GetCurrentValidators`.
I realised that the term `checkpoint` is used in 2 different meanings in
the code, which are distinct. Renaming one of the to persistentSnapshots
to reduce confusion
This PR does the following things:
- Updates the hardfork number of the upcoming Indore hardfork schedule
at block 36877056.
- Refactoring to `CommitStates` method of bor consensus
- Fixes a bug in triggering mining
I've added a non root logger to bor.ValidatorSet validator set. This
creates a signature change on a number of calling functions to propagate
the logger. This is mostly constrained to the bor package but impacts a
number of tests and utilities which call the validators set.