113 Commits

Author SHA1 Message Date
milen
1a6b83b82c
borheimdall: add test for span persistence (#8988)
1. Adds an eth/stagedsync/test package which provides a test Harness
object
2. Adds the first automated test to the bor-heimdall stage regarding
span persistence (more to come in subsequent PRs)
3. Fixes a bug in the bor-heimdall stage which was uncovered with the
test - we do not fetch span 0 when we sync straight from blockNum=0
without snapshots
4. Reorganises all mocks to be placed under ./mock sub-package within
their respective packages
2023-12-14 22:50:59 +02:00
Mark Holt
85ade6b49a
FIx outstanding know header==nil errors + reduce bor heimdall logging (#8878)
This PR has fixes for a number of instances in the bor heimdall stage
where nil headers are either ignored or inadvertently processed.

It also has a demotion of milestone related logging messages to debug
for missing blocks because the process is not at the head of the chain +
a general reduction in periodic logging to 30 secs rather than 20 to
reduce the log output on long runs.

In addition there is a refactor of persistValidatorSets to perform
validator set initiation in a seperate function. This is intended to
clarify the operation of persistValidatorSets - which is till performing
2 actions, persisting the snapshot and then using it to check the header
against synthesized validator set in the snapshot.
2023-12-01 17:52:50 +00:00
milen
9b74cf0384
metrics: use prometheus histogram and summary interfaces (#8808) 2023-11-24 17:50:57 +00:00
milen
230b013096
metrics: separate usage of prometheus counter and gauge interfaces (#8793) 2023-11-24 16:15:12 +01:00
Pratik Patil
59909a7efe
Added TxDependency Metadata to ExtraData in Block Header in Bor for Block-STM (#8037)
This PR adds support to store the transaction dependency (generated by
the block producer) in the block header for bor. This transaction
dependency will then be used by the parallel processor
([Block-STM](https://github.com/ledgerwatch/erigon/pull/7812/)).

I have created another
[PR](https://github.com/ledgerwatch/erigon-lib/pull/1064) in the
erigon-lib repo which adds the `IsParallelUniverse()` function.
2023-11-24 10:26:33 +00:00
Alex Sharov
fdc75df6b5
Bor: increase client timeout from 5 to 10sec (to cover remote server case) (#8801)
I using `https://heimdall-api-testnet.polygon.technology/` and seems
5sec timeout is not enough sometime - even that remote service working
well (node syncing well)

most of timeouts comes from same endpoint: 
```
 [bor.heimdall] request canceled          reason="context deadline exceeded" path=/milestone/lastNoAck attempt=2
```
2023-11-23 16:32:30 +00:00
Alex Sharov
34b9a70b02
bor: add more context to error - to understand where it happened (#8811) 2023-11-23 16:31:45 +00:00
ledgerwatch
19451ac610
Return difficulty check into bor header validation (#8815) 2023-11-23 16:30:58 +00:00
Alex Sharov
43b8cbbdeb
bor: don't hide ctx.Err() (#8792)
log `ctx.Err()` - it can be canceled by many reasons: timeout, etc...
2023-11-22 09:27:50 +07:00
Alex Sharov
f476fe690f
bor: check nil-blocks in other places (#8788) 2023-11-20 12:46:32 +07:00
Alex Sharov
2f17848b76
bor: logs prefix, grep-friendly (#8787) 2023-11-20 12:16:06 +07:00
Mark Holt
f3ce5f8a36
Bor proofgen tests (#8751)
Added initial proof generation tests for polygon reverse flow for devnet

Blocks tested, receipts need trie proof clarification
2023-11-17 10:41:45 +00:00
ledgerwatch
1185587b20
Move validator set snapshot computation to bor_heimdall stage (#8646)
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro-2.local>
2023-11-06 08:24:33 +00:00
ledgerwatch
138dceb639
Make some functions in bor exportable (no-op) (#8650) 2023-11-04 12:54:31 +00:00
ledgerwatch
a77e33e7c4
Introduce extra functions for BorSpans (no-op) (#8648) 2023-11-04 10:59:07 +00:00
Andrew Ashikhmin
8288dcb5c2
erigon-lib: remove unused constants from protocol.go (#8644) 2023-11-03 10:20:36 +01:00
battlmonstr
3698e7f476
devnet: configuration fixes (#8592)
* fix "genesis hash does not match" when dev nodes connect  
The "dev" nodes need to have the same --miner.etherbase in order to
generate the same genesis ExtraData by DeveloperGenesisBlock(). Override
DevnetEtherbase global var that's used if --miner.etherbase is not
passed. (for NonBlockProducer case)

* fix missing private key for the hardcoded DevnetEtherbase  
Fixes panic if SigKey is not found. Bor non-producers will use a default
`DevnetEtherbase` while Dev nodes modify it. Save hardcoded
DevnetEtherbase/DevnetSignPrivateKey into accounts so that SigKey can
recover it.

* refactor devnet.node to contain Node config  
This avoids interface{} type casts and fixes an error with
Heimdall.validatorSet == nil

* add connection retries to rpcCall and Subscribe of requestGenerator  
Fixes "connection refused" errors due to node not ready to handle early
RPC requests.

* fix deadlock in Heimdall.NodeStarted

* fix GetBlockByNumber
Fixes "cannot unmarshal string into Go struct field body.transactions of
type jsonrpc.RPCTransaction"

* demote "no of blocks on childchain is less than confirmations
required" to Info (#8626)

* demote "mismatched pending subpool size" to Debug (#8615)

* revert wiggle testing code
2023-11-01 11:08:47 +01:00
Andrew Ashikhmin
38e91809f9
Revert "Move validator set snapshot computation to bor_heimdall stage… (#8580)
PR #8202 might cause Issue #8550, so reverting it until Alexey's return.

This reverts commit 2ce98f8337f15a07424b43f939def47d7a546778.
2023-10-25 14:02:31 +02:00
Andrew Ashikhmin
a226b6ca29
Fix wiring of AgraBlock into tx pool (#8555)
Fixes and simplifications to PR #8504
2023-10-23 11:03:46 +02:00
a
436493350e
Sentinel refactor (#8296)
1. changes sentinel to use an http-like interface

2. moves hexutil, crypto/blake2b, metrics packages to erigon-lib
2023-10-22 01:17:18 +02:00
Anshal Shukla
7dce1268ab
Agra HF (#8504)
Adds agra HF to the bor consensus
2023-10-21 01:16:19 +05:30
ledgerwatch
2ce98f8337
Move validator set snapshot computation to bor_heimdall stage (#8202)
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro-2.local>
2023-10-20 18:31:00 +01:00
Alex Sharov
3ac9f493b6
move chainname and snapcfg packages to erigon-lib (#8508) 2023-10-18 13:37:39 +07:00
Alex Sharov
21ebaab208
bor initFrozenSnapshot: parallel erecover (#8488)
on 16-core mumbai's initFrozenSnapshot took 10min (to 38M block)
2023-10-16 13:05:10 +01:00
battlmonstr
757a91c44d
sync: fix a memory leak when header verification fails (#8431)
If HeaderDownload.VerifyHeader always returns false, the memory usage
grows at a fast pace
due to Link objects (containing headers) not deallocated even after the
link queue pruning.
2023-10-14 08:39:43 +07:00
Mark Holt
7b3570c019
Add block producer progress check (#8447)
Add a check against the inbound headers before publishing a newly mined
block after the wait delay.

If the node received a block while it was processing transactions, or
waiting for its publish slot, do a final check that another node hasn't
already published a block.
2023-10-12 19:08:05 +01:00
Mark Holt
6f7186e0f4
Fix invalid pre-fetched header broadcast (#8442)
Fixes and issue with Polygon validators where locally mined blocks are
broadcast with invalid header hashes because the NewBlock message
constructor was removing the ReceiptHash which contributed to the header
hash.

The results in the bor header validation code not being able to
correctly identify the signer of the header - so header validation
fails.

This also likely fixes part of the bogon-block issue which was
identified by the polygon team.
2023-10-12 08:27:02 +01:00
Mark Holt
0d190ff9e9
Bor rpc config fix (#8413)
This is an additional fix for BorRo to add bor config in the constructor
- otherwise code which accesses chain config will panic.
2023-10-10 15:26:02 +01:00
Anshal Shukla
076dc33232
move borfinality package out of eth (#8407)
- Move borfinality out of eth package
- Adds nil pointer check in bor_verifier
2023-10-09 19:13:31 +01:00
Mark Holt
ca3ad096e1
Bor fix rpcdeamon engine initialization (#8390)
This fixes 2 related issues:

* Now that the bor consensus engine is required for queries it can't be
created based on the pretense of a db directory, but must be based on
chain config read from the db. Using the DB presence causes Bor to get
instantiated for non bor chains which breaks.
* At the moment eth_calls on a remote daemon don't check Bor headers
prior to calling the EVM code as it was just using a fake ETHash
instance - which performs ETH header validation only.

The current version is mostly working but needs adapting to perform lazy
initialization of the engine.
2023-10-06 11:58:08 +01:00
Andrew Ashikhmin
0bd6d77acd
Remove CalcuttaBlock in favour of BlockAlloc (#8371)
System contract upgrades for Polygon are already handled by the
`BlockAlloc` logic and there's no need to duplicate it with the
`CalcuttaBlock` logic (there's no Calcutta in
https://github.com/maticnetwork/bor).
2023-10-05 15:39:57 +02:00
Manav Darji
2d0e091a6e
eth, consensus/bor: handle 503 response from heimdall (#8364)
When a new feature (like for the upcoming `Aalborg` hard fork) for
Polygon hasn't kicked in (in heimdall), the endpoints of heimdall will
now return 503 (Service Unavailable) status code. This PR makes sure
that erigon handles that code separately and doesn't keep retying to
fetch info. It also acts as a notifier of the HF in erigon.

Similar reference PR in bor:
https://github.com/maticnetwork/bor/pull/1023
2023-10-04 13:43:07 +01:00
Mark Holt
3d6d2a7c25
Added fix to allow getroothash to work with no api running (#8342)
Whitelisting calculation of the roothash should not be dependent on the
bor api running. This will not always be the case, for example when
erigon is configured with a separate rpc deamon.

To fix this the calculation has been moved to Bor.

Additionally the redundant Bor API code has been removed as this is not
called by any code and the functionality looks to have migrated to the
turbo/jsonrpc package.
2023-10-02 18:55:31 +01:00
Mark Holt
0bdca6c457
Metrics label fixes (#8339)
Fixes for label discrepancies in collector for summaries etc which have
a template which includes a quantile.

Initial native Prometheus client implementation of metrics - which is
currently turned off except for local testing and interface exports.
2023-10-02 17:19:02 +01:00
Mark Holt
f99f326363
Bor fix frozen snapshot load (#8305)
This is a fix for at least one cause of this isssue:
https://github.com/ledgerwatch/erigon/issues/8212.

It happens after the end of the snapshot load, because the snapshot
processing which was introduced a couple of month ago does not deal with
validation of the headers at the start of the start of the chain.

I have also added a fix to the persistence so that the last snapshot is
recorded so that subsequent runs are not forced to process the whole
snapshot run from start.

The relationship between this an memory usage is that the fact that
headers are not processed leads to a queue of pending headers with size
of around 5GB. I have not changed any header parameters - so likely a
prolonged stop to header processing will result in a similar level of
memory growth.
2023-09-27 13:45:09 +01:00
Mark Holt
f26c7b389e
Bor break loop onrewind (#8302)
Add code to the headers state to break processing if a bor milestone
rewind is detected.

The rewind processing happens in the bor/heimdall stage - this change
just avoids unnecessary header loading
if a milestone fork is likely to be detected

---------

Co-authored-by: Anshal Shukla <shukla.anshal85@gmail.com>
2023-09-27 13:17:54 +01:00
Andrew Ashikhmin
d8d16c3c7c
Fix Errorf (#8241)
Small fix after PR #8239
2023-09-19 17:15:23 +02:00
Mark Holt
f3902ef589
Add flag test to prevent milestone services from starting (#8239) 2023-09-19 14:10:09 +01:00
Mark Holt
3b45f53f3d
Milestone stage processing (#8187)
This is the second part of the bor milestone release it contains the
following changes:

* Initialize services
* This is a change from the initial pull request I have moved all of the
initialization to the bor engine. To facilitate this I have just passed
in the heimdall client interface, rather than the whole engine
* Stage processing 
* This is also a change from the original PR - the code is contained in
the bor heimdall stage rather than in headers - the effect should be the
same, but this needs testing

---------

Co-authored-by: Mark Holt <mark@disributed.vision>
Co-authored-by: Anshal Shukla <shukla.anshal85@gmail.com>
2023-09-18 18:05:33 +01:00
Mark Holt
33d8c08c1c
Get vote on hash (#8172)
This is the initial merge for polygon milestones it implements an rpc
call used by heimdall but does not directly impact any chain processing
2023-09-13 11:49:49 +01:00
Mark Holt
8ea0096d56
moved metrics sub packages types to metrics (#8119)
This is a non functional change which consolidates the various packages
under metrics into the top level package now that the dead code is
removed.

It is a precursor to the removal of Victoria metrics after which all
erigon metrics code will be contained in this single package.
2023-09-03 08:09:27 +07:00
Mark Holt
a4cfbe0d56
Heimdall metrics + Metrics HTTP server rationalization (#8094)
This is an update of:

https://github.com/ledgerwatch/erigon/pull/7846

which uses a local fork of victoria metrics to include the changes that
https://github.com/anshalshukla added to the original for we where
using.

It also includes code to address the duplicate metrics issue identified
here:

https://github.com/ledgerwatch/erigon/issues/8053

It has one more associated fix which is to correctly add a metadata
label to counters, these where previously labelled as gauges.

e.g. 

```
# TYPE p2p_peers counter
p2p_peers 0
```
rather than

```
# TYPE p2p_peers gauge
p2p_peers 0
```

---------

Co-authored-by: Anshal Shukla <53994948+anshalshukla@users.noreply.github.com>
Co-authored-by: Anshal Shukla <shukla.anshal85@gmail.com>
2023-08-31 09:04:27 +01:00
Andrew Ashikhmin
9b63764b16
Move ApplyDAOHardFork & UpgradeBuildInSystemContract to engine.Initialize (#8095)
Now all protocol-stipulated changes at the beginning of the block (AuRa
stuff,
[DAO](https://github.com/ethereum/execution-specs/blob/master/network-upgrades/mainnet-upgrades/dao-fork.md)
irregular state change, Calcutta system contract upgrade,
[EIP-4788](https://eips.ethereum.org/EIPS/eip-4788) beacon root) are
handled by consensus engine `Initialize()`.
2023-08-30 15:51:19 +02:00
Mark Holt
f05a6ab43e
Bor mining benchmark (#8096)
Replacement for: https://github.com/ledgerwatch/erigon/pull/7998 with
windows fixes

---------

Co-authored-by: SHIVAM SHARMA <shivam691999@gmail.com>
2023-08-30 10:25:02 +01:00
Alex Sharov
e5cde45936
[wip]: test non-nil compress.Next (#8072)
Co-authored-by: Mark Holt <mark@distributed.vision>
2023-08-25 12:53:05 +01:00
Mark Holt
c51573f333
Bor eth event flow (#8068)
Implemented polygon->eth flow
2023-08-25 12:19:39 +01:00
Andrew Ashikhmin
1fd9d20e14
EIP-4788 v2 (no precompile) (#8038)
See https://github.com/ethereum/EIPs/pull/7456 &
https://github.com/ethereum/go-ethereum/pull/27849. Also set the gas
limit for system calls to 30M (previously 2^64-1), which is in line with
the [Gnosis
spec](https://github.com/gnosischain/specs/blob/master/execution/withdrawals.md#specification),
but should be doubled checked for Gnosis Chain.
2023-08-24 17:10:50 +02:00
battlmonstr
2e29ff33e1
bor: BroadcastNewBlock to all peers from validator nodes (#8030)
Currently PropagateNewBlockHashes and BroadcastNewBlock
selects a subset of all sentries by taking a `Sqrt(len(sentries))`,
and then for each sentry SendMessageToRandomPeers
selects a subset of its peers by taking `Sqrt(len(peerInfos))`.

This behaviour limits the broadcast scope with a lot of peers, e.g. 100
becomes 10,
but is not great with very few peers, or if the message is very
important
to broadcast to everyone, which is the case of bor validator/proposer
nodes.

* send to all sentries in both BroadcastNewBlock and PropagateNewBlockHashes
* remove peerCountConstrained sqrt logic in SendMessageToRandomPeers
* add maxPeers provider func as a parameter to MultiClient
* default it to 10 for eth and 0 (unlimited) for bor validators

---------

Co-authored-by: Mark Holt <mark@distributed.vision>
2023-08-23 14:28:39 +02:00
ledgerwatch
6b6c0caad0
Snapshots of Bor events (#7901)
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro-2.local>
Co-authored-by: Alex Sharp <alexsharp@alexs-mbp-2.home>
2023-08-18 17:10:35 +01:00
Andrew Ashikhmin
03927d3e27
Call InitializeBlockExecution in SpawnMiningExecStage (EIP-4788) (#7999)
This fixes the trie state root issue that was occurring in the Hive
tests for Cancun.
2023-08-11 14:04:53 +02:00