Commit Graph

482 Commits

Author SHA1 Message Date
milen
98cc1ee808
stagedsync: implement bor span for chain reader and fix loggers (#9146)
While working on fixing the bor mining loop I stumbled across an error
in `ChainReader.BorSpan` - not implemented panic. Also hit a few other
panics due to missed logger in `ChainReaderImpl` struct initialisations.
This PR fixes both.
2024-01-05 14:20:21 +00:00
battlmonstr
b57cbdcff7
polygon/sync: canonical chain builder (#9117) 2024-01-04 10:44:57 +01:00
Mark Holt
19bc328a07
Added db loggers to all db callers and fixed flag settings (#9099)
Mdbx now takes a logger - but this has not been pushed to all callers -
meaning it had an invalid logger

This fixes the log propagation.

It also fixed a start-up issue for http.enabled and txpool.disable
created by a previous merge
2023-12-31 17:10:08 +07:00
milen
b562eff482
heimdall: better error logging for clerk/event-record/list nil response (#9103)
Users reported this error
```
[bor.heimdall] an error while trying fetching path=clerk/event-record/list attempt=5 error="unexpected end of JSON input"
```

Which may happen if:

1. Heimdall is behind and not sync-ed - for more info check
https://github.com/maticnetwork/heimdall/pull/993
2. Or the header time erigon is sending is far into the future

The logs in this PR will help us see which of the 2 is the culprit but
most likely it is 1. We will investigate further 2. if it ever happens.

Changes:
1. Improves logging upon heimdall client retries - prints out the full
url that failed.
2. Fixes a bug where the body was incorrectly checked if it is empty -
`len(body) == 0` vs `body == nil`
3. Unit test for the bug regression
4. Adds a log to indicate to users to check their heimdall process if
they run into this scenario since that may be the culprit


Example output with new logs
<img width="1465" alt="Screenshot 2023-12-29 at 20 16 57"
src="https://github.com/ledgerwatch/erigon/assets/94537774/1ebfde68-aa93-41d6-889a-27bef5414f25">
2023-12-30 11:23:25 +00:00
milen
fc9dae1783
heimdall: add max retries to heimdall client (#9098)
Corresponds to the client fix in this PR description -
https://github.com/ledgerwatch/erigon/pull/9096#issue-2058506765
2023-12-28 17:57:44 +00:00
milen
f8cc27aebd
heimdall: use span id as naming (#9097)
follow up on naming as suggested here
https://github.com/ledgerwatch/erigon/pull/9096#pullrequestreview-1798218317
2023-12-28 17:49:31 +00:00
milen
1f237c0aaf
borheimdall: only fetch next span when in last sprint of current span (#9096)
Heimdall prepares the next span a number of sprints before the current
span ends. Currently we always fetch the next span regardless of which
sprint we are in during the current span. This causes a liveness issue
due to how the Heimdall client works (it infinitely retries until it
fetches a span - this issue will be fixed in a separate PR). This PR
fixes this by matching what bor does - it fetches the next span only in
the last sprint of the current span.

Changes:

- Adds a unit test for the above
- Adds a new function BlockInLastSprintOfSpan
- Some code reorg and cleanup - moves the span num related functions
from the bor package to the span sub package for better logical grouping
2023-12-28 15:52:49 +00:00
milen
67704871c0
borheimdall: add tests for validator set and selected proposers validation (#9089)
Adds unit tests for:
- Bor Heimdall Stage - `checkHeaderExtraData`
- at end of each sprint verifies that the validators in the header extra
data matches the selected proposers from the heimdall span
   - 1 test for selected proposers length mismatch
   - 1 test for selected proposers bytes mismatch
- BorHeimdall Stage - `persistValidatorSets`
- verifies that each header is created by a validator in the validator
set
   - in such situation we set the unwind point
2023-12-28 14:00:09 +00:00
Mark Holt
a3a61701e2
Allow proxy paths in Heimdall URL (#8940)
Add paths to the hiemdall config URL when creating calls so that extra
paths needs by, for example proxy servers are not stripped from the flag
value passed into the process.
2023-12-22 10:48:25 +00:00
battlmonstr
55d37b938c
bor: spanID calculation refactoring (#9040) 2023-12-21 09:52:00 +01:00
battlmonstr
2760eeb961
polygon: astrid sync heimdall wrapper (#9017) 2023-12-20 16:48:37 +01:00
milen
1a6b83b82c
borheimdall: add test for span persistence (#8988)
1. Adds an eth/stagedsync/test package which provides a test Harness
object
2. Adds the first automated test to the bor-heimdall stage regarding
span persistence (more to come in subsequent PRs)
3. Fixes a bug in the bor-heimdall stage which was uncovered with the
test - we do not fetch span 0 when we sync straight from blockNum=0
without snapshots
4. Reorganises all mocks to be placed under ./mock sub-package within
their respective packages
2023-12-14 22:50:59 +02:00
Mark Holt
85ade6b49a
FIx outstanding know header==nil errors + reduce bor heimdall logging (#8878)
This PR has fixes for a number of instances in the bor heimdall stage
where nil headers are either ignored or inadvertently processed.

It also has a demotion of milestone related logging messages to debug
for missing blocks because the process is not at the head of the chain +
a general reduction in periodic logging to 30 secs rather than 20 to
reduce the log output on long runs.

In addition there is a refactor of persistValidatorSets to perform
validator set initiation in a seperate function. This is intended to
clarify the operation of persistValidatorSets - which is till performing
2 actions, persisting the snapshot and then using it to check the header
against synthesized validator set in the snapshot.
2023-12-01 17:52:50 +00:00
milen
9b74cf0384
metrics: use prometheus histogram and summary interfaces (#8808) 2023-11-24 17:50:57 +00:00
milen
230b013096
metrics: separate usage of prometheus counter and gauge interfaces (#8793) 2023-11-24 16:15:12 +01:00
Pratik Patil
59909a7efe
Added TxDependency Metadata to ExtraData in Block Header in Bor for Block-STM (#8037)
This PR adds support to store the transaction dependency (generated by
the block producer) in the block header for bor. This transaction
dependency will then be used by the parallel processor
([Block-STM](https://github.com/ledgerwatch/erigon/pull/7812/)).

I have created another
[PR](https://github.com/ledgerwatch/erigon-lib/pull/1064) in the
erigon-lib repo which adds the `IsParallelUniverse()` function.
2023-11-24 10:26:33 +00:00
Alex Sharov
fdc75df6b5
Bor: increase client timeout from 5 to 10sec (to cover remote server case) (#8801)
I using `https://heimdall-api-testnet.polygon.technology/` and seems
5sec timeout is not enough sometime - even that remote service working
well (node syncing well)

most of timeouts comes from same endpoint: 
```
 [bor.heimdall] request canceled          reason="context deadline exceeded" path=/milestone/lastNoAck attempt=2
```
2023-11-23 16:32:30 +00:00
Alex Sharov
34b9a70b02
bor: add more context to error - to understand where it happened (#8811) 2023-11-23 16:31:45 +00:00
ledgerwatch
19451ac610
Return difficulty check into bor header validation (#8815) 2023-11-23 16:30:58 +00:00
Alex Sharov
43b8cbbdeb
bor: don't hide ctx.Err() (#8792)
log `ctx.Err()` - it can be canceled by many reasons: timeout, etc...
2023-11-22 09:27:50 +07:00
Alex Sharov
f476fe690f
bor: check nil-blocks in other places (#8788) 2023-11-20 12:46:32 +07:00
Alex Sharov
2f17848b76
bor: logs prefix, grep-friendly (#8787) 2023-11-20 12:16:06 +07:00
Mark Holt
f3ce5f8a36
Bor proofgen tests (#8751)
Added initial proof generation tests for polygon reverse flow for devnet

Blocks tested, receipts need trie proof clarification
2023-11-17 10:41:45 +00:00
ledgerwatch
1185587b20
Move validator set snapshot computation to bor_heimdall stage (#8646)
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro-2.local>
2023-11-06 08:24:33 +00:00
ledgerwatch
138dceb639
Make some functions in bor exportable (no-op) (#8650) 2023-11-04 12:54:31 +00:00
ledgerwatch
a77e33e7c4
Introduce extra functions for BorSpans (no-op) (#8648) 2023-11-04 10:59:07 +00:00
Andrew Ashikhmin
8288dcb5c2
erigon-lib: remove unused constants from protocol.go (#8644) 2023-11-03 10:20:36 +01:00
battlmonstr
3698e7f476
devnet: configuration fixes (#8592)
* fix "genesis hash does not match" when dev nodes connect  
The "dev" nodes need to have the same --miner.etherbase in order to
generate the same genesis ExtraData by DeveloperGenesisBlock(). Override
DevnetEtherbase global var that's used if --miner.etherbase is not
passed. (for NonBlockProducer case)

* fix missing private key for the hardcoded DevnetEtherbase  
Fixes panic if SigKey is not found. Bor non-producers will use a default
`DevnetEtherbase` while Dev nodes modify it. Save hardcoded
DevnetEtherbase/DevnetSignPrivateKey into accounts so that SigKey can
recover it.

* refactor devnet.node to contain Node config  
This avoids interface{} type casts and fixes an error with
Heimdall.validatorSet == nil

* add connection retries to rpcCall and Subscribe of requestGenerator  
Fixes "connection refused" errors due to node not ready to handle early
RPC requests.

* fix deadlock in Heimdall.NodeStarted

* fix GetBlockByNumber
Fixes "cannot unmarshal string into Go struct field body.transactions of
type jsonrpc.RPCTransaction"

* demote "no of blocks on childchain is less than confirmations
required" to Info (#8626)

* demote "mismatched pending subpool size" to Debug (#8615)

* revert wiggle testing code
2023-11-01 11:08:47 +01:00
Andrew Ashikhmin
38e91809f9
Revert "Move validator set snapshot computation to bor_heimdall stage… (#8580)
PR #8202 might cause Issue #8550, so reverting it until Alexey's return.

This reverts commit 2ce98f8337.
2023-10-25 14:02:31 +02:00
Andrew Ashikhmin
a54939633e
Numerical instead of lexicographic sorting in borKeyValueConfigHelper (#8560)
Corresponds to Item 3 of https://github.com/maticnetwork/bor/pull/1055
2023-10-23 14:11:32 +02:00
Andrew Ashikhmin
a226b6ca29
Fix wiring of AgraBlock into tx pool (#8555)
Fixes and simplifications to PR #8504
2023-10-23 11:03:46 +02:00
a
436493350e
Sentinel refactor (#8296)
1. changes sentinel to use an http-like interface

2. moves hexutil, crypto/blake2b, metrics packages to erigon-lib
2023-10-22 01:17:18 +02:00
Anshal Shukla
7dce1268ab
Agra HF (#8504)
Adds agra HF to the bor consensus
2023-10-21 01:16:19 +05:30
ledgerwatch
2ce98f8337
Move validator set snapshot computation to bor_heimdall stage (#8202)
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro-2.local>
2023-10-20 18:31:00 +01:00
Alex Sharov
3ac9f493b6
move chainname and snapcfg packages to erigon-lib (#8508) 2023-10-18 13:37:39 +07:00
Alex Sharov
21ebaab208
bor initFrozenSnapshot: parallel erecover (#8488)
on 16-core mumbai's initFrozenSnapshot took 10min (to 38M block)
2023-10-16 13:05:10 +01:00
battlmonstr
757a91c44d
sync: fix a memory leak when header verification fails (#8431)
If HeaderDownload.VerifyHeader always returns false, the memory usage
grows at a fast pace
due to Link objects (containing headers) not deallocated even after the
link queue pruning.
2023-10-14 08:39:43 +07:00
Andrew Ashikhmin
b60642fa5a
Configure EIP-4844 parameters for Gnosis (#8464)
See https://github.com/gnosischain/specs/pull/20 &
https://github.com/gnosischain/specs/pull/24
2023-10-13 11:43:16 +02:00
Mark Holt
7b3570c019
Add block producer progress check (#8447)
Add a check against the inbound headers before publishing a newly mined
block after the wait delay.

If the node received a block while it was processing transactions, or
waiting for its publish slot, do a final check that another node hasn't
already published a block.
2023-10-12 19:08:05 +01:00
Mark Holt
6f7186e0f4
Fix invalid pre-fetched header broadcast (#8442)
Fixes and issue with Polygon validators where locally mined blocks are
broadcast with invalid header hashes because the NewBlock message
constructor was removing the ReceiptHash which contributed to the header
hash.

The results in the bor header validation code not being able to
correctly identify the signer of the header - so header validation
fails.

This also likely fixes part of the bogon-block issue which was
identified by the polygon team.
2023-10-12 08:27:02 +01:00
Mark Holt
0d190ff9e9
Bor rpc config fix (#8413)
This is an additional fix for BorRo to add bor config in the constructor
- otherwise code which accesses chain config will panic.
2023-10-10 15:26:02 +01:00
Anshal Shukla
076dc33232
move borfinality package out of eth (#8407)
- Move borfinality out of eth package
- Adds nil pointer check in bor_verifier
2023-10-09 19:13:31 +01:00
Mark Holt
ca3ad096e1
Bor fix rpcdeamon engine initialization (#8390)
This fixes 2 related issues:

* Now that the bor consensus engine is required for queries it can't be
created based on the pretense of a db directory, but must be based on
chain config read from the db. Using the DB presence causes Bor to get
instantiated for non bor chains which breaks.
* At the moment eth_calls on a remote daemon don't check Bor headers
prior to calling the EVM code as it was just using a fake ETHash
instance - which performs ETH header validation only.

The current version is mostly working but needs adapting to perform lazy
initialization of the engine.
2023-10-06 11:58:08 +01:00
Giulio rebuffo
2294c8c66c
EthereumExecutionService in MockSentry (#8373)
Now we use the ethereum execution service directly:

* Changed sig of InsertChain
* Use of the service in case of PoS
2023-10-05 18:30:19 +02:00
Andrew Ashikhmin
0bd6d77acd
Remove CalcuttaBlock in favour of BlockAlloc (#8371)
System contract upgrades for Polygon are already handled by the
`BlockAlloc` logic and there's no need to duplicate it with the
`CalcuttaBlock` logic (there's no Calcutta in
https://github.com/maticnetwork/bor).
2023-10-05 15:39:57 +02:00
Manav Darji
2d0e091a6e
eth, consensus/bor: handle 503 response from heimdall (#8364)
When a new feature (like for the upcoming `Aalborg` hard fork) for
Polygon hasn't kicked in (in heimdall), the endpoints of heimdall will
now return 503 (Service Unavailable) status code. This PR makes sure
that erigon handles that code separately and doesn't keep retying to
fetch info. It also acts as a notifier of the HF in erigon.

Similar reference PR in bor:
https://github.com/maticnetwork/bor/pull/1023
2023-10-04 13:43:07 +01:00
Mark Holt
3d6d2a7c25
Added fix to allow getroothash to work with no api running (#8342)
Whitelisting calculation of the roothash should not be dependent on the
bor api running. This will not always be the case, for example when
erigon is configured with a separate rpc deamon.

To fix this the calculation has been moved to Bor.

Additionally the redundant Bor API code has been removed as this is not
called by any code and the functionality looks to have migrated to the
turbo/jsonrpc package.
2023-10-02 18:55:31 +01:00
Mark Holt
0bdca6c457
Metrics label fixes (#8339)
Fixes for label discrepancies in collector for summaries etc which have
a template which includes a quantile.

Initial native Prometheus client implementation of metrics - which is
currently turned off except for local testing and interface exports.
2023-10-02 17:19:02 +01:00
Mark Holt
f99f326363
Bor fix frozen snapshot load (#8305)
This is a fix for at least one cause of this isssue:
https://github.com/ledgerwatch/erigon/issues/8212.

It happens after the end of the snapshot load, because the snapshot
processing which was introduced a couple of month ago does not deal with
validation of the headers at the start of the start of the chain.

I have also added a fix to the persistence so that the last snapshot is
recorded so that subsequent runs are not forced to process the whole
snapshot run from start.

The relationship between this an memory usage is that the fact that
headers are not processed leads to a queue of pending headers with size
of around 5GB. I have not changed any header parameters - so likely a
prolonged stop to header processing will result in a similar level of
memory growth.
2023-09-27 13:45:09 +01:00
Mark Holt
f26c7b389e
Bor break loop onrewind (#8302)
Add code to the headers state to break processing if a bor milestone
rewind is detected.

The rewind processing happens in the bor/heimdall stage - this change
just avoids unnecessary header loading
if a milestone fork is likely to be detected

---------

Co-authored-by: Anshal Shukla <shukla.anshal85@gmail.com>
2023-09-27 13:17:54 +01:00