Commit Graph

423 Commits

Author SHA1 Message Date
battlmonstr
2793ef6ec1
polygon: flatten redundant packages (#9241)
* move mocks to the owner packages
* squash single file packages
* move types to more appropriate files
* remove unused mocks
2024-01-16 09:23:02 +01:00
Mark Holt
d8b91c4d02
Fix startup sync for txpool processing for bor block production (#9219)
When the sync loop first runs it suppresses block sync events both in
the initial loop and when the blocks being processed are greater than
1000.

This fix removed the first check, because otherwise the first block
received by the process ends up not getting sent to the tx pool. Which
means it won't produce new block for polygon.

As well as this fix - I have also moved the gas initialization to the
txpool start method rather than prompting it with a 'synthetic block
event'

As the txpool start has access to the core & tx DB's it can find the
current block and chain config internally so that it doesn't need to be
externally activated it can just do this itself on start up. This has
the advantage of making the txpool more self contained.
2024-01-13 10:33:34 +00:00
Alex Sharov
3bb1917e8a
recsplit: reduce ram pressure (#9218)
reasons: 
- indexing done in background (or in many workers)
- `recsplit` has 2 etl collectors
2024-01-12 11:26:20 +01:00
Mark Holt
b05ffc909d
Fixes for Bor Block Production Synchronization (#9162)
This PR contains 3 fixes for interaction between the Bor mining loop and
the TX pool which where causing the regular creation of blocks with zero
transactions.

* Mining/Tx pool block synchronization
The synchronization of the tx pool between the sync loop and the mining
loop has been changed so that both are triggered by the same event and
synchronized via a sync.Cond rather than a polling loop with a hard
coded loop limit. This means that mining now waits for the pool to be
updated from the previous block before it starts the mining process.
* Txpool Startup consolidated into its MainLoop
Previously the tx pool start process was dynamically triggered at
various points in the code. This has all now been moved to the start of
the main loop. This is necessary to avoid a timing hole which can leave
the mining loop hanging waiting for a previously block broadcast which
it missed due to its delay start.
* Mining listens for block broadcast to avoid duplicate mining
operations
The mining loop for bor has a recommit timer in case blocks re not
produced on time. However in the case of sprint transitions where the
seal publication is delayed this can lead to duplicate block production.
This is suppressed by introducing a `waiting` state which is exited upon
the block being broadcast from the sealing operation.
2024-01-10 17:12:15 +00:00
battlmonstr
9c47cce62c
bor: move to polygon directory (#9174) 2024-01-09 19:20:42 +01:00
Dmytro
ff92b701c3
dvovk/updsync (#9134)
refactored data structure for sync statistics
2024-01-08 10:43:04 +01:00
Mark Holt
15ff41876c
Change retire progress log level to debug (#9153)
This moved the log level of retire progress messaging to debug, to avoid
log nose on qa and test runs
2024-01-08 07:10:45 +07:00
battlmonstr
b57cbdcff7
polygon/sync: canonical chain builder (#9117) 2024-01-04 10:44:57 +01:00
Alex Sharov
82822ee602
erigon snapshots integrity: add check for body.BaseTxnID (#9121) 2024-01-04 14:19:37 +07:00
Dmytro
777f5dcd61
added collection for log prefix (#9118) 2024-01-03 08:13:56 +07:00
Giulio rebuffo
46ecf030f5
Added GET /eth/v1/beacon/rewards/blocks/{block_id} and POST /eth/v1/beacon/rewards/sync_committee/{block_id} (#9102)
* Changed slightly archive format (again)
* Added all of the remaining rewards endpoints
2023-12-30 20:51:28 +01:00
milen
f8cc27aebd
heimdall: use span id as naming (#9097)
follow up on naming as suggested here
https://github.com/ledgerwatch/erigon/pull/9096#pullrequestreview-1798218317
2023-12-28 17:49:31 +00:00
milen
1f237c0aaf
borheimdall: only fetch next span when in last sprint of current span (#9096)
Heimdall prepares the next span a number of sprints before the current
span ends. Currently we always fetch the next span regardless of which
sprint we are in during the current span. This causes a liveness issue
due to how the Heimdall client works (it infinitely retries until it
fetches a span - this issue will be fixed in a separate PR). This PR
fixes this by matching what bor does - it fetches the next span only in
the last sprint of the current span.

Changes:

- Adds a unit test for the above
- Adds a new function BlockInLastSprintOfSpan
- Some code reorg and cleanup - moves the span num related functions
from the bor package to the span sub package for better logical grouping
2023-12-28 15:52:49 +00:00
Mark Holt
79ed8cad35
E2 snapshot uploading (#9056)
This change introduces additional processes to manage snapshot uploading
for E2 snapshots:

## erigon snapshots upload

The `snapshots uploader` command starts a version of erigon customized
for uploading snapshot files to
a remote location.  

It breaks the stage execution process after the senders stage and then
uses the snapshot stage to send
uploaded headers, bodies and (in the case of polygon) bor spans and
events to snapshot files. Because
this process avoids execution in run signifigantly faster than a
standard erigon configuration.

The uploader uses rclone to send seedable (100K or 500K blocks) to a
remote storage location specified
in the rclone config file.

The **uploader** is configured to minimize disk usage by doing the
following:

* It removes snapshots once they are loaded
* It aggressively prunes the database once entities are transferred to
snapshots

in addition to this it has the following performance related features:

* maximizes the workers allocated to snapshot processing to improve
throughput
* Can be started from scratch by downloading the latest snapshots from
the remote location to seed processing

## snapshots command

Is a stand alone command for managing remote snapshots it has the
following sub commands

* **cmp** - compare snapshots
* **copy** - copy snapshots
* **verify** - verify snapshots
* **manifest** - manage the manifest file in the root of remote snapshot
locations
* **torrent** - manage snapshot torrent files
2023-12-27 22:05:09 +00:00
Mark Holt
df0699a12b
Added sentry simulator implementation (#9087)
This adds a simulator object with implements the SentryServer api but
takes objects from a pre-existing snapshot file.

If the snapshot is not available locally it will download and index the
.seg file for the header range being asked for.

It is created as follows: 

```go
sim, err := simulator.NewSentry(ctx, "mumbai", dataDir, 1, logger)
```

Where the arguments are:

* ctx - a callable context where cancel will close the simulator torrent
and file connections (it also has a Close method)
* chain - the name of the chain to take the snapshots from
* datadir - a directory potentially containing snapshot .seg files. If
not files exist in this directory they will be downloaded
 *  num peers - the number of peers the simulator should create
 *  logger - the loger to log actions to

It can be attached to a client as follows:

```go
simClient := direct.NewSentryClientDirect(66, sim)
```

At the moment only very basic functionality is implemented:

* get headers will return headers by range or hash (hash assumes a
pre-downloaded .seg as it needs an index
* the header replay semantics need to be confirmed
* eth 65 and 66(+) messaging is supported
* For details see: `simulator_test.go

More advanced peer behavior (e.g. header rewriting) can be added
Bodies/Transactions handling can be added
2023-12-27 14:56:57 +00:00
Alex Sharov
e08003f1f7
block retire: merge all possible files (even bor) even if nothing to retire (#9068)
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro-2.local>
2023-12-24 07:32:52 +00:00
Giulio rebuffo
a4d7b6d33f
Switched Caplin snapshot format to ZSTD blinded blocks (#9058)
* Chunked format -> blinded
* LZ4 -> ZSTD
* Implemented parent block root support for history download
* Rationale: Allows to optimize GC collection easily on state
reconstruction and it allows to read fast attestations in historical
states reader
2023-12-23 15:56:35 +01:00
Alex Sharov
2b87d65285
retire: handle case when bor snaps are behind block snaps (#9061) 2023-12-23 16:37:30 +07:00
Dmytro
a36071e7ff
dvovk/snapidx (#9049) 2023-12-22 11:25:55 +00:00
battlmonstr
55d37b938c
bor: spanID calculation refactoring (#9040) 2023-12-21 09:52:00 +01:00
Alex Sharov
7107cfed0f
snaps: stop merge to 500K and enjoy immutability (#9034) 2023-12-21 08:04:46 +07:00
milen
4f95342036
freezeblocks: fix blockreader last frozen bor span and event ids (#9018)
During testing we run into a "span 7813 not found (db)" due to a very
large unwind (1 million blocks).

This is because the block reader's `LastFrozenSpanID` and
`LastFrozenEventID` returned results that are not consistent with
`FrozenBorBlocks`. The latter is taking into account the existence of
`.idx` files while the former 2 functions were not.

Note such a large unwind is not likely to happen normally unless there
is a bug in our unwind logic or an operator is manually unwinding very
far back due to reasons like chain halts (ie mumbai bug problem from few
months ago), devel testing or anything else along these lines.
Regardless, it exposed the above discrepancy which is best to be fixed.
2023-12-18 19:13:21 +02:00
Alex Sharov
1468317efd
erigon snapshots index: build bor indices (#9009) 2023-12-18 17:46:50 +07:00
Dmytro
e82147caf3
added collecting info about snapshot indexing, renamed downloading prop (#8987) 2023-12-15 07:23:26 +07:00
Dmytro
ac1e42b68d
added grabbing info about downloaded metadata (#8972) 2023-12-13 21:04:14 +07:00
Alex Sharov
d41d523050
Downloader: add ProhibitNewDownloads() (#8939)
"whitelisting" mechanism (list of files - stored in DB) - which
protecting us from downloading new files after upgrade/downgrade was
broken. And seems it became over-complicated with time.
I replacing it by 1 persistent flag inside downloader:
"prohibit_new_downloads.lock"
Erigon will turn downloader into this mode after
downloading/verification of first snapshots.


```
//Corner cases:
	// - Erigon generated file X with hash H1. User upgraded Erigon. New version has preverified file X with hash H2. Must ignore H2 (don't send to Downloader)
	// - Erigon "download once": means restart/upgrade/downgrade must not download files (and will be fast)
	// - After "download once" - Erigon will produce and seed new files
```

------
`downloader --seedbox` is never "prohibit new downloads"
2023-12-12 16:05:56 +07:00
Dmytro
4696769d25
dvovk/snapshotsstats (#8935)
Updated collecting snapshots, renamed keys
2023-12-08 21:07:59 +07:00
Alex Sharov
754276909b
bor snaps: "erigon snapshots retire" to build bor files (#8912) 2023-12-06 12:12:43 +00:00
Giulio rebuffo
c477281362
Caplin: Parallel historical states reconstruction (#8817)
What does this PR do:
* Optional Backfilling and Caplin Archive Node
* Create antiquary for historical states
* Fixed gaps of chain gap related to the Head of the chain and anchor of
the chain.
* Added basic reader object to Read the Historical state
2023-12-06 10:48:36 +01:00
Alex Sharov
9bea4e3a9c
blockReader: read blockNum == r.FrozenBlocks() from files (#8890)
Example value of `r.FrozenBlocks()`: `499999`
In future PR I will rename this method to something like
`MaxBlockNumInFiles()`
2023-12-05 13:59:21 +07:00
Giulio rebuffo
a2433455f9
Keep few beacon block headers in mdbx (#8809)
Now keep few beacon block headers in mdbx
2023-11-22 01:45:15 +01:00
Alex Sharov
06c508c02c
downloader: don't create .torrent for too small files (#8785) 2023-11-20 11:14:05 +07:00
Alex Sharov
cc8bdc5185
BlockReader: handle nil-body (#8739)
```
EROR] [11-16|07:33:02.592] RPC method eth_getLogs crashed: runtime error: invalid memory address or nil pointer dereference
Nov 16 07:33:02 i-0e4d4ce2636f49d8a erigon.sh[1739584]: [service.go:219 panic.go:914 panic.go:261 signal_unix.go:861 block_reader.go:615 block_reader.go:405 eth_receipts.go:228 value.go:596 value.go:380 service.go:224 handler.go:493 handler.go:443 handler.go:391 handler.go:222 handler.go:315 asm_amd64.s:1650]
```
2023-11-16 15:54:17 +07:00
Giulio rebuffo
51af060450
Added --beacon.api flags to enable experimental beacon api. (#8727)
Make it so that erigon can the enable beacon api.
2023-11-15 15:07:16 +01:00
Giulio rebuffo
8d8368091c
Add full support to beacon snapshots (#8665)
This PR adds beacon blocks snapshots for the following chains:

* Mainnet snapshots
* Sepolia snapshots
2023-11-13 14:10:57 +01:00
Dmytro
466031ab8f
add fixes (#8673) 2023-11-08 17:02:30 +03:00
Dmytro
9c7c758bda
added snapshot sync diagnostic information, updated diagnostic channel (#8645) 2023-11-07 12:50:36 +00:00
Giulio rebuffo
e67db34145
Caplin: Bumbed down snapshots from 500 to 100k (#8657) 2023-11-06 15:41:19 +01:00
ledgerwatch
a77e33e7c4
Introduce extra functions for BorSpans (no-op) (#8648) 2023-11-04 10:59:07 +00:00
ledgerwatch
1ffa3fcf94
Properly remove borspans snapshots after merges (#8647) 2023-11-04 09:58:56 +00:00
Alex Sharov
329d18ef6f
snapshots: reduce merge limit of blocks to 100K (#8614)
Reason: 
- produce and seed snapshots earlier on chain tip. reduce depnedency on
"good peers with history" at p2p-network.
Some networks have no much archive peers, also ConsensusLayer clients
are not-good(not-incentivised) at serving history.
- avoiding having too much files:
more files(shards) - means "more metadata", "more lookups for
non-indexed queries", "more dictionaries", "more bittorrent
connections", ...
less files - means small files will be removed after merge (no peers for
this files).


ToDo:
[x] Recent 500K - merge up to 100K 
[x] Older than 500K - merge up to 500K 
[x] Start seeding 100k files
[x] Stop seeding 100k files after merge (right before delete)

In next PR: 
[] Old version of Erigon must be able download recent hashes. To achieve
it - at first start erigon will download preverified hashes .toml from
s3 - if it's newer that what we have (build-in) - use it.
2023-11-01 23:22:35 +07:00
Giulio rebuffo
513fd50fa5
Compress snapshots for Caplin with lz4 level=1 (#8609) 2023-10-30 13:48:14 +01:00
Alex Sharov
c23e5a1abf
downloader: preparations for reducing blocks merge limit (#8612) 2023-10-30 13:46:35 +07:00
Alex Sharov
b311da959f
downloader: webseed better error messages (#8611) 2023-10-30 12:13:45 +07:00
Giulio rebuffo
0e5af0a69c
Added beacon snapshots download (#8601) 2023-10-28 17:41:50 +02:00
Andrew Ashikhmin
38e91809f9
Revert "Move validator set snapshot computation to bor_heimdall stage… (#8580)
PR #8202 might cause Issue #8550, so reverting it until Alexey's return.

This reverts commit 2ce98f8337.
2023-10-25 14:02:31 +02:00
Alex Sharov
52c4489b04
sapshots: remove modtime protection (#8573)
Historically we had several times when:
- erigon downloaded new version of .seg file
- or didn't finish download and start indexing

this was a "quick-fix protection" against this cases
but now we have other protections for this cases
let's try to remove this one - because it's not compatible with "copy
datadir" and "restore datadir from backup" scenarios
2023-10-25 11:35:45 +07:00
Giulio rebuffo
d7448fdb3f
Added functional beacon snapshots reader and generator to Caplin (#8570)
This PR adds beacon blocks snapshots and beacon blocks snapshot
generator to Caplin, plus a snapshot verifier CLI
2023-10-24 21:32:29 +02:00
Giulio rebuffo
995009ac7b
Added cli too for Snapshots Generations for Caplin (#8543) 2023-10-22 19:21:37 +02:00
ledgerwatch
2ce98f8337
Move validator set snapshot computation to bor_heimdall stage (#8202)
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro-2.local>
2023-10-20 18:31:00 +01:00