Commit Graph

584 Commits

Author SHA1 Message Date
battlmonstr
ebe16d8360
bor: BorConfig setup fix (#9145)
A crash on startup happens on --chain=mumbai , because I've confused
chainConfig.Bor (from type chain.Config) and config.Bor (from type
ethconfig.Config) in the setup code.

ethconfig.Config.Bor property contained bogus values, and was used only
to check its type in CreateConsensusEngine(). Its value was never read
(before PR #9117).

This change removes the property to avoid confusion and fix the crash.

Devnet network.BorStateSyncDelay was implemented using
ethconfig.Config.Bor, but it wasn't taking any effect. It should be
fixed separately in a different way.
2024-01-05 20:03:19 +07:00
Mark Holt
19bc328a07
Added db loggers to all db callers and fixed flag settings (#9099)
Mdbx now takes a logger - but this has not been pushed to all callers -
meaning it had an invalid logger

This fixes the log propagation.

It also fixed a start-up issue for http.enabled and txpool.disable
created by a previous merge
2023-12-31 17:10:08 +07:00
Mark Holt
79ed8cad35
E2 snapshot uploading (#9056)
This change introduces additional processes to manage snapshot uploading
for E2 snapshots:

## erigon snapshots upload

The `snapshots uploader` command starts a version of erigon customized
for uploading snapshot files to
a remote location.  

It breaks the stage execution process after the senders stage and then
uses the snapshot stage to send
uploaded headers, bodies and (in the case of polygon) bor spans and
events to snapshot files. Because
this process avoids execution in run signifigantly faster than a
standard erigon configuration.

The uploader uses rclone to send seedable (100K or 500K blocks) to a
remote storage location specified
in the rclone config file.

The **uploader** is configured to minimize disk usage by doing the
following:

* It removes snapshots once they are loaded
* It aggressively prunes the database once entities are transferred to
snapshots

in addition to this it has the following performance related features:

* maximizes the workers allocated to snapshot processing to improve
throughput
* Can be started from scratch by downloading the latest snapshots from
the remote location to seed processing

## snapshots command

Is a stand alone command for managing remote snapshots it has the
following sub commands

* **cmp** - compare snapshots
* **copy** - copy snapshots
* **verify** - verify snapshots
* **manifest** - manage the manifest file in the root of remote snapshot
locations
* **torrent** - manage snapshot torrent files
2023-12-27 22:05:09 +00:00
Alex Sharov
4eecd8c86c
print_stages: bor snaps info (#9008) 2023-12-21 08:08:55 +07:00
Alex Sharov
1468317efd
erigon snapshots index: build bor indices (#9009) 2023-12-18 17:46:50 +07:00
Alex Sharov
2991a6b645
integration: run_migration must not use Accede mode (#8867) 2023-12-02 18:09:53 +07:00
Alex Sharov
5ff9ce802b
prnt_stages: to show value of --snapshots flag (#8862) 2023-11-30 16:16:53 +07:00
Alex Sharov
9cfed88082
integration run_migrations: open in non-accede exclusive mode - to create new tables (#8821)
sometimes need tool to apply migrations - without starting erigon (if
erigon is buggy, or to prevent snapshots downloading, etc...)
2023-11-25 09:12:44 +07:00
ledgerwatch
2064edc5e6
Add arguments (no-op) (#8653) 2023-11-04 17:44:34 +00:00
battlmonstr
d92898a508
p2p: silkworm sentry (#8527) 2023-11-02 08:35:13 +07:00
Alex Sharov
329d18ef6f
snapshots: reduce merge limit of blocks to 100K (#8614)
Reason: 
- produce and seed snapshots earlier on chain tip. reduce depnedency on
"good peers with history" at p2p-network.
Some networks have no much archive peers, also ConsensusLayer clients
are not-good(not-incentivised) at serving history.
- avoiding having too much files:
more files(shards) - means "more metadata", "more lookups for
non-indexed queries", "more dictionaries", "more bittorrent
connections", ...
less files - means small files will be removed after merge (no peers for
this files).


ToDo:
[x] Recent 500K - merge up to 100K 
[x] Older than 500K - merge up to 500K 
[x] Start seeding 100k files
[x] Stop seeding 100k files after merge (right before delete)

In next PR: 
[] Old version of Erigon must be able download recent hashes. To achieve
it - at first start erigon will download preverified hashes .toml from
s3 - if it's newer that what we have (build-in) - use it.
2023-11-01 23:22:35 +07:00
Alex Sharov
c23e5a1abf
downloader: preparations for reducing blocks merge limit (#8612) 2023-10-30 13:46:35 +07:00
Andrew Ashikhmin
38e91809f9
Revert "Move validator set snapshot computation to bor_heimdall stage… (#8580)
PR #8202 might cause Issue #8550, so reverting it until Alexey's return.

This reverts commit 2ce98f8337.
2023-10-25 14:02:31 +02:00
Alex Sharov
dd09762ebe
integration: open db in Accede mode (#8535) 2023-10-22 08:57:07 +07:00
a
436493350e
Sentinel refactor (#8296)
1. changes sentinel to use an http-like interface

2. moves hexutil, crypto/blake2b, metrics packages to erigon-lib
2023-10-22 01:17:18 +02:00
ledgerwatch
2ce98f8337
Move validator set snapshot computation to bor_heimdall stage (#8202)
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro-2.local>
2023-10-20 18:31:00 +01:00
Alex Sharov
f265bd8bfc
add "integration stage_headers --integrity.slow" to check gaps in headers or canonical markers (#8489) 2023-10-16 19:52:04 +07:00
Alex Sharov
6d9a4f4d94
rpcdaemon: must not create db - because doesn't know right parameters (#8445) 2023-10-12 14:11:46 +07:00
Mark Holt
7deb69967f
Avoid marking blocks as bad at rewind (#8414)
Co-authored-by: Anshal Shukla <shukla.anshal85@gmail.com>
2023-10-10 19:47:51 +05:30
Alex Sharov
7dd678896a
downloader: move from snapshots/db to snapshots/downloader (#8375) 2023-10-05 14:25:00 +07:00
canepat
47690db676
Block execution using embedded Silkworm (#8353)
This introduces _experimental_ block execution run by embedded Silkworm
API library:

- new command-line option `silkworm.path` to enable the feature by
specifying the path to the Silkworm library
- the Silkworm API shared library is dynamically loaded on-demand
- currently requires to build Silkworm library on the target machine
- available only on Linux at the moment: macOS has issue with [stack
size](https://github.com/golang/go/issues/28024) and Windows would
require [TDM-GCC-64](https://jmeubank.github.io/tdm-gcc/), both need
dedicated effort for an assessment
2023-10-05 09:27:37 +07:00
Alex Sharov
dbdb486dd3
downloader: download .torrent files from webseeds provider (#8346) 2023-10-03 11:21:41 +07:00
Mark Holt
0bdca6c457
Metrics label fixes (#8339)
Fixes for label discrepancies in collector for summaries etc which have
a template which includes a quantile.

Initial native Prometheus client implementation of metrics - which is
currently turned off except for local testing and interface exports.
2023-10-02 17:19:02 +01:00
Alex Sharov
6cc2bd5751
downloader: non-readonly open db (so it can auto-recover if need) (#8312) 2023-09-28 11:38:29 +07:00
Mark Holt
3b45f53f3d
Milestone stage processing (#8187)
This is the second part of the bor milestone release it contains the
following changes:

* Initialize services
* This is a change from the initial pull request I have moved all of the
initialization to the bor engine. To facilitate this I have just passed
in the heimdall client interface, rather than the whole engine
* Stage processing 
* This is also a change from the original PR - the code is contained in
the bor heimdall stage rather than in headers - the effect should be the
same, but this needs testing

---------

Co-authored-by: Mark Holt <mark@disributed.vision>
Co-authored-by: Anshal Shukla <shukla.anshal85@gmail.com>
2023-09-18 18:05:33 +01:00
Alex Sharov
ff54df073e
etl: do sort and file flush in another goroutine (#7915) 2023-09-06 14:51:23 +07:00
Mark Holt
8ea0096d56
moved metrics sub packages types to metrics (#8119)
This is a non functional change which consolidates the various packages
under metrics into the top level package now that the dead code is
removed.

It is a precursor to the removal of Victoria metrics after which all
erigon metrics code will be contained in this single package.
2023-09-03 08:09:27 +07:00
Andrew Ashikhmin
9895b66b3d
Fix applyBlock in commands/state_domains (#8115)
Fix an issue introduced by PR #8095
2023-09-01 17:33:05 +01:00
Mark Holt
a4cfbe0d56
Heimdall metrics + Metrics HTTP server rationalization (#8094)
This is an update of:

https://github.com/ledgerwatch/erigon/pull/7846

which uses a local fork of victoria metrics to include the changes that
https://github.com/anshalshukla added to the original for we where
using.

It also includes code to address the duplicate metrics issue identified
here:

https://github.com/ledgerwatch/erigon/issues/8053

It has one more associated fix which is to correctly add a metadata
label to counters, these where previously labelled as gauges.

e.g. 

```
# TYPE p2p_peers counter
p2p_peers 0
```
rather than

```
# TYPE p2p_peers gauge
p2p_peers 0
```

---------

Co-authored-by: Anshal Shukla <53994948+anshalshukla@users.noreply.github.com>
Co-authored-by: Anshal Shukla <shukla.anshal85@gmail.com>
2023-08-31 09:04:27 +01:00
Andrew Ashikhmin
9b63764b16
Move ApplyDAOHardFork & UpgradeBuildInSystemContract to engine.Initialize (#8095)
Now all protocol-stipulated changes at the beginning of the block (AuRa
stuff,
[DAO](https://github.com/ethereum/execution-specs/blob/master/network-upgrades/mainnet-upgrades/dao-fork.md)
irregular state change, Calcutta system contract upgrade,
[EIP-4788](https://eips.ethereum.org/EIPS/eip-4788) beacon root) are
handled by consensus engine `Initialize()`.
2023-08-30 15:51:19 +02:00
Mark Holt
f2d0118a33
Bor snapshot block production (#8065)
I have added:

```go
{
	ID:          stages.BorHeimdall,
	Description: "Download Bor-specific data from Heimdall",
	Forward: func(firstCycle bool, badBlockUnwind bool, s *StageState, u Unwinder, tx kv.RwTx, logger log.Logger) error {
		if badBlockUnwind {
			return nil
		}
		return BorHeimdallForward(s, u, ctx, tx, borHeimdallCfg, true, logger)
	},
	Unwind: func(firstCycle bool, u *UnwindState, s *StageState, tx kv.RwTx, logger log.Logger) error {
		return BorHeimdallUnwind(u, ctx, s, tx, borHeimdallCfg)
	},
	Prune: func(firstCycle bool, p *PruneState, tx kv.RwTx, logger log.Logger) error {
		return BorHeimdallPrune(p, ctx, tx, borHeimdallCfg)
	},
},
```
To MiningStages as well as Default as otherwise bor events are not added
when the block producer creates new blocks.

There are a couple of questions I have around this implementation:

* Is this the right place to add this
* As the state is also executed when the default stage is processed ther
is some duplicate processing for the block producing node.
* There is a duplicated call to heimdall which could be removed if the
stages share state - but its not clear if we want to do this.
* I don't think the mining stage needs to prune as this will be
replicated in the default iteration

This can be tested using the devnet with the following arguments:

```
--chain bor-devnet --bor.localheimdall --scenarios state-sync
```

This will generate sync events via an ethereum devnet which are
transmitted to bor chain and will be executed at the end of the snapshot
delay, which results in events generated from the bor chain. This tests
the whole sync, block generation, event lifecycle. As it needs to wait
for sprints to end after a sufficient delay it is quite slow to run.
2023-08-30 08:06:09 +01:00
battlmonstr
32ca0e5ab1
sync: revert flawed dropUselessPeers logic and alleviate its issues (#8062)
The current logic is flawed, because it drops all peers that are less
synced.
It is valid to return empty responses by the eth spec.
A proper logic should penalize from the context of the sync process,
where enough "reputation" data is collected about a peer.

In order to be able to connect to erigon 2.48 peers that have
--sentry.drop-useless-peers enabled,
this adds a check to not reply with an empty headers list.
If we reply with an empty list, we're going to be considered useless and
kicked.
Once enough of erigon nodes are updated in the network past this commit,
this check should be removed,
because it is totally acceptable to return an empty list by the eth
spec.
2023-08-25 11:42:54 +02:00
Alex Sharov
2b6c21fddb
move mdbx to new org (#8061) 2023-08-24 18:00:24 +07:00
battlmonstr
2e29ff33e1
bor: BroadcastNewBlock to all peers from validator nodes (#8030)
Currently PropagateNewBlockHashes and BroadcastNewBlock
selects a subset of all sentries by taking a `Sqrt(len(sentries))`,
and then for each sentry SendMessageToRandomPeers
selects a subset of its peers by taking `Sqrt(len(peerInfos))`.

This behaviour limits the broadcast scope with a lot of peers, e.g. 100
becomes 10,
but is not great with very few peers, or if the message is very
important
to broadcast to everyone, which is the case of bor validator/proposer
nodes.

* send to all sentries in both BroadcastNewBlock and PropagateNewBlockHashes
* remove peerCountConstrained sqrt logic in SendMessageToRandomPeers
* add maxPeers provider func as a parameter to MultiClient
* default it to 10 for eth and 0 (unlimited) for bor validators

---------

Co-authored-by: Mark Holt <mark@distributed.vision>
2023-08-23 14:28:39 +02:00
alex.sharov
02b7c28a94 bor: integration cmd support 2023-08-23 12:07:36 +07:00
Andrew Ashikhmin
6bc0ca9e85
Correctly compute fork id when timestamp fork is activated in genesis (#8046)
See https://github.com/ethereum/go-ethereum/pull/27895
2023-08-21 15:35:13 +02:00
ledgerwatch
6b6c0caad0
Snapshots of Bor events (#7901)
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro-2.local>
Co-authored-by: Alex Sharp <alexsharp@alexs-mbp-2.home>
2023-08-18 17:10:35 +01:00
alex.sharov
b50f427641 save 2023-07-29 12:08:38 +07:00
Andrew Ashikhmin
7d35c6b737
EIP-4844: Rename "data gas" to "blob gas" (#7937)
See https://github.com/ethereum/EIPs/pull/7354 &
https://github.com/ethereum/consensus-specs/pull/3461. Prerequisite:
https://github.com/ledgerwatch/erigon-lib/pull/1058
2023-07-28 12:12:05 +02:00
gladcow
7e3c6ea2f8
Adds clear_bad_blocks command (#7934)
Adds `clear_bad_blocks` command to integration tool. This command allows
to re-process blocks that were erroneously marked as bad.

Command just clears `BadHeaderNumber` table. It can be safer in some
cases than
```
./integration state_stages —unwind=<some_number>
./integration stage_headers —unwind=<some_number>
```
and can be used in the cases like this one
https://github.com/ledgerwatch/erigon/issues/7892%20

Command syntax:
```
./integration clear_bad_blocks --datadir=<datadir>
```
2023-07-28 08:28:47 +07:00
Giulio rebuffo
b2442a7b41
Removed unused stages (Cumulative index + Translation) (#7884) 2023-07-13 16:55:48 +02:00
Mark Holt
bd9896bf4b
added bor tx indexing with tests (#7826)
This request implements the insertion of Bor ephemeral transactions into
snapshot indexes.

I does this by taking the block hash from the header index and passing
it to the transaction indexer to add an additional index entry per block
into the transaction hash -> block index.

The passed entries are currently contained in an in memory array which
is (32 * number of blocks / sprint size) bytes.

In addition to the functional code there is also an update to the
`dump_test.go` so that it runs `DumpBlocks` to exercise the indexing
code. To facilitate this the `InsertChain` method in `mock_sentry` has
been modified so that it can process >128 blocks.

The code in this request also includes additional bor/consensus code
with the following functions:

`CalculateSprint`
`CalculateSprintCount`

The first function is a modification of the code in erigon-lib so that
the sprints are numerically rather than lexically ordered. This code
should be migrated to erigon-lib and should have its sprint set
calculated once from its underlying map rather than this process being
repeated every calculation.

---------

Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro-2.local>
Co-authored-by: ledgerwatch <akhounov@gmail.com>
Co-authored-by: Enrique Jose  Avila Asapche <eavilaasapche@gmail.com>
Co-authored-by: Giulio <giulio.rebuffo@gmail.com>
2023-07-12 23:31:38 +01:00
gladcow
8809208605
Expand home dir from tilda for integration tool settings (#7877)
Fixes https://github.com/ledgerwatch/erigon/issues/7814, it can be
really confusing when `erigon` and `integration` process `~` in path in
different ways.

I can't fix it for `integration` the same way it is done for `erigon`,
because in main app it is done with `urfave/cli` package that is not
used in `integration`, so I've used other simplest way to do it.
2023-07-13 00:45:38 +07:00
Giulio rebuffo
6272559fb7
Separated PendingBlock behaviour to be chain agnostic (#7859)
Instead of getting the pending block with latest payload id, we just
store the latest block built and serve it in remote backend
2023-07-10 19:22:03 +02:00
Alex Sharov
bf8ac9a38f
mdbx_to_mdbx: to use logger (#7860) 2023-07-09 08:04:20 +01:00
Alex Sharov
62b8daca21
integration stage_exec: add flag --no-commit (#7851) 2023-07-06 11:31:59 +07:00
alex.sharov
499680c5d2 docs 2023-07-05 09:35:04 +07:00
Alex Sharov
a7b9850ff4
mdbx_to_mdbx: docs (#7841) 2023-07-04 09:45:36 +07:00
Mark Holt
5935b11b24
Devnet macos startup and windows committed memory startup fixes (#7832)
The fixes here fix a couple of issues related to devnet start-up

1. macos threading and syscall error return where causing multi node
start to both not wait and fail
2. On windows creating DB's with the default 2 TB mapsize causes the os
to reserve about 4GB of committed memory per DB. This may not be used -
but is reserved by the OS - so a default bor node reserves around 10GB
of storage. Starting many nodes causes the OS page file to become
exhausted.

To fix this the consensus DB's now use the node's OpenDatabase function
rather than their own, which means that the consensus DB's take notice
of the config.MdbxDBSizeLimit.

This fix leaves one 4GB committed memory allocation in the TX pool which
needs its own MapSize setting.

---------

Co-authored-by: Alex Sharp <akhounov@gmail.com>
2023-07-02 22:37:23 +01:00
Andrew Ashikhmin
a24eae8d6c
EIP-4844: Handle data gas in txpool (#7779)
Prerequisite: https://github.com/ledgerwatch/erigon-lib/pull/1029
2023-06-23 11:10:23 +02:00