177 Commits

Author SHA1 Message Date
Alex Sharov
82822ee602
erigon snapshots integrity: add check for body.BaseTxnID (#9121) 2024-01-04 14:19:37 +07:00
Mark Holt
19bc328a07
Added db loggers to all db callers and fixed flag settings (#9099)
Mdbx now takes a logger - but this has not been pushed to all callers -
meaning it had an invalid logger

This fixes the log propagation.

It also fixed a start-up issue for http.enabled and txpool.disable
created by a previous merge
2023-12-31 17:10:08 +07:00
Mark Holt
79ed8cad35
E2 snapshot uploading (#9056)
This change introduces additional processes to manage snapshot uploading
for E2 snapshots:

## erigon snapshots upload

The `snapshots uploader` command starts a version of erigon customized
for uploading snapshot files to
a remote location.  

It breaks the stage execution process after the senders stage and then
uses the snapshot stage to send
uploaded headers, bodies and (in the case of polygon) bor spans and
events to snapshot files. Because
this process avoids execution in run signifigantly faster than a
standard erigon configuration.

The uploader uses rclone to send seedable (100K or 500K blocks) to a
remote storage location specified
in the rclone config file.

The **uploader** is configured to minimize disk usage by doing the
following:

* It removes snapshots once they are loaded
* It aggressively prunes the database once entities are transferred to
snapshots

in addition to this it has the following performance related features:

* maximizes the workers allocated to snapshot processing to improve
throughput
* Can be started from scratch by downloading the latest snapshots from
the remote location to seed processing

## snapshots command

Is a stand alone command for managing remote snapshots it has the
following sub commands

* **cmp** - compare snapshots
* **copy** - copy snapshots
* **verify** - verify snapshots
* **manifest** - manage the manifest file in the root of remote snapshot
locations
* **torrent** - manage snapshot torrent files
2023-12-27 22:05:09 +00:00
Alex Sharov
1468317efd
erigon snapshots index: build bor indices (#9009) 2023-12-18 17:46:50 +07:00
Alex Sharov
754276909b
bor snaps: "erigon snapshots retire" to build bor files (#8912) 2023-12-06 12:12:43 +00:00
Alex Sharov
2e0f9b0411
"erigon snapshots diff": setup logger (#8834) 2023-11-29 16:27:58 +00:00
a
a2673c62c5
[caplin/sentinel] config reorganizations (#8844) 2023-11-28 22:45:58 -06:00
Dmytro
a6b5297b3e
dvovk/tunnelwws (#8745)
- changed communication tunnel to web socket in order to connect to
remote nodes
- changed diagnostics.url flag to diagnostics.addr as now user need to
enter only address and support command will connect to it through
websocket
- changed flag debug.urls to debug.addrs in order to have ability to
change connection type between erigon and support to websocket and don't
change user API
- added auto trying to connect to connect to ws if connection with was
failed
2023-11-16 16:37:29 +00:00
Alex Sharov
329d18ef6f
snapshots: reduce merge limit of blocks to 100K (#8614)
Reason: 
- produce and seed snapshots earlier on chain tip. reduce depnedency on
"good peers with history" at p2p-network.
Some networks have no much archive peers, also ConsensusLayer clients
are not-good(not-incentivised) at serving history.
- avoiding having too much files:
more files(shards) - means "more metadata", "more lookups for
non-indexed queries", "more dictionaries", "more bittorrent
connections", ...
less files - means small files will be removed after merge (no peers for
this files).


ToDo:
[x] Recent 500K - merge up to 100K 
[x] Older than 500K - merge up to 500K 
[x] Start seeding 100k files
[x] Stop seeding 100k files after merge (right before delete)

In next PR: 
[] Old version of Erigon must be able download recent hashes. To achieve
it - at first start erigon will download preverified hashes .toml from
s3 - if it's newer that what we have (build-in) - use it.
2023-11-01 23:22:35 +07:00
Alex Sharov
c23e5a1abf
downloader: preparations for reducing blocks merge limit (#8612) 2023-10-30 13:46:35 +07:00
Alex Sharov
b311da959f
downloader: webseed better error messages (#8611) 2023-10-30 12:13:45 +07:00
Alex Sharov
5bb91bb77c
"erigon snapshots retire": prune, then retire, then prune (#8606) 2023-10-29 12:33:31 +07:00
Alex Sharov
6d9a4f4d94
rpcdaemon: must not create db - because doesn't know right parameters (#8445) 2023-10-12 14:11:46 +07:00
Alex Sharov
8850f3a76d
memstat: to use helper func (#8377) 2023-10-05 13:42:30 +07:00
Alex Sharov
fa3b8c23b2
Downloader: step towards more complex datadir (#8286)
migration included - no manual actions required
2023-10-04 11:01:02 +07:00
Alex Sharov
e993ee984b
stage loop: allow nil in hook (#8318) 2023-09-29 09:03:19 +07:00
Mark Holt
f794438335
Diag session routing (#8232)
Code to support react based UI for diagnostics:

* pprof, prometheus and diagnistics rationalized to use a single router
(i.e. they can all run in the same port)
* support_cmd updated to support node routing (was only first node)
* Multi content support in router tunnel (application/octet-stream &
appliaction/json)
* Routing requests changed from using http forms to rest + query params
* REST query requests can now be made against erigon base port and
diagnostics with the same url format/params

---------

Co-authored-by: dvovk <vovk.dimon@gmail.com>
Co-authored-by: Mark Holt <mark@disributed.vision>
2023-09-25 16:24:17 +01:00
Alex Sharov
3cea1b9b9e
torrent: add --webseeds cli arg (#8176) 2023-09-12 12:18:47 +07:00
alex.sharov
66c94d9760 retire snapshots - nil ptr fix 2023-09-09 08:33:42 +07:00
alex.sharov
253c232f77 pprof flags in erigon sub-commands 2023-09-01 12:22:24 +07:00
Alex Sharov
d1d348211f
add flag --force.partial.commit: to workaround problem "start from backup takes long time and can't save partial progress" (#8090) 2023-08-30 08:49:16 +07:00
ledgerwatch
6b6c0caad0
Snapshots of Bor events (#7901)
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro-2.local>
Co-authored-by: Alex Sharp <alexsharp@alexs-mbp-2.home>
2023-08-18 17:10:35 +01:00
a
db5b348673
Caplin block persistence (#7941)
Co-authored-by: Giulio <giulio.rebuffo@gmail.com>
2023-08-09 01:21:19 +02:00
Giulio rebuffo
a0865b489f
Prototyped experimental engine reverse downloader (#7936)
Erigon lib now good
2023-07-28 01:32:19 +02:00
Mark Holt
bd9896bf4b
added bor tx indexing with tests (#7826)
This request implements the insertion of Bor ephemeral transactions into
snapshot indexes.

I does this by taking the block hash from the header index and passing
it to the transaction indexer to add an additional index entry per block
into the transaction hash -> block index.

The passed entries are currently contained in an in memory array which
is (32 * number of blocks / sprint size) bytes.

In addition to the functional code there is also an update to the
`dump_test.go` so that it runs `DumpBlocks` to exercise the indexing
code. To facilitate this the `InsertChain` method in `mock_sentry` has
been modified so that it can process >128 blocks.

The code in this request also includes additional bor/consensus code
with the following functions:

`CalculateSprint`
`CalculateSprintCount`

The first function is a modification of the code in erigon-lib so that
the sprints are numerically rather than lexically ordered. This code
should be migrated to erigon-lib and should have its sprint set
calculated once from its underlying map rather than this process being
repeated every calculation.

---------

Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro-2.local>
Co-authored-by: ledgerwatch <akhounov@gmail.com>
Co-authored-by: Enrique Jose  Avila Asapche <eavilaasapche@gmail.com>
Co-authored-by: Giulio <giulio.rebuffo@gmail.com>
2023-07-12 23:31:38 +01:00
Alex Sharov
bf8ac9a38f
mdbx_to_mdbx: to use logger (#7860) 2023-07-09 08:04:20 +01:00
Mark Holt
5935b11b24
Devnet macos startup and windows committed memory startup fixes (#7832)
The fixes here fix a couple of issues related to devnet start-up

1. macos threading and syscall error return where causing multi node
start to both not wait and fail
2. On windows creating DB's with the default 2 TB mapsize causes the os
to reserve about 4GB of committed memory per DB. This may not be used -
but is reserved by the OS - so a default bor node reserves around 10GB
of storage. Starting many nodes causes the OS page file to become
exhausted.

To fix this the consensus DB's now use the node's OpenDatabase function
rather than their own, which means that the consensus DB's take notice
of the config.MdbxDBSizeLimit.

This fix leaves one 4GB committed memory allocation in the TX pool which
needs its own MapSize setting.

---------

Co-authored-by: Alex Sharp <akhounov@gmail.com>
2023-07-02 22:37:23 +01:00
Andrew Ashikhmin
1f7de0e5cf
Rename StageLoopStep to StageLoopIteration (#7820)
Rename for clarity to highlight that not only one stage is executed, but
the entire loop.
2023-06-29 17:53:23 +02:00
Mark Holt
051cad0fdd
Devnet diagnostics (#7762)
Added support tunnel to the devnet cmd. In order to get this to run I
made the following changes:

* Create a public function 
* Added non root logging

I have also added commentary to the readme to explain the additional
command line arguments needed to integrate with diagnostics. In summary,
if you set the --diagnostics.url the devenet will wait for diagnostic
requests rather than exiting

---------

Co-authored-by: alex.sharov <AskAlexSharov@gmail.com>
2023-06-20 10:11:55 +01:00
Alex Sharov
d3c3be9c91
e2: optimize tests speed (#7738) 2023-06-15 16:09:11 +07:00
Alex Sharov
e5023775aa
Enforce blockReader interface (#7737)
- breaks dependency from staged_sync to package with block_reader
implementation
- breaks dependency from snap_sync to package with block_reader
implementation
- breaks dependency from mining to txpool implementation
2023-06-15 13:11:51 +07:00
Alex Sharov
63c92010cd
remove txsV3 cli flag (#7644) 2023-06-03 15:54:27 +07:00
Alex Sharov
ad72b7178e
prune speedup. stage_senders: don't re-calc existing senders (#7643)
- stage_senders: don't re-calc existing senders
- stage_tx_lookup: prune less blocks per iteration - because
random-deletes are expensive. pruning must not slow-down sync.
- prune data even if --snap.stop is set
- "prune as-much-as-possible at startup" is not very good idea: at
initialCycle machine can be cold and prune will cause big downtime, no
reason to produce much freelist in 1 tx. People may also restart erigon
- because of some bug - and it will cause unexpected downtime (usually
Erigon startup very fast). So, I just remove all `initialSync`-related
logic in pruning.
- fix lost metrics about disk write byte/sec
2023-06-03 12:30:53 +07:00
Alex Sharov
d40317c905
preparation for --txs.v3, step 2 (#7636) 2023-06-03 07:20:22 +07:00
Alex Sharov
c6b12ed7ca
stageLoop: unbound canRunCycleInOneTransaction logic from initialCycle variable (#7616) 2023-06-01 16:50:19 +07:00
Alex Sharov
6c0b531eca
add "erigon snapshots diff" sub-command to find difference between 2 snapshots (#7619) 2023-06-01 16:29:26 +07:00
Alex Sharov
279e1bec33
use blockReader as service-provider of RoSnapshots (#7584) 2023-05-26 17:12:33 +07:00
Alex Sharov
23bd14744b
blockReader: use in ethstat (#7565) 2023-05-23 16:30:47 +07:00
Alex Sharov
18990ff91a
e2: reduce StageSyncStep interface size (#7561) 2023-05-23 12:28:31 +07:00
Anshul Yadav
2c194e10a1
Args usage msg bug fix (#7554)
## What's this PR is about?
Minor fix in args usage message of support flag. The current message
says that the flag should be 'metrics.url' but it reality it should be
'metrics.urls'
2023-05-20 21:57:23 +01:00
ledgerwatch
b382f96f6c
Introduce logger into etl (#7537)
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro-2.local>
2023-05-18 21:20:07 +01:00
Alex Sharov
c0096ee8c8
StageLoopStep: if node synced - then run initialCycle in 1 tx also (for data consistency) (#7532) 2023-05-17 13:39:21 +01:00
ledgerwatch
e75ea786c0
[devnet tool] separate logging (#7526)
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro-2.local>
2023-05-17 07:36:06 +01:00
ledgerwatch
3f9ae3ec77
[devnet tool] separate logging (#7525)
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro-2.local>
2023-05-16 10:53:50 +01:00
ledgerwatch
6ef3fc3a3c
[devnet tool] Side-quest logging, step 5 (#7484)
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro-2.local>
2023-05-10 19:36:27 +01:00
ledgerwatch
f38ec1e772
[devnet tool] side-quest: logging, step 3 (#7471)
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro-2.local>
2023-05-09 18:11:31 +01:00
ledgerwatch
3c1448afed
[devnet tool] Side-quest logging - replace quiet parameter (#7464)
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro-2.local>
2023-05-08 17:52:31 +01:00
ledgerwatch
fdd385cef1
[Devnet tool] Side-quest to improve logging - part 1 (#7445)
This is the beginning of the series of changes to make it possible to
run multiple instances of erigon inside a single process (as devnet tool
does), with the logging from these processes going to respective log
files correctly.
This is the first part where the initial infrastructure is being
established

---------

Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro-2.local>
2023-05-07 07:28:15 +01:00
alex.sharov
061d3ff1a4 fix cli metrics flag 2023-05-05 09:36:02 +07:00
a
30430d585a
begin refactor of beacon state (#7433)
this first major move separates the transient beacon state cache from
the underlying tree.

leaf updates are enforced in the setters, which should make programming
easier.

all exported methods of the raw.BeaconState should be safe to call
(without disrupting internal state)

changes many functions to consume *raw.BeaconState in perparation for
interface


beyond refactor it also:

adds a pool for the leaves of the validator ssz hash 

adds a pool for the snappy writers
  
removed the parallel hash experiment (high memory use)
2023-05-04 15:18:42 +02:00