21 Commits

Author SHA1 Message Date
Mark Holt
79ed8cad35
E2 snapshot uploading (#9056)
This change introduces additional processes to manage snapshot uploading
for E2 snapshots:

## erigon snapshots upload

The `snapshots uploader` command starts a version of erigon customized
for uploading snapshot files to
a remote location.  

It breaks the stage execution process after the senders stage and then
uses the snapshot stage to send
uploaded headers, bodies and (in the case of polygon) bor spans and
events to snapshot files. Because
this process avoids execution in run signifigantly faster than a
standard erigon configuration.

The uploader uses rclone to send seedable (100K or 500K blocks) to a
remote storage location specified
in the rclone config file.

The **uploader** is configured to minimize disk usage by doing the
following:

* It removes snapshots once they are loaded
* It aggressively prunes the database once entities are transferred to
snapshots

in addition to this it has the following performance related features:

* maximizes the workers allocated to snapshot processing to improve
throughput
* Can be started from scratch by downloading the latest snapshots from
the remote location to seed processing

## snapshots command

Is a stand alone command for managing remote snapshots it has the
following sub commands

* **cmp** - compare snapshots
* **copy** - copy snapshots
* **verify** - verify snapshots
* **manifest** - manage the manifest file in the root of remote snapshot
locations
* **torrent** - manage snapshot torrent files
2023-12-27 22:05:09 +00:00
Alex Sharov
657aafd5b7
allow erigon download .torrent from webseed by default (#9052) 2023-12-22 11:42:35 +07:00
Alex Sharov
d41d523050
Downloader: add ProhibitNewDownloads() (#8939)
"whitelisting" mechanism (list of files - stored in DB) - which
protecting us from downloading new files after upgrade/downgrade was
broken. And seems it became over-complicated with time.
I replacing it by 1 persistent flag inside downloader:
"prohibit_new_downloads.lock"
Erigon will turn downloader into this mode after
downloading/verification of first snapshots.


```
//Corner cases:
	// - Erigon generated file X with hash H1. User upgraded Erigon. New version has preverified file X with hash H2. Must ignore H2 (don't send to Downloader)
	// - Erigon "download once": means restart/upgrade/downgrade must not download files (and will be fast)
	// - After "download once" - Erigon will produce and seed new files
```

------
`downloader --seedbox` is never "prohibit new downloads"
2023-12-12 16:05:56 +07:00
Alex Sharov
0fbcd5b5d8
downloader: use manifest.txt for public bucket (#8863)
use manifest.txt instead of webseed.toml in public buckets
2023-11-30 16:58:23 +07:00
Alex Sharov
748359cf72
webseed: .torrent file validation must check - fileName and hash (#8820)
because files with different name can have same hash: BitTorrent is
content-addressable.
2023-11-24 18:46:17 +07:00
Giulio rebuffo
8d8368091c
Add full support to beacon snapshots (#8665)
This PR adds beacon blocks snapshots for the following chains:

* Mainnet snapshots
* Sepolia snapshots
2023-11-13 14:10:57 +01:00
Alex Sharov
bba91e90ec
downloader: demote webseed request errors (#8662) 2023-11-06 14:47:57 +01:00
Alex Sharov
329d18ef6f
snapshots: reduce merge limit of blocks to 100K (#8614)
Reason: 
- produce and seed snapshots earlier on chain tip. reduce depnedency on
"good peers with history" at p2p-network.
Some networks have no much archive peers, also ConsensusLayer clients
are not-good(not-incentivised) at serving history.
- avoiding having too much files:
more files(shards) - means "more metadata", "more lookups for
non-indexed queries", "more dictionaries", "more bittorrent
connections", ...
less files - means small files will be removed after merge (no peers for
this files).


ToDo:
[x] Recent 500K - merge up to 100K 
[x] Older than 500K - merge up to 500K 
[x] Start seeding 100k files
[x] Stop seeding 100k files after merge (right before delete)

In next PR: 
[] Old version of Erigon must be able download recent hashes. To achieve
it - at first start erigon will download preverified hashes .toml from
s3 - if it's newer that what we have (build-in) - use it.
2023-11-01 23:22:35 +07:00
Alex Sharov
b311da959f
downloader: webseed better error messages (#8611) 2023-10-30 12:13:45 +07:00
Andrew Ashikhmin
8231cdaede
downloader: less webseed logs (#8586) (#8607)
Merge PR #8586 into `devel`

Co-authored-by: Alex Sharov <AskAlexSharov@gmail.com>
2023-10-29 10:37:50 +01:00
Alex Sharov
3d3c0bec18
downloader: log.Trace non-important log messages instead of skipping it (#8574) 2023-10-25 11:35:22 +07:00
Alex Sharov
12d33d516c
downloader: supress some "EOF" logs (#8519) 2023-10-19 11:29:35 +07:00
Giulio rebuffo
82f1e9f342
Skip blockhashes stage (#8497) 2023-10-18 16:39:11 +02:00
Alex Sharov
6edaee1853
downloader: less webseed logs (#8510) 2023-10-18 15:27:41 +07:00
Alex Sharov
33d5399436
downloader: support token (#8507) 2023-10-18 14:24:09 +07:00
Alex Sharov
9db82fee5b
downloader: don't loose debug logs (#8473) 2023-10-14 17:11:39 +07:00
Alex Sharov
7d357b4327
better webseed support (#8466) 2023-10-13 18:03:52 +07:00
Alex Sharov
fe263ae02e
downloader: smaller ratelimit birst (#8435) 2023-10-11 16:18:21 +07:00
Alex Sharov
fa3b8c23b2
Downloader: step towards more complex datadir (#8286)
migration included - no manual actions required
2023-10-04 11:01:02 +07:00
Alex Sharov
47f89c2b7e
torrent: allow --downloader.verify use all CPU's (#8253) 2023-09-22 08:47:27 +07:00
battlmonstr
231e468e19 Add 'erigon-lib/' from commit '93d9c9d9fe4bd8a49f7a98a6bce0f0da7094c7d3'
git-subtree-dir: erigon-lib
git-subtree-mainline: 3c8cbda8098cc073a668b9e9b0aafe6c361f17da
git-subtree-split: 93d9c9d9fe4bd8a49f7a98a6bce0f0da7094c7d3
2023-09-20 14:50:25 +02:00