2022-06-08 02:29:59 +00:00
|
|
|
# Downloader
|
2021-12-25 08:32:51 +00:00
|
|
|
|
2022-04-21 07:42:07 +00:00
|
|
|
Service to seed/download historical data (snapshots, immutable .seg files) by Bittorrent protocol
|
2021-12-25 08:32:51 +00:00
|
|
|
|
2022-07-28 09:57:38 +00:00
|
|
|
## Start Erigon with snapshots support
|
2022-01-31 07:04:16 +00:00
|
|
|
|
2022-06-08 02:29:59 +00:00
|
|
|
As many other Erigon components (txpool, sentry, rpc daemon) it may be built-into Erigon or run as separated process.
|
2022-04-21 07:42:07 +00:00
|
|
|
|
2022-01-26 09:11:22 +00:00
|
|
|
```shell
|
2022-07-29 03:01:13 +00:00
|
|
|
# 1. Downloader by default run inside Erigon, by `--snapshots` flag:
|
|
|
|
erigon --snapshots --datadir=<your_datadir>
|
2022-02-10 01:33:05 +00:00
|
|
|
```
|
|
|
|
|
|
|
|
```shell
|
2022-07-29 03:01:13 +00:00
|
|
|
# 2. It's possible to start Downloader as independent process, by `--snapshots --downloader.api.addr=127.0.0.1:9093` flags:
|
2022-01-26 09:11:22 +00:00
|
|
|
make erigon downloader
|
|
|
|
|
2022-04-21 07:42:07 +00:00
|
|
|
# Start downloader (can limit network usage by 512mb/sec: --torrent.download.rate=512mb --torrent.upload.rate=512mb)
|
2022-01-26 09:11:22 +00:00
|
|
|
downloader --downloader.api.addr=127.0.0.1:9093 --torrent.port=42068 --datadir=<your_datadir>
|
|
|
|
# --downloader.api.addr - is for internal communication with Erigon
|
|
|
|
# --torrent.port=42068 - is for public BitTorrent protocol listen
|
|
|
|
|
|
|
|
# Erigon on startup does send list of .torrent files to Downloader and wait for 100% download accomplishment
|
2022-07-29 03:01:13 +00:00
|
|
|
erigon --snapshots --downloader.api.addr=127.0.0.1:9093 --datadir=<your_datadir>
|
2022-01-26 09:11:22 +00:00
|
|
|
```
|
|
|
|
|
2022-04-01 08:00:50 +00:00
|
|
|
Use `--snap.keepblocks=true` to don't delete retired blocks from DB
|
2022-03-24 05:17:31 +00:00
|
|
|
|
|
|
|
Any network/chain can start with snapshot sync:
|
|
|
|
|
|
|
|
- node will download only snapshots registered in next repo https://github.com/ledgerwatch/erigon-snapshot
|
|
|
|
- node will move old blocks from DB to snapshots of 1K blocks size, then merge snapshots to bigger range, until
|
|
|
|
snapshots of 500K blocks, then automatically start seeding new snapshot
|
|
|
|
|
2022-07-29 03:01:13 +00:00
|
|
|
Flag `--snapshots` is compatible with `--prune` flag
|
2022-03-16 02:57:48 +00:00
|
|
|
|
2022-02-07 05:07:46 +00:00
|
|
|
## How to create new network or bootnode
|
2022-01-26 09:11:22 +00:00
|
|
|
|
|
|
|
```shell
|
2022-02-07 05:07:46 +00:00
|
|
|
# Need create new snapshots and start seeding them
|
|
|
|
|
2022-01-26 09:11:22 +00:00
|
|
|
# Create new snapshots (can change snapshot size by: --from=0 --to=1_000_000 --segment.size=500_000)
|
|
|
|
# It will dump blocks from Database to .seg files:
|
2023-01-02 05:26:56 +00:00
|
|
|
erigon snapshots retire --datadir=<your_datadir>
|
2022-01-26 09:11:22 +00:00
|
|
|
|
|
|
|
# Create .torrent files (Downloader will seed automatically all .torrent files)
|
|
|
|
# output format is compatible with https://github.com/ledgerwatch/erigon-snapshot
|
2022-02-07 05:07:46 +00:00
|
|
|
downloader torrent_hashes --rebuild --datadir=<your_datadir>
|
2022-01-26 09:11:22 +00:00
|
|
|
|
2022-02-07 05:07:46 +00:00
|
|
|
# Start downloader (seeds automatically)
|
2022-01-26 09:11:22 +00:00
|
|
|
downloader --downloader.api.addr=127.0.0.1:9093 --datadir=<your_datadir>
|
|
|
|
|
2022-07-29 03:01:13 +00:00
|
|
|
# Erigon is not required for snapshots seeding. But Erigon with --snapshots also does seeding.
|
2022-05-18 06:50:28 +00:00
|
|
|
```
|
|
|
|
|
|
|
|
Additional info:
|
2022-06-08 02:29:59 +00:00
|
|
|
|
2022-05-18 06:50:28 +00:00
|
|
|
```shell
|
|
|
|
# Snapshots creation does not require fully-synced Erigon - few first stages enough. For example:
|
2022-08-31 00:46:27 +00:00
|
|
|
STOP_AFTER_STAGE=Senders ./build/bin/erigon --snapshots=false --datadir=<your_datadir>
|
2022-05-18 06:50:28 +00:00
|
|
|
# But for security - better have fully-synced Erigon
|
|
|
|
|
2022-04-21 07:42:07 +00:00
|
|
|
|
2022-05-18 06:50:28 +00:00
|
|
|
# Erigon can use snapshots only after indexing them. Erigon will automatically index them but also can run (this step is not required for seeding):
|
2022-04-21 07:42:07 +00:00
|
|
|
erigon snapshots index --datadir=<your_datadir>
|
2022-01-26 09:11:22 +00:00
|
|
|
```
|
|
|
|
|
2021-12-25 08:32:51 +00:00
|
|
|
## Architecture
|
|
|
|
|
2022-01-26 09:11:22 +00:00
|
|
|
Downloader works based on <your_datadir>/snapshots/*.torrent files. Such files can be created 4 ways:
|
2021-12-31 05:09:11 +00:00
|
|
|
|
|
|
|
- Erigon can do grpc call downloader.Download(list_of_hashes), it will trigger creation of .torrent files
|
|
|
|
- Erigon can create new .seg file, Downloader will scan .seg file and create .torrent
|
|
|
|
- operator can manually copy .torrent files (rsync from other server or restore from backup)
|
|
|
|
- operator can manually copy .seg file, Downloader will scan .seg file and create .torrent
|
|
|
|
|
2021-12-25 08:32:51 +00:00
|
|
|
Erigon does:
|
|
|
|
|
|
|
|
- connect to Downloader
|
|
|
|
- share list of hashes (see https://github.com/ledgerwatch/erigon-snapshot )
|
|
|
|
- wait for download of all snapshots
|
2022-01-26 09:11:22 +00:00
|
|
|
- when .seg available - automatically create .idx files - secondary indices, for example to find block by hash
|
2021-12-25 08:32:51 +00:00
|
|
|
- then switch to normal staged sync (which doesn't require connection to Downloader)
|
2022-07-28 09:57:38 +00:00
|
|
|
- ensure that snapshot dwnloading happening only once: even if new Erigon version does include new pre-verified snapshot
|
|
|
|
hashes, Erigon will not download them (to avoid unpredictable downtime) - but Erigon may produce them by self.
|
2021-12-25 08:32:51 +00:00
|
|
|
|
|
|
|
Downloader does:
|
|
|
|
|
2021-12-31 05:09:11 +00:00
|
|
|
- Read .torrent files, download everything described by .torrent files
|
2022-11-20 03:41:30 +00:00
|
|
|
- Use https://github.com/ngosang/trackerslist see [./trackers/embed.go](../../../erigon-lib/downloader/trackers/embed.go)
|
2021-12-25 08:32:51 +00:00
|
|
|
- automatically seeding
|
|
|
|
|
2022-01-26 09:11:22 +00:00
|
|
|
Technical details:
|
2022-01-05 10:14:37 +00:00
|
|
|
|
2022-02-07 05:07:46 +00:00
|
|
|
- To prevent attack - .idx creation using random Seed - all nodes will have different .idx file (and same .seg files)
|
2022-06-08 02:29:59 +00:00
|
|
|
- If you add/remove any .seg file manually, also need remove `<your_datadir>/snapshots/db` folder
|
2022-02-07 05:07:46 +00:00
|
|
|
|
|
|
|
## How to verify that .seg files have same checksum withch current .torrent files
|
|
|
|
|
|
|
|
```
|
|
|
|
# Use it if you see weird behavior, bugs, bans, hardware issues, etc...
|
|
|
|
downloader torrent_hashes --verify --datadir=<your_datadir>
|
|
|
|
```
|
2022-03-21 02:39:48 +00:00
|
|
|
|
|
|
|
## Faster rsync
|
|
|
|
|
|
|
|
```
|
|
|
|
rsync -aP --delete -e "ssh -T -o Compression=no -x" <src> <dst>
|
2022-03-30 02:54:07 +00:00
|
|
|
```
|
|
|
|
|
|
|
|
## Release details
|
|
|
|
|
|
|
|
Start automatic commit of new hashes to branch `master`
|
|
|
|
|
|
|
|
```
|
|
|
|
crontab -e
|
2022-03-30 02:55:22 +00:00
|
|
|
@hourly cd <erigon_source_dir> && ./cmd/downloader/torrent_hashes_update.sh <your_datadir> <network_name> 1>&2 2>> ~/erigon_cron.log
|
2022-03-30 02:54:07 +00:00
|
|
|
```
|
|
|
|
|
|
|
|
It does push to branch `auto`, before release - merge `auto` to `main` manually
|