39 Commits

Author SHA1 Message Date
Alex Sharov
62fe81e4be
IH stage speedup and lmdb custom comparators support (#1080)
* etl.Loader - allow use of custom comparator

* log timing

* try now

* try now

* more performance

* etl.Loader - allow use of custom comparator

* working version

* simplify IH cursor

* clean

* squash

* squash

* squash

* squash

* squash

* squash

* squash

* clean

* add only unwind support

* squash

* squash

* clean

* fix test

* clean

* clean

* clean
2020-09-10 13:35:58 +01:00
Alex Sharov
522287ac18
Transactional cycle (#966)
* v0

* v1

* v3

* v4

* clean

* temporary fix of txpool

* Add debug logs about tx start/commit

* Add debug logs about tx start/commit

* Add debug logs about tx start/commit

* add condition

* tx pool to not hold own db

* try to enable TxPool in integration

* exclude tx pool from tx

* exclude tx pool from integration

* reduce limit

* fix integration

* clean

* clean

* clean

* clean

* clean

* exclude tx pool unwind

* exclude tx pool unwind in integration

* fix integration tx pool

* fix commit

* fix current stage after unwind

* fix current stage after unwind

* fix linter

* move unwind of tx_pool after unwind of unwind of senders, then all stages from body to tx_pool will be inside tx.

* move body and headers unwind out of tx

* fix unwind order after reboot

* add support external tx to exec stage

* clean

* clean

* clean

* clean

* clean

* add logs

* better id check

* better id check
2020-08-26 07:02:10 +01:00
Igor Mandrigin
9b81829616
etl: create a subfolder in datadir for temp files. (#965) 2020-08-23 10:53:01 +01:00
Alex Sharov
a6be18b915
ticker-based logs (#954)
* timer-based logs

* timer-based logs

* delegate progress calculation to user

* delegate progress calculation to user

* delegate progress calculation to user

* clear

* add logs to senders recovery

* use default dir in integration

* more logs

* more logs
2020-08-22 12:12:33 +02:00
Igor Mandrigin
f46a72bdd9
Print approximate progress for ETL stages (#950) 2020-08-21 07:30:30 +01:00
Alex Sharov
0e253e7336
lmdb transactions of unlimited size (#918)
* add logging to loader

* use pure tx in etl loading, logs in mutation commit

* clean

* bletter logging and more cleanup

* bletter logging and more cleanup

* increase batch size to 500M

* better batch commit logging

* async fsync

* sync fsync

* sync fsync

* unify logging

* fix corner-case when etl can use empty bucket name

* fix tests

* better logging

* better logging

* rebase master

* remove lmdb.NoMetaSync flag for now

* consistent walk and multi-walk

* clean

* sub tx

* add consistent multi-put

* implement dupsort support in one new cursor method

* clear
2020-08-17 07:45:52 +01:00
Alex Sharov
16f09be0a8
add method .Last() (#909) 2020-08-12 10:49:52 +07:00
Alex Sharov
d9d9e14f45
change bucket type to string (#894) 2020-08-11 06:55:32 +07:00
Igor Mandrigin
bbb918ea6c etl: update docs with the correct sort method 2020-08-06 18:22:58 +02:00
Igor Mandrigin
ba9f9ee7dd
Add docs for common/etl (#878)
* ETL readme

* add a link to the explainer

* fixups

* a note on marshaling
2020-08-06 14:02:41 +02:00
Igor Mandrigin
1495979983
etl: only print file sizes for actual files, not RAM (#877) 2020-08-06 09:33:09 +02:00
Igor Mandrigin
fa89cdefd5
log ETL temp files datasize (#873) 2020-08-05 16:33:58 +01:00
Alex Sharov
d458c4cc1e
Migrations: use stage name as db key (#868) 2020-08-05 17:13:35 +07:00
ledgerwatch
0514501937
Revert "Use stage name as db key (#858)" (#867)
This reverts commit 7290bf519edd9d8f66aa444e1866d47f10deb5c7.
2020-08-04 11:05:27 +01:00
Alex Sharov
7290bf519e
Use stage name as db key (#858) 2020-08-04 16:25:28 +07:00
Alex Sharov
7f60d1b535
Remove load start key from etl (#757)
* limit incremental step size of stages 5 and 6

* remove limit, make incremental

* remove_load_start_key_from_etl

* switch to "struct of arrays" from "array of structs"

* switch to "struct of arrays" from "array of structs"

Co-authored-by: Alexey Akhunov <akhounov@gmail.com>
2020-07-18 10:17:04 +01:00
Alex Sharov
6043c09ddf
Replace global buff pool by local, because of buffer leaks to larger pools (#739) 2020-07-14 08:56:47 +07:00
ledgerwatch
665b1d88b9
Speed up GenerateChain by using intermediate hashes (#736)
* GenerateChain to support intermediate hashes

* Fix formatting

* Changeset stats

* Fix formatting
2020-07-10 22:37:34 +01:00
Alex Sharov
305f8aff02
Integration tests2 (#714)
```
go run ./cmd/integration reset --chaindata=...
go run ./cmd/integration state_stages -h
go run ./cmd/integration state_stages  --chaindata=... --verbosity=3 --block=2_000_000 --unwind=10 --unwind_every=1_000 
```
Also, it inherits flags from geth:
```
--pprof.cpuprofile=./cpu.out   // to file
--pprof --pprof.port=6060     // launch pprof server
--metrics                  //  sends to prometheus 
```
2020-07-07 11:00:25 +07:00
ledgerwatch
d713de4713
Fix index generation for storage when incarnation changes (#716)
* Fixing history generation

* Try fixes

* Reset from 0

* Cleanup
2020-07-06 07:34:24 +01:00
Alex Sharov
98f8ccc561
Command for long and heavy integration tests (#712)
* cmd for integration tests

* cmd for integration tests
2020-07-05 07:18:21 +01:00
ledgerwatch
b498419dcb
Fix loader crash (kludge) (#706)
* Kludge to remove duplicates

* Fix

* Fix formatting

* Fix in Len function
2020-07-03 08:08:35 +01:00
ledgerwatch
137daa6c67
Fixes in generated changeSet and checkChangeSets (#698)
* Debugging changesets

* Kinda works

* Fix compile error

* Fix formatting

* Fix lint

* Duplicate entries kludge

* Fix compile error

* Cleanup
2020-07-01 15:56:56 +01:00
ledgerwatch
ed866e6934
non-concurrent ETL, debug_traceTransaction in rcpdaemon (#692)
* Fixing history index

* Remove chunk generation, fix formatting

* Fix compile error, clean up hack.go

* Fix output tests

* Fix index generator test

* Fixed checkChangeSets

* Fix linter
2020-06-28 07:10:27 +01:00
Alex Sharov
b26b00f72b
increase IO buffers size (#686)
* increase IO buffers size

* found SSD threshold
2020-06-25 08:34:40 +01:00
b00ris
7ac009a7da
Fix last chunk order (#683)
* fix commit order

* fix lint
2020-06-21 15:13:17 +01:00
ledgerwatch
fd98914c28
Stage6 - intermediate hashes (#677)
* First cut of the stage6

* Fix formatter

* Introduce state6

* Fix target number for stage6

* Fix linter

* Fix linter

* Reset in regenerate

* Correct block number

* Fix linter

* Fix linter

* Fix

* Upper case

* Skeleton to debug'

* Fix formatting

* Added codehash correction

* Fix TestUnwind

* fix test

* Fix linter

* Introduce unwind6

* Report adjustment error

* Code hashes included into stage5 incremental promotion and unwind

* Fix formatting

* Cleanup

* fix TestUnwind

* Cleanup

* Fix compile error

* Fix formatting

* unwind4

* Disable verifyRoot in stage5, fix hacks

* Remove verifyRoot function

Co-authored-by: alex.sharov <AskAlexSharov@gmail.com>
2020-06-18 22:27:11 +01:00
b00ris
7df7880a5a
[etl] Split collect&commit to separate threads (#676)
* fast search

* fix

* remove copy bytes

* speedup collector

* move logs to commit gr

* fix lint

* fix test
2020-06-18 19:14:10 +01:00
Igor Mandrigin
96750b3711
Support ETL interruptions while promoting plain state (#662)
* lil etl changez

* fix test compile

* fixups to the unwinds

* add new buffer mode

* support unfinished transitions

* fixups

* fix tests

* linters

* linters
2020-06-13 16:03:38 +01:00
b00ris
b4ba764fb1
[WIP] TxLookup stage (#646)
* save state

* txlookup full results

* save state

* save state

* remove experiments

* some fix&lint

* add end key to txLookup and index generation

* change log message

* change log

* fix lint

* lint

* fix test
2020-06-10 23:07:14 +03:00
ledgerwatch
a8b438d21d
Incremental promotion for stage 5 (#628)
* Incremental promotion

* Fix compile errors

* Fix linter

* Transform state keys before loading

* Re-enable and fix the tests

* Fix formatting
2020-06-06 06:36:29 +01:00
Alex Sharov
f0bc2b2146
Run tests on lmdb and badger (#624)
* lmdb tests

* trigger ci

* fix tests

* disable parallelism

* disable parallelism

* cleanup resources

* cleanup resources

* reduce concurency

* try run tests on bolt

* try run tests on bolt

* fix downloader test

* run bolt tests

* rely on interface instead of exact instance

* Rename AbstractKV to KV

* don't use separator for badger

* don't initialize badger cursor - because it not used here

* fix linter

* try reduce badger compactors

* compat with master

* try lmdb

* try lmdb

* try lmdb

* reduce badger's MaxTableSize, reduce badger's minGoMaxProc for inMem option

* allow to close closed db

* release

* release

* ideal batch size for badger

* ideal batch size for badger
2020-06-05 10:25:33 +01:00
ledgerwatch
e5692d1726
Various fixes to staged sync and experiments (#608)
* First commit

* Fix the one-off error

* Fix formatting

* Ability to execute stage5 separately

* Clean up intermediate hashes and stage5 progress

* Fix linter

* Print original keys when extracting

* channel

* More logging

* More logging

* Remove excess logging

* fix stage2

* Revert

* Fix stage2

* Add provider exhausted

* Sort sortable buffer

* Fix test

* Another cleanup

* Remove exhaust log
2020-06-03 13:03:40 +01:00
Igor Mandrigin
c16d3da1b4
etl: startkey for Load + OnLoadCommit callback (#601) 2020-06-01 17:14:40 +03:00
ledgerwatch
7ab10c85af
Separate Collector pattern out of ETL, hash collector for rebuilding Intermediate Hashes (#597)
* Introduce hashCollector

* Add HashCollector to SubTrieLoader

* Fix linter

* Reset hashed state

* Not to regenerate the hashed state

* Not to delete state

* Fix linter

* Print expected hash in the beginning

* Simplify

* Remove initialTrie

* Use etl to buffer intermediate hashes

* Copy values, not insert empty key

* Compress instead of decompress

* Enhance file buffer logging, fix linter

* Fix compile errors

* Fix log

* Fix logging

* Exclude zero key again

* Add rewind

* Restrict timestamps

* Fix

* Fix formatting

* Incorporate separation

* Extract identityLoadFunction

* Fix formatting
2020-05-31 13:23:34 +01:00
Igor Mandrigin
786561e65e
don't write files if all data fits into one buffer. (#594) 2020-05-31 08:32:33 +01:00
b00ris
266f9a5208
Plain state index (#595)
* plain state indexes

* generalize index generation&tests

* add plain state tests

* fix regenerate index

* сoncurrent changeset chunks processing

* remove concurrency

* fix lint

* remove comments

* add test to truncate

* fix conflicts

* fmt

* remove shadowing
2020-05-31 07:57:47 +01:00
Evgeny Danilenko
32ec443572
Restore graceful shutdown (#591)
* initial

* gracefull shutdown for staged

* more wg fixes

* fmt

* linters

* remove generalization

* linters

* linters

* fix

* fix

* fmt

* quit into etl

* after CR

* after CR
2020-05-30 16:44:54 +03:00
Igor Mandrigin
ebe7aec14e
Generalize "insert into DB though sorted files" pattern. (#592) 2020-05-30 10:00:35 +03:00