Commit Graph

27 Commits

Author SHA1 Message Date
awskii
d0efd3c1ca
E3/4 restore state and commitment fix (#670)
- Fixed commitment issues both erigon3/erigon4
- get back update-based commitments approach
- partially fixed state seeking
2022-10-11 07:24:25 +01:00
Alex Sharov
a63b054c1c
e3: prune limited amount before commit (#675) 2022-10-11 11:25:08 +07:00
Alex Sharov
b683ed435c
Compress params change (#651)
Main Target: reduce RAM usage of huffman tables. If possible - improve
decompression speed. Compression speed not so important.

Experiments on 74Gb uncompressed file (bsc
012500-013000-transactions.seg)
Ram - needed just to open compressed file (Huff tables, etc...)
dec_speed - loop with `word, _ = g.Next(word[:0])`
skip_speed - loop with `g.Skip()` 
```
| DictSize | Ram  | file_size | dec_speed | skip_speed |
| -------- | ---- | --------- | --------- | ---------- |
| 1M       | 70Mb | 35871Mb   | 4m06s     | 1m58s      |
| 512K     | 42Mb | 36496Mb   | 3m49s     | 1m51s      |
| 256K     | 21Mb | 37100Mb   | 3m44s     | 1m48s      |
| 128K     | 11Mb | 37782Mb   | 3m25s     | 1m44s      |
| 64K      | 7Mb  | 38597Mb   | 3m16s     | 1m34s      |
| 32K      | 5Mb  | 39626Mb   | 3m0s      | 1m29s      |
```
 
Also about small sampling: skip superstrings if superstringNumber % 4 !=
0 does reduce compression ratio by 1% - checked on big BSC file and
small (1gb) goerli file.

so, I feel it's not so bad idea to use:
maxDictPatterns=64k
samplingFactor=4

Tradeoffs: sacrify 5% compression ratio to 4x compression speedup (i
think even more), 30% decompression speedup, 10x RAM reduction

Release: I will not change existing snapshots - now will focus on
releasing new block snapshots and releasing new history snapshots
(Erigon3). If have time will re-compress existing snapshots later.
2022-10-05 17:54:48 +07:00
Alex Sharov
ca2ebac0f9
erigon3: step toward background snapshots build #663 2022-10-02 10:03:49 +07:00
Alex Sharov
784b6cc904
erigon3: build .vi after downloading (#659) 2022-09-29 12:14:45 +07:00
Artem Tsebrovskiy
4f5232504f
E3 agg commitment (#647)
* added commitment to aggregator

* added commitment evaluation by updates, fixed mainnet roothash mismatch

* added ability to change starting state of hph

* replayable erigon23 with commitment

* possible fix for eliasfano index read after close

* fixed db pruning and restart

* Initial fixes

* Debug

* clear downHashedLen for branch nodes

* Fix key length, cleanup

* Cleanup

* Cleanup

* picked aggregator updates

* fixed empty cell hash for ProcessUpdate evaluation

* hashBuffer moved from Cell to HexPatriciaHashed

* fixed codeHash incorrect renewal

* lint

* removed valuemergefn from history

* fixed lint

* fixed test

* rewritten fuzz test on hph

* fix for Win tests - do not remove tmp dir after test

* win

* fixup after merge

* close aggregator after test

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro.local>
2022-09-26 15:59:24 +01:00
Alex Sharov
417cea6485
erigon22: non-pointer btree (#653) 2022-09-26 09:42:44 +07:00
Alex Sharov
f05cd214bd
aggregator22: read dir without idx (#638) 2022-09-18 17:38:43 +07:00
Alex Sharov
aad257bc0c
erigon22: skip tmp files by regexp (#637) 2022-09-13 16:01:41 +07:00
Alex Sharov
d93972c581
domain: docs of tables format (#595) 2022-08-18 15:02:24 +07:00
ledgerwatch
fadc9b21d1
[erigon2.2] Split 2.2 and 2.3 prototype (#548)
* Introduce access functions to history

* Add missing functions

* Add missing functions

* Add missing functions

* Changeover in the aggregator

* Intermediate

* Fix domain tests

* Fix lint

* Fix lint

* Fix lint

* Close files

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-07-28 08:47:13 +01:00
Alex Sharov
471d790348
kv.Del() remove second parameter (#554)
* save

* save

* save

* save

* save

* save

* save

* save
2022-07-26 12:47:08 +07:00
ledgerwatch
596d10ea2e
Split aggregator to 2.2 and 2.3 versions (#539)
* Split History from Domain

* Add History.prune

* More on history

* Fix HistoryHistory test

* Merge history files

* Scan file test for history

* Add aggregator for erigon 2.2

* Change to generics, introduce contexts

* Delete to belong to Aggregator

* Fix lint

* Fix lint

* Fix lint

* Fix lint

* Use pointers to InvertedIndex again

* Remove prints

* Close embedded InvertedIndex

* Fix closing files

* Print

* Update ci.yml

* More printing

* Fix

* Make InvertedIndex pointer inside History

* Fix

* Update ci.yml

* Remove print

Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro.local>
Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-07-23 09:06:52 +01:00
Alex Sharov
ebea2863c1
domain: files generic btree 2022-07-18 16:05:04 +07:00
ledgerwatch
707a89842d
Add function to get history without state (#501)
* Add function to get history without state

* Add recon functions

* Expose endMinimax

* Recon prints

* Add NoState access methods

* MaxTxNum functions

* MaxTxNum functions

* MaxTxNum functions

* MaxTxNum functions

* History iterator

* Iterator

* history iterators to aggregator

* Print

* Fix

* Fix

* Fix

* Fix

* Fix

* Fix

* Print

* Print

* Print

* Fix

* Fix

* Fix

* Fix

* Fix

* Print

* Print

* Print

* Print

* Print

* Add stats

* Remove time measurement

* Contexts for thread safety

* Partial iterators

* Fix

* Fix

* Not use SkipUncompressed

* Print

* Print

* Pass empty vals

* Parallel bitmap collection

* Print

* ReconTx iterator

* ReconTx iterator

* ReconTx iterator

* ReconTx iterator

* Print

* Print

* Remove print

* Print

* Print

* Print

* Print

* Print

* Print

* Dedicated getter for Iterate

* For for storage 0

* Remove print

* do not perform unnecessary changes

Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro.local>
Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-07-02 19:38:34 +01:00
ledgerwatch
46bebb3317
[erigon2.2] Add ReadIndices aggregator to collect data (#500)
* [erigon2.2] Add ReadIndices aggregator to collect data

* Try

* Fix for history access

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-06-20 08:39:29 +01:00
ledgerwatch
234be664fc
Optimise history access for multiple files (#498)
Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-06-18 22:54:36 +01:00
ledgerwatch
945b0e9e0f
Fix merge of code files (#495)
Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-06-17 19:24:56 +01:00
ledgerwatch
df49481ddc
[erigon 2.2] Make keys always uncompressed, values compressed only for code (#492)
* Reduce allocations in domain and aggregator

* Make keys always uncompressed, values compressed only for code

* Functions to remake index

* Fix index recreation

* Test for reindex, fix

* Use uncompress vals in history

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-06-17 12:39:49 +01:00
ledgerwatch
e2c6ef0058
[erigon2.2] Fixes for inverted indices and domains for the prototype (#489)
* Better control of compress/uncompressed

* Add new function

* more careful pruning

* Printf

* Printf

* Fix DupSort

* Remove copying in prune

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-06-13 19:32:13 +01:00
ledgerwatch
6cad65e62b
[erigon2.2] Parallel build files and merge, change file names (#487)
* Parallel build files and merge, change file names

* Update ci.yml

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-06-12 10:14:18 +01:00
ledgerwatch
74ea75f9b8
[erigon2.2] Merge fixes, add historical access (#482)
* Merge fixes, add historical access

* Change API from AfterTxNum to BeforeTxNum

* Change API functions

* Change API functions

* Print

* Fix for non-existent items

* Remove prints

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-06-09 14:48:16 +01:00
ledgerwatch
a77e6425eb
Fixes for the Erigon 2 upgrade 2 prototype (#479)
* Print

* Remove print

* Remove print

* Fix one panic

* Fix duplicate collation

* Print

* Fix print

* fix maxSpan

* Reduce maxSpan

* Remove duplicate join

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-06-05 22:32:34 +01:00
ledgerwatch
157b4299e4
[erigon2] Continuation on domains and inverted indices, putting things together (#476)
* Add scan files tests, create new aggregator type

* Fix lint

* windows test fix

* Add delelte test

* AggCollation

* More functions to Aggregator

* More aggregator functions

* Update

* More functions

* More functions

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro.local>
2022-06-02 21:40:58 +01:00
ledgerwatch
c5a10975ab
[erigon2] Introduce inverted index type (#473)
* [erigon2] Introduce inverted index type

* More inverted index code

* More tests for inverted index

* Think about public and non-public APIs

* Minimise DB access when accessing history

* Work on iterator

* Implementation of inverted iterator

* Test for inverted index

* Assert end of iterators

* Merge of inverted index files and test

* Fix lint

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro.local>
2022-05-31 18:42:04 +01:00
ledgerwatch
990e586823
[erigon2] Continuation of work on domains - merge static files (#466)
* Iteration over files - initial

* Fix interator for multistep

* Add function

* More functions for merge

* Merge files

* More work on the merge

* Fix buildIndex

* Fix history test for test of not completely pruned db

* Prepare for merge test

* Merge file test

* Close files

* Move functions into separate file

* Print

* fix for closing index

Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro.local>
Co-authored-by: Alex Sharp <alexsharp@alexs-mbp.lan>
Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-05-29 19:57:09 +01:00
ledgerwatch
37d9944da9
[erigon2] State domains (move functionality out of aggregator) (#436)
* Domain

* First functions

* change year

* More on domain

* More to test

* More on test

* More on domains

* buildFiles

* More on domains

* Collation test

* Fix collate

* Add test for decompressors

* Restructure history tables

* Split history into 2 tables

* Fix lint

* Check index files in the test

* Close files

* Add file scanning

* Fix lint

* Fix lint

* Add readFromFiles

* Add ef history idx file

* Start cleanup

* More to cleanup, test for ef history

* More test

* Add prune to test

* Test for prune and fix

* Start history access

* History test

* Test for LastDup

* Fix one lint

* Workaround

* History tests

* Debug

* Fix

* Fix in history

* Fix lint

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
Co-authored-by: Alex Sharp <alexsharp@alexs-macbook-pro.home>
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro.local>
Co-authored-by: Alex Sharp <alexsharp@alexs-mbp.lan>
2022-05-24 18:59:57 +01:00