Commit Graph

123 Commits

Author SHA1 Message Date
Alex Sharov
06cacb67a0
some fail-fast asserts about merge files (#678) 2022-10-12 17:23:34 +07:00
Alex Sharov
0eab2a3dd1
e3: prevent files ranges overlap (kill -9 during merge handle) (#674) 2022-10-12 10:18:51 +07:00
awskii
d0efd3c1ca
E3/4 restore state and commitment fix (#670)
- Fixed commitment issues both erigon3/erigon4
- get back update-based commitments approach
- partially fixed state seeking
2022-10-11 07:24:25 +01:00
Alex Sharov
a63b054c1c
e3: prune limited amount before commit (#675) 2022-10-11 11:25:08 +07:00
Alex Sharov
91cc20a34b
erigon3: cli command to force merge snapshots (#672) 2022-10-10 09:47:05 +07:00
Alex Sharov
1ce5610eea
e3: agg atomic (#671) 2022-10-09 20:16:26 +07:00
alex.sharov
77d3a90936 Revert "save"
This reverts commit f24d3231ac.
2022-10-09 18:51:13 +07:00
alex.sharov
f24d3231ac save 2022-10-09 18:50:51 +07:00
Alex Sharov
2204990464
e3: fix close nil ptr (#669) 2022-10-06 12:20:28 +07:00
Alex Sharov
b683ed435c
Compress params change (#651)
Main Target: reduce RAM usage of huffman tables. If possible - improve
decompression speed. Compression speed not so important.

Experiments on 74Gb uncompressed file (bsc
012500-013000-transactions.seg)
Ram - needed just to open compressed file (Huff tables, etc...)
dec_speed - loop with `word, _ = g.Next(word[:0])`
skip_speed - loop with `g.Skip()` 
```
| DictSize | Ram  | file_size | dec_speed | skip_speed |
| -------- | ---- | --------- | --------- | ---------- |
| 1M       | 70Mb | 35871Mb   | 4m06s     | 1m58s      |
| 512K     | 42Mb | 36496Mb   | 3m49s     | 1m51s      |
| 256K     | 21Mb | 37100Mb   | 3m44s     | 1m48s      |
| 128K     | 11Mb | 37782Mb   | 3m25s     | 1m44s      |
| 64K      | 7Mb  | 38597Mb   | 3m16s     | 1m34s      |
| 32K      | 5Mb  | 39626Mb   | 3m0s      | 1m29s      |
```
 
Also about small sampling: skip superstrings if superstringNumber % 4 !=
0 does reduce compression ratio by 1% - checked on big BSC file and
small (1gb) goerli file.

so, I feel it's not so bad idea to use:
maxDictPatterns=64k
samplingFactor=4

Tradeoffs: sacrify 5% compression ratio to 4x compression speedup (i
think even more), 30% decompression speedup, 10x RAM reduction

Release: I will not change existing snapshots - now will focus on
releasing new block snapshots and releasing new history snapshots
(Erigon3). If have time will re-compress existing snapshots later.
2022-10-05 17:54:48 +07:00
Alex Sharov
746b31def2
agg22 madv helpers (#668) 2022-10-05 13:17:23 +07:00
Alex Sharov
980eeacbd0
eliasfano32.Max() method on serialized bytes (#664) 2022-10-04 10:51:34 +01:00
Alex Sharov
ca2ebac0f9
erigon3: step toward background snapshots build #663 2022-10-02 10:03:49 +07:00
Alex Sharov
8d5cf0170a
agg print stats at startup #662 2022-10-01 09:25:59 +07:00
Alex Sharov
784b6cc904
erigon3: build .vi after downloading (#659) 2022-09-29 12:14:45 +07:00
Alex Sharov
ec49625cd9
erigon3: allow set workers amount for history compress and merge #657 2022-09-28 14:31:28 +07:00
Alex Sharov
6c929b7771
erigon3: simplify history reader (fixing edge case of reading history from files) (#658) 2022-09-28 13:48:13 +07:00
awskii
e1860348b2
reverted minHeap at elias-fano merge (#655)
* reverted minHeap at elias-fano merge

* skip ef merge test for now
2022-09-27 11:54:29 +01:00
Artem Tsebrovskiy
4f5232504f
E3 agg commitment (#647)
* added commitment to aggregator

* added commitment evaluation by updates, fixed mainnet roothash mismatch

* added ability to change starting state of hph

* replayable erigon23 with commitment

* possible fix for eliasfano index read after close

* fixed db pruning and restart

* Initial fixes

* Debug

* clear downHashedLen for branch nodes

* Fix key length, cleanup

* Cleanup

* Cleanup

* picked aggregator updates

* fixed empty cell hash for ProcessUpdate evaluation

* hashBuffer moved from Cell to HexPatriciaHashed

* fixed codeHash incorrect renewal

* lint

* removed valuemergefn from history

* fixed lint

* fixed test

* rewritten fuzz test on hph

* fix for Win tests - do not remove tmp dir after test

* win

* fixup after merge

* close aggregator after test

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro.local>
2022-09-26 15:59:24 +01:00
Alex Sharov
7790688724
erigon3: build .efi after download #654 2022-09-26 15:26:58 +07:00
Alex Sharov
417cea6485
erigon22: non-pointer btree (#653) 2022-09-26 09:42:44 +07:00
Alex Sharov
f05cd214bd
aggregator22: read dir without idx (#638) 2022-09-18 17:38:43 +07:00
ledgerwatch
10a15edebc
[erigon22] not to overwrite files after state reconstitution (#642)
* Print

* Skip finishTx

* Correct skip

* Fix

* Fix

* Remove print

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-09-16 08:34:11 +01:00
Alex Sharov
aad257bc0c
erigon22: skip tmp files by regexp (#637) 2022-09-13 16:01:41 +07:00
Alex Sharov
4858acfb2e
fix lint (#632) 2022-09-09 21:07:39 +07:00
Alex Sharov
4fea8e9ba2
erigon22: history iterator v3 #630 2022-09-08 14:01:32 +07:00
Alex Sharov
6db97dbe2d
enable some test (#629) 2022-09-08 11:19:32 +07:00
Alex Sharov
e6276aeea8
erigon22: history iterator v2 (#628) 2022-09-08 11:09:54 +07:00
Alex Sharov
c22f737b87
Erigon22: use history iterator #627 2022-09-07 15:57:28 +07:00
Alex Sharov
f8060aa75d
erigon22: HistoryIterator1 v1 (#626) 2022-09-07 14:40:39 +07:00
Alex Sharov
841fe604f9
erigon22: fix infinity loop #624 2022-09-06 13:56:07 +07:00
Alex Sharov
775ace2e37
erigon22: historyReader22 and more tests #623 2022-09-06 13:54:58 +07:00
Alex Sharov
e40691a4ad
history22: small renames #608 2022-08-29 11:07:10 +07:00
Alex Sharov
588519a33b
erigon22: recent history read (#605) 2022-08-28 11:25:53 +07:00
Alex Sharov
cfd14d0297
erigon22: step toward /tests 2022-08-25 15:31:59 +07:00
Andrew Ashikhmin
23c7f503e0
WithTablessCfg -> WithTableCfg (#601) 2022-08-24 11:02:47 +02:00
Alex Sharov
c7cf5b6530
clean (#599) 2022-08-22 15:56:18 +07:00
Alex Sharov
eab2010195
InvertedIndex don't loose last key (#597)
* save

* save
2022-08-22 15:45:59 +07:00
alex.sharov
36778a2db3 save 2022-08-22 10:33:14 +07:00
alex.sharov
abcfb230fc save 2022-08-19 11:23:56 +07:00
Alex Sharov
d93972c581
domain: docs of tables format (#595) 2022-08-18 15:02:24 +07:00
Alex Sharov
0b4dcfb43d
erigon22: unwind code (#591)
* save

* save
2022-08-17 16:37:42 +07:00
Alex Sharov
59dfcc471c
erigon22: prune - check key existance (#588) 2022-08-15 14:33:32 +07:00
Alex Sharov
4945162dd7
erigon22: unwind code #587 2022-08-15 10:27:08 +07:00
ledgerwatch
e160c1ad9c
Optimise state erigon2.2 reconstitution (#570)
* Start iterator1

* No parallel buildFiles and mergeFiles

* Optimise GetNoState

* Fixes

* Fix 2

* Another fix

* Fix

* More changes iter

* Provide keys in ScanIterator

* Tables for bitmaps

* Add X tables

* Change signature of GeNoState

* More on changes iterator

* Test for changed keys iterator

* ReconDb tables

* Changed key iterator

* Fix lint

* Fix lint

* uncovert

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro.local>
Co-authored-by: Alex Sharp <alexsharp@alexs-mbp.lan>
2022-08-14 14:56:47 +01:00
Alex Sharov
404276494a
state22.Unwind() (#586) 2022-08-14 17:53:53 +07:00
Alex Sharov
0b68b61b52
fix for loop 2022-08-14 10:21:38 +07:00
Alex Sharov
27ce06026f
Aggregator22.Unwind() (#584) 2022-08-13 18:51:23 +07:00
Alex Sharov
95e94b2eb5
erigon22: optimize index.add (#571)
* save

* save
2022-08-09 10:28:29 +07:00
Alex Sharov
2be46669d5
Progress type (#568) 2022-08-04 12:31:17 +07:00
ledgerwatch
fadc9b21d1
[erigon2.2] Split 2.2 and 2.3 prototype (#548)
* Introduce access functions to history

* Add missing functions

* Add missing functions

* Add missing functions

* Changeover in the aggregator

* Intermediate

* Fix domain tests

* Fix lint

* Fix lint

* Fix lint

* Close files

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-07-28 08:47:13 +01:00
Alex Sharov
f9164fdc82
Log readable (#556)
* save

* save

* save
2022-07-27 12:09:07 +07:00
Alex Sharov
471d790348
kv.Del() remove second parameter (#554)
* save

* save

* save

* save

* save

* save

* save

* save
2022-07-26 12:47:08 +07:00
ledgerwatch
596d10ea2e
Split aggregator to 2.2 and 2.3 versions (#539)
* Split History from Domain

* Add History.prune

* More on history

* Fix HistoryHistory test

* Merge history files

* Scan file test for history

* Add aggregator for erigon 2.2

* Change to generics, introduce contexts

* Delete to belong to Aggregator

* Fix lint

* Fix lint

* Fix lint

* Fix lint

* Use pointers to InvertedIndex again

* Remove prints

* Close embedded InvertedIndex

* Fix closing files

* Print

* Update ci.yml

* More printing

* Fix

* Make InvertedIndex pointer inside History

* Fix

* Update ci.yml

* Remove print

Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro.local>
Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-07-23 09:06:52 +01:00
Alex Sharov
ebea2863c1
domain: files generic btree 2022-07-18 16:05:04 +07:00
ledgerwatch
9e7f22667e
[erigon2.2] FinishTx to aggregate with delay (to avoid MDBX panic) (#513)
* Add temporary table for Plain state reconstitution

* Add 2 more temp tables

* FinishTx with delay

* Fix search in history

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-07-06 06:42:40 +01:00
ledgerwatch
707a89842d
Add function to get history without state (#501)
* Add function to get history without state

* Add recon functions

* Expose endMinimax

* Recon prints

* Add NoState access methods

* MaxTxNum functions

* MaxTxNum functions

* MaxTxNum functions

* MaxTxNum functions

* History iterator

* Iterator

* history iterators to aggregator

* Print

* Fix

* Fix

* Fix

* Fix

* Fix

* Fix

* Print

* Print

* Print

* Fix

* Fix

* Fix

* Fix

* Fix

* Print

* Print

* Print

* Print

* Print

* Add stats

* Remove time measurement

* Contexts for thread safety

* Partial iterators

* Fix

* Fix

* Not use SkipUncompressed

* Print

* Print

* Pass empty vals

* Parallel bitmap collection

* Print

* ReconTx iterator

* ReconTx iterator

* ReconTx iterator

* ReconTx iterator

* Print

* Print

* Remove print

* Print

* Print

* Print

* Print

* Print

* Print

* Dedicated getter for Iterate

* For for storage 0

* Remove print

* do not perform unnecessary changes

Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro.local>
Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-07-02 19:38:34 +01:00
ledgerwatch
46bebb3317
[erigon2.2] Add ReadIndices aggregator to collect data (#500)
* [erigon2.2] Add ReadIndices aggregator to collect data

* Try

* Fix for history access

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-06-20 08:39:29 +01:00
ledgerwatch
234be664fc
Optimise history access for multiple files (#498)
Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-06-18 22:54:36 +01:00
ledgerwatch
945b0e9e0f
Fix merge of code files (#495)
Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-06-17 19:24:56 +01:00
ledgerwatch
df49481ddc
[erigon 2.2] Make keys always uncompressed, values compressed only for code (#492)
* Reduce allocations in domain and aggregator

* Make keys always uncompressed, values compressed only for code

* Functions to remake index

* Fix index recreation

* Test for reindex, fix

* Use uncompress vals in history

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-06-17 12:39:49 +01:00
ledgerwatch
bbf96d0580
Close compressor (#491)
Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-06-14 22:38:18 +01:00
ledgerwatch
e2c6ef0058
[erigon2.2] Fixes for inverted indices and domains for the prototype (#489)
* Better control of compress/uncompressed

* Add new function

* more careful pruning

* Printf

* Printf

* Fix DupSort

* Remove copying in prune

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-06-13 19:32:13 +01:00
ledgerwatch
6cad65e62b
[erigon2.2] Parallel build files and merge, change file names (#487)
* Parallel build files and merge, change file names

* Update ci.yml

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-06-12 10:14:18 +01:00
ledgerwatch
45d4c21490
Expose inverted index ranges in aggregator (#486)
Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-06-11 12:09:24 +01:00
ledgerwatch
7ce8bd589f
[erigon 2.2] Add functions for traces and event logs (#485)
* Add functions for traces and event logs

* Add functions for traces and event logs

Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro.local>
2022-06-10 06:51:00 +01:00
ledgerwatch
74ea75f9b8
[erigon2.2] Merge fixes, add historical access (#482)
* Merge fixes, add historical access

* Change API from AfterTxNum to BeforeTxNum

* Change API functions

* Change API functions

* Print

* Fix for non-existent items

* Remove prints

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-06-09 14:48:16 +01:00
ledgerwatch
a77e6425eb
Fixes for the Erigon 2 upgrade 2 prototype (#479)
* Print

* Remove print

* Remove print

* Fix one panic

* Fix duplicate collation

* Print

* Fix print

* fix maxSpan

* Reduce maxSpan

* Remove duplicate join

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-06-05 22:32:34 +01:00
ledgerwatch
f16b285631
Adjustments for erigon 2 upgrade prototype (#477)
Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-06-05 16:34:38 +01:00
ledgerwatch
157b4299e4
[erigon2] Continuation on domains and inverted indices, putting things together (#476)
* Add scan files tests, create new aggregator type

* Fix lint

* windows test fix

* Add delelte test

* AggCollation

* More functions to Aggregator

* More aggregator functions

* Update

* More functions

* More functions

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro.local>
2022-06-02 21:40:58 +01:00
ledgerwatch
c5a10975ab
[erigon2] Introduce inverted index type (#473)
* [erigon2] Introduce inverted index type

* More inverted index code

* More tests for inverted index

* Think about public and non-public APIs

* Minimise DB access when accessing history

* Work on iterator

* Implementation of inverted iterator

* Test for inverted index

* Assert end of iterators

* Merge of inverted index files and test

* Fix lint

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro.local>
2022-05-31 18:42:04 +01:00
ledgerwatch
990e586823
[erigon2] Continuation of work on domains - merge static files (#466)
* Iteration over files - initial

* Fix interator for multistep

* Add function

* More functions for merge

* Merge files

* More work on the merge

* Fix buildIndex

* Fix history test for test of not completely pruned db

* Prepare for merge test

* Merge file test

* Close files

* Move functions into separate file

* Print

* fix for closing index

Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro.local>
Co-authored-by: Alex Sharp <alexsharp@alexs-mbp.lan>
Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-05-29 19:57:09 +01:00
ledgerwatch
37d9944da9
[erigon2] State domains (move functionality out of aggregator) (#436)
* Domain

* First functions

* change year

* More on domain

* More to test

* More on test

* More on domains

* buildFiles

* More on domains

* Collation test

* Fix collate

* Add test for decompressors

* Restructure history tables

* Split history into 2 tables

* Fix lint

* Check index files in the test

* Close files

* Add file scanning

* Fix lint

* Fix lint

* Add readFromFiles

* Add ef history idx file

* Start cleanup

* More to cleanup, test for ef history

* More test

* Add prune to test

* Test for prune and fix

* Start history access

* History test

* Test for LastDup

* Fix one lint

* Workaround

* History tests

* Debug

* Fix

* Fix in history

* Fix lint

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
Co-authored-by: Alex Sharp <alexsharp@alexs-macbook-pro.home>
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro.local>
Co-authored-by: Alex Sharp <alexsharp@alexs-mbp.lan>
2022-05-24 18:59:57 +01:00