81 Commits

Author SHA1 Message Date
Alex Sharov
127d1bac5b
decompress: catch maxDepth underflow 2022-08-01 12:37:10 +07:00
Håvard Anda Estensen
ad2344a6cc
Replace ioutil with io and os (#560) 2022-08-01 11:03:48 +07:00
ledgerwatch
fadc9b21d1
[erigon2.2] Split 2.2 and 2.3 prototype (#548)
* Introduce access functions to history

* Add missing functions

* Add missing functions

* Add missing functions

* Changeover in the aggregator

* Intermediate

* Fix domain tests

* Fix lint

* Fix lint

* Fix lint

* Close files

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-07-28 08:47:13 +01:00
ledgerwatch
596d10ea2e
Split aggregator to 2.2 and 2.3 versions (#539)
* Split History from Domain

* Add History.prune

* More on history

* Fix HistoryHistory test

* Merge history files

* Scan file test for history

* Add aggregator for erigon 2.2

* Change to generics, introduce contexts

* Delete to belong to Aggregator

* Fix lint

* Fix lint

* Fix lint

* Fix lint

* Use pointers to InvertedIndex again

* Remove prints

* Close embedded InvertedIndex

* Fix closing files

* Print

* Update ci.yml

* More printing

* Fix

* Make InvertedIndex pointer inside History

* Fix

* Update ci.yml

* Remove print

Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro.local>
Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-07-23 09:06:52 +01:00
Alex Sharov
f23061eed9
compressor: generic sort (#524) 2022-07-18 17:12:39 +07:00
Alex Sharov
e824fdff60
remove fuzzbeta build tag, because now go1.18 is minimum requirement (#428) 2022-07-03 14:38:53 +06:00
ledgerwatch
707a89842d
Add function to get history without state (#501)
* Add function to get history without state

* Add recon functions

* Expose endMinimax

* Recon prints

* Add NoState access methods

* MaxTxNum functions

* MaxTxNum functions

* MaxTxNum functions

* MaxTxNum functions

* History iterator

* Iterator

* history iterators to aggregator

* Print

* Fix

* Fix

* Fix

* Fix

* Fix

* Fix

* Print

* Print

* Print

* Fix

* Fix

* Fix

* Fix

* Fix

* Print

* Print

* Print

* Print

* Print

* Add stats

* Remove time measurement

* Contexts for thread safety

* Partial iterators

* Fix

* Fix

* Not use SkipUncompressed

* Print

* Print

* Pass empty vals

* Parallel bitmap collection

* Print

* ReconTx iterator

* ReconTx iterator

* ReconTx iterator

* ReconTx iterator

* Print

* Print

* Remove print

* Print

* Print

* Print

* Print

* Print

* Print

* Dedicated getter for Iterate

* For for storage 0

* Remove print

* do not perform unnecessary changes

Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro.local>
Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-07-02 19:38:34 +01:00
Alex Sharov
ceafdded8f
Compress: reduce etl buffers to save RAM (#502) 2022-06-25 19:39:36 +06:00
ledgerwatch
df49481ddc
[erigon 2.2] Make keys always uncompressed, values compressed only for code (#492)
* Reduce allocations in domain and aggregator

* Make keys always uncompressed, values compressed only for code

* Functions to remake index

* Fix index recreation

* Test for reindex, fix

* Use uncompress vals in history

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-06-17 12:39:49 +01:00
Artem Tsebrovskiy
f8bdadf3e0
HPH with direct reading from state by plainKey (#472)
* dirty trie with direct reading of account/storage data from state

run with fixes

implemented trie with direct reading from state

* cleaner version without updates
2022-06-09 13:46:11 +01:00
Alex Sharov
fdf7c6598b
compress.Count() method (#478) 2022-06-03 12:14:58 +07:00
Artem Tsebrovskiy
49e3522a05
added print of decompressed file at panic (#468)
* added print of decompressed file at panic

* more info for recovered decompressing
2022-05-27 08:20:53 +07:00
Artem Tsebrovskiy
6de4ac4ba9
reduced memory footprint on building huffman table (#459) 2022-05-20 11:23:05 +07:00
Alex Sharov
7908982ed9
MatchPrefix: limit 2nd loop iterations (#458)
* sf

* sf

* save
2022-05-19 12:27:36 +07:00
Alex Sharov
a8ce14e8cc
option to disable runtime.ReadMemStats (#457)
* save

* save

* save

* save
2022-05-19 11:46:55 +07:00
Alex Sharov
e304418d5a
MatchPrefix: working version (#456) 2022-05-18 14:36:01 +07:00
Alex Sharov
b4776607dc
MatchPrefix: don't compare if prefix longer than word (#455)
* save

* save

* save

* save

* save

* fd
2022-05-18 10:29:19 +07:00
Artem Tsebrovskiy
6d2181968a
reduce memory footprint during decompression (#452) 2022-05-17 12:38:48 +07:00
Alex Sharov
a86660187d
Test: support of nil value for prefixMatch (#450)
* save

* save

* save

* save
2022-05-16 20:59:29 +01:00
Alex Sharov
91f7d84e60
Generic sort of slices (no allocs, inlinable) (#449)
* save

* save
2022-05-16 08:23:43 +01:00
Alex Sharov
d882a11c67
up linter version (#443)
* save

* save

* save

* save
2022-05-10 10:14:02 +07:00
ledgerwatch
dd3e7fd537
Update decompress.go (#439) 2022-05-06 14:55:11 +01:00
Artem Tsebrovskiy
abd93fe9c9
implement bin_patricia_hashed trie (#430)
* commitment: implemented semi-working bin patricia trie

* commitment: added initialize function to select commitment implementation

* deleted reference implementation of binary trie

* added branch merge function selection in accordance with current commitment type

* smarter branch prefix convolution to reduce disk usage

* implemented DELETE update

* commitment/bin-trie: fixed merge processing and storage encoding

* added changed hex to bin patricia trie

* fixed trie variant select

* allocate if bufPos larger than buf size

* added tracing code

* Fix lint

* Skip test

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-05-05 13:08:58 +01:00
Alex Sharov
04337fd090
Compress: reduce maxlen to 512 (#416) 2022-04-17 07:59:29 +07:00
ledgerwatch
f18e05186d
Compact huffman representation in files (#414)
* More compact huffman represenation

* Intermediate

* Intermediate

* fix

* Fix lint

* Fix lint

* Fix lint

* Change min file size

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro.local>
2022-04-13 12:55:15 +01:00
Alex Sharov
75b64f01a3
compressor: log lvl #408 2022-04-01 10:44:25 +07:00
Alex Sharov
83951a1d62
Enable more linters (#381) 2022-03-19 11:38:37 +07:00
ledgerwatch
f93ea948d0
[erigon2] Optimise Huffman decoder (#374)
* Update

* Intermediate

* Huffman decoding

* Fix lint

Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro.local>
Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-03-18 09:10:18 +00:00
ledgerwatch
77eb94b53e
Elias fano search and merge (#357)
* Elias fano search and merge

* Add first cut of search

* Iterator and test

* Changes in aggregator

* Elias fano bitmap

* Fix uncompress decompress

* Print

* Print

* No print

* Print

* Print

* Print

* Change to AppendBytes

* Print

* Fix NextUncompressed

* Remove print

* Fix history search

* Fix in history search

* More tracing

* More tracing

* Fix

* Print

* Print key

* More print

* Print

* No deletion for history records

* Remove print

* Fix

* Fix

* Fix test

* Fix lint

Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro.local>
Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-03-13 22:46:17 +00:00
Alex Sharov
c1f1365f92
cancel compress (#362) 2022-03-12 16:34:58 +07:00
Alex Sharov
c0fcdabf91
compress: less allocs (#361) 2022-03-12 15:33:01 +07:00
Alex Sharov
6512e3c941
add emptyWordsCount field to .seg file header (breaking .seg format) (#355)
* up torrent

* save

* save

* save

* save

* save

* save

* save
2022-03-10 07:48:37 +00:00
ledgerwatch
75b52ac25e
[compress] Allow uncompressed words (#350)
* Intermediate work

* Allow uncompressed words

* Fix

* Fix tests

* Add NextUncompressed, remove g.word buffer

* Code simplifications, no goroutines when workers == 1

* Fix lint|

* Add test for MatchPrefix

* Work on patricia

* Beginning of new matcher

* Fuzz test for new longest match

* No skip

* Fixes

* Fixes

* More tracing

* Fixes

* Fixes

* Change back to old FindLongestMatches

* Switch to old match finder

* Print mismatches

* Fix

* After fix

* After fix

* After fix

* Print pointers

* Fixes and tests

* Print

* Print

* Print

* More tests

* Intermediate

* Fix

* Fix

* Prints

* Fix

* Fix

* Initialise matchStack

* Compute only once

* Compute only once

* Switch back

* Switch to old Find

* Introduce sais

* Switch patricia to sais

* Use sais in compressor

* Use sais in compressor

* Remove unused code

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro.local>
2022-03-09 17:25:22 +00:00
racytech
7763945374
Reverted 3 last commits (#348)
* Revert "unnecessary includes removed"

This reverts commit 76406bb78b144cfd406b75fae0beadff719ea780.

* Revert "local dev setup"

This reverts commit ac06fd9400bf4feaa65edca0f45b250ee0b132a0.

* Revert "compress/cgo-addition"

This reverts commit fae7683d46ea48b6c076b79c41430f41a89be2eb, reversing
changes made to e3e108c6c4775d9630ca801988eb273c5e168b8c.
2022-02-24 14:39:42 +00:00
Kairat Abylkasymov
76406bb78b unnecessary includes removed 2022-02-24 06:21:25 -05:00
Kairat Abylkasymov
ac06fd9400 local dev setup 2022-02-24 06:15:14 -05:00
Alex Sharov
3205770ee0
snapshots: fix test (#346) 2022-02-24 08:35:13 +07:00
ledgerwatch
c71ac02a0f
[erigon2] Optimisations in etl collector and compressor (#339)
* Optimisations in etl collector and compressor

* Not copy k and v in the collector

* Fix lint

* Optimisations

* Change Load1 back to Load

* Reduce allocations for tests

* preallocate inv

* counting hits and misses

* Try to fix

* Try to fix

* Relaxation 1

* Relaxation 2

* Add arch tables

* Fix

* Update arch tables and use them

* Not to override larger value

* Increase arch table size

* Increase arch table size

* Fixes to arch

* Print

* Off by one

* Print

* Fix

* Remove print

* Perform update of arch in the background

* Build up huffman tree

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-02-20 22:14:06 +00:00
Alex Sharov
1f5a1ab9cd
fuzz cases (#328) 2022-02-14 11:53:20 +07:00
Alex Sharov
6f85066c7e
path -> filepath (path package is for urls) (#321) 2022-02-12 20:11:30 +07:00
Alex Sharov
e649f7ea91
Less alloc etl recsplit (#307)
* less allocs recsplit

* save

* save
2022-02-09 13:22:45 +07:00
Alex Sharov
567d9ddfed
ParallelCompressor: Remove intermediate ETL collectors (#302) 2022-02-04 16:48:02 +07:00
ledgerwatch
55080d5c01
Proper reset of decompressor getter (#299)
Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-02-03 17:58:56 +00:00
Alex Sharov
0feb7fd591
Decompressor.WithReadAhead (#290) 2022-02-01 11:19:11 +07:00
ledgerwatch
4e8840256e
[erigon2] Use shorter references instead of full plain keys in the commitment files (#289)
* Rearrange aggregations

* More rearranging before introducing 3 threads

* Background aggregation|

* Concurrency fixes

* Remove files under lock

* Better logging

* Remove files without lock

* Fix lint

* Fix locking

* Try

* Fix background Merge

* Log merging

* Log merging

* Less logging

* Millisecond

* Add Stats function

* Log merge only after 1m

* Wrong counting

* plain key extract and replace functions

* Insert valTransform function

* Not parse first byte

* Not parse first byte

* Fix lint

* Switch to thin state references

* Fix lint

* Fix lint

* Debug print|

* Fix decoding

* Turn off valTransform

* Not to reuse transformer

* Print

* Print

* Print

* Derive hashed keys later

* Fix

* Fix log

* Fix

* Debug

* Another fix

* Fix

* Fix

* Print

* Print

* Data race

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro.local>
2022-01-31 22:32:00 +00:00
ledgerwatch
586ab3e6b3
Separate state btree files (#287)
* Separate state file btrees, fix Match in the decompressor

* fix match

* Fix to match

* Switch back from Match

* Try to use match, close indices

* Fixing Match

* Use Skip

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
2022-01-29 11:12:38 +00:00
Alex Sharov
dfdf7c8a66
[wip] parallel compress: less read of dat file (#284)
* save

* save

* save
2022-01-27 17:13:26 +07:00
Alex Sharov
ec11eb3d91
parallel compressor: don't save dict (#283)
* save

* save
2022-01-27 12:54:38 +07:00
ledgerwatch
7ec016b160
Fixes in compress (#260)
* Fixes in compress

* Reuse outputFile also as uncompressed file

* Close file before renaming

* Trace

* Untrace

* Use 8 threads

* Print aggregations

* Print merge and timing

* Print merge and timing

* readonly mode for patricia

* Fix to infinite loop

* Fix file names

* Cleanup

* Cleanup

Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro.local>
2022-01-24 22:13:48 +00:00
primal_concrete_sledge
d8a33270e8
issue/issue-249-add_index_reader (#273)
* issue/issue-249-add_index_reader

* Add licence
2022-01-24 20:39:04 +00:00