erigon-pulse/erigon-lib/kv
battlmonstr 231e468e19 Add 'erigon-lib/' from commit '93d9c9d9fe4bd8a49f7a98a6bce0f0da7094c7d3'
git-subtree-dir: erigon-lib
git-subtree-mainline: 3c8cbda809
git-subtree-split: 93d9c9d9fe
2023-09-20 14:50:25 +02:00
..
bitmapdb Add 'erigon-lib/' from commit '93d9c9d9fe4bd8a49f7a98a6bce0f0da7094c7d3' 2023-09-20 14:50:25 +02:00
iter Add 'erigon-lib/' from commit '93d9c9d9fe4bd8a49f7a98a6bce0f0da7094c7d3' 2023-09-20 14:50:25 +02:00
kvcache Add 'erigon-lib/' from commit '93d9c9d9fe4bd8a49f7a98a6bce0f0da7094c7d3' 2023-09-20 14:50:25 +02:00
kvcfg Add 'erigon-lib/' from commit '93d9c9d9fe4bd8a49f7a98a6bce0f0da7094c7d3' 2023-09-20 14:50:25 +02:00
mdbx Add 'erigon-lib/' from commit '93d9c9d9fe4bd8a49f7a98a6bce0f0da7094c7d3' 2023-09-20 14:50:25 +02:00
memdb Add 'erigon-lib/' from commit '93d9c9d9fe4bd8a49f7a98a6bce0f0da7094c7d3' 2023-09-20 14:50:25 +02:00
order Add 'erigon-lib/' from commit '93d9c9d9fe4bd8a49f7a98a6bce0f0da7094c7d3' 2023-09-20 14:50:25 +02:00
rawdbv3 Add 'erigon-lib/' from commit '93d9c9d9fe4bd8a49f7a98a6bce0f0da7094c7d3' 2023-09-20 14:50:25 +02:00
remotedb Add 'erigon-lib/' from commit '93d9c9d9fe4bd8a49f7a98a6bce0f0da7094c7d3' 2023-09-20 14:50:25 +02:00
remotedbserver Add 'erigon-lib/' from commit '93d9c9d9fe4bd8a49f7a98a6bce0f0da7094c7d3' 2023-09-20 14:50:25 +02:00
temporal/historyv2 Add 'erigon-lib/' from commit '93d9c9d9fe4bd8a49f7a98a6bce0f0da7094c7d3' 2023-09-20 14:50:25 +02:00
helpers.go Add 'erigon-lib/' from commit '93d9c9d9fe4bd8a49f7a98a6bce0f0da7094c7d3' 2023-09-20 14:50:25 +02:00
kv_interface.go Add 'erigon-lib/' from commit '93d9c9d9fe4bd8a49f7a98a6bce0f0da7094c7d3' 2023-09-20 14:50:25 +02:00
Readme.md Add 'erigon-lib/' from commit '93d9c9d9fe4bd8a49f7a98a6bce0f0da7094c7d3' 2023-09-20 14:50:25 +02:00
tables.go Add 'erigon-lib/' from commit '93d9c9d9fe4bd8a49f7a98a6bce0f0da7094c7d3' 2023-09-20 14:50:25 +02:00

Ethdb package hold's bouquet of objects to access DB

Words "KV" and "DB" have special meaning here:

  • KV - key-value-style API to access data: let developer manage transactions, stateful cursors.
  • DB - object-oriented-style API to access data: Get/Put/Delete/WalkOverTable/MultiPut, managing transactions internally.

So, DB abstraction fits 95% times and leads to more maintainable code - because it looks stateless.

About "key-value-style": Modern key-value databases don't provide Get/Put/Delete methods, because it's very hard-drive-unfriendly - it pushes developers do random-disk-access which is order of magnitude slower than sequential read. To enforce sequential-reads - introduced stateful cursors/iterators - they intentionally look as file-api: open_cursor/seek/write_data_from_current_position/move_to_end/step_back/step_forward/delete_key_on_current_position/append.

Class diagram:

// This is not call graph, just show classes from low-level to high-level. 
// And show which classes satisfy which interfaces.

+-----------------------------------+   +-----------------------------------+ 
|  github.com/erigonteh/mdbx-go    |   | google.golang.org/grpc.ClientConn |                    
|  (app-agnostic MDBX go bindings)  |   | (app-agnostic RPC and streaming)  |
+-----------------------------------+   +-----------------------------------+
                  |                                      |
                  |                                      |
                  v                                      v
+-----------------------------------+   +-----------------------------------+
|       ethdb/kv_mdbx.go            |   |       ethdb/kv_remote.go          |                
|  (tg-specific MDBX implementaion) |   |   (tg-specific remote DB access)  |              
+-----------------------------------+   +-----------------------------------+
                  |                                      |
                  |                                      |
                  v                                      v    
+----------------------------------------------------------------------------------------------+
|                                       eth/kv_interface.go                                   |  
|         (Common KV interface. DB-friendly, disk-friendly, cpu-cache-friendly.                |
|           Same app code can work with local or remote database.                              |
|           Allows experiment with another database implementations.                           |
|          Supports context.Context for cancelation. Any operation can return error)           |
+----------------------------------------------------------------------------------------------+

Then:
turbo/snapshotsync/block_reader.go.go
erigon-lib/state/aggregator_v3.go

Then:
kv_temporal.go

ethdb.AbstractKV design:

  • InMemory, ReadOnly: NewMDBX().Flags(mdbx.ReadOnly).InMem().Open()

  • MultipleDatabases, Customization: NewMDBX().Path(path).WithBucketsConfig(config).Open()

  • 1 Transaction object can be used only within 1 goroutine.

  • Only 1 write transaction can be active at a time (other will wait).

  • Unlimited read transactions can be active concurrently (not blocked by write transaction).

  • Methods db.Update, db.View - can be used to open and close short transaction.

  • Methods Begin/Commit/Rollback - for long transaction.

  • it's safe to call .Rollback() after .Commit(), multiple rollbacks are also safe. Common transaction patter:

tx, err := db.Begin(true, ethdb.RW)
if err != nil {
    return err
}
defer tx.Rollback() // important to avoid transactions leak at panic or early return

// ... code which uses database in transaction
 
err := tx.Commit()
if err != nil {
    return err
}
  • No internal copies/allocations. It means: 1. app must copy keys/values before put to database. 2. Data after read from db - valid only during current transaction - copy it if plan use data after transaction Commit/Rollback.

  • Methods .Bucket() and .Cursor(), cant return nil, can't return error.

  • Bucket and Cursor - are interfaces - means different classes can satisfy it: for example MdbxCursor and MdbxDupSortCursor classes satisfy it. If your are not familiar with "DupSort" concept, please read dupsort.md

  • If Cursor returns err!=nil then key SHOULD be != nil (can be []byte{} for example). Then traversal code look as:

for k, v, err := c.First(); k != nil; k, v, err = c.Next() {
if err != nil {
return err
}
// logic
}
  • Move cursor: cursor.Seek(key)

ethdb.Database design:

  • Allows pass multiple implementations
  • Allows traversal tables by db.Walk

ethdb.TxDb design:

  • holds inside 1 long-running transaction and 1 cursor per table
  • method Begin DOESN'T create new TxDb object, it means this object can be passed into other objects by pointer, and high-level app code can start/commit transactions when it needs without re-creating all objects which holds TxDb pointer.
  • This is reason why txDb.CommitAndBegin() method works: inside it creating new transaction object, pinter to TxDb stays valid.

How to dump/load table

Install all database tools: make db-tools

./build/bin/mdbx_dump -a <datadir>/erigon/chaindata | lz4 > dump.lz4
lz4 -d < dump.lz4 | ./build/bin/mdbx_load -an <datadir>/erigon/chaindata

How to get table checksum

./build/bin/mdbx_dump -s table_name <datadir>/erigon/chaindata | tail -n +4 | sha256sum # tail here is for excluding header 

Header example:
VERSION=3
geometry=l268435456,c268435456,u25769803776,s268435456,g268435456
mapsize=756375552
maxreaders=120
format=bytevalue
database=TBL0001
type=btree
db_pagesize=4096
duplicates=1
dupsort=1
HEADER=END