erigon-pulse/ethdb
ledgerwatch 793c13e20e
trace_filter and CallTrace Index (derivative of ) ()
* Trace Index

* trace_filter

* hrtc -> hrt

* Fix CallTraces

* wip

* wip

* Fix for incorrect gas

* fix

* Fix Tracer

* Add block and uncle coinbases to trace index

* rewind coinbase

* Commit every 100k blocks after block 3m

* Simplify

* Safe unwinds

* Safe unwind 2

* Cleanup

* Simplification

* Remove intermediate JSON transform

* Reinstate optional CallIndex stage

* Create rpctest bench for trace_filter

* Fix trace_filter generator

* Restore unwind order

* Ignore the storage mode

* Ignore the storage mode

Co-authored-by: Artem Vorotnikov <artem@vorotnikov.me>
Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro.local>
2021-05-04 07:23:54 +01:00
..
bitmapdb [merge after release] Move history stage to rwtx () 2021-04-15 17:06:30 +07:00
cbor Store receipts separately - one record per tx () 2020-10-25 08:38:55 +00:00
mdbx Degbug build support (for delve debugger and for getting C code profiling, traces) () 2021-04-29 21:29:58 +07:00
remote/remotedbserver pending txs methods () 2021-05-04 06:51:28 +01:00
typedcursor KV: Split RO and RW transactions () 2021-03-21 20:15:25 +07:00
database_test.go Harmonize DB APIs () 2021-04-03 09:26:00 +03:00
interface.go KVGetter, replace old geth interfaces with it () 2021-04-05 20:04:58 +07:00
kv_abstract_test.go Move compatibility check () 2021-04-26 13:39:34 +01:00
kv_abstract.go --database.verbosity flag () 2021-04-27 13:31:00 +01:00
kv_lmdb.go Rpcdaemon: move tx pool to own grpc service, subscribe to pending txs () 2021-05-04 08:37:17 +07:00
kv_mdbx.go clean () 2021-05-04 13:21:51 +07:00
kv_migrator_test.go Harmonize DB APIs () 2021-04-03 09:26:00 +03:00
kv_remote.go Rpcdaemon: move tx pool to own grpc service, subscribe to pending txs () 2021-05-04 08:37:17 +07:00
kv_snapshot_test.go Harmonize DB APIs () 2021-04-03 09:26:00 +03:00
kv_snapshot.go rebort db metrics - not guilty () 2021-04-27 15:32:41 +07:00
kv_util.go tx.RwCursor() to return err (first step of removing lazy cursors) () 2021-04-02 13:36:49 +07:00
memory_database.go Port rpcdaemon to KV interface () 2021-03-30 12:53:54 +03:00
mutation.go Connect TxFetcher to Sentry, Add --download.v2 option to TG () 2021-04-25 11:20:50 +07:00
object_db_nomdbx.go Don't require MDBX if you don't specify it () 2020-10-28 12:17:18 +00:00
object_db.go KVGetter, replace old geth interfaces with it () 2021-04-05 20:04:58 +07:00
Readme.md add --datadir parameter to integration, snapshot generator, header downloader () 2021-04-19 14:25:26 +07:00
rewind.go ChangeSets dupsort () 2020-11-16 12:08:28 +00:00
storage_mode_test.go [WIP] CallTraces index () 2020-10-12 09:39:04 +01:00
storage_mode.go trace_filter and CallTrace Index (derivative of ) () 2021-05-04 07:23:54 +01:00
tx_db.go Less use rawdb deprecated methods () 2021-05-03 21:01:01 +01:00
walk.go ChangeSets dupsort () 2020-11-16 12:08:28 +00:00

Ethdb package hold's bouquet of objects to access DB

Words "KV" and "DB" have special meaning here:

  • KV - key-value-style API to access data: let developer manage transactions, stateful cursors.
  • DB - object-oriented-style API to access data: Get/Put/Delete/WalkOverTable/MultiPut, managing transactions internally.

So, DB abstraction fits 95% times and leads to more maintainable code - because it's looks stateless.

About "key-value-style": Modern key-value databases don't provide Get/Put/Delete methods, because it's very hard-drive-unfriendly - it pushes developers do random-disk-access which is order of magnitude slower than sequential read. To enforce sequential-reads - introduced stateful cursors/iterators - they intentionally look as file-api: open_cursor/seek/write_data_from_current_position/move_to_end/step_back/step_forward/delete_key_on_current_position/append.

Class diagram:

// This is not call graph, just show classes from low-level to high-level. 
// And show which classes satisfy which interfaces.

+-----------------------------------+   +-----------------------------------+   +-----------------------------------+ 
|  github.com/ledgerwatch/lmdb-go   |   |  github.com/torquem-ch/mdbx-go    |   | google.golang.org/grpc.ClientConn |                    
|  (app-agnostic LMDB go bindings)  |   |  (app-agnostic MDBX go bindings)  |   | (app-agnostic RPC and streaming)  |
+-----------------------------------+   +-----------------------------------+   +-----------------------------------+
                 |                                        |                                      |
                 |                                        |                                      |
                 v                                        v                                      v
+-----------------------------------+   +-----------------------------------+   +-----------------------------------+
|      ethdb/kv_lmdb.go             |   |       ethdb/kv_mdbx.go            |   |       ethdb/kv_remote.go          |                
| (tg-specific LMDB implementaion)  |   |  (tg-specific MDBX implementaion) |   |   (tg-specific remote DB access)  |              
+-----------------------------------+   +-----------------------------------+   +-----------------------------------+
                 |                                        |                                      |
                 |                                        |                                      |
                 v                                        v                                      v
            +----------------------------------------------------------------------------------------------+
            |                                       ethdb/kv_abstract.go                                   |  
            |         (Common KV interface. DB-friendly, disk-friendly, cpu-cache-friendly.                |
            |           Same app code can work with local or remote database.                              |
            |           Allows experiment with another database implementations.                           |
            |          Supports context.Context for cancelation. Any operation can return error)           |
            +----------------------------------------------------------------------------------------------+
                 |                                        |                                      |
                 |                                        |                                      |
                 v                                        v                                      v
+-----------------------------------+   +-----------------------------------+   +-----------------------------------+
|       ethdb/object_db.go          |   |          ethdb/tx_db.go           |   |    ethdb/remote/remotedbserver    |                
|     (thread-safe, stateless,      |   | (non-thread-safe, more performant |   | (grpc server, using kv_abstract,  |  
|   opens/close short transactions  |   |   than object_db, method Begin    |   |   kv_remote call this server, 1   |
|      internally when need)        |   |  DOESN'T create new TxDb object)  |   | transaction maps on 1 grpc stream |
+-----------------------------------+   +-----------------------------------+   +-----------------------------------+
                |                                          |                                     
                |                                          |                                     
                v                                          v                                     
            +-----------------------------------------------------------------------------------------------+
            |                                    ethdb/interface.go                                         |  
            |     (Common DB interfaces. ethdb.Database and ethdb.DbWithPendingMutations are widely used)   |
            +-----------------------------------------------------------------------------------------------+
                |                      
                |                      
                v                      
+--------------------------------------------------+ 
|             ethdb/mutation.go                    |                 
| (also known as "batch", recording all writes and |  
|   them flush to DB in sorted way only when call  | 
|     .Commit(), use it to avoid random-writes.    | 
|   It use and satisfy ethdb.Database in same time |
+--------------------------------------------------+ 

ethdb.AbstractKV design:

  • InMemory, ReadOnly: NewLMDB().Flags(lmdb.ReadOnly).InMem().Open()

  • MultipleDatabases, Customization: NewLMDB().Path(path).WithBucketsConfig(config).Open()

  • 1 Transaction object can be used only withing 1 goroutine.

  • Only 1 write transaction can be active at a time (other will wait).

  • Unlimited read transactions can be active concurrently (not blocked by write transaction).

  • Methods db.Update, db.View - can be used to open and close short transaction.

  • Methods Begin/Commit/Rollback - for long transaction.

  • it's safe to call .Rollback() after .Commit(), multiple rollbacks are also safe. Common transaction patter:

tx, err := db.Begin(true, ethdb.RW)
if err != nil {
    return err
}
defer tx.Rollback() // important to avoid transactions leak at panic or early return

// ... code which uses database in transaction
 
err := tx.Commit()
if err != nil {
    return err
}
  • No internal copies/allocations. It means: 1. app must copy keys/values before put to database. 2. Data after read from db - valid only during current transaction - copy it if plan use data after transaction Commit/Rollback.

  • Methods .Bucket() and .Cursor(), cant return nil, can't return error.

  • Bucket and Cursor - are interfaces - means different classes can satisfy it: for example LmdbCursor and LmdbDupSortCursor classes satisfy it. If your are not familiar with "DupSort" concept, please read indices.md first.

  • If Cursor returns err!=nil then key SHOULD be != nil (can be []byte{} for example). Then traversal code look as:

for k, v, err := c.First(); k != nil; k, v, err = c.Next() {
    if err != nil {
        return err
    }
    // logic
}
  • Move cursor: cursor.Seek(key)

ethdb.Database design:

  • Allows pass multiple implementations
  • Allows traversal tables by db.Walk

ethdb.TxDb design:

  • holds inside 1 long-running transaction and 1 cursor per table
  • method Begin DOESN'T create new TxDb object, it means this object can be passed into other objects by pointer, and high-level app code can start/commit transactions when it needs without re-creating all objects which holds TxDb pointer.
  • This is reason why txDb.CommitAndBegin() method works: inside it creating new transaction object, pinter to TxDb stays valid.

How to dump/load table

Install all database tools: make db-tools - tools with prefix mdb_ is for lmdb, lmdbgo_ is for lmdb written in go, mdbx_ is for mdbx.

./build/bin/mdbx_dump -a <datadir>/tg/chaindata | lz4 > dump.lz4
lz4 -d < dump.lz4 | ./build/bin/mdbx_load -an <datadir>/tg/chaindata

How to get table checksum

./build/bin/mdbx_dump -s table_name <datadir>/tg/chaindata | tail -n +4 | sha256sum # tail here is for excluding header 

Header example:
VERSION=3
geometry=l268435456,c268435456,u25769803776,s268435456,g268435456
mapsize=756375552
maxreaders=120
format=bytevalue
database=TBL0001
type=btree
db_pagesize=4096
duplicates=1
dupsort=1
HEADER=END