Changed distribution of httpcfg.HttpCfg to be pointer.
Added new flags:
rpc.slow.log - which is false by default, this flag need to enable
logging slow RPC requests
rpc.slow.log.threshold - which is 100 by default, this flag specify slow
threshold in milliseconds
Updated rpc handler to log slow requests:
- added map[request id] {method, timestamp}
- put every request details to map above
- delete request details from map above
- added time interval check for elements in map and if time difference
is more than given threshold print request id and the method
- app will print slow requests in next cases:
1. As soon as request take more than given threshold
2. Every 20 seconds if request still in process
3. After request finished and it took more than give threshold
---------
Co-authored-by: alex.sharov <AskAlexSharov@gmail.com>
This PR adds support to store the transaction dependency (generated by
the block producer) in the block header for bor. This transaction
dependency will then be used by the parallel processor
([Block-STM](https://github.com/ledgerwatch/erigon/pull/7812/)).
I have created another
[PR](https://github.com/ledgerwatch/erigon-lib/pull/1064) in the
erigon-lib repo which adds the `IsParallelUniverse()` function.
# Background
Erigon currently uses a combination of Victoria Metrics and Prometheus
client for providing metrics.
We want to rationalize this and use only the Prometheus client library,
but we want to maintain the simplified Victoria Metrics methods for
constructing metrics.
This task is currently partly complete and needs to be finished to a
stage where we can remove the Victoria Metrics module from the Erigon
code base.
# Summary of changes
- Adds missing `NewCounter`, `NewSummary`, `NewHistogram`,
`GetOrCreateHistogram` functions to `erigon-lib/metrics` similar to the
interface VictoriaMetrics lib provides
- Minor tidy up for consistency inside `erigon-lib/metrics/set.go`
around return types (panic vs err consistency for funcs inside the
file), error messages, comments
- Replace all remaining usages of `github.com/VictoriaMetrics/metrics`
with `github.com/ledgerwatch/erigon-lib/metrics` - seamless (only import
changes) since interfaces match
This fixes an issue where the mumbai testnet node struggle to find
peers. Before this fix in general test peer numbers are typically around
20 in total between eth66, eth67 and eth68. For new peers some can
struggle to find even a single peer after days of operation.
These are the numbers after 12 hours or running on a node which
previously could not find any peers: eth66=13, eth67=76, eth68=91.
The root cause of this issue is the following:
- A significant number of mumbai peers around the boot node return
network ids which are different from those currently available in the
DHT
- The available nodes are all consequently busy and return 'too many
peers' for long periods
These issues case a significant number of discovery timeouts, some of
the queries will never receive a response.
This causes the discovery read loop to enter a channel deadlock - which
means that no responses are processed, nor timeouts fired. This causes
the discovery process in the node to stop. From then on it just
re-requests handshakes from a relatively small number of peers.
This check in fixes this situation with the following changes:
- Remove the deadlock by running the timer in a separate go-routine so
it can run independently of the main request processing.
- Allow the discovery process matcher to match on port if no id match
can be established on initial ping. This allows subsequent node
validation to proceed and if the node proves to be valid via the
remainder of the look-up and handshake process it us used as a valid
peer.
- Completely unsolicited responses, i.e. those which come from a
completely unknown ip:port combination continue to be ignored.
-
Reason:
- produce and seed snapshots earlier on chain tip. reduce depnedency on
"good peers with history" at p2p-network.
Some networks have no much archive peers, also ConsensusLayer clients
are not-good(not-incentivised) at serving history.
- avoiding having too much files:
more files(shards) - means "more metadata", "more lookups for
non-indexed queries", "more dictionaries", "more bittorrent
connections", ...
less files - means small files will be removed after merge (no peers for
this files).
ToDo:
[x] Recent 500K - merge up to 100K
[x] Older than 500K - merge up to 500K
[x] Start seeding 100k files
[x] Stop seeding 100k files after merge (right before delete)
In next PR:
[] Old version of Erigon must be able download recent hashes. To achieve
it - at first start erigon will download preverified hashes .toml from
s3 - if it's newer that what we have (build-in) - use it.
Newly introduced `t.logGaps` was being set to `nil` and still accessed
within `clearFailedLogs`. This PR changes the ordering, moving the nil
setting to `CaptureTxEnd`.
This introduces _experimental_ RPC daemon run by embedded Silkworm
library. Same notes as in PR #8353 apply here plus the following ones:
- activated if `http` command-line option is enabled and `silkworm.path`
option is present, nothing more is required (i.e. currently, both block
execution and RPC daemon run by Silkworm when specifying
`silkworm.path`, just to keep things as simple as possible)
- only Execution API endpoints are implemented by Silkworm RPCDaemon,
whilst Engine API endpoints are still served by Erigon RPCDaemon
- some features are still missing, in particular:
- state change notification handling
- custom JSON RPC settings (i.e. Erigon RPC settings are not passed to
Silkworm yet)
If HeaderDownload.VerifyHeader always returns false, the memory usage
grows at a fast pace
due to Link objects (containing headers) not deallocated even after the
link queue pruning.
Add a check against the inbound headers before publishing a newly mined
block after the wait delay.
If the node received a block while it was processing transactions, or
waiting for its publish slot, do a final check that another node hasn't
already published a block.
This introduces _experimental_ block execution run by embedded Silkworm
API library:
- new command-line option `silkworm.path` to enable the feature by
specifying the path to the Silkworm library
- the Silkworm API shared library is dynamically loaded on-demand
- currently requires to build Silkworm library on the target machine
- available only on Linux at the moment: macOS has issue with [stack
size](https://github.com/golang/go/issues/28024) and Windows would
require [TDM-GCC-64](https://jmeubank.github.io/tdm-gcc/), both need
dedicated effort for an assessment
When a new feature (like for the upcoming `Aalborg` hard fork) for
Polygon hasn't kicked in (in heimdall), the endpoints of heimdall will
now return 503 (Service Unavailable) status code. This PR makes sure
that erigon handles that code separately and doesn't keep retying to
fetch info. It also acts as a notifier of the HF in erigon.
Similar reference PR in bor:
https://github.com/maticnetwork/bor/pull/1023
Whitelisting calculation of the roothash should not be dependent on the
bor api running. This will not always be the case, for example when
erigon is configured with a separate rpc deamon.
To fix this the calculation has been moved to Bor.
Additionally the redundant Bor API code has been removed as this is not
called by any code and the functionality looks to have migrated to the
turbo/jsonrpc package.
Fixes for label discrepancies in collector for summaries etc which have
a template which includes a quantile.
Initial native Prometheus client implementation of metrics - which is
currently turned off except for local testing and interface exports.
This builds upon #3453 and #3524, which previously implemented gas
optimizations for the access lists generated by Erigon's implementation
of the `eth_createAccessList` RPC call.
Erigon currently optimizes inclusion of the recipient address based on
how many storage keys are accessed, but does not perform the same
optimization for sender address and precompiled contract addresses.
These changes make the same optimization available for all of these
cases.
Additionally, this handles the cases of block producer address and
created smart contract addresses. If these cases were omitted on purpose
since they heavily rely on state, it may still make sense to offer them
to users but disable them by default.
Closes#8078
This change is primarily intended to support go 1.21, but as a
side-effect requires updating libp2p, which in turn triggers an update
of golang.org/x/exp which creates quite a bit of (simple) churn in the
slice sorting.
This change introduces a new `cmp.Compare` function which can be used to
return an integer satisfying the compare interface for slice sorting.
In order to continue to support mplex for libp2p, the change references
github.com/libp2p/go-libp2p-mplex instead. Please see the PR at
https://github.com/libp2p/go-libp2p/pull/2498 for the official usptream
comment that indicates official support for mplex being moved to this
location.
Co-authored-by: Jason Yellick <jason@enya.ai>
Apparently `log.New()` doesn't discard messages. Before this change I
saw the following in the console log:
```
[INFO] [09-27|15:25:36.225] [NewPayload] Handling new payload height=18227379 hash=0xa82630b9acf814930f18b45cad533a635076eaaf8e0ae596914d906db02ea3aa
[INFO] [09-27|15:25:36.624] [5/7 Execution] Completed on block=18227379
[INFO] [09-27|15:25:37.245] [updateForkchoice] Fork choice update: flushing in-memory state (built by previous newPayload)
[INFO] [09-27|15:25:37.854] RPC Daemon notified of new headers from=18227378 to=18227379 hash=0xa82630b9acf814930f18b45cad533a635076eaaf8e0ae596914d906db02ea3aa header sending=9.615µs log sending=336ns
[INFO] [09-27|15:25:37.854] head updated hash=0xa82630b9acf814930f18b45cad533a635076eaaf8e0ae596914d906db02ea3aa number=18227379
```
And after this change:
```
[INFO] [09-27|20:31:36.929] [NewPayload] Handling new payload height=18228897 hash=0x276fbffe9dc95fa613815238c870de89de340784b4d3d7a7bd5bf1cea6e54a97
[INFO] [09-27|20:31:37.713] [updateForkchoice] Fork choice update: flushing in-memory state (built by previous newPayload)
[INFO] [09-27|20:31:38.172] RPC Daemon notified of new headers from=18228896 to=18228897 hash=0x276fbffe9dc95fa613815238c870de89de340784b4d3d7a7bd5bf1cea6e54a97 header sending=17.073µs log sending=272ns
[INFO] [09-27|20:31:38.172] head updated hash=0x276fbffe9dc95fa613815238c870de89de340784b4d3d7a7bd5bf1cea6e54a97 number=18228897
```
This is a follow-up to PR #7464.
Add code to the headers state to break processing if a bor milestone
rewind is detected.
The rewind processing happens in the bor/heimdall stage - this change
just avoids unnecessary header loading
if a milestone fork is likely to be detected
---------
Co-authored-by: Anshal Shukla <shukla.anshal85@gmail.com>
Some peer-review changes from the last related PR.
Addition of a flag for BlobSlots - for max allowed blobs per account in
txpool.
Use BlobFee from the block to validate txs in the pool.
See also https://github.com/ledgerwatch/erigon-lib/pull/1125
This is the second part of the bor milestone release it contains the
following changes:
* Initialize services
* This is a change from the initial pull request I have moved all of the
initialization to the bor engine. To facilitate this I have just passed
in the heimdall client interface, rather than the whole engine
* Stage processing
* This is also a change from the original PR - the code is contained in
the bor heimdall stage rather than in headers - the effect should be the
same, but this needs testing
---------
Co-authored-by: Mark Holt <mark@disributed.vision>
Co-authored-by: Anshal Shukla <shukla.anshal85@gmail.com>
First posted [here](https://github.com/ledgerwatch/erigon/pull/8195)
with bugs, which fixed in this PR.
When we call the debug_traceXXX API provided by Erigon, the withLog
option in tracerConfig is very helpful.
However, currently, the tracer cannot guarantee that the order of logs
output is consistent with the event logs returned in the transaction
receipt. This may make it difficult to use the output of the APIs. If
you need to access event logs in order, it is recommended to use the
event logs returned in the transaction receipt.
Here is an example to illustrate the reason why the order of callTracer
logs and receipt eventlogs can be inconsistent:
In a call trace tree, if a call has multiple logs and this call has
multiple child calls, and logs are also output in the child calls, the
logs of the child calls may be output between the logs of the parent
call in receipt.
I add an index field of log to identify log index of the logs in tracer
result, and it helps me a lot.
For example, erigon on devnet8 marked a block as bad due to
"mdbx_cursor_open: cannot allocate memory":
```
[INFO] [09-12|04:57:36.041] [NewPayload] Handling new payload height=171035 hash=0x321dea00c4853ee354bebaf8aef3e63fbe06c4508271c0db4c92b0f087aedc3b
171034
[WARN] [09-12|04:57:36.069] Could not validate block err="[3/7 BlockHashes] table: Header, mdbx_cursor_open: cannot allocate memory, stack: [kv_mdbx.go:1057 kv_mdbx.
go:1069 kv_mdbx.go:1077 memory_mutation.go:473 memory_mutation.go:502 etl.go:123 etl.go:96 block_writer.go:40 stage_blockhashes.go:49 default_stages.go:457 sync.go:425 sync.go:258 s
tageloop.go:414 backend.go:476 fork_validator.go:250 fork_validator.go:156 ethereum_execution.go:151 execution_client.go:51 chain_reader.go:252 engine_server.go:741 engine_server.go
:235 engine_server.go:600 value.go:586 value.go:370 service.go:224 handler.go:494 handler.go:444 handler.go:392 handler.go:223 handler.go:316 asm_amd64.s:1598]"
[WARN] [09-12|04:57:36.069] ethereumExecutionModule.ValidateChain: chain is invalid hash=0x321dea00c4853ee354bebaf8aef3e63fbe06c4508271c0db4c92b0f087aedc3b
```
With this PR blocks are marked as bad only on genuine protocol errors.
When we call the debug_traceXXX API provided by Erigon, the withLog
option in tracerConfig is very helpful.
However, currently, the tracer cannot guarantee that the order of logs
output is consistent with the event logs returned in the transaction
receipt. This may make it difficult to use the output of the APIs. If
you need to access event logs in order, it is recommended to use the
event logs returned in the transaction receipt.
Here is an example to illustrate the reason why the order of callTracer
logs and receipt eventlogs can be inconsistent:
In a call trace tree, if a call has multiple logs and this call has
multiple child calls, and logs are also output in the child calls, the
logs of the child calls may be output between the logs of the parent
call in receipt.
I add an index field of log to identify log index of the logs in tracer
result, and it helps me a lot.
Fix ``CreateConsensusEngineBareBones`` for bor and aura. Current codes
will pass a ``**chain.BorConfig`` or ``**chain.AuraConfig to``
``CreateConsensusEngine`` and cause a panic
Co-authored-by: Maohua Zhu <zhumaohua@cobo.com>