We've got a report from a user that erigon fails to run with this error:
```
/opt/erigon/releases/latest/bin/erigon: error while loading shared libraries: libsilkworm_capi.so: cannot open shared object file: No such file or directory
```
On this system erigon is executed using a service-account user which has
no permission to access the build user's $HOME where the
libsilkworm_capi.so resides (inside $GOPATH/pkg/mod).
This adds a support to silkworm-go to look for the library relative to
the executable path on Linux:
d4ec8a8bce
and a new `make DIST=<path> install` command which copies both the
library and erigon binary to any destination.
"whitelisting" mechanism (list of files - stored in DB) - which
protecting us from downloading new files after upgrade/downgrade was
broken. And seems it became over-complicated with time.
I replacing it by 1 persistent flag inside downloader:
"prohibit_new_downloads.lock"
Erigon will turn downloader into this mode after
downloading/verification of first snapshots.
```
//Corner cases:
// - Erigon generated file X with hash H1. User upgraded Erigon. New version has preverified file X with hash H2. Must ignore H2 (don't send to Downloader)
// - Erigon "download once": means restart/upgrade/downgrade must not download files (and will be fast)
// - After "download once" - Erigon will produce and seed new files
```
------
`downloader --seedbox` is never "prohibit new downloads"
Small fix to the script for the scenario where more than one matching
library can be returned.
For example, the command `/sbin/ldconfig -p | grep libstdc++ | awk '{
print $NF }'` can result in
```
/lib/x86_64-linux-gnu/libstdc++.so.6
/lib32/libstdc++.so.6
```
which then fails the check `if [[ ! -L "$link_path" ]]`
### Context
**Websocket port flag**
Hive tests for RPC suite depend on the (geth) default 8546 port. So,
opening one more listener for this additional port if `ws.port` was
specified. This flag isn't used in Erigon, as it shares port with http
listener. Normally, one may not specify and it offers no other benefit.
Silkworm built on Ubuntu 22 depends on glibc 2.34. In order to run on an
older OS, Silkworm needs to be built and linked with an older glibc, but
to build on an older OS we need a compatible compiler. Silkworm requires
gcc 11+ that is not available on Ubuntu 20 or Debian 11.
To simplify the deployment disable Silkworm support on versions before
Ubuntu 22, Debian 12, and glibc prior to 2.34. The check for Ubuntu and
Debian is explicit, because some Ubuntu 16 installations report glibc
2.35 with ldd, but `go build` still uses an older system one and fails.
What does this PR do:
* Optional Backfilling and Caplin Archive Node
* Create antiquary for historical states
* Fixed gaps of chain gap related to the Head of the chain and anchor of
the chain.
* Added basic reader object to Read the Historical state
adds a two indexes to the validators cache
creates beaconhttp package with many utilities for beacon http endpoint
(future support for ssz is baked in)
started on some validator endpoints
Changed distribution of httpcfg.HttpCfg to be pointer.
Added new flags:
rpc.slow.log - which is false by default, this flag need to enable
logging slow RPC requests
rpc.slow.log.threshold - which is 100 by default, this flag specify slow
threshold in milliseconds
Updated rpc handler to log slow requests:
- added map[request id] {method, timestamp}
- put every request details to map above
- delete request details from map above
- added time interval check for elements in map and if time difference
is more than given threshold print request id and the method
- app will print slow requests in next cases:
1. As soon as request take more than given threshold
2. Every 20 seconds if request still in process
3. After request finished and it took more than give threshold
---------
Co-authored-by: alex.sharov <AskAlexSharov@gmail.com>
- changed communication tunnel to web socket in order to connect to
remote nodes
- changed diagnostics.url flag to diagnostics.addr as now user need to
enter only address and support command will connect to it through
websocket
- changed flag debug.urls to debug.addrs in order to have ability to
change connection type between erigon and support to websocket and don't
change user API
- added auto trying to connect to connect to ws if connection with was
failed
Based on https://github.com/maticnetwork/bor/pull/871 in bor, this PR
handles import of same difficulty chains (tie breaker conditions) based
on their height and hash.
This PR also modifies an existing test to check different types of
side-chain import and how the canonical is decided.
This fixes an issue where the mumbai testnet node struggle to find
peers. Before this fix in general test peer numbers are typically around
20 in total between eth66, eth67 and eth68. For new peers some can
struggle to find even a single peer after days of operation.
These are the numbers after 12 hours or running on a node which
previously could not find any peers: eth66=13, eth67=76, eth68=91.
The root cause of this issue is the following:
- A significant number of mumbai peers around the boot node return
network ids which are different from those currently available in the
DHT
- The available nodes are all consequently busy and return 'too many
peers' for long periods
These issues case a significant number of discovery timeouts, some of
the queries will never receive a response.
This causes the discovery read loop to enter a channel deadlock - which
means that no responses are processed, nor timeouts fired. This causes
the discovery process in the node to stop. From then on it just
re-requests handshakes from a relatively small number of peers.
This check in fixes this situation with the following changes:
- Remove the deadlock by running the timer in a separate go-routine so
it can run independently of the main request processing.
- Allow the discovery process matcher to match on port if no id match
can be established on initial ping. This allows subsequent node
validation to proceed and if the node proves to be valid via the
remainder of the look-up and handshake process it us used as a valid
peer.
- Completely unsolicited responses, i.e. those which come from a
completely unknown ip:port combination continue to be ignored.
-
At `turbo/jsonrpc/bor_snapshot.go:239` creates read only transaction and
acquire semaphore but does not rollback or commit transaction and
unrelease semaphore lock. Over time, this will result in the locking all
of semaphore resources. Any other resources can't acquire semaphore.
I added defer function to rollback transaction to release semaphore.
Reason:
- produce and seed snapshots earlier on chain tip. reduce depnedency on
"good peers with history" at p2p-network.
Some networks have no much archive peers, also ConsensusLayer clients
are not-good(not-incentivised) at serving history.
- avoiding having too much files:
more files(shards) - means "more metadata", "more lookups for
non-indexed queries", "more dictionaries", "more bittorrent
connections", ...
less files - means small files will be removed after merge (no peers for
this files).
ToDo:
[x] Recent 500K - merge up to 100K
[x] Older than 500K - merge up to 500K
[x] Start seeding 100k files
[x] Stop seeding 100k files after merge (right before delete)
In next PR:
[] Old version of Erigon must be able download recent hashes. To achieve
it - at first start erigon will download preverified hashes .toml from
s3 - if it's newer that what we have (build-in) - use it.
new flag examples.
--https.enabled
--https.addr="0.0.0.0"
--https.port=443
--https.url="unix:///file.wow"
--https.cert="keyfile.cert"
--https.key="certfile.cert"
also adds support for h2c to the http handler - http2 protocol without tls.
Historically we had several times when:
- erigon downloaded new version of .seg file
- or didn't finish download and start indexing
this was a "quick-fix protection" against this cases
but now we have other protections for this cases
let's try to remove this one - because it's not compatible with "copy
datadir" and "restore datadir from backup" scenarios
This introduces _experimental_ RPC daemon run by embedded Silkworm
library. Same notes as in PR #8353 apply here plus the following ones:
- activated if `http` command-line option is enabled and `silkworm.path`
option is present, nothing more is required (i.e. currently, both block
execution and RPC daemon run by Silkworm when specifying
`silkworm.path`, just to keep things as simple as possible)
- only Execution API endpoints are implemented by Silkworm RPCDaemon,
whilst Engine API endpoints are still served by Erigon RPCDaemon
- some features are still missing, in particular:
- state change notification handling
- custom JSON RPC settings (i.e. Erigon RPC settings are not passed to
Silkworm yet)
If HeaderDownload.VerifyHeader always returns false, the memory usage
grows at a fast pace
due to Link objects (containing headers) not deallocated even after the
link queue pruning.
Fixes and issue with Polygon validators where locally mined blocks are
broadcast with invalid header hashes because the NewBlock message
constructor was removing the ReceiptHash which contributed to the header
hash.
The results in the bor header validation code not being able to
correctly identify the signer of the header - so header validation
fails.
This also likely fixes part of the bogon-block issue which was
identified by the polygon team.
This fixes 2 related issues:
* Now that the bor consensus engine is required for queries it can't be
created based on the pretense of a db directory, but must be based on
chain config read from the db. Using the DB presence causes Bor to get
instantiated for non bor chains which breaks.
* At the moment eth_calls on a remote daemon don't check Bor headers
prior to calling the EVM code as it was just using a fake ETHash
instance - which performs ETH header validation only.
The current version is mostly working but needs adapting to perform lazy
initialization of the engine.