Mdbx now takes a logger - but this has not been pushed to all callers -
meaning it had an invalid logger
This fixes the log propagation.
It also fixed a start-up issue for http.enabled and txpool.disable
created by a previous merge
This fixes an issue where the mumbai testnet node struggle to find
peers. Before this fix in general test peer numbers are typically around
20 in total between eth66, eth67 and eth68. For new peers some can
struggle to find even a single peer after days of operation.
These are the numbers after 12 hours or running on a node which
previously could not find any peers: eth66=13, eth67=76, eth68=91.
The root cause of this issue is the following:
- A significant number of mumbai peers around the boot node return
network ids which are different from those currently available in the
DHT
- The available nodes are all consequently busy and return 'too many
peers' for long periods
These issues case a significant number of discovery timeouts, some of
the queries will never receive a response.
This causes the discovery read loop to enter a channel deadlock - which
means that no responses are processed, nor timeouts fired. This causes
the discovery process in the node to stop. From then on it just
re-requests handshakes from a relatively small number of peers.
This check in fixes this situation with the following changes:
- Remove the deadlock by running the timer in a separate go-routine so
it can run independently of the main request processing.
- Allow the discovery process matcher to match on port if no id match
can be established on initial ping. This allows subsequent node
validation to proceed and if the node proves to be valid via the
remainder of the look-up and handshake process it us used as a valid
peer.
- Completely unsolicited responses, i.e. those which come from a
completely unknown ip:port combination continue to be ignored.
-
* call getEnode before NodeStarted to make sure it is ready for RPC
calls
* fix connection error detection on macOS
* use a non-default p2p port to avoid conflicts
* disable bor milestones on local heimdall
* generate node keys for static peers config
Problem:
"Started P2P networking" log message contains port zero on startup,
e.g.: 127.0.0.1:0 because of the outdated localnodeAddrCache.
Solution:
Call updateLocalNodeStaticAddrCache after updating the port.
Improve p2p error handling to propagate errors
from the origin up the call chain the Server peer removal code
using a new PeerError type containing a DiscReason and a more detailed
description.
The origin can be tracked down using PeerErrorCode (code) and DiscReason
(reason)
which looks like this in the log:
> [TRACE] [08-28|16:33:40.205] Removing p2p peer peercount=0
url=enode://d399f4b...@1.2.3.4:30303 duration=6.901ms
err="PeerError(code=remote disconnect reason, reason=too many peers,
err=<nil>, message=Peer.run got a remote DiscReason)"
The peer ID in sentry.proto is a H512 / 64 bytes value, and
MarshalPubkey creates it from a public key.
There's no need to cut the first byte, because MarshalPubkey already
does it.
Doing so results in a 63 bytes value that is incompatible with silkworm
sentry.
the log line here was the culprit for the race. made sense to just
capture this on localnode creation instead and hold onto it for when the
server is started.
ran test with `-race` and `-count=5000` to double check and all looks
good
Regarding https://github.com/ledgerwatch/erigon/issues/6260
added flag `--p2p.allowed-ports=<porta>,<portb>` to restrict which ports
to use for sentries for different protocol versions.
Default for this flag is `30303, 30304` (first port is inherited from
`--port` flag defaults.
If `--port` is changed and it's new value is not presented in allowed
port list, provided port will be allowed as well as list provided via
`--p2p.allowed-ports`
Port picking is straightforward, we create sentry gRPC server for
protocol over first allowed port that is not already taken.
If there are no allowed ports left, erigon exits with hint.
* Add eth/67
* Listen to eth/66 on a separate port
* Fix compilation error
* Fix cfg66.ListenAddr
* Update erigon ports in README
* Expose port 30304 in docker
* P2pProtocolVersionFlag instead of second sentry
* Remove "66 by default" from usage
* Small comment
* exchange RLPx Hello even when maxpeers limit is reached
* bump MaxPendingPeers to increase the default handshake queue
(and the likelyhood of Hello exchange)
* use semaphore instead of a chan struct{}
* move MaxPendingPeers default value to DefaultConfig.P2P
* log Error if Accept fails
* replace quit channel with context
* Switch peerId from 256 to 512 bit (as in stable)
* go mod tidy
* Fix some tests
* Fixed
* Fixes
* Fix tests
* Update to erigon-lib main
Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro.local>
Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
Most places that used this method were cutting off the 1st byte.
Refactor this idea to a common place.
* better naming: MarshalPubkey matches existing UnmarshalPubkey
* "Std" suffix for the ANSI standard encoding without cut off
* docs
If --nat extip:1.2.3.4 option is specified, the port mapping logic
(AddMapping/DeleteMapping) does nothing.
This optimization avoids running a goroutine for doing nothing.
* use "log" for struct fields
* use "logger" for function parameters and local vars
This is a compromise between:
1) using logger := log.New() to avoid aliasing (log := log.New())
2) and keeping it short when logging e.g.: srv.log.Info(...)
* implemented crash reporting for all goroutine panics that aren't handled explicitly
* implemented crash reporting for all goroutine panics that aren't handled explicitly
* changed node defaults back to originals after testing
* implemented panic handling for all goroutines that don't explicitly handle them, outputting the stack trace to a file in crashreports
* handling panics on all goroutines gracefully
* updated missing call
* error assignment
* implemented suggestions
* path.Join added
* implemented Evgeny's suggestions
* changed path.Join to filepath.Join for cross-platform
* added err check
* updated RecoverStackTrace to LogPanic
* updated closures
* removed call of common.Go to some goroutines
* updated scope capture
* removed testing files
* reverted back to original method, I feel like its less intrusive
* update filename for clarity
This PR enables running the new discv5 protocol in both LES client
and server mode. In client mode it mixes discv5 and dnsdisc iterators
(if both are enabled) and filters incoming ENRs for "les" tag and fork ID.
The old p2p/discv5 package and all references to it are removed.
Co-authored-by: Felix Lange <fjl@twurst.com>
# Conflicts:
# cmd/bootnode/main.go
# cmd/faucet/faucet.go
# cmd/utils/flags.go
# les/client.go
# les/commons.go
# les/enr_entry.go
# les/server.go
# les/serverpool.go
# les/serverpool_test.go
# mobile/discover.go
# mobile/params.go
# p2p/discv5/database.go
# p2p/discv5/metrics.go
# p2p/discv5/net.go
# p2p/discv5/net_test.go
# p2p/discv5/node.go
# p2p/discv5/node_test.go
# p2p/discv5/sim_test.go
# p2p/discv5/table.go
# p2p/discv5/table_test.go
# p2p/discv5/ticket.go
# p2p/discv5/topic.go
# p2p/discv5/topic_test.go
# p2p/discv5/udp.go
# p2p/server.go