lighthouse-pulse/validator_client/src
Michael Sproul 3052db29fe Implement el_offline and use it in the VC (#4295)
## Issue Addressed

Closes https://github.com/sigp/lighthouse/issues/4291, part of #3613.

## Proposed Changes

- Implement the `el_offline` field on `/eth/v1/node/syncing`. We set `el_offline=true` if:
  - The EL's internal status is `Offline` or `AuthFailed`, _or_
  - The most recent call to `newPayload` resulted in an error (more on this in a moment).

- Use the `el_offline` field in the VC to mark nodes with offline ELs as _unsynced_. These nodes will still be used, but only after synced nodes.
- Overhaul the usage of `RequireSynced` so that `::No` is used almost everywhere. The `--allow-unsynced` flag was broken and had the opposite effect to intended, so it has been deprecated.
- Add tests for the EL being offline on the upcheck call, and being offline due to the newPayload check.


## Why track `newPayload` errors?

Tracking the EL's online/offline status is too coarse-grained to be useful in practice, because:

- If the EL is timing out to some calls, it's unlikely to timeout on the `upcheck` call, which is _just_ `eth_syncing`. Every failed call is followed by an upcheck [here](693886b941/beacon_node/execution_layer/src/engines.rs (L372-L380)), which would have the effect of masking the failure and keeping the status _online_.
- The `newPayload` call is the most likely to time out. It's the call in which ELs tend to do most of their work (often 1-2 seconds), with `forkchoiceUpdated` usually returning much faster (<50ms).
- If `newPayload` is failing consistently (e.g. timing out) then this is a good indication that either the node's EL is in trouble, or the network as a whole is. In the first case validator clients _should_ prefer other BNs if they have one available. In the second case, all of their BNs will likely report `el_offline` and they'll just have to proceed with trying to use them.

## Additional Changes

- Add utility method `ForkName::latest` which is quite convenient for test writing, but probably other things too.
- Delete some stale comments from when we used to support multiple execution nodes.
2023-05-17 05:51:56 +00:00
..
duties_service Implement el_offline and use it in the VC (#4295) 2023-05-17 05:51:56 +00:00
http_api Split common crates out into their own repos (#3890) 2023-04-28 01:15:40 +00:00
http_metrics Add new validator API for voluntary exit (#4119) 2023-04-03 03:02:56 +00:00
signing_method Split common crates out into their own repos (#3890) 2023-04-28 01:15:40 +00:00
attestation_service.rs Validator registration request failures do not cause us to mark BNs offline (#3488) 2022-08-29 11:35:59 +00:00
beacon_node_fallback.rs Implement el_offline and use it in the VC (#4295) 2023-05-17 05:51:56 +00:00
block_service.rs Separate BN for block proposals (#4182) 2023-04-26 01:12:36 +00:00
check_synced.rs Implement el_offline and use it in the VC (#4295) 2023-05-17 05:51:56 +00:00
cli.rs Implement el_offline and use it in the VC (#4295) 2023-05-17 05:51:56 +00:00
config.rs Implement el_offline and use it in the VC (#4295) 2023-05-17 05:51:56 +00:00
doppelganger_service.rs Clippy lints for rust 1.66 (#3810) 2022-12-16 04:04:00 +00:00
duties_service.rs Implement el_offline and use it in the VC (#4295) 2023-05-17 05:51:56 +00:00
graffiti_file.rs Rust 1.54.0 lints (#2483) 2021-07-30 01:11:47 +00:00
initialized_validators.rs Optimise update_validators by decrypting key cache only when necessary (#4126) 2023-03-29 02:56:39 +00:00
key_cache.rs Clippy lints for rust 1.66 (#3810) 2022-12-16 04:04:00 +00:00
latency.rs Add VC metric for primary BN latency (#4051) 2023-03-06 04:08:49 +00:00
lib.rs Implement el_offline and use it in the VC (#4295) 2023-05-17 05:51:56 +00:00
notifier.rs Add new VC metrics for beacon node availability (#3193) 2022-05-26 02:05:16 +00:00
preparation_service.rs Implement el_offline and use it in the VC (#4295) 2023-05-17 05:51:56 +00:00
signing_method.rs Add new validator API for voluntary exit (#4119) 2023-04-03 03:02:56 +00:00
sync_committee_service.rs Implement el_offline and use it in the VC (#4295) 2023-05-17 05:51:56 +00:00
validator_store.rs Add new validator API for voluntary exit (#4119) 2023-04-03 03:02:56 +00:00