lighthouse-pulse/beacon_node
Paul Hauner 8609cced0e Reset payload statuses when resuming fork choice (#3498)
## Issue Addressed

NA

## Proposed Changes

This PR is motivated by a recent consensus failure in Geth where it returned `INVALID` for an `VALID` block. Without this PR, the only way to recover is by re-syncing Lighthouse. Whilst ELs "shouldn't have consensus failures", in reality it's something that we can expect from time to time due to the complex nature of Ethereum. Being able to recover easily will help the network recover and EL devs to troubleshoot.

The risk introduced with this PR is that genuinely INVALID payloads get a "second chance" at being imported. I believe the DoS risk here is negligible since LH needs to be restarted in order to re-process the payload. Furthermore, there's no reason to think that a well-performing EL will accept a truly invalid payload the second-time-around.

## Additional Info

This implementation has the following intricacies:

1. Instead of just resetting *invalid* payloads to optimistic, we'll also reset *valid* payloads. This is an artifact of our existing implementation.
1. We will only reset payload statuses when we detect an invalid payload present in `proto_array`
    - This helps save us from forgetting that all our blocks are valid in the "best case scenario" where there are no invalid blocks.
1. If we fail to revert the payload statuses we'll log a `CRIT` and just continue with a `proto_array` that *does not* have reverted payload statuses.
    - The code to revert statuses needs to deal with balances and proposer-boost, so it's a failure point. This is a defensive measure to avoid introducing new show-stopping bugs to LH.
2022-08-29 14:34:41 +00:00
..
beacon_chain Reset payload statuses when resuming fork choice (#3498) 2022-08-29 14:34:41 +00:00
builder_client Builder Specs v0.2.0 (#3134) 2022-07-30 00:22:37 +00:00
client Log if no execution endpoint is configured (#3467) 2022-08-15 01:31:02 +00:00
eth1 Tidy eth1/deposit contract logging (#3397) 2022-08-01 07:20:43 +00:00
execution_layer Pause sync when EE is offline (#3428) 2022-08-24 23:34:56 +00:00
genesis Unify execution layer endpoints (#3214) 2022-06-29 09:07:09 +00:00
http_api Refactor op pool for speed and correctness (#3312) 2022-08-29 09:10:26 +00:00
http_metrics Support IPv6 in BN and VC HTTP APIs (#3104) 2022-03-24 00:04:49 +00:00
lighthouse_network Return ResourceUnavailable if we are unable to reconstruct execution payloads (#3365) 2022-07-27 03:20:00 +00:00
network Refactor op pool for speed and correctness (#3312) 2022-08-29 09:10:26 +00:00
operation_pool Refactor op pool for speed and correctness (#3312) 2022-08-29 09:10:26 +00:00
src Reset payload statuses when resuming fork choice (#3498) 2022-08-29 14:34:41 +00:00
store Refactor op pool for speed and correctness (#3312) 2022-08-29 09:10:26 +00:00
tests Altair consensus changes and refactors (#2279) 2021-07-09 06:15:32 +00:00
timer Use async code when interacting with EL (#3244) 2022-07-03 05:36:50 +00:00
Cargo.toml v3.0.0 (#3464) 2022-08-22 03:43:08 +00:00