## Issue Addressed
Resolves#2602
## Proposed Changes
*Note: For a review it might help to look at the individual commits.*
### `boot_node`
Add support for the flags `dump-config` and `immediate-shutdown`. For `immediate-shutdown` the actual behavior could be described as `dump-config-and-exit`.
Both flags are handled in `boot_node::main`, which appears to be the simplest approach.
### `boot_node` regression tests
Added in `lighthouse/tests/boot_node.rs`.
### `CommandLineTestExec`
Factors out boilerplate related to CLI tests. It's used in the regression tests for `boot_node`, `beacon_node` and `validator_client`.
## Open TODO
Add tests for `boot_node` flags `enable-enr-auto-update` and `disable-packet-filter`. They end up in [`Discv5Config`](9ed2cba6bc/boot_node/src/config.rs (L29)), which doesn't support serde (de)serialization.
I haven't found a workaround - guidance would be appreciated.
## Issue Addressed
Closes https://github.com/sigp/lighthouse/issues/2112
Closes https://github.com/sigp/lighthouse/issues/1861
## Proposed Changes
Collect attestations by validator index in the slasher, and use the magic of reference counting to automatically discard redundant attestations. This results in us storing only 1-2% of the attestations observed when subscribed to all subnets, which carries over to a 50-100x reduction in data stored 🎉
## Additional Info
There's some nuance to the configuration of the `slot-offset`. It has a profound effect on the effictiveness of de-duplication, see the docs added to the book for an explanation: 5442e695e5/book/src/slasher.md (slot-offset)
## Issue Addressed
The computation of metrics in the network service can be expensive. This disables the computation unless the cli flag `metrics` is set.
## Additional Info
Metrics in other parts of the network are still updated, since most are simple metrics and checking if metrics are enabled each time each metric is updated doesn't seem like a gain.
## Issue Addressed
Mitigates #1096
## Proposed Changes
Add a flag to the beacon node called `--disable-lock-timeouts` which allows opting out of lock timeouts.
The lock timeouts serve a dual purpose:
1. They prevent any single operation from hogging the lock for too long. When a timeout occurs it logs a nasty error which indicates that there's suboptimal lock use occurring, which we can then act on.
2. They allow deadlock detection. We're fairly sure there are no deadlocks left in Lighthouse anymore but the timeout locks offer a safeguard against that.
However, timeouts on locks are not without downsides:
They allow for the possibility of livelock, particularly on slower hardware. If lock timeouts keep failing spuriously the node can be prevented from making any progress, even if it would be able to make progress slowly without the timeout. One particularly concerning scenario which could occur would be if a DoS attack succeeded in slowing block signature verification times across the network, and all Lighthouse nodes got livelocked because they timed out repeatedly. This could also occur on just a subset of nodes (e.g. dual core VPSs or Raspberri Pis).
By making the behaviour runtime configurable this PR allows us to choose the behaviour we want depending on circumstance. I suspect that long term we could make the timeout-free approach the default (#2381 moves in this direction) and just enable the timeouts on our testnet nodes for debugging purposes. This PR conservatively leaves the default as-is so we can gain some more experience before switching the default.
## Description
The `eth2_libp2p` crate was originally named and designed to incorporate a simple libp2p integration into lighthouse. Since its origins the crates purpose has expanded dramatically. It now houses a lot more sophistication that is specific to lighthouse and no longer just a libp2p integration.
As of this writing it currently houses the following high-level lighthouse-specific logic:
- Lighthouse's implementation of the eth2 RPC protocol and specific encodings/decodings
- Integration and handling of ENRs with respect to libp2p and eth2
- Lighthouse's discovery logic, its integration with discv5 and logic about searching and handling peers.
- Lighthouse's peer manager - This is a large module handling various aspects of Lighthouse's network, such as peer scoring, handling pings and metadata, connection maintenance and recording, etc.
- Lighthouse's peer database - This is a collection of information stored for each individual peer which is specific to lighthouse. We store connection state, sync state, last seen ips and scores etc. The data stored for each peer is designed for various elements of the lighthouse code base such as syncing and the http api.
- Gossipsub scoring - This stores a collection of gossipsub 1.1 scoring mechanisms that are continuously analyssed and updated based on the ethereum 2 networks and how Lighthouse performs on these networks.
- Lighthouse specific types for managing gossipsub topics, sync status and ENR fields
- Lighthouse's network HTTP API metrics - A collection of metrics for lighthouse network monitoring
- Lighthouse's custom configuration of all networking protocols, RPC, gossipsub, discovery, identify and libp2p.
Therefore it makes sense to rename the crate to be more akin to its current purposes, simply that it manages the majority of Lighthouse's network stack. This PR renames this crate to `lighthouse_network`
Co-authored-by: Paul Hauner <paul@paulhauner.com>
## Issue Addressed
Resolves#2262
## Proposed Changes
Add a new CLI flag `--beacon-nodes-tls-certs` which allows the user to specify a path to a certificate file (or a list of files, separated by commas). The VC will then use these certificates (in addition to the existing certificates in the OS trust store) when connecting to a beacon node over HTTPS.
## Additional Info
This only supports certificates in PEM format.
Currently, the beacon node has no ability to serve the HTTP API over TLS.
Adding this functionality would be helpful for certain use cases, such as when you need a validator client to connect to a backup beacon node which is outside your local network, and the use of an SSH tunnel or reverse proxy would be inappropriate.
## Proposed Changes
- Add three new CLI flags to the beacon node
- `--http-enable-tls`: enables TLS
- `--http-tls-cert`: to specify the path to the certificate file
- `--http-tls-key`: to specify the path to the key file
- Update the HTTP API to optionally use `warp`'s [`TlsServer`](https://docs.rs/warp/0.3.1/warp/struct.TlsServer.html) depending on the presence of the `--http-enable-tls` flag
- Update tests and docs
- Use a custom branch for `warp` to ensure proper error handling
## Additional Info
Serving the API over TLS should currently be considered experimental. The reason for this is that it uses code from an [unmerged PR](https://github.com/seanmonstar/warp/pull/717). This commit provides the `try_bind_with_graceful_shutdown` method to `warp`, which is helpful for controlling error flow when the TLS configuration is invalid (cert/key files don't exist, incorrect permissions, etc).
I've implemented the same code in my [branch here](https://github.com/macladson/warp/tree/tls).
Once the code has been reviewed and merged upstream into `warp`, we can remove the dependency on my branch and the feature can be considered more stable.
Currently, the private key file must not be password-protected in order to be read into Lighthouse.
## Issue Addressed
Closes#1891Closes#1784
## Proposed Changes
Implement checkpoint sync for Lighthouse, enabling it to start from a weak subjectivity checkpoint.
## Additional Info
- [x] Return unavailable status for out-of-range blocks requested by peers (#2561)
- [x] Implement sync daemon for fetching historical blocks (#2561)
- [x] Verify chain hashes (either in `historical_blocks.rs` or the calling module)
- [x] Consistency check for initial block + state
- [x] Fetch the initial state and block from a beacon node HTTP endpoint
- [x] Don't crash fetching beacon states by slot from the API
- [x] Background service for state reconstruction, triggered by CLI flag or API call.
Considered out of scope for this PR:
- Drop the requirement to provide the `--checkpoint-block` (this would require some pretty heavy refactoring of block verification)
Co-authored-by: Diva M <divma@protonmail.com>
Sigma Prime is transitioning our mainnet bootnodes and this PR represents the transition of our bootnodes.
After a few releases, old boot-nodes will be deprecated.
## Issue Addressed
Resolves#2033
## Proposed Changes
Adds a flag to enable shutting down beacon node right after sync is completed.
## Additional Info
Will need modification after weak subjectivity sync is enabled to change definition of a fully synced node.
## Issue Addressed
Resolves#2069
## Proposed Changes
- Adds a `--doppelganger-detection` flag
- Adds a `lighthouse/seen_validators` endpoint, which will make it so the lighthouse VC is not interopable with other client beacon nodes if the `--doppelganger-detection` flag is used, but hopefully this will become standardized. Relevant Eth2 API repo issue: https://github.com/ethereum/eth2.0-APIs/issues/64
- If the `--doppelganger-detection` flag is used, the VC will wait until the beacon node is synced, and then wait an additional 2 epochs. The reason for this is to make sure the beacon node is able to subscribe to the subnets our validators should be attesting on. I think an alternative would be to have the beacon node subscribe to all subnets for 2+ epochs on startup by default.
## Additional Info
I'd like to add tests and would appreciate feedback.
TODO: handle validators started via the API, potentially make this default behavior
Co-authored-by: realbigsean <seananderson33@gmail.com>
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
Co-authored-by: Paul Hauner <paul@paulhauner.com>
## Issue Addressed
#2454
## Proposed Changes
Adds the `--http-address` flag to allow the user to use custom HTTP addresses. This can be helpful for certain Docker setups.
Since using custom HTTP addresses is unsafe due to the server being unencrypted, `--unencrypted-http-transport` was also added as a safety flag and must be used in tandem with `--http-address`. This is to ensure the user is aware of the risks associated with using non-local HTTP addresses.
## Issue Addressed
Resolves#2313
## Proposed Changes
Provide `BeaconNodeHttpClient` with a dedicated `Timeouts` struct.
This will allow granular adjustment of the timeout duration for different calls made from the VC to the BN. These can either be a constant value, or as a ratio of the slot duration.
Improve timeout performance by using these adjusted timeout duration's only whenever a fallback endpoint is available.
Add a CLI flag called `use-long-timeouts` to revert to the old behavior.
## Additional Info
Additionally set the default `BeaconNodeHttpClient` timeouts to the be the slot duration of the network, rather than a constant 12 seconds. This will allow it to adjust to different network specifications.
Co-authored-by: Paul Hauner <paul@paulhauner.com>
## Issue Addressed
Resolves#2322
## Proposed Changes
Adds a `modify` command to `lighthouse account validator` with subcommands to enable and disable specific or all pubkeys.
## Issue Addressed
NA
## Proposed Changes
Modify the configuration of [GNU malloc](https://www.gnu.org/software/libc/manual/html_node/The-GNU-Allocator.html) to reduce memory footprint.
- Set `M_ARENA_MAX` to 4.
- This reduces memory fragmentation at the cost of contention between threads.
- Set `M_MMAP_THRESHOLD` to 2mb
- This means that any allocation >= 2mb is allocated via an anonymous mmap, instead of on the heap/arena. This reduces memory fragmentation since we don't need to keep growing the heap to find big contiguous slabs of free memory.
- ~~Run `malloc_trim` every 60 seconds.~~
- ~~This shaves unused memory from the top of the heap, preventing the heap from constantly growing.~~
- Removed, see: https://github.com/sigp/lighthouse/pull/2299#issuecomment-825322646
*Note: this only provides memory savings on the Linux (glibc) platform.*
## Additional Info
I'm going to close#2288 in favor of this for the following reasons:
- I've managed to get the memory footprint *smaller* here than with jemalloc.
- This PR seems to be less of a dramatic change than bringing in the jemalloc dep.
- The changes in this PR are strictly runtime changes, so we can create CLI flags which disable them completely. Since this change is wide-reaching and complex, it's nice to have an easy "escape hatch" if there are undesired consequences.
## TODO
- [x] Allow configuration via CLI flags
- [x] Test on Mac
- [x] Test on RasPi.
- [x] Determine if GNU malloc is present?
- I'm not quite sure how to detect for glibc.. This issue suggests we can't really: https://github.com/rust-lang/rust/issues/33244
- [x] Make a clear argument regarding the affect of this on CPU utilization.
- [x] Test with higher `M_ARENA_MAX` values.
- [x] Test with longer trim intervals
- [x] Add some stats about memory savings
- [x] Remove `malloc_trim` calls & code
## Issue Addressed
Windows incompatibility.
## Proposed Changes
On windows, lighthouse needs to default to STDIN as tty doesn't exist. Also Windows uses ACLs for file permissions. So to mirror chmod 600, we will remove every entry in a file's ACL and add only a single SID that is an alias for the file owner.
Beyond that, there were several changes made to different unit tests because windows has slightly different error messages as well as frustrating nuances around killing a process :/
## Additional Info
Tested on my Windows VM and it appears to work, also compiled & tested on Linux with these changes. Permissions look correct on both platforms now. Just waiting for my validator to activate on Prater so I can test running full validator client on windows.
Co-authored-by: ethDreamer <37123614+ethDreamer@users.noreply.github.com>
Co-authored-by: Michael Sproul <micsproul@gmail.com>
## Issue Addressed
N/A
## Proposed Changes
Add unit tests for the various CLI flags associated with the beacon node and validator client. These changes require the addition of two new flags: `dump-config` and `immediate-shutdown`.
## Additional Info
Both `dump-config` and `immediate-shutdown` are marked as hidden since they should only be used in testing and other advanced use cases.
**Note:** This requires changing `main.rs` so that the flags can adjust the program behavior as necessary.
Co-authored-by: Paul Hauner <paul@paulhauner.com>
## Issue Addressed
Closes#2052
## Proposed Changes
- Refactor the attester/proposer duties endpoints in the BN
- Performance improvements
- Fixes some potential inconsistencies with the dependent root fields.
- Removes `http_api::beacon_proposer_cache` and just uses the one on the `BeaconChain` instead.
- Move the code for the proposer/attester duties endpoints into separate files, for readability.
- Refactor the `DutiesService` in the VC
- Required to reduce the delay on broadcasting new blocks.
- Gets rid of the `ValidatorDuty` shim struct that came about when we adopted the standard API.
- Separate block/attestation duty tasks so that they don't block each other when one is slow.
- In the VC, use `PublicKeyBytes` to represent validators instead of `PublicKey`. `PublicKey` is a legit crypto object whilst `PublicKeyBytes` is just a byte-array, it's much faster to clone/hash `PublicKeyBytes` and this change has had a significant impact on runtimes.
- Unfortunately this has created lots of dust changes.
- In the BN, store `PublicKeyBytes` in the `beacon_proposer_cache` and allow access to them. The HTTP API always sends `PublicKeyBytes` over the wire and the conversion from `PublicKey` -> `PublickeyBytes` is non-trivial, especially when queries have 100s/1000s of validators (like Pyrmont).
- Add the `state_processing::state_advance` mod which dedups a lot of the "apply `n` skip slots to the state" code.
- This also fixes a bug with some functions which were failing to include a state root as per [this comment](072695284f/consensus/state_processing/src/state_advance.rs (L69-L74)). I couldn't find any instance of this bug that resulted in anything more severe than keying a shuffling cache by the wrong block root.
- Swap the VC block service to use `mpsc` from `tokio` instead of `futures`. This is consistent with the rest of the code base.
~~This PR *reduces* the size of the codebase 🎉~~ It *used* to reduce the size of the code base before I added more comments.
## Observations on Prymont
- Proposer duties times down from peaks of 450ms to consistent <1ms.
- Current epoch attester duties times down from >1s peaks to a consistent 20-30ms.
- Block production down from +600ms to 100-200ms.
## Additional Info
- ~~Blocked on #2241~~
- ~~Blocked on #2234~~
## TODO
- [x] ~~Refactor this into some smaller PRs?~~ Leaving this as-is for now.
- [x] Address `per_slot_processing` roots.
- [x] Investigate slow next epoch times. Not getting added to cache on block processing?
- [x] Consider [this](072695284f/beacon_node/store/src/hot_cold_store.rs (L811-L812)) in the scenario of replacing the state roots
Co-authored-by: pawan <pawandhananjay@gmail.com>
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
## Issue Addressed
#2224
## Proposed Changes
Add a `--password-file` option to the `lighthouse account validator import` command. The flag requires `--reuse-password` and will copy the password over to the `validator_definitions.yml` file. I used #2070 as a guide for validating the password as UTF-8 and stripping newlines.
## Additional Info
Co-authored-by: realbigsean <seananderson33@gmail.com>
## Issue Addressed
Resolves#1944
## Proposed Changes
Adds a "graffiti" key to the `validator_definitions.yml`. Setting the key will override anything passed through the validator `--graffiti` flag.
Returns an error if the value for the graffiti key is > 32 bytes instead of silently truncating.
## Issue Addressed
NA
## Proposed Changes
- Implements a HTTP API for the validator client.
- Creates EIP-2335 keystores with an empty `description` field, instead of a missing `description` field. Adds option to set name.
- Be more graceful with setups without any validators (yet)
- Remove an error log when there are no validators.
- Create the `validator` dir if it doesn't exist.
- Allow building a `ValidatorDir` without a withdrawal keystore (required for the API method where we only post a voting keystore).
- Add optional `description` field to `validator_definitions.yml`
## TODO
- [x] Signature header, as per https://github.com/sigp/lighthouse/issues/1269#issuecomment-649879855
- [x] Return validator descriptions
- [x] Return deposit data
- [x] Respect the mnemonic offset
- [x] Check that mnemonic can derive returned keys
- [x] Be strict about non-localhost
- [x] Allow graceful start without any validators (+ create validator dir)
- [x] Docs final pass
- [x] Swap to EIP-2335 description field.
- [x] Fix Zerioze TODO in VC api types.
- [x] Zeroize secp256k1 key
## Endpoints
- [x] `GET /lighthouse/version`
- [x] `GET /lighthouse/health`
- [x] `GET /lighthouse/validators`
- [x] `POST /lighthouse/validators/hd`
- [x] `POST /lighthouse/validators/keystore`
- [x] `PATCH /lighthouse/validators/:validator_pubkey`
- [ ] ~~`POST /lighthouse/validators/:validator_pubkey/exit/:epoch`~~ Future works
## Additional Info
TBC
## Issue Addressed
Implements support for importing and exporting the slashing protection DB interchange format described here:
https://hackmd.io/@sproul/Bk0Y0qdGD
Also closes#1584
## Proposed Changes
* [x] Support for serializing and deserializing the format
* [x] Support for importing and exporting Lighthouse's database
* [x] CLI commands to invoke import and export
* [x] Export to minimal format (required when a minimal format has been previously imported)
* [x] Tests for export to minimal (utilising mixed importing and attestation signing?)
* [x] Tests for import/export of complete format, and import of minimal format
* [x] ~~Prevent attestations with sources less than our max source (Danny's suggestion). Required for the fake attestation that we put in for the minimal format to block attestations from source 0.~~
* [x] Add the concept of a "low watermark" for compatibility with the minimal format
Bonus!
* [x] A fix to a potentially nasty bug involving validators getting re-registered each time the validator client ran! Thankfully, the ordering of keys meant that the validator IDs used for attestations and blocks remained stable -- otherwise we could have had some slashings on our hands! 😱
* [x] Tests to confirm that this bug is indeed vanquished
Closes#1487Closes#1427
Directory restructure in accordance with #1487. Also has temporary migration code to move the old directories into new structure.
Also extracts all default directory names and utility functions into a `directory` crate to avoid repetitio.
~Since `validator_definition.yaml` stores absolute paths, users will have to manually change the keystore paths or delete the file to get the validators picked up by the vc.~. `validator_definition.yaml` is migrated as well from the default directories.
Co-authored-by: realbigsean <seananderson33@gmail.com>
Co-authored-by: Paul Hauner <paul@paulhauner.com>
## Issue Addressed
#1437
## Proposed Changes
- Make the `--wallet-password` flag optional and creates an interactive prompt if not provided.
- Make the `--wallet-name` flag optional and creates an interactive prompt if not provided.
- Add a minimum password requirement of a 12 character length.
- Update the `--stdin-passwords` flag to `--stdin-inputs` because we have non-password user inputs
## Additional Info
## Issue Addressed
N/A
## Proposed Changes
- Fix a wrong command in the validator generation example.
- Replace occurrences of 'passphrase' with 'password'. This is mostly because I felt that there was a lot of mixing of the two phrases in the documentation and the actual commands which is a bit confusing. Picked 'password' everywhere because it felt more appropriate but I don't mind changing it to 'passphrase' as long it's consistent everywhere.
## Issue Addressed
NA
## Proposed Changes
- Refactor the `bls` crate to support multiple BLS "backends" (e.g., milagro, blst, etc).
- Removes some duplicate, unused code in `common/rest_types/src/validator.rs`.
- Removes the old "upgrade legacy keypairs" functionality (these were unencrypted keys that haven't been supported for a few testnets, no one should be using them anymore).
## Additional Info
Most of the files changed are just inconsequential changes to function names.
## TODO
- [x] Optimization levels
- [x] Infinity point: https://github.com/supranational/blst/issues/11
- [x] Ensure milagro *and* blst are tested via CI
- [x] What to do with unsafe code?
- [x] Test infinity point in signature sets
* Update milagro_bls to new release
Signed-off-by: Kirk Baird <baird.k@outlook.com>
* Tidy up fake cryptos
Signed-off-by: Kirk Baird <baird.k@outlook.com>
* move SecretHash to bls and put plaintext back
Signed-off-by: Kirk Baird <baird.k@outlook.com>