## Issue Addressed
NA
## Proposed Changes
Please list or describe the changes introduced by this PR.
## Additional Info
- Pending testing on our infra. **Please do not merge**
## Issue Addressed
Upcoming spec change https://github.com/ethereum/consensus-specs/pull/2878
## Proposed Changes
1. Run fork choice at the start of every slot, and wait for this run to complete before proposing a block.
2. As an optimisation, also run fork choice 3/4 of the way through the slot (at 9s), _dequeueing attestations for the next slot_.
3. Remove the fork choice run from the state advance timer that occurred before advancing the state.
## Additional Info
### Block Proposal Accuracy
This change makes us more likely to propose on top of the correct head in the presence of re-orgs with proposer boost in play. The main scenario that this change is designed to address is described in the linked spec issue.
### Attestation Accuracy
This change _also_ makes us more likely to attest to the correct head. Currently in the case of a skipped slot at `slot` we only run fork choice 9s into `slot - 1`. This means the attestations from `slot - 1` aren't taken into consideration, and any boost applied to the block from `slot - 1` is not removed (it should be). In the language of the linked spec issue, this means we are liable to attest to C, even when the majority voting weight has already caused a re-org to B.
### Why remove the call before the state advance?
If we've run fork choice at the start of the slot then it has already dequeued all the attestations from the previous slot, which are the only ones eligible to influence the head in the current slot. Running fork choice again is unnecessary (unless we run it for the next slot and try to pre-empt a re-org, but I don't currently think this is a great idea).
### Performance
Based on Prater testing this adds about 5-25ms of runtime to block proposal times, which are 500-1000ms on average (and spike to 5s+ sometimes due to state handling issues 😢 ). I believe this is a small enough penalty to enable it by default, with the option to disable it via the new flag `--fork-choice-before-proposal-timeout 0`. Upcoming work on block packing and state representation will also reduce block production times in general, while removing the spikes.
### Implementation
Fork choice gets invoked at the start of the slot via the `per_slot_task` function called from the slot timer. It then uses a condition variable to signal to block production that fork choice has been updated. This is a bit funky, but it seems to work. One downside of the timer-based approach is that it doesn't happen automatically in most of the tests. The test added by this PR has to trigger the run manually.
# Description
Since the `TaskExecutor` currently requires a `Weak<Runtime>`, it's impossible to use it in an async test where the `Runtime` is created outside our scope. Whilst we *could* create a new `Runtime` instance inside the async test, dropping that `Runtime` would cause a panic (you can't drop a `Runtime` in an async context).
To address this issue, this PR creates the `enum Handle`, which supports either:
- A `Weak<Runtime>` (for use in our production code)
- A `Handle` to a runtime (for use in testing)
In theory, there should be no change to the behaviour of our production code (beyond some slightly different descriptions in HTTP 500 errors), or even our tests. If there is no change, you might ask *"why bother?"*. There are two PRs (#3070 and #3175) that are waiting on these fixes to introduce some new tests. Since we've added the EL to the `BeaconChain` (for the merge), we are now doing more async stuff in tests.
I've also added a `RuntimeExecutor` to the `BeaconChainTestHarness`. Whilst that's not immediately useful, it will become useful in the near future with all the new async testing.
## Issue Addressed
Addresses sync stalls on v2.2.0 (i.e. https://github.com/sigp/lighthouse/issues/3147).
## Additional Info
I've avoided doing a full `cargo update` because I noticed there's a new patch version of libp2p and thought it could do with some more testing.
Co-authored-by: Paul Hauner <paul@paulhauner.com>
## Proposed Changes
Cut release v2.2.0 including proposer boost.
## Additional Info
I also updated the clippy lints for the imminent release of Rust 1.60, although LH v2.2.0 will continue to compile using Rust 1.58 (our MSRV).
## Proposed Changes
I did some gardening 🌳 in our dependency tree:
- Remove duplicate versions of `warp` (git vs patch)
- Remove duplicate versions of lots of small deps: `cpufeatures`, `ethabi`, `ethereum-types`, `bitvec`, `nix`, `libsecp256k1`.
- Update MDBX (should resolve#3028). I tested and Lighthouse compiles on Windows 11 now.
- Restore `psutil` back to upstream
- Make some progress updating everything to rand 0.8. There are a few crates stuck on 0.7.
Hopefully this puts us on a better footing for future `cargo audit` issues, and improves compile times slightly.
## Additional Info
Some crates are held back by issues with `zeroize`. libp2p-noise depends on [`chacha20poly1305`](https://crates.io/crates/chacha20poly1305) which depends on zeroize < v1.5, and we can only have one version of zeroize because it's post 1.0 (see https://github.com/rust-lang/cargo/issues/6584). The latest version of `zeroize` is v1.5.4, which is used by the new versions of many other crates (e.g. `num-bigint-dig`). Once a new version of chacha20poly1305 is released we can update libp2p-noise and upgrade everything to the latest `zeroize` version.
I've also opened a PR to `blst` related to zeroize: https://github.com/supranational/blst/pull/111
## Proposed Changes
Increase the default `--slots-per-restore-point` to 8192 for a 4x reduction in freezer DB disk usage.
Existing nodes that use the previous default of 2048 will be left unchanged. Newly synced nodes (with or without checkpoint sync) will use the new 8192 default.
Long-term we could do away with the freezer DB entirely for validator-only nodes, but this change is much simpler and grants us some extra space in the short term. We can also roll it out gradually across our nodes by purging databases one by one, while keeping the Ansible config the same.
## Additional Info
We ignore a change from 2048 to 8192 if the user hasn't set the 8192 explicitly. We fire a debug log in the case where we do ignore:
```
DEBG Ignoring slots-per-restore-point config in favour of on-disk value, on_disk: 2048, config: 8192
```
## Proposed Changes
Add a `lighthouse db` command with three initial subcommands:
- `lighthouse db version`: print the database schema version.
- `lighthouse db migrate --to N`: manually upgrade (or downgrade!) the database to a different version.
- `lighthouse db inspect --column C`: log the key and size in bytes of every value in a given `DBColumn`.
This PR lays the groundwork for other changes, namely:
- Mark's fast-deposit sync (https://github.com/sigp/lighthouse/pull/2915), for which I think we should implement a database downgrade (from v9 to v8).
- My `tree-states` work, which already implements a downgrade (v10 to v8).
- Standalone purge commands like `lighthouse db purge-dht` per https://github.com/sigp/lighthouse/issues/2824.
## Additional Info
I updated the `strum` crate to 0.24.0, which necessitated some changes in the network code to remove calls to deprecated methods.
Thanks to @winksaville for the motivation, and implementation work that I used as a source of inspiration (https://github.com/sigp/lighthouse/pull/2685).
## Issue Addressed
#3103
## Proposed Changes
Parse `http-address` and `metrics-address` as `IpAddr` for both the beacon node and validator client to support IPv6 addresses.
Also adjusts parsing of CORS origins to allow for IPv6 addresses.
## Usage
You can now set `http-address` and/or `metrics-address` flags to IPv6 addresses.
For example, the following:
`lighthouse bn --http --http-address :: --metrics --metrics-address ::1`
will expose the beacon node HTTP server on `[::]` (equivalent of `0.0.0.0` in IPv4) and the metrics HTTP server on `localhost` (the equivalent of `127.0.0.1` in IPv4)
The beacon node API can then be accessed by:
`curl "http://[server-ipv6-address]:5052/eth/v1/some_endpoint"`
And the metrics server api can be accessed by:
`curl "http://localhost:5054/metrics"` or by `curl "http://[::1]:5054/metrics"`
## Additional Info
On most Linux distributions the `v6only` flag is set to `false` by default (see the section for the `IPV6_V6ONLY` flag in https://www.man7.org/linux/man-pages/man7/ipv6.7.html) which means IPv4 connections will continue to function on a IPv6 address (providing it is appropriately mapped). This means that even if the Lighthouse API is running on `::` it is also possible to accept IPv4 connections.
However on Windows, this is not the case. The `v6only` flag is set to `true` so binding to `::` will only allow IPv6 connections.
## Proposed Changes
Set a minimum supported Rust version (MSRV) in the `Cargo.toml` for the Lighthouse binary so that attempts to compile it with an outdated compiler fail immediately with a clear error.
To ensure that the codebase builds with the MSRV I've also added a Github actions job that runs `cargo check` using the MSRV extracted from `Cargo.toml`. This will force us to keep it up to date.
I opted to use `cargo check` rather than Clippy because Clippy frequently introduces new lints that we adopt, so our MSRV for Clippy is usually the most recent Rust version, while the MSRV for building Lighthouse is older.
## Issue Addressed
NA
## Proposed Changes
- Bump version to `v2.1.4`
- Run `cargo update`
## Additional Info
I think this release should be published around the 15th of March.
Presently `blocked` for testing on our infrastructure.
## Issue Addressed
Resolves#3015
## Proposed Changes
Add JWT token based authentication to engine api requests. The jwt secret key is read from the provided file and is used to sign tokens that are used for authenticated communication with the EL node.
- [x] Interop with geth (synced `merge-devnet-4` with the `merge-kiln-v2` branch on geth)
- [x] Interop with other EL clients (nethermind on `merge-devnet-4`)
- [x] ~Implement `zeroize` for jwt secrets~
- [x] Add auth server tests with `mock_execution_layer`
- [x] Get auth working with the `execution_engine_integration` tests
Co-authored-by: Paul Hauner <paul@paulhauner.com>
## Description
This PR adds a single, trivial commit (f5d2b27d78349d5a675a2615eba42cc9ae708094) atop #2986 to resolve a tests compile error. The original author (@ethDreamer) is AFK so I'm getting this one merged ☺️
Please see #2986 for more information about the other, significant changes in this PR.
Co-authored-by: Mark Mackey <mark@sigmaprime.io>
Co-authored-by: ethDreamer <37123614+ethDreamer@users.noreply.github.com>
## Proposed Changes
Lots of lint updates related to `flat_map`, `unwrap_or_else` and string patterns. I did a little more creative refactoring in the op pool, but otherwise followed Clippy's suggestions.
## Additional Info
We need this PR to unblock CI.
## Issue Addressed
Closes#2990
## Proposed Changes
Add a check to see if the `--validators-dir` CLI flag is set and if so store validator logs into it.
Ensure that if the log directory cannot be created, emit a `WARN` and disable file logging rather than panicking.
## Additional Info
Panics associated with logfiles can still occur in these scenarios:
1. The `$datadir/validators/logs` directory already exists with the wrong permissions (or was changed after creation).
1. The logfile already exists with the wrong permissions (or was changed after creation).
> These panics are cosmetic only since only the logfile thread panics. Following the panics, LH will continue to function as normal.
I believe this is due to the use of [`slog::Fuse`](https://docs.rs/slog/latest/slog/struct.Fuse.html) when initializing the logger.
I'm not sure if there a better way of handling logfile errors?
I think ideally, rather than panicking, we would emit a `WARN` to the stdout logger with the panic reason, then exit the logfile thread gracefully.
## Issue Addressed
#3020
## Proposed Changes
- Alias the `validators-dir` arg to `validator-dir` in the `validator_client` subcommand.
- Alias the `validator-dir` arg to `validators-dir` in the `account_manager validator` subcommand.
- Add test for the validator_client alias.
## Issue Addressed
NA
## Proposed Changes
This PR extends #3018 to address my review comments there and add automated integration tests with Geth (and other implementations, in the future).
I've also de-duplicated the "unused port" logic by creating an `common/unused_port` crate.
## Additional Info
I'm not sure if we want to merge this PR, or update #3018 and merge that. I don't mind, I'm primarily opening this PR to make sure CI works.
Co-authored-by: Mark Mackey <mark@sigmaprime.io>
## Issue Addressed
#2883
## Proposed Changes
* Added `suggested-fee-recipient` & `suggested-fee-recipient-file` flags to validator client (similar to graffiti / graffiti-file implementation).
* Added proposer preparation service to VC, which sends the fee-recipient of all known validators to the BN via [/eth/v1/validator/prepare_beacon_proposer](https://github.com/ethereum/beacon-APIs/pull/178) api once per slot
* Added [/eth/v1/validator/prepare_beacon_proposer](https://github.com/ethereum/beacon-APIs/pull/178) api endpoint and preparation data caching
* Added cleanup routine to remove cached proposer preparations when not updated for 2 epochs
## Additional Info
Changed the Implementation following the discussion in #2883.
Co-authored-by: pk910 <philipp@pk910.de>
Co-authored-by: Paul Hauner <paul@paulhauner.com>
Co-authored-by: Philipp K <philipp@pk910.de>
## Proposed Changes
Add a new hardcoded spec for the Gnosis Beacon Chain.
Ideally, official Lighthouse executables will be able to connect to the gnosis beacon chain from now on, using `--network gnosis` CLI option.
## Issue Addressed
NA
## Proposed Changes
- Bump Lighthouse version to v2.1.1
- Update `thread_local` from v1.1.3 to v1.1.4 to address https://rustsec.org/advisories/RUSTSEC-2022-0006
## Additional Info
- ~~Blocked on #2950~~
- ~~Blocked on #2952~~
## Proposed Changes
Change the canonical fork name for the merge to Bellatrix. Keep other merge naming the same to avoid churn.
I've also fixed and enabled the `fork` and `transition` tests for Bellatrix, and the v1.1.7 fork choice tests.
Additionally, the `BellatrixPreset` has been added with tests. It gets served via the `/config/spec` API endpoint along with the other presets.
## Issue Addressed
The fee-recipient argument of the beacon node does not allow a value to be specified:
> $ lighthouse beacon_node --merge --fee-recipient "0x332E43696A505EF45b9319973785F837ce5267b9"
> error: Found argument '0x332E43696A505EF45b9319973785F837ce5267b9' which wasn't expected, or isn't valid in this context
>
> USAGE:
> lighthouse beacon_node --fee-recipient --merge
>
> For more information try --help
## Proposed Changes
Allow specifying a value for the fee-recipient argument in beacon_node/src/cli.rs
## Additional Info
I've added .takes_value(true) and successfully proposed a block in the kintsugi testnet with my own fee-recipient address instead of the hardcoded default. I think that was just missed as the argument does not make sense without a value :)
Co-authored-by: pk910 <philipp@pk910.de>
Co-authored-by: Michael Sproul <micsproul@gmail.com>
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
## Issue Addressed
Resolves#2854
## Proposed Changes
If validator was imported first without entering password and then imported again with valid password update the password in validator_definitions.yml
## Additional Info
There can be other cases for updating existing validator during import. They are not covered here.
Co-authored-by: Michael Sproul <micsproul@gmail.com>
## Issue Addressed
Closes#2286Closes#2538Closes#2342
## Proposed Changes
Part II of major slasher optimisations after #2767
These changes will be backwards-incompatible due to the move to MDBX (and the schema change) 😱
* [x] Shrink attester keys from 16 bytes to 7 bytes.
* [x] Shrink attester records from 64 bytes to 6 bytes.
* [x] Separate `DiskConfig` from regular `Config`.
* [x] Add configuration for the LRU cache size.
* [x] Add a "migration" that deletes any legacy LMDB database.
## Issue Addressed
Resolves: https://github.com/sigp/lighthouse/issues/2741
Includes: https://github.com/sigp/lighthouse/pull/2853 so that we can get ssz static tests passing here on v1.1.6. If we want to merge that first, we can make this diff slightly smaller
## Proposed Changes
- Changes the `justified_epoch` and `finalized_epoch` in the `ProtoArrayNode` each to an `Option<Checkpoint>`. The `Option` is necessary only for the migration, so not ideal. But does allow us to add a default logic to `None` on these fields during the database migration.
- Adds a database migration from a legacy fork choice struct to the new one, search for all necessary block roots in fork choice by iterating through blocks in the db.
- updates related to https://github.com/ethereum/consensus-specs/pull/2727
- We will have to update the persisted forkchoice to make sure the justified checkpoint stored is correct according to the updated fork choice logic. This boils down to setting the forkchoice store's justified checkpoint to the justified checkpoint of the block that advanced the finalized checkpoint to the current one.
- AFAICT there's no migration steps necessary for the update to allow applying attestations from prior blocks, but would appreciate confirmation on that
- I updated the consensus spec tests to v1.1.6 here, but they will fail until we also implement the proposer score boost updates. I confirmed that the previously failing scenario `new_finalized_slot_is_justified_checkpoint_ancestor` will now pass after the boost updates, but haven't confirmed _all_ tests will pass because I just quickly stubbed out the proposer boost test scenario formatting.
- This PR now also includes proposer boosting https://github.com/ethereum/consensus-specs/pull/2730
## Additional Info
I realized checking justified and finalized roots in fork choice makes it more likely that we trigger this bug: https://github.com/ethereum/consensus-specs/pull/2727
It's possible the combination of justified checkpoint and finalized checkpoint in the forkchoice store is different from in any block in fork choice. So when trying to startup our store's justified checkpoint seems invalid to the rest of fork choice (but it should be valid). When this happens we get an `InvalidBestNode` error and fail to start up. So I'm including that bugfix in this branch.
Todo:
- [x] Fix fork choice tests
- [x] Self review
- [x] Add fix for https://github.com/ethereum/consensus-specs/pull/2727
- [x] Rebase onto Kintusgi
- [x] Fix `num_active_validators` calculation as @michaelsproul pointed out
- [x] Clean up db migrations
Co-authored-by: realbigsean <seananderson33@gmail.com>
## Issue Addressed
Closes#1996
## Proposed Changes
Run a second `Logger` via `sloggers` which logs to a file in the background with:
- separate `debug-level` for background and terminal logging
- the ability to limit log size
- rotation through a customizable number of log files
- an option to compress old log files (`.gz` format)
Add the following new CLI flags:
- `--logfile-debug-level`: The debug level of the log files
- `--logfile-max-size`: The maximum size of each log file
- `--logfile-max-number`: The number of old log files to store
- `--logfile-compress`: Whether to compress old log files
By default background logging uses the `debug` log level and saves logfiles to:
- Beacon Node: `$HOME/.lighthouse/$network/beacon/logs/beacon.log`
- Validator Client: `$HOME/.lighthouse/$network/validators/logs/validator.log`
Or, when using the `--datadir` flag:
`$datadir/beacon/logs/beacon.log` and `$datadir/validators/logs/validator.log`
Once rotated, old logs are stored like so: `beacon.log.1`, `beacon.log.2` etc.
> Note: `beacon.log.1` is always newer than `beacon.log.2`.
## Additional Info
Currently the default value of `--logfile-max-size` is 200 (MB) and `--logfile-max-number` is 5.
This means that the maximum storage space that the logs will take up by default is 1.2GB.
(200MB x 5 from old log files + <200MB the current logfile being written to)
Happy to adjust these default values to whatever people think is appropriate.
It's also worth noting that when logging to a file, we lose our custom `slog` formatting. This means the logfile logs look like this:
```
Oct 27 16:02:50.305 INFO Lighthouse started, version: Lighthouse/v2.0.1-8edd9d4+, module: lighthouse:413
Oct 27 16:02:50.305 INFO Configured for network, name: prater, module: lighthouse:414
```
## Issue Addressed
Resolves#2602
## Proposed Changes
*Note: For a review it might help to look at the individual commits.*
### `boot_node`
Add support for the flags `dump-config` and `immediate-shutdown`. For `immediate-shutdown` the actual behavior could be described as `dump-config-and-exit`.
Both flags are handled in `boot_node::main`, which appears to be the simplest approach.
### `boot_node` regression tests
Added in `lighthouse/tests/boot_node.rs`.
### `CommandLineTestExec`
Factors out boilerplate related to CLI tests. It's used in the regression tests for `boot_node`, `beacon_node` and `validator_client`.
## Open TODO
Add tests for `boot_node` flags `enable-enr-auto-update` and `disable-packet-filter`. They end up in [`Discv5Config`](9ed2cba6bc/boot_node/src/config.rs (L29)), which doesn't support serde (de)serialization.
I haven't found a workaround - guidance would be appreciated.
## Issue Addressed
Closes https://github.com/sigp/lighthouse/issues/2112
Closes https://github.com/sigp/lighthouse/issues/1861
## Proposed Changes
Collect attestations by validator index in the slasher, and use the magic of reference counting to automatically discard redundant attestations. This results in us storing only 1-2% of the attestations observed when subscribed to all subnets, which carries over to a 50-100x reduction in data stored 🎉
## Additional Info
There's some nuance to the configuration of the `slot-offset`. It has a profound effect on the effictiveness of de-duplication, see the docs added to the book for an explanation: 5442e695e5/book/src/slasher.md (slot-offset)
## Issue Addressed
The computation of metrics in the network service can be expensive. This disables the computation unless the cli flag `metrics` is set.
## Additional Info
Metrics in other parts of the network are still updated, since most are simple metrics and checking if metrics are enabled each time each metric is updated doesn't seem like a gain.
## Issue Addressed
Mitigates #1096
## Proposed Changes
Add a flag to the beacon node called `--disable-lock-timeouts` which allows opting out of lock timeouts.
The lock timeouts serve a dual purpose:
1. They prevent any single operation from hogging the lock for too long. When a timeout occurs it logs a nasty error which indicates that there's suboptimal lock use occurring, which we can then act on.
2. They allow deadlock detection. We're fairly sure there are no deadlocks left in Lighthouse anymore but the timeout locks offer a safeguard against that.
However, timeouts on locks are not without downsides:
They allow for the possibility of livelock, particularly on slower hardware. If lock timeouts keep failing spuriously the node can be prevented from making any progress, even if it would be able to make progress slowly without the timeout. One particularly concerning scenario which could occur would be if a DoS attack succeeded in slowing block signature verification times across the network, and all Lighthouse nodes got livelocked because they timed out repeatedly. This could also occur on just a subset of nodes (e.g. dual core VPSs or Raspberri Pis).
By making the behaviour runtime configurable this PR allows us to choose the behaviour we want depending on circumstance. I suspect that long term we could make the timeout-free approach the default (#2381 moves in this direction) and just enable the timeouts on our testnet nodes for debugging purposes. This PR conservatively leaves the default as-is so we can gain some more experience before switching the default.
## Description
The `eth2_libp2p` crate was originally named and designed to incorporate a simple libp2p integration into lighthouse. Since its origins the crates purpose has expanded dramatically. It now houses a lot more sophistication that is specific to lighthouse and no longer just a libp2p integration.
As of this writing it currently houses the following high-level lighthouse-specific logic:
- Lighthouse's implementation of the eth2 RPC protocol and specific encodings/decodings
- Integration and handling of ENRs with respect to libp2p and eth2
- Lighthouse's discovery logic, its integration with discv5 and logic about searching and handling peers.
- Lighthouse's peer manager - This is a large module handling various aspects of Lighthouse's network, such as peer scoring, handling pings and metadata, connection maintenance and recording, etc.
- Lighthouse's peer database - This is a collection of information stored for each individual peer which is specific to lighthouse. We store connection state, sync state, last seen ips and scores etc. The data stored for each peer is designed for various elements of the lighthouse code base such as syncing and the http api.
- Gossipsub scoring - This stores a collection of gossipsub 1.1 scoring mechanisms that are continuously analyssed and updated based on the ethereum 2 networks and how Lighthouse performs on these networks.
- Lighthouse specific types for managing gossipsub topics, sync status and ENR fields
- Lighthouse's network HTTP API metrics - A collection of metrics for lighthouse network monitoring
- Lighthouse's custom configuration of all networking protocols, RPC, gossipsub, discovery, identify and libp2p.
Therefore it makes sense to rename the crate to be more akin to its current purposes, simply that it manages the majority of Lighthouse's network stack. This PR renames this crate to `lighthouse_network`
Co-authored-by: Paul Hauner <paul@paulhauner.com>
## Issue Addressed
NA
## Proposed Changes
- Update versions to `v2.0.1` in anticipation for a release early next week.
- Add `--ignore` to `cargo audit`. See #2727.
## Additional Info
NA
## Issue Addressed
Resolves#2262
## Proposed Changes
Add a new CLI flag `--beacon-nodes-tls-certs` which allows the user to specify a path to a certificate file (or a list of files, separated by commas). The VC will then use these certificates (in addition to the existing certificates in the OS trust store) when connecting to a beacon node over HTTPS.
## Additional Info
This only supports certificates in PEM format.
Currently, the beacon node has no ability to serve the HTTP API over TLS.
Adding this functionality would be helpful for certain use cases, such as when you need a validator client to connect to a backup beacon node which is outside your local network, and the use of an SSH tunnel or reverse proxy would be inappropriate.
## Proposed Changes
- Add three new CLI flags to the beacon node
- `--http-enable-tls`: enables TLS
- `--http-tls-cert`: to specify the path to the certificate file
- `--http-tls-key`: to specify the path to the key file
- Update the HTTP API to optionally use `warp`'s [`TlsServer`](https://docs.rs/warp/0.3.1/warp/struct.TlsServer.html) depending on the presence of the `--http-enable-tls` flag
- Update tests and docs
- Use a custom branch for `warp` to ensure proper error handling
## Additional Info
Serving the API over TLS should currently be considered experimental. The reason for this is that it uses code from an [unmerged PR](https://github.com/seanmonstar/warp/pull/717). This commit provides the `try_bind_with_graceful_shutdown` method to `warp`, which is helpful for controlling error flow when the TLS configuration is invalid (cert/key files don't exist, incorrect permissions, etc).
I've implemented the same code in my [branch here](https://github.com/macladson/warp/tree/tls).
Once the code has been reviewed and merged upstream into `warp`, we can remove the dependency on my branch and the feature can be considered more stable.
Currently, the private key file must not be password-protected in order to be read into Lighthouse.
## Proposed Changes
Cut the first release candidate for v2.0.0, in preparation for testing and release this week
## Additional Info
Builds on #2632, which should either be merged first or in the same batch
## Issue Addressed
Closes#1891Closes#1784
## Proposed Changes
Implement checkpoint sync for Lighthouse, enabling it to start from a weak subjectivity checkpoint.
## Additional Info
- [x] Return unavailable status for out-of-range blocks requested by peers (#2561)
- [x] Implement sync daemon for fetching historical blocks (#2561)
- [x] Verify chain hashes (either in `historical_blocks.rs` or the calling module)
- [x] Consistency check for initial block + state
- [x] Fetch the initial state and block from a beacon node HTTP endpoint
- [x] Don't crash fetching beacon states by slot from the API
- [x] Background service for state reconstruction, triggered by CLI flag or API call.
Considered out of scope for this PR:
- Drop the requirement to provide the `--checkpoint-block` (this would require some pretty heavy refactoring of block verification)
Co-authored-by: Diva M <divma@protonmail.com>
Sigma Prime is transitioning our mainnet bootnodes and this PR represents the transition of our bootnodes.
After a few releases, old boot-nodes will be deprecated.
## Proposed Changes
* Modify the `TaskExecutor` so that it spawns a "monitor" future for each future spawned by `spawn` or `spawn_blocking`. This monitor future joins the handle of the child future and shuts down the executor if it detects a panic.
* Enable backtraces by default by setting the environment variable `RUST_BACKTRACE`.
* Spawn the `ProductionBeaconNode` on the `TaskExecutor` so that if a panic occurs during start-up it will take down the whole process. Previously we were using a raw Tokio `spawn`, but I can't see any reason not to use the executor (perhaps someone else can).
## Additional Info
I considered using [`std::panic::set_hook`](https://doc.rust-lang.org/std/panic/fn.set_hook.html) to instantiate a custom panic handler, however this doesn't allow us to send a shutdown signal because `Fn` functions can't move variables (i.e. the shutdown sender) out of their environment. This also prevents it from receiving a `Logger`. Hence I decided to leave the panic handler untouched, but with backtraces turned on by default.
I did a run through the code base with all the raw Tokio spawn functions disallowed by Clippy, and found only two instances where we bypass the `TaskExecutor`: the HTTP API and `InitializedValidators` in the VC. In both places we use `spawn_blocking` and handle the return value, so I figured that was OK for now.
In terms of performance I think the overhead should be minimal. The monitor tasks will just get parked by the executor until their child resolves.
I've checked that this covers Discv5, as the `TaskExecutor` gets injected into Discv5 here: f9bba92db3/beacon_node/src/lib.rs (L125-L126)
## Proposed Changes
This PR deletes all `remote_signer` code from Lighthouse, for the following reasons:
* The `remote_signer` code is unused, and we have no plans to use it now that we're moving to supporting the Web3Signer APIs: #2522
* It represents a significant maintenance burden. The HTTP API tests have been prone to platform-specific failures, and breakages due to dependency upgrades, e.g. #2400.
Although the code is deleted it remains in the Git history should we ever want to recover it. For ease of reference:
- The last commit containing remote signer code: 5a3bcd2904
- The last Lighthouse version: v1.5.1
## Proposed Changes
Remove the SIGPIPE handler added in #2486.
We saw some of the testnet nodes running under `systemd` being stopped due to `journald` restarts. The systemd docs state:
> If systemd-journald.service is stopped, the stream connections associated with all services are terminated. Further writes to those streams by the service will result in EPIPE errors. In order to react gracefully in this case it is recommended that programs logging to standard output/error ignore such errors. If the SIGPIPE UNIX signal handler is not blocked or turned off, such write attempts will also result in such process signals being generated, see signal(7).
From https://www.freedesktop.org/software/systemd/man/systemd-journald.service.html
## Additional Info
It turned out that the issue described in #2114 was due to `tee`'s behaviour rather than Lighthouse's, so the SIGPIPE handler isn't required for any current use case. An alternative to disabling it all together would be to exit with a non-zero code so that systemd knows to restart the process, but it seems more desirable to tolerate journald glitches than to restart frequently.
## Issue Addressed
Resolves#2438Resolves#2437
## Proposed Changes
Changes the permissions for validator client http server api token file and secret key to 600 from 644. Also changes the permission for logfiles generated using the `--logfile` cli option to 600.
Logs the path to the api token instead of the actual api token. Updates docs to reflect the change.
## Issue Addressed
Related to: #2259
Made an attempt at all the necessary updates here to publish the crates to crates.io. I incremented the minor versions on all the crates that have been previously published. We still might run into some issues as we try to publish because I'm not able to test this out but I think it's a good starting point.
## Proposed Changes
- Add description and license to `ssz_types` and `serde_util`
- rename `serde_util` to `eth2_serde_util`
- increment minor versions
- remove path dependencies
- remove patch dependencies
## Additional Info
Crates published:
- [x] `tree_hash` -- need to publish `tree_hash_derive` and `eth2_hashing` first
- [x] `eth2_ssz_types` -- need to publish `eth2_serde_util` first
- [x] `tree_hash_derive`
- [x] `eth2_ssz`
- [x] `eth2_ssz_derive`
- [x] `eth2_serde_util`
- [x] `eth2_hashing`
Co-authored-by: realbigsean <seananderson33@gmail.com>
## Issue Addressed
NA
## Proposed Changes
This PR adds some more fancy macro magic to make it easier to add a new built-in (aka "baked-in") testnet config to the `lighthouse` binary.
Previously, a user needed to modify several files and repeat themselves several times. Now, they only need to add a single definition in the `eth2_config` crate. No repetition 🎉
## Issue Addressed
Resolves#2033
## Proposed Changes
Adds a flag to enable shutting down beacon node right after sync is completed.
## Additional Info
Will need modification after weak subjectivity sync is enabled to change definition of a fully synced node.
## Issue Addressed
Resolves#2114
Swapped out the ctrlc crate for tokio signals to hook register handlers for SIGPIPE and SIGHUP along with SIGTERM and SIGINT.
## Proposed Changes
- Swap out the ctrlc crate for tokio signals for unix signal handing
- Register signals for SIGPIPE and SHIGUP that trigger the same shutdown procedure as SIGTERM and SIGINT
## Additional Info
I tested these changes against the examples in the original issue and noticed some interesting behavior on my machine. When running `lighthouse bn --network pyrmont |& tee -a pyrmont_bn.log` or `lighthouse bn --network pyrmont 2>&1 | tee -a pyrmont_bn.log` none of the above signals are sent to the lighthouse program in a way I was able to observe.
The only time it seems that the signal gets sent to the lighthouse program is if there is no redirection of stderr to stdout. I'm not as familiar with the details of how unix signals work in linux with a redirect like that so I'm not sure if this is a bug in the program or expected behavior.
Signals are correctly received without the redirection and if the above signals are sent directly to the program with something like `kill`.
## Issue Addressed
NA
## Proposed Changes
- Version bump
- Increase queue sizes for aggregated attestations and re-queued attestations.
## Additional Info
NA
## Proposed Changes
* Consolidate Tokio versions: everything now uses the latest v1.10.0, no more `tokio-compat`.
* Many semver-compatible changes via `cargo update`. Notably this upgrades from the yanked v0.8.0 version of crossbeam-deque which is present in v1.5.0-rc.0
* Many semver incompatible upgrades via `cargo upgrades` and `cargo upgrade --workspace pkg_name`. Notable ommissions:
- Prometheus, to be handled separately: https://github.com/sigp/lighthouse/issues/2485
- `rand`, `rand_xorshift`: the libsecp256k1 package requires 0.7.x, so we'll stick with that for now
- `ethereum-types` is pinned at 0.11.0 because that's what `web3` is using and it seems nice to have just a single version
## Additional Info
We still have two versions of `libp2p-core` due to `discv5` depending on the v0.29.0 release rather than `master`. AFAIK it should be OK to release in this state (cc @AgeManning )
## Issue Addressed
NA
## Proposed Changes
- Bump to `v1.5.0-rc.0`.
- Increase attestation reprocessing queue size (I saw this filling up on Prater).
- Reduce error log for full attn reprocessing queue to warn.
## TODO
- [x] Manual testing
- [x] Resolve https://github.com/sigp/lighthouse/pull/2493
- [x] Include https://github.com/sigp/lighthouse/pull/2501
## Issue Addressed
Resolves#2069
## Proposed Changes
- Adds a `--doppelganger-detection` flag
- Adds a `lighthouse/seen_validators` endpoint, which will make it so the lighthouse VC is not interopable with other client beacon nodes if the `--doppelganger-detection` flag is used, but hopefully this will become standardized. Relevant Eth2 API repo issue: https://github.com/ethereum/eth2.0-APIs/issues/64
- If the `--doppelganger-detection` flag is used, the VC will wait until the beacon node is synced, and then wait an additional 2 epochs. The reason for this is to make sure the beacon node is able to subscribe to the subnets our validators should be attesting on. I think an alternative would be to have the beacon node subscribe to all subnets for 2+ epochs on startup by default.
## Additional Info
I'd like to add tests and would appreciate feedback.
TODO: handle validators started via the API, potentially make this default behavior
Co-authored-by: realbigsean <seananderson33@gmail.com>
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
Co-authored-by: Paul Hauner <paul@paulhauner.com>
## Issue Addressed
N/A
## Proposed Changes
- Removing a bunch of unnecessary references
- Updated `Error::VariantError` to `Error::Variant`
- There were additional enum variant lints that I ignored, because I thought our variant names were fine
- removed `MonitoredValidator`'s `pubkey` field, because I couldn't find it used anywhere. It looks like we just use the string version of the pubkey (the `id` field) if there is no index
## Additional Info
Co-authored-by: realbigsean <seananderson33@gmail.com>
## Issue Addressed
#2454
## Proposed Changes
Adds the `--http-address` flag to allow the user to use custom HTTP addresses. This can be helpful for certain Docker setups.
Since using custom HTTP addresses is unsafe due to the server being unencrypted, `--unencrypted-http-transport` was also added as a safety flag and must be used in tandem with `--http-address`. This is to ensure the user is aware of the risks associated with using non-local HTTP addresses.
## Proposed Changes
Modify the SHA256 implementation in `eth2_hashing` so that it switches between `ring` and `sha2` to take advantage of [x86_64 SHA extensions](https://en.wikipedia.org/wiki/Intel_SHA_extensions). The extensions are available on modern Intel and AMD CPUs, and seem to provide a considerable speed-up: on my Ryzen 5950X it dropped state tree hashing times by about 30% from 35ms to 25ms (on Prater).
## Additional Info
The extensions became available in the `sha2` crate [last year](https://www.reddit.com/r/rust/comments/hf2vcx/ann_rustcryptos_sha1_and_sha2_now_support/), and are not available in Ring, which uses a [pure Rust implementation of sha2](https://github.com/briansmith/ring/blob/main/src/digest/sha2.rs). Ring is faster on CPUs that lack the extensions so I've implemented a runtime switch to use `sha2` only when the extensions are available. The runtime switching seems to impose a miniscule penalty (see the benchmarks linked below).
## Issue Addressed
Resolves#2313
## Proposed Changes
Provide `BeaconNodeHttpClient` with a dedicated `Timeouts` struct.
This will allow granular adjustment of the timeout duration for different calls made from the VC to the BN. These can either be a constant value, or as a ratio of the slot duration.
Improve timeout performance by using these adjusted timeout duration's only whenever a fallback endpoint is available.
Add a CLI flag called `use-long-timeouts` to revert to the old behavior.
## Additional Info
Additionally set the default `BeaconNodeHttpClient` timeouts to the be the slot duration of the network, rather than a constant 12 seconds. This will allow it to adjust to different network specifications.
Co-authored-by: Paul Hauner <paul@paulhauner.com>
## Proposed Changes
Implement the consensus changes necessary for the upcoming Altair hard fork.
## Additional Info
This is quite a heavy refactor, with pivotal types like the `BeaconState` and `BeaconBlock` changing from structs to enums. This ripples through the whole codebase with field accesses changing to methods, e.g. `state.slot` => `state.slot()`.
Co-authored-by: realbigsean <seananderson33@gmail.com>
## Issue Addressed
#2225
## Proposed Changes
Exposes the version given from the `lighthouse_version` crate to the Prometheus metrics server.
## Additional Info
- This metric appears in both the Beacon Node and Validator Client metrics servers.
- This is the simplest solution. It might be better to include the version and commit hash as separate labels rather than combined, however this would be more involved. Happy to do it that way if this is too cumbersome to use.
- The metric appears as:
```
# HELP lighthouse_info The build of Lighthouse running on the server
# TYPE lighthouse_info gauge
lighthouse_info{version="Lighthouse/v1.4.0-379664a+"} 1
```
## Issue Addressed
Closes#1661
## Proposed Changes
Add a dummy package called `target_check` which gets compiled early in the build and fails if the target is 32-bit
## Additional Info
You can test the efficacy of this check with:
```
cross build --release --manifest-path lighthouse/Cargo.toml --target i686-unknown-linux-gnu
```
In which case this compilation error is shown:
```
error: Lighthouse requires a 64-bit CPU and operating system
--> common/target_check/src/lib.rs:8:1
|
8 | / assert_cfg!(
9 | | target_pointer_width = "64",
10 | | "Lighthouse requires a 64-bit CPU and operating system",
11 | | );
| |__^
```
## Issue Addressed
`make lint` failing on rust 1.53.0.
## Proposed Changes
1.53.0 updates
## Additional Info
I haven't figure out why yet, we were now hitting the recursion limit in a few crates. So I had to add `#![recursion_limit = "256"]` in a few places
Co-authored-by: realbigsean <seananderson33@gmail.com>
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
## Issue Addressed
Resolves#2322
## Proposed Changes
Adds a `modify` command to `lighthouse account validator` with subcommands to enable and disable specific or all pubkeys.
## Issue Addressed
NA
## Proposed Changes
Only run `configure_memory_alllocator` for the BN process.
I noticed that VC memory usage increases significantly with the new malloc tuning parameters. This was also raised by a user on [r/ethstaker](https://www.reddit.com/r/ethstaker/comments/nr8998/lighthouse_prerelease_v140rc0/h0fnt9l?utm_source=share&utm_medium=web2x&context=3).
There wasn't any issue with memory usage by the VC before we implemented #2296, so I think we were a bit overzealous when we allowed these changes to affect it. This PR allows things that weren't broken to remain unfixed.
## Additional Info
NA
## Issue Addressed
NA
## Proposed Changes
Bump versions.
## Additional Info
This is not exactly the v1.4.0 release described in [Lighthouse Update #36](https://lighthouse.sigmaprime.io/update-36.html).
Whilst it contains:
- Beta Windows support
- A reduction in Eth1 queries
- A reduction in memory footprint
It does not contain:
- Altair
- Doppelganger Protection
- The remote signer
We have decided to release some features early. This is primarily due to the desire to allow users to benefit from the memory saving improvements as soon as possible.
## TODO
- [x] Wait for #2340, #2356 and #2376 to merge and then rebase on `unstable`.
- [x] Ensure discovery issues are fixed (see #2388)
- [x] Ensure https://github.com/sigp/lighthouse/pull/2382 is merged/removed.
- [x] Ensure https://github.com/sigp/lighthouse/pull/2383 is merged/removed.
- [x] Ensure https://github.com/sigp/lighthouse/pull/2384 is merged/removed.
- [ ] Double-check eth1 cache is carried between boots
## Issue Addressed
NA
## Proposed Changes
Modify the configuration of [GNU malloc](https://www.gnu.org/software/libc/manual/html_node/The-GNU-Allocator.html) to reduce memory footprint.
- Set `M_ARENA_MAX` to 4.
- This reduces memory fragmentation at the cost of contention between threads.
- Set `M_MMAP_THRESHOLD` to 2mb
- This means that any allocation >= 2mb is allocated via an anonymous mmap, instead of on the heap/arena. This reduces memory fragmentation since we don't need to keep growing the heap to find big contiguous slabs of free memory.
- ~~Run `malloc_trim` every 60 seconds.~~
- ~~This shaves unused memory from the top of the heap, preventing the heap from constantly growing.~~
- Removed, see: https://github.com/sigp/lighthouse/pull/2299#issuecomment-825322646
*Note: this only provides memory savings on the Linux (glibc) platform.*
## Additional Info
I'm going to close#2288 in favor of this for the following reasons:
- I've managed to get the memory footprint *smaller* here than with jemalloc.
- This PR seems to be less of a dramatic change than bringing in the jemalloc dep.
- The changes in this PR are strictly runtime changes, so we can create CLI flags which disable them completely. Since this change is wide-reaching and complex, it's nice to have an easy "escape hatch" if there are undesired consequences.
## TODO
- [x] Allow configuration via CLI flags
- [x] Test on Mac
- [x] Test on RasPi.
- [x] Determine if GNU malloc is present?
- I'm not quite sure how to detect for glibc.. This issue suggests we can't really: https://github.com/rust-lang/rust/issues/33244
- [x] Make a clear argument regarding the affect of this on CPU utilization.
- [x] Test with higher `M_ARENA_MAX` values.
- [x] Test with longer trim intervals
- [x] Add some stats about memory savings
- [x] Remove `malloc_trim` calls & code
## Issue Addressed
Windows incompatibility.
## Proposed Changes
On windows, lighthouse needs to default to STDIN as tty doesn't exist. Also Windows uses ACLs for file permissions. So to mirror chmod 600, we will remove every entry in a file's ACL and add only a single SID that is an alias for the file owner.
Beyond that, there were several changes made to different unit tests because windows has slightly different error messages as well as frustrating nuances around killing a process :/
## Additional Info
Tested on my Windows VM and it appears to work, also compiled & tested on Linux with these changes. Permissions look correct on both platforms now. Just waiting for my validator to activate on Prater so I can test running full validator client on windows.
Co-authored-by: ethDreamer <37123614+ethDreamer@users.noreply.github.com>
Co-authored-by: Michael Sproul <micsproul@gmail.com>
## Issue Addressed
N/A
## Proposed Changes
Add unit tests for the various CLI flags associated with the beacon node and validator client. These changes require the addition of two new flags: `dump-config` and `immediate-shutdown`.
## Additional Info
Both `dump-config` and `immediate-shutdown` are marked as hidden since they should only be used in testing and other advanced use cases.
**Note:** This requires changing `main.rs` so that the flags can adjust the program behavior as necessary.
Co-authored-by: Paul Hauner <paul@paulhauner.com>
## Issue Addressed
NA
## Proposed Changes
Bump versions.
## Additional Info
This is a minor release (not patch) due to the very slight change introduced by #2291.
## Issue Addressed
Closes#2052
## Proposed Changes
- Refactor the attester/proposer duties endpoints in the BN
- Performance improvements
- Fixes some potential inconsistencies with the dependent root fields.
- Removes `http_api::beacon_proposer_cache` and just uses the one on the `BeaconChain` instead.
- Move the code for the proposer/attester duties endpoints into separate files, for readability.
- Refactor the `DutiesService` in the VC
- Required to reduce the delay on broadcasting new blocks.
- Gets rid of the `ValidatorDuty` shim struct that came about when we adopted the standard API.
- Separate block/attestation duty tasks so that they don't block each other when one is slow.
- In the VC, use `PublicKeyBytes` to represent validators instead of `PublicKey`. `PublicKey` is a legit crypto object whilst `PublicKeyBytes` is just a byte-array, it's much faster to clone/hash `PublicKeyBytes` and this change has had a significant impact on runtimes.
- Unfortunately this has created lots of dust changes.
- In the BN, store `PublicKeyBytes` in the `beacon_proposer_cache` and allow access to them. The HTTP API always sends `PublicKeyBytes` over the wire and the conversion from `PublicKey` -> `PublickeyBytes` is non-trivial, especially when queries have 100s/1000s of validators (like Pyrmont).
- Add the `state_processing::state_advance` mod which dedups a lot of the "apply `n` skip slots to the state" code.
- This also fixes a bug with some functions which were failing to include a state root as per [this comment](072695284f/consensus/state_processing/src/state_advance.rs (L69-L74)). I couldn't find any instance of this bug that resulted in anything more severe than keying a shuffling cache by the wrong block root.
- Swap the VC block service to use `mpsc` from `tokio` instead of `futures`. This is consistent with the rest of the code base.
~~This PR *reduces* the size of the codebase 🎉~~ It *used* to reduce the size of the code base before I added more comments.
## Observations on Prymont
- Proposer duties times down from peaks of 450ms to consistent <1ms.
- Current epoch attester duties times down from >1s peaks to a consistent 20-30ms.
- Block production down from +600ms to 100-200ms.
## Additional Info
- ~~Blocked on #2241~~
- ~~Blocked on #2234~~
## TODO
- [x] ~~Refactor this into some smaller PRs?~~ Leaving this as-is for now.
- [x] Address `per_slot_processing` roots.
- [x] Investigate slow next epoch times. Not getting added to cache on block processing?
- [x] Consider [this](072695284f/beacon_node/store/src/hot_cold_store.rs (L811-L812)) in the scenario of replacing the state roots
Co-authored-by: pawan <pawandhananjay@gmail.com>
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
## Issue Addressed
#2224
## Proposed Changes
Add a `--password-file` option to the `lighthouse account validator import` command. The flag requires `--reuse-password` and will copy the password over to the `validator_definitions.yml` file. I used #2070 as a guide for validating the password as UTF-8 and stripping newlines.
## Additional Info
Co-authored-by: realbigsean <seananderson33@gmail.com>
## Issue Addressed
Resolves#1788
## Proposed Changes
As per #1788, expose the time at which the process started via the `process_start_time_seconds` Prometheus metric. This will help users track uptime.
## Additional Info
NA
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
## Issue Addressed
Resolves#1944
## Proposed Changes
Adds a "graffiti" key to the `validator_definitions.yml`. Setting the key will override anything passed through the validator `--graffiti` flag.
Returns an error if the value for the graffiti key is > 32 bytes instead of silently truncating.
## Issue Addressed
resolves#2129resolves#2099
addresses some of #1712
unblocks #2076
unblocks #2153
## Proposed Changes
- Updates all the dependencies mentioned in #2129, except for web3. They haven't merged their tokio 1.0 update because they are waiting on some dependencies of their own. Since we only use web3 in tests, I think updating it in a separate issue is fine. If they are able to merge soon though, I can update in this PR.
- Updates `tokio_util` to 0.6.2 and `bytes` to 1.0.1.
- We haven't made a discv5 release since merging tokio 1.0 updates so I'm using a commit rather than release atm. **Edit:** I think we should merge an update of `tokio_util` to 0.6.2 into discv5 before this release because it has panic fixes in `DelayQueue` --> PR in discv5: https://github.com/sigp/discv5/pull/58
## Additional Info
tokio 1.0 changes that required some changes in lighthouse:
- `interval.next().await.is_some()` -> `interval.tick().await`
- `sleep` future is now `!Unpin` -> https://github.com/tokio-rs/tokio/issues/3028
- `try_recv` has been temporarily removed from `mpsc` -> https://github.com/tokio-rs/tokio/issues/3350
- stream features have moved to `tokio-stream` and `broadcast::Receiver::into_stream()` has been temporarily removed -> `https://github.com/tokio-rs/tokio/issues/2870
- I've copied over the `BroadcastStream` wrapper from this PR, but can update to use `tokio-stream` once it's merged https://github.com/tokio-rs/tokio/pull/3384
Co-authored-by: realbigsean <seananderson33@gmail.com>
## Issue Addressed
NA
## Proposed Changes
Copied from #2083, changes the config milliseconds_per_slot to seconds_per_slot to avoid errors when slot duration is not a multiple of a second. To avoid deserializing old serialized data (with milliseconds instead of seconds) the Serialize and Deserialize derive got removed from the Spec struct (isn't currently used anyway).
This PR replaces #2083 for the purpose of fixing a merge conflict without requiring the input of @blacktemplar.
## Additional Info
NA
Co-authored-by: blacktemplar <blacktemplar@a1.net>
## Proposed Changes
Run cargo update and bump version in prep for v1.0.4 release
## Additional Info
Planning to merge this commit to `unstable`, test on Pyrmont and canary nodes, then push to `stable`.
## Issue Addressed
Closes#1264
## Proposed Changes
* Milagro BLS: tweak the feature flags so that Milagro doesn't get compiled if we're using BLST. Profiling showed that it was consuming about 1 minute of CPU time out of 60 minutes of CPU time (real time ~15 mins). A 1.6% saving.
* Reduce monomorphization: compiling for 3 different `EthSpec` types causes a heck of a lot of generic functions to be instantiated (monomorphized). Removing 2 of 3 cuts the LLVM+linking step from around 250 seconds to 180 seconds, a saving of 70 seconds (real time!). This applies only to `make` and not the CI build, because we test with the minimal spec on CI.
* Update `web3` crate to v0.13. This is perhaps the most controversial change, because it requires axing some deposit contract tools from `lcli`. I suspect these tools weren't used much anyway, and could be maintained separately, but I'm also happy to revert this change. However, it does save us a lot of compile time. With #1839, we now have 3 versions of Tokio (and all of Tokio's deps). This change brings us down to 2 versions, but 1 should be achievable once web3 (and reqwest) move to Tokio 0.3.
* Remove `lcli` from the Docker image. It's a dev tool and can be built from the repo if required.
## Issue Addressed
#1992 and #1987, and also to be considered a continuation of #1751
## Proposed Changes
many changed files but most are renaming to align the code with the semantics of `--network`
- remove the `--network` default value (in clap) and instead set it after checking the `network` and `testnet-dir` flags
- move `eth2_testnet_config` crate to `eth2_network_config`
- move `Eth2TestnetConfig` to `Eth2NetworkConfig`
- move `DEFAULT_HARDCODED_TESTNET` to `DEFAULT_HARDCODED_NETWORK`
- `beacon_node`s `get_eth2_testnet_config` loads the `DEFAULT_HARDCODED_NETWORK` if there is no network nor testnet provided
- `boot_node`s config loads the config same as the `beacon_node`, it was using the configuration only for preconfigured networks (That code is ~1year old so I asume it was not intended)
- removed a one year old comment stating we should try to emulate `https://github.com/eth2-clients/eth2-testnets/tree/master/nimbus/testnet1` it looks outdated (?)
- remove `lighthouse`s `load_testnet_config` in favor of `get_eth2_network_config` to centralize that logic (It had differences)
- some spelling
## Additional Info
Both the command of #1992 and the scripts of #1987 seem to work fine, same as `bn` and `vc`
## Issue Addressed
NA
## Proposed Changes
Updates out of date dependencies.
## Additional Info
See also https://github.com/sigp/lighthouse/issues/1712 for a list of dependencies that are still out of date and the resasons.
Update lighthouse to version `v1.0.2`.
There are two major updates in this version:
- Updates to the task executor to tokio 0.3 and all sub-dependencies relying on core execution, including libp2p
- Update BLST
## Description
This PR updates Lighthouse to tokio 0.3. It includes a number of dependency updates and some structural changes as to how we create and spawn tasks.
This also brings with it a number of various improvements:
- Discv5 update
- Libp2p update
- Fix for recompilation issues
- Improved UPnP port mapping handling
- Futures dependency update
- Log downgrade to traces for rejecting peers when we've reached our max
Co-authored-by: blacktemplar <blacktemplar@a1.net>
## Issue Addressed
NA
## Proposed Changes
- Replace Zinken with Pyrmont (Zinken has been sun-setted).
- Ensure Mainnet is build in the build script.
## Additional Info
NA
## Issue Addressed
NA
## Proposed Changes
Adds support for the [Toledo](https://github.com/protolambda/toledo) dev-net.
```bash
lighthouse --testnet toledo bn --http
```
This is for development only, we do not recommend users to join this testnet.
## Additional Info
- ~~Blocked on #1862~~
The remote signer relies on the `types` and `crypto/bls` crates from Lighthouse. Moreover, a number of tests of the remote signer consumption of LH leverages this very signer, making any important update a potential dependency nightmare.
Co-authored-by: Paul Hauner <paul@paulhauner.com>
## Issue Addressed
Closes#1842
## Proposed Changes
Due to the lies told to us by VPS providers about what CPU features they support, we are forced to check for the availability of CPU features like ADX by just _running code and seeing if it crashes_. The prominent warning should hopefully help users who have truly incompatible CPUs work out what is going on, while not burdening users of cheap VPSs.
## Issue Addressed
Closes#1504Closes#1505
Replaces #1703Closes#1707
## Proposed Changes
* Update BLST and Milagro to versions compatible with BLSv4 spec
* Update Lighthouse to spec v1.0.0-rc.0, and update EF test vectors
* Use the v1.0.0 constants for `MainnetEthSpec`.
* Rename `InteropEthSpec` -> `V012LegacyEthSpec`
* Change all constants to suit the mainnet `v0.12.3` specification (i.e., Medalla).
* Deprecate the `--spec` flag for the `lighthouse` binary
* This value is now obtained from the `config_name` field of the `YamlConfig`.
* Built in testnet YAML files have been updated.
* Ignore the `--spec` value, if supplied, log a warning that it will be deprecated
* `lcli` still has the spec flag, that's fine because it's dev tooling.
* Remove the `E: EthSpec` from `YamlConfig`
* This means we need to deser the genesis `BeaconState` on-demand, but this is fine.
* Swap the old "minimal", "mainnet" strings over to the new `EthSpecId` enum.
* Always require a `CONFIG_NAME` field in `YamlConfig` (it used to have a default).
## Additional Info
Lots of breaking changes, do not merge! ~~We will likely need a Lighthouse v0.4.0 branch, and possibly a long-term v0.3.0 branch to keep Medalla alive~~.
Co-authored-by: Kirk Baird <baird.k@outlook.com>
Co-authored-by: Paul Hauner <paul@paulhauner.com>
## Issue Addressed
N/A
## Proposed Changes
Minor doc fixes. Adds a section on custom data directories.
Co-authored-by: Michael Sproul <micsproul@gmail.com>
## Issue Addressed
- Resolves#1722
## Proposed Changes
This extends @danielschonfeld's work in #1739 with:
- Use an empty boot node list
- Remove the genesis state
## Additional Info
NA
Co-authored-by: Daniel Schonfeld <daniel@schonfeld.org>
## Issue Addressed
- Resolves#1706
## Proposed Changes
Updates dependencies across the workspace. Any crate that was not able to be brought to the latest version is listed in #1712.
## Additional Info
NA
* Initial rebase
* Remove old code
* Correct release tests
* Rebase commit
* Remove eth2-testnet dep on eth2libp2p
* Remove crates lost in rebase
* Remove unused dep
This commit was modified by Paul H whilst rebasing master onto
v0.3.0-staging
Adding UPnP support will help grow the DHT by allowing NAT traversal for peers with UPnP supported routers.
Using IGD library: https://docs.rs/igd/0.10.0/igd/
Adding the the libp2p tcp port and discovery udp port. If this fails it simply logs the attempt and moves on
Co-authored-by: Age Manning <Age@AgeManning.com>
## Issue Addressed
NA
## Proposed Changes
- Implements a HTTP API for the validator client.
- Creates EIP-2335 keystores with an empty `description` field, instead of a missing `description` field. Adds option to set name.
- Be more graceful with setups without any validators (yet)
- Remove an error log when there are no validators.
- Create the `validator` dir if it doesn't exist.
- Allow building a `ValidatorDir` without a withdrawal keystore (required for the API method where we only post a voting keystore).
- Add optional `description` field to `validator_definitions.yml`
## TODO
- [x] Signature header, as per https://github.com/sigp/lighthouse/issues/1269#issuecomment-649879855
- [x] Return validator descriptions
- [x] Return deposit data
- [x] Respect the mnemonic offset
- [x] Check that mnemonic can derive returned keys
- [x] Be strict about non-localhost
- [x] Allow graceful start without any validators (+ create validator dir)
- [x] Docs final pass
- [x] Swap to EIP-2335 description field.
- [x] Fix Zerioze TODO in VC api types.
- [x] Zeroize secp256k1 key
## Endpoints
- [x] `GET /lighthouse/version`
- [x] `GET /lighthouse/health`
- [x] `GET /lighthouse/validators`
- [x] `POST /lighthouse/validators/hd`
- [x] `POST /lighthouse/validators/keystore`
- [x] `PATCH /lighthouse/validators/:validator_pubkey`
- [ ] ~~`POST /lighthouse/validators/:validator_pubkey/exit/:epoch`~~ Future works
## Additional Info
TBC
## Issue Addressed
Implements support for importing and exporting the slashing protection DB interchange format described here:
https://hackmd.io/@sproul/Bk0Y0qdGD
Also closes#1584
## Proposed Changes
* [x] Support for serializing and deserializing the format
* [x] Support for importing and exporting Lighthouse's database
* [x] CLI commands to invoke import and export
* [x] Export to minimal format (required when a minimal format has been previously imported)
* [x] Tests for export to minimal (utilising mixed importing and attestation signing?)
* [x] Tests for import/export of complete format, and import of minimal format
* [x] ~~Prevent attestations with sources less than our max source (Danny's suggestion). Required for the fake attestation that we put in for the minimal format to block attestations from source 0.~~
* [x] Add the concept of a "low watermark" for compatibility with the minimal format
Bonus!
* [x] A fix to a potentially nasty bug involving validators getting re-registered each time the validator client ran! Thankfully, the ordering of keys meant that the validator IDs used for attestations and blocks remained stable -- otherwise we could have had some slashings on our hands! 😱
* [x] Tests to confirm that this bug is indeed vanquished
Closes#1487Closes#1427
Directory restructure in accordance with #1487. Also has temporary migration code to move the old directories into new structure.
Also extracts all default directory names and utility functions into a `directory` crate to avoid repetitio.
~Since `validator_definition.yaml` stores absolute paths, users will have to manually change the keystore paths or delete the file to get the validators picked up by the vc.~. `validator_definition.yaml` is migrated as well from the default directories.
Co-authored-by: realbigsean <seananderson33@gmail.com>
Co-authored-by: Paul Hauner <paul@paulhauner.com>
## Issue Addressed
- Resolves#1313
## Proposed Changes
Changes the way we start the validator client and beacon node to ensure that we cleanly drop the validator keystores (which therefore ensures we cleanup their lockfiles).
Previously we were holding the validator keystores in a tokio task that was being forcefully killed (i.e., without `Drop`). Now, we hold them in a task that can gracefully handle a shutdown.
Also, switches the `--strict-lockfiles` flag to `--delete-lockfiles`. This means two things:
1. We are now strict on lockfiles by default (before we weren't).
1. There's a simple way for people delete the lockfiles if they experience a crash.
## Additional Info
I've only given the option to ignore *and* delete lockfiles, not just ignore them. I can't see a strong need for ignore-only but could easily add it, if the need arises.
I've flagged this as `api-breaking` since users that have lockfiles lingering around will be required to supply `--delete-lockfiles` next time they run.
## Issue Addressed
Resolves#1651
## Description
This supercedes #1658. Great work was done by @pawanjay176, I just needed to make a change whilst he is away.
See #1658 for a description, prior reviews and approval by @michaelsproul.
## Additional info
Ignores a rustsec advisory. This is tracked in #1669.
Co-authored-by: pawan <pawandhananjay@gmail.com>
## Proposed Changes
Update test vectors for v0.12.3, and introduced configurable `proportional_slashing_multiplier`.
Also makes `YamlConfig` a bit safer by making every field access in `apply_to_chain_spec` explicit, and removing the `#[serde(default)]` attribute, which would instantiate missing fields to type defaults! Risky!
## Issue Addressed
#1437
## Proposed Changes
- Make the `--wallet-password` flag optional and creates an interactive prompt if not provided.
- Make the `--wallet-name` flag optional and creates an interactive prompt if not provided.
- Add a minimum password requirement of a 12 character length.
- Update the `--stdin-passwords` flag to `--stdin-inputs` because we have non-password user inputs
## Additional Info
This reverts commit 4fca306397.
Something in the BLST update is causing SIGILLs on aarch64 non-portable builds. While we debug the issue, I think it's best if we just revert the update.
## Issue Addressed
Closes#1504
Closes https://github.com/sigp/lighthouse/issues/1505
## Proposed Changes
* Update `blst` to the latest version, which is more portable and includes finer-grained compilation controls (see below).
* Detect the case where a binary has been explicitly compiled with ADX support but it's missing at runtime, and report a nicer error than `SIGILL`.
## Known Issues
* None. The previous issue with `make build-aarch64` (https://github.com/supranational/blst/issues/27), has been resolved.
## Additional Info
I think we should tweak our release process and our Docker builds so that we provide two options:
Binaries:
* `lighthouse`: compiled with `modern`/`force-adx`, for CPUs 2013 and newer
* `lighthouse-portable`: compiled with `portable` for older CPUs
Docker images:
* `sigp/lighthouse:latest`: multi-arch image with `modern` x86_64 and vanilla aarch64 binary
* `sigp/lighthouse:latest-portable`: multi-arch image with `portable` builds for x86_64 and aarch64
And relevant Docker images for the releases (as per https://github.com/sigp/lighthouse/pull/1574#issuecomment-687766141), tagged `v0.x.y` and `v0.x.y-portable`
## Issue Addressed
N/A
## Proposed Changes
- Fix a wrong command in the validator generation example.
- Replace occurrences of 'passphrase' with 'password'. This is mostly because I felt that there was a lot of mixing of the two phrases in the documentation and the actual commands which is a bit confusing. Picked 'password' everywhere because it felt more appropriate but I don't mind changing it to 'passphrase' as long it's consistent everywhere.
## Issue Addressed
#1464
## Proposed Changes
Increase the slog-async log channel size from the default of 128 to 2048 to reduce the number of dropped logs.
## Additional Info
## Issue Addressed
Fixes#1552
## Proposed Changes
Earlier, we were always loading the hardcoded default testnet config which is a mainnet spec. So running lighthouse with `--spec` option anything other than mainnet gave errors because we tried loading a mainnet genesis spec with `minimal`/`interop` flags.
This PR fixes the order of loading such that we load the hardcoded default spec only if neither `--testnet` and `--testnet-dir` flags are present.
## Issue Addressed
NA
## Proposed Changes
Shift practically all HTTP endpoint handlers to the blocking executor (some very light tasks are left on the core executor).
## Additional Info
This PR covers the `rest_api` which will soon be refactored to suit the standard API. As such, I've cut a few corners and left some existing issues open in this patch. What I have done here should leave the API in state that is not necessary *exactly* the same, but good enough for us to run validators with. Specifically, the number of blocking workers that can be spawned is unbounded and I have not implemented a queue; this will need to be fixed when we implement the standard API.
## Issue Addressed
#1494
## Proposed Changes
- Give the TaskExecutor the sender side of a channel that a task can clone to request shutting down
- The receiver side of this channel is in environment and now we block until ctrl+c or an internal shutdown signal is received
- The swarm now informs when it has reached 0 listeners
- The network receives this message and requests the shutdown
## Discovery v5 update
In this update we remove the openssl dependency in favour of rust-crypto.
The update also removes a series of unnecessary async functions which may improve some of the issues we have been experiencing.
## Issue Addressed
N/A
## Proposed Changes
Earlier, to log to a file, the only options were to redirect stdout/stderr to a file or use json logging.
Redirecting to stdout/stderr works well but causes issues with mistakenly overwriting the file instead of appending which has resulted in loss of precious logs on multiple occasions for me.
Json logging creates a timestamped backup of the file if it already exists, but the json format itself is hugely annoying.
This PR modifies the `--logfile` option to log as it does in the terminal to a logfile.
## Issue Addressed
NA
## Proposed Changes
Adds support for using the [`cross`](https://github.com/rust-embedded/cross) project to produce cross-compiled binaries using Docker images.
Provides quite clean and simple cross-compiles cause all the complexity is hidden in Dockerfiles. It does require you to be in the `docker` group though.
## Details
- Adds shortcut commands to `Makefile`
- Ensures `reqwest` and `discv5` use vendored openssl libs (i.e., static not shared).
- Switches to a [commit](284f705964) of blst that has a renamed C function to avoid a collision with openssl (upstream issue: https://github.com/supranational/blst/issues/21).
- Updates `ring` to the latest satisfiable version, since an earlier version was causing issues with `cross`.
- Off-topic, but adds extra message about Windows support as suggested by Discord user.
## Additional Info
- ~~Blocked on #1495~~
- There are no tests in CI for this yet for a few reasons:
- I'm hesitant to add more long-running tasks.
- Short-term bitrot should be avoided since we'll use it each release.
- In the long term I think it would be good to automate binary creation on a release.
- I observed the binaries increase in size from 50mb to 52mb after these changes.
## Issue Addressed
#1483
## Proposed Changes
Upgrades the log to a critical if a listener fails. We are able to listen on many interfaces so a single instance is not critical. We should however gracefully shutdown the client if we have no listeners, although the client can still function solely on outgoing connections.
For now a critical is raised and I leave #1494 for more sophisticated handling of this.
This also updates discv5 to handle errors of binding to a UDP socket such that lighthouse is now able to handle them.
## Issue Addressed
NA
## Proposed Changes
- Moves the git-based versioning we were doing into the `lighthouse_version` crate in `common`.
- Removes the `beacon_node/version` crate, replacing it with `lighthouse_version`.
- Bumps the version to `v0.2.0`.
## Additional Info
There are now two types of version string:
1. `const VERSION: &str = Lighthouse/v0.2.0-1419501f2+`
1. `version_with_platform() = Lighthouse/v0.2.0-1419501f2+/x86_64-linux`
(1) is handy cause it's a `const` and shorter. (2) has platform info so it's more useful. Note that the plus-sign (`+`) indicates the the git commit is dirty (it used to be `(modified)` but I had to shorten it to fit into graffiti).
These version strings are now included on:
- `lighthouse --version`
- `lcli --version`
- `curl localhost:5052/node/version`
- p2p messages when we communicate our version
You can update the version by changing this constant (version is not related to a `Cargo.toml`):
b9ad7102d5/common/lighthouse_version/src/lib.rs (L4-L15)
## Proposed Changes
In the continuing war against unportable binaries I figured we should have an option to enable building the Lighthouse binary itself with Milagro. This PR adds a `milagro` feature that can be used with `cargo install --path lighthouse --features milagro --force --locked`. The BLS library in-use will also show up under `lighthouse --version` like this:
```
Lighthouse 0.1.2-7d8acc20a(modified)
BLS Library: milagro
```
Future work: add other cool stuff like the compiler version and CPU target to `--version`.
## Issue Addressed
Closes#1395
## Proposed Changes
* Add a feature to `lighthouse` and `lcli` called `portable` which enables the `portable` feature on our fork of BLST. This feature turns off the `-march=native` C compiler flag that produces binaries highly targeted to the host CPU's instruction set.
* Tweak the `Makefile` so that when the `PORTABLE` environment variable is set to `true`, it compiles with this feature.
* Temporarily enable `PORTABLE=true` in the Docker build so that the image on Docker Hub is portable. Eventually I think we should enable `PORTABLE=true` _only on Docker Hub_, so that users building locally can take advantage of the tasty compiler magic. This seems to be possible by setting a Docker Hub environment variable: https://docs.docker.com/docker-hub/builds/#environment-variables-for-builds
## Additional Info
Tested by compiling on a very new CPU (Intel Core i7-8550U) and copying the binary to a very old CPU (Intel Core i3 530). Before the portability fix, this produced the SIGILL crash described in #1395, and after the fix, it worked smoothly.
I'm in the process of testing the Docker build and running some benches to confirm that the performance penalty isn't too severe.
## Issue Addressed
NA
## Proposed Changes
Allows for multiple "hardcoded" testnets.
## Additional Info
This PR is incomplete.
## TODO
- [x] Add flag to CLI, integrate with rest of Lighthouse.
Co-authored-by: Pawan Dhananjay <pawandhananjay@gmail.com>
Co-authored-by: Michael Sproul <michael@sigmaprime.io>
## Issue Addressed
NA
## Proposed Changes
- Refactor the `bls` crate to support multiple BLS "backends" (e.g., milagro, blst, etc).
- Removes some duplicate, unused code in `common/rest_types/src/validator.rs`.
- Removes the old "upgrade legacy keypairs" functionality (these were unencrypted keys that haven't been supported for a few testnets, no one should be using them anymore).
## Additional Info
Most of the files changed are just inconsequential changes to function names.
## TODO
- [x] Optimization levels
- [x] Infinity point: https://github.com/supranational/blst/issues/11
- [x] Ensure milagro *and* blst are tested via CI
- [x] What to do with unsafe code?
- [x] Test infinity point in signature sets
## Issue Addressed
https://github.com/sigp/lighthouse/issues/1177
## Proposed Changes
Add a command line option (`--http-allow-origin`) and a config item for configuring the `Access-Control-Allow-Origin` response header. This should unblock making XMLHttpRequests.
## Issue Addressed
NA
## Proposed Changes
Exits *earlier* when system is 32-bit. Previously we were trying (and failing) to parse the genesis SSZ first. This made for a bad user experience.
## Additional Info
NA
* Add latest commit info to git version
* Testing docker build
* Use fallback; modify format
* Revert "Testing docker build"
This reverts commit 197140d6973e22a6b891ebc0f1798d60c87b182a.
* Modify fallback to have crate version
* Add logging on shutdown
* Replace tokio::spawn with handle.spawn
* Upgrade tokio
* Add a task executor
* Beacon chain tasks use task executor
* Validator client tasks use task executor
* Rename runtime_handle to executor
* Add duration histograms; minor fixes
* Cleanup
* Fix logs
* Fix tests
* Remove random file
* Get enr dependency instead of libp2p
* Address some review comments
* Libp2p takes a TaskExecutor
* Ugly fix libp2p tests
* Move TaskExecutor to own file
* Upgrade Dockerfile rust version
* Minor fixes
* Revert "Ugly fix libp2p tests"
This reverts commit 58d4bb690f52de28d893943b7504d2d0c6621429.
* Pretty fix libp2p tests
* Add spawn_without_exit; change Counter to Gauge
* Tidy
* Move log from RuntimeContext to TaskExecutor
* Fix errors
* Replace histogram with int_gauge for async tasks
* Fix todo
* Fix memory leak in test by exiting all spawned tasks at the end
* Update state processing for v0.12
* Fix EF test runners for v0.12
* Fix some tests
* Fix broken attestation verification test
* More test fixes
* Fix typo found in review