lighthouse-pulse/watch
Paul Hauner b60304b19f Use BeaconProcessor for API requests (#4462)
## Issue Addressed

NA

## Proposed Changes

Rather than spawning new tasks on the tokio executor to process each HTTP API request, send the tasks to the `BeaconProcessor`. This achieves:

1. Places a bound on how many concurrent requests are being served (i.e., how many we are actually trying to compute at one time).
1. Places a bound on how many requests can be awaiting a response at one time (i.e., starts dropping requests when we have too many queued).
1. Allows the BN prioritise HTTP requests with respect to messages coming from the P2P network (i.e., proiritise importing gossip blocks rather than serving API requests).

Presently there are two levels of priorities:

- `Priority::P0`
    - The beacon processor will prioritise these above everything other than importing new blocks.
    - Roughly all validator-sensitive endpoints.
- `Priority::P1`
    - The beacon processor will prioritise practically all other P2P messages over these, except for historical backfill things.
    - Everything that's not `Priority::P0`
    
The `--http-enable-beacon-processor false` flag can be supplied to revert back to the old behaviour of spawning new `tokio` tasks for each request:

```
        --http-enable-beacon-processor <BOOLEAN>
            The beacon processor is a scheduler which provides quality-of-service and DoS protection. When set to
            "true", HTTP API requests will queued and scheduled alongside other tasks. When set to "false", HTTP API
            responses will be executed immediately. [default: true]
```
    
## New CLI Flags

I added some other new CLI flags:

```
        --beacon-processor-aggregate-batch-size <INTEGER>
            Specifies the number of gossip aggregate attestations in a signature verification batch. Higher values may
            reduce CPU usage in a healthy network while lower values may increase CPU usage in an unhealthy or hostile
            network. [default: 64]
        --beacon-processor-attestation-batch-size <INTEGER>
            Specifies the number of gossip attestations in a signature verification batch. Higher values may reduce CPU
            usage in a healthy network whilst lower values may increase CPU usage in an unhealthy or hostile network.
            [default: 64]
        --beacon-processor-max-workers <INTEGER>
            Specifies the maximum concurrent tasks for the task scheduler. Increasing this value may increase resource
            consumption. Reducing the value may result in decreased resource usage and diminished performance. The
            default value is the number of logical CPU cores on the host.
        --beacon-processor-reprocess-queue-len <INTEGER>
            Specifies the length of the queue for messages requiring delayed processing. Higher values may prevent
            messages from being dropped while lower values may help protect the node from becoming overwhelmed.
            [default: 12288]
```


I needed to add the max-workers flag since the "simulator" flavor tests started failing with HTTP timeouts on the test assertions. I believe they were failing because the Github runners only have 2 cores and there just weren't enough workers available to process our requests in time. I added the other flags since they seem fun to fiddle with.

## Additional Info

I bumped the timeouts on the "simulator" flavor test from 4s to 8s. The prioritisation of consensus messages seems to be causing slower responses, I guess this is what we signed up for 🤷 

The `validator/register` validator has some special handling because the relays have a bad habit of timing out on these calls. It seems like a waste of a `BeaconProcessor` worker to just wait for the builder API HTTP response, so we spawn a new `tokio` task to wait for a builder response.

I've added an optimisation for the `GET beacon/states/{state_id}/validators/{validator_id}` endpoint in [efbabe3](efbabe3252). That's the endpoint the VC uses to resolve pubkeys to validator indices, and it's the endpoint that was causing us grief. Perhaps I should move that into a new PR, not sure.
2023-08-08 23:30:15 +00:00
..
migrations Add beacon.watch (#3362) 2023-04-03 05:35:11 +00:00
postgres_docker_compose Add beacon.watch (#3362) 2023-04-03 05:35:11 +00:00
src Update mev-rs and remove patches (#4496) 2023-07-17 00:14:15 +00:00
tests Use BeaconProcessor for API requests (#4462) 2023-08-08 23:30:15 +00:00
.gitignore Add beacon.watch (#3362) 2023-04-03 05:35:11 +00:00
Cargo.toml Use BeaconProcessor for API requests (#4462) 2023-08-08 23:30:15 +00:00
config.yaml.default Add beacon.watch (#3362) 2023-04-03 05:35:11 +00:00
diesel.toml Add beacon.watch (#3362) 2023-04-03 05:35:11 +00:00
README.md Fix libpq typo in beacon.watch README (#4356) 2023-05-31 07:16:20 +00:00

beacon.watch

beacon.watch is pre-MVP and still under active development and subject to change.

beacon.watch is an Ethereum Beacon Chain monitoring platform whose goal is to provide fast access to data which is:

  1. Not already stored natively in the Beacon Chain
  2. Too specialized for Block Explorers
  3. Too sensitive for public Block Explorers

Requirements

cargo install diesel_cli --no-default-features --features postgres

Setup

  1. Setup the database:
cd postgres_docker_compose
docker-compose up
  1. Ensure the tests pass:
cargo test --release
  1. Drop the database (if it already exists) and run the required migrations:
diesel database reset --database-url postgres://postgres:postgres@localhost/dev
  1. Ensure a synced Lighthouse beacon node with historical states is available at localhost:5052. The smaller the value of --slots-per-restore-point the faster beacon.watch will be able to sync to the beacon node.

  2. Run the updater daemon:

cargo run --release -- run-updater
  1. Start the HTTP API server:
cargo run --release -- serve
  1. Ensure connectivity:
curl "http://localhost:5059/v1/slots/highest"

Functionality on MacOS has not been tested. Windows is not supported.

Configuration

beacon.watch can be configured through the use of a config file. Available options can be seen in config.yaml.default.

You can specify a config file during runtime:

cargo run -- run-updater --config path/to/config.yaml
cargo run -- serve --config path/to/config.yaml

You can specify only the parts of the config file which you need changed. Missing values will remain as their defaults.

For example, if you wish to run with default settings but only wish to alter log_level your config file would be:

# config.yaml
log_level = "info"

Available Endpoints

As beacon.watch continues to develop, more endpoints will be added.

In these examples any data containing information from blockprint has either been redacted or fabricated.

/v1/slots/{slot}

curl "http://localhost:5059/v1/slots/4635296"
{
  "slot": "4635296",
  "root": "0xf7063a9d6c663682e59bd0b41d29ce80c3ff0b089049ff8676d6f9ee79622c62",
  "skipped": false,
  "beacon_block": "0xf7063a9d6c663682e59bd0b41d29ce80c3ff0b089049ff8676d6f9ee79622c62"
}

/v1/slots?start_slot={}&end_slot={}

curl "http://localhost:5059/v1/slots?start_slot=4635296&end_slot=4635297"
[
  {
    "slot": "4635297",
    "root": "0x04ad2e963811207e344bebeba5b1217805bcc3a9e2ed9fcf2205d491778c6182",
    "skipped": false,
    "beacon_block": "0x04ad2e963811207e344bebeba5b1217805bcc3a9e2ed9fcf2205d491778c6182"
  },
  {
    "slot": "4635296",
    "root": "0xf7063a9d6c663682e59bd0b41d29ce80c3ff0b089049ff8676d6f9ee79622c62",
    "skipped": false,
    "beacon_block": "0xf7063a9d6c663682e59bd0b41d29ce80c3ff0b089049ff8676d6f9ee79622c62"
  }
]

/v1/slots/lowest

curl "http://localhost:5059/v1/slots/lowest"
{
  "slot": "4635296",
  "root": "0xf7063a9d6c663682e59bd0b41d29ce80c3ff0b089049ff8676d6f9ee79622c62",
  "skipped": false,
  "beacon_block": "0xf7063a9d6c663682e59bd0b41d29ce80c3ff0b089049ff8676d6f9ee79622c62"
}

/v1/slots/highest

curl "http://localhost:5059/v1/slots/highest"
{
  "slot": "4635358",
  "root": "0xe9eff13560688f1bf15cf07b60c84963d4d04a4a885ed0eb19ceb8450011894b",
  "skipped": false,
  "beacon_block": "0xe9eff13560688f1bf15cf07b60c84963d4d04a4a885ed0eb19ceb8450011894b"
}

v1/slots/{slot}/block

curl "http://localhost:5059/v1/slots/4635296/block"
{
  "slot": "4635296",
  "root": "0xf7063a9d6c663682e59bd0b41d29ce80c3ff0b089049ff8676d6f9ee79622c62",
  "parent_root": "0x7c4860b420a23de9d126da71f9043b3744af98c847efd9e1440f2da8fbf7f31b"
}

/v1/blocks/{block_id}

curl "http://localhost:5059/v1/blocks/4635296"
# OR
curl "http://localhost:5059/v1/blocks/0xf7063a9d6c663682e59bd0b41d29ce80c3ff0b089049ff8676d6f9ee79622c62"
{
  "slot": "4635296",
  "root": "0xf7063a9d6c663682e59bd0b41d29ce80c3ff0b089049ff8676d6f9ee79622c62",
  "parent_root": "0x7c4860b420a23de9d126da71f9043b3744af98c847efd9e1440f2da8fbf7f31b"
}

/v1/blocks?start_slot={}&end_slot={}

curl "http://localhost:5059/v1/blocks?start_slot=4635296&end_slot=4635297"
[
  {
    "slot": "4635297",
    "root": "0x04ad2e963811207e344bebeba5b1217805bcc3a9e2ed9fcf2205d491778c6182",
    "parent_root": "0xf7063a9d6c663682e59bd0b41d29ce80c3ff0b089049ff8676d6f9ee79622c62"
  },
  {
    "slot": "4635296",
    "root": "0xf7063a9d6c663682e59bd0b41d29ce80c3ff0b089049ff8676d6f9ee79622c62",
    "parent_root": "0x7c4860b420a23de9d126da71f9043b3744af98c847efd9e1440f2da8fbf7f31b"
  }
]

/v1/blocks/{block_id}/previous

curl "http://localhost:5059/v1/blocks/4635297/previous"
# OR
curl "http://localhost:5059/v1/blocks/0x04ad2e963811207e344bebeba5b1217805bcc3a9e2ed9fcf2205d491778c6182/previous"
{
  "slot": "4635296",
  "root": "0xf7063a9d6c663682e59bd0b41d29ce80c3ff0b089049ff8676d6f9ee79622c62",
  "parent_root": "0x7c4860b420a23de9d126da71f9043b3744af98c847efd9e1440f2da8fbf7f31b"
}

/v1/blocks/{block_id}/next

curl "http://localhost:5059/v1/blocks/4635296/next"
# OR
curl "http://localhost:5059/v1/blocks/0xf7063a9d6c663682e59bd0b41d29ce80c3ff0b089049ff8676d6f9ee79622c62/next"
{
  "slot": "4635297",
  "root": "0x04ad2e963811207e344bebeba5b1217805bcc3a9e2ed9fcf2205d491778c6182",
  "parent_root": "0xf7063a9d6c663682e59bd0b41d29ce80c3ff0b089049ff8676d6f9ee79622c62"
}

/v1/blocks/lowest

curl "http://localhost:5059/v1/blocks/lowest"
{
  "slot": "4635296",
  "root": "0xf7063a9d6c663682e59bd0b41d29ce80c3ff0b089049ff8676d6f9ee79622c62",
  "parent_root": "0x7c4860b420a23de9d126da71f9043b3744af98c847efd9e1440f2da8fbf7f31b"
}

/v1/blocks/highest

curl "http://localhost:5059/v1/blocks/highest"
{
  "slot": "4635358",
  "root": "0xe9eff13560688f1bf15cf07b60c84963d4d04a4a885ed0eb19ceb8450011894b",
  "parent_root": "0xb66e05418bb5b1d4a965c994e1f0e5b5f0d7b780e0df12f3f6321510654fa1d2"
}

/v1/blocks/{block_id}/proposer

curl "http://localhost:5059/v1/blocks/4635296/proposer"
# OR
curl "http://localhost:5059/v1/blocks/0xf7063a9d6c663682e59bd0b41d29ce80c3ff0b089049ff8676d6f9ee79622c62/proposer"

{
  "slot": "4635296",
  "proposer_index": 223126,
  "graffiti": ""
}

/v1/blocks/{block_id}/rewards

curl "http://localhost:5059/v1/blocks/4635296/reward"
# OR
curl "http://localhost:5059/v1/blocks/0xf7063a9d6c663682e59bd0b41d29ce80c3ff0b089049ff8676d6f9ee79622c62/reward"

{
  "slot": "4635296",
  "total": 25380059,
  "attestation_reward": 24351867,
  "sync_committee_reward": 1028192
}

/v1/blocks/{block_id}/packing

curl "http://localhost:5059/v1/blocks/4635296/packing"
# OR
curl "http://localhost:5059/v1/blocks/0xf7063a9d6c663682e59bd0b41d29ce80c3ff0b089049ff8676d6f9ee79622c62/packing"

{
  "slot": "4635296",
  "available": 16152,
  "included": 13101,
  "prior_skip_slots": 0
}

/v1/validators/{validator}

curl "http://localhost:5059/v1/validators/1"
# OR
curl "http://localhost:5059/v1/validators/0xa1d1ad0714035353258038e964ae9675dc0252ee22cea896825c01458e1807bfad2f9969338798548d9858a571f7425c"
{
  "index": 1,
  "public_key": "0xa1d1ad0714035353258038e964ae9675dc0252ee22cea896825c01458e1807bfad2f9969338798548d9858a571f7425c",
  "status": "active_ongoing",
  "client": null,
  "activation_epoch": 0,
  "exit_epoch": null
}

/v1/validators/{validator}/attestation/{epoch}

curl "http://localhost:5059/v1/validators/1/attestation/144853"
# OR
curl "http://localhost:5059/v1/validators/0xa1d1ad0714035353258038e964ae9675dc0252ee22cea896825c01458e1807bfad2f9969338798548d9858a571f7425c/attestation/144853"
{
  "index": 1,
  "epoch": "144853",
  "source": true,
  "head": true,
  "target": true
}

/v1/validators/missed/{vote}/{epoch}

curl "http://localhost:5059/v1/validators/missed/head/144853"
[
  63,
  67,
  98,
  ...
]

/v1/validators/missed/{vote}/{epoch}/graffiti

curl "http://localhost:5059/v1/validators/missed/head/144853/graffiti"
{
  "Mr F was here": 3,
  "Lighthouse/v3.1.0-aa022f4": 5,
  ...
}

/v1/clients/missed/{vote}/{epoch}

curl "http://localhost:5059/v1/clients/missed/source/144853"
{
  "Lighthouse": 100,
  "Lodestar": 100,
  "Nimbus": 100,
  "Prysm": 100,
  "Teku": 100,
  "Unknown": 100
}

/v1/clients/missed/{vote}/{epoch}/percentages

Note that this endpoint expresses the following:

What percentage of each client implementation missed this vote?
curl "http://localhost:5059/v1/clients/missed/target/144853/percentages"
{
  "Lighthouse": 0.51234567890,
  "Lodestar": 0.51234567890,
  "Nimbus": 0.51234567890,
  "Prysm": 0.09876543210,
  "Teku": 0.09876543210,
  "Unknown": 0.05647382910
}

/v1/clients/missed/{vote}/{epoch}/percentages/relative

Note that this endpoint expresses the following:

For the validators which did miss this vote, what percentage of them were from each client implementation?

You can check these values against the output of /v1/clients/percentages to see any discrepancies.

curl "http://localhost:5059/v1/clients/missed/target/144853/percentages/relative"
{
  "Lighthouse": 11.11111111111111,
  "Lodestar": 11.11111111111111,
  "Nimbus": 11.11111111111111,
  "Prysm": 16.66666666666667,
  "Teku": 16.66666666666667,
  "Unknown": 33.33333333333333
}

/v1/clients

curl "http://localhost:5059/v1/clients"
{
  "Lighthouse": 5000,
  "Lodestar": 5000,
  "Nimbus": 5000,
  "Prysm": 5000,
  "Teku": 5000,
  "Unknown": 5000
}

/v1/clients/percentages

curl "http://localhost:5059/v1/clients/percentages"
{
  "Lighthouse": 16.66666666666667,
  "Lodestar": 16.66666666666667,
  "Nimbus": 16.66666666666667,
  "Prysm": 16.66666666666667,
  "Teku": 16.66666666666667,
  "Unknown": 16.66666666666667
}

Future work

  • New tables

    • skip_slots?
  • More API endpoints

    • /v1/proposers?start_epoch={}&end_epoch={} and similar
    • /v1/validators/{status}/count
  • Concurrently backfill and forwards fill, so forwards fill is not bottlenecked by large backfills.

  • Better/prettier (async?) logging.

  • Connect to a range of beacon_nodes to sync different components concurrently. Generally, processing certain api queries such as block_packing and attestation_performance take the longest to sync.

Architecture

Connection Pooling:

  • 1 Pool for Updater (read and write)
  • 1 Pool for HTTP Server (should be read only, although not sure if we can enforce this)