go-pulse/metrics/registry.go
Martin Holst Swende 8b6cf128af
metrics: refactor metrics (#28035)
This change includes a lot of things, listed below. 

### Split up interfaces, write vs read

The interfaces have been split up into one write-interface and one read-interface, with `Snapshot` being the gateway from write to read. This simplifies the semantics _a lot_. 

Example of splitting up an interface into one readonly 'snapshot' part, and one updatable writeonly part: 

```golang
type MeterSnapshot interface {
	Count() int64
	Rate1() float64
	Rate5() float64
	Rate15() float64
	RateMean() float64
}

// Meters count events to produce exponentially-weighted moving average rates
// at one-, five-, and fifteen-minutes and a mean rate.
type Meter interface {
	Mark(int64)
	Snapshot() MeterSnapshot
	Stop()
}
```

### A note about concurrency

This PR makes the concurrency model clearer. We have actual meters and snapshot of meters. The `meter` is the thing which can be accessed from the registry, and updates can be made to it. 

- For all `meters`, (`Gauge`, `Timer` etc), it is assumed that they are accessed by different threads, making updates. Therefore, all `meters` update-methods (`Inc`, `Add`, `Update`, `Clear` etc) need to be concurrency-safe. 
- All `meters` have a `Snapshot()` method. This method is _usually_ called from one thread, a backend-exporter. But it's fully possible to have several exporters simultaneously: therefore this method should also be concurrency-safe. 

TLDR: `meter`s are accessible via registry, all their methods must be concurrency-safe. 

For all `Snapshot`s, it is assumed that an individual exporter-thread has obtained a `meter` from the registry, and called the `Snapshot` method to obtain a readonly snapshot. This snapshot is _not_ guaranteed to be concurrency-safe. There's no need for a snapshot to be concurrency-safe, since exporters should not share snapshots. 

Note, though: that by happenstance a lot of the snapshots _are_ concurrency-safe, being unmutable minimal representations of a value. Only the more complex ones are _not_ threadsafe, those that lazily calculate things like `Variance()`, `Mean()`.

Example of how a background exporter typically works, obtaining the snapshot and sequentially accessing the non-threadsafe methods in it: 
```golang
		ms := metric.Snapshot()
                ...
		fields := map[string]interface{}{
			"count":    ms.Count(),
			"max":      ms.Max(),
			"mean":     ms.Mean(),
			"min":      ms.Min(),
			"stddev":   ms.StdDev(),
			"variance": ms.Variance(),
```

TLDR: `snapshots` are not guaranteed to be concurrency-safe (but often are).

### Sample changes

I also changed the `Sample` type: previously, it iterated the samples fully every time `Mean()`,`Sum()`, `Min()` or `Max()` was invoked. Since we now have readonly base data, we can just iterate it once, in the constructor, and set all four values at once. 

The same thing has been done for runtimehistogram. 

### ResettingTimer API

Back when ResettingTImer was implemented, as part of https://github.com/ethereum/go-ethereum/pull/15910, Anton implemented a `Percentiles` on the new type. However, the method did not conform to the other existing types which also had a `Percentiles`. 

1. The existing ones, on input, took `0.5` to mean `50%`. Anton used `50` to mean `50%`. 
2. The existing ones returned `float64` outputs, thus interpolating between values. A value-set of `0, 10`, at `50%` would return `5`, whereas Anton's would return either `0` or `10`. 

This PR removes the 'new' version, and uses only the 'legacy' percentiles, also for the ResettingTimer type. 

The resetting timer snapshot was also defined so that it would expose the internal values. This has been removed, and getters for `Max, Min, Mean` have been added instead. 

### Unexport types

A lot of types were exported, but do not need to be. This PR unexports quite a lot of them.
2023-09-13 13:13:47 -04:00

373 lines
9.9 KiB
Go

package metrics
import (
"fmt"
"reflect"
"sort"
"strings"
"sync"
)
// DuplicateMetric is the error returned by Registry.Register when a metric
// already exists. If you mean to Register that metric you must first
// Unregister the existing metric.
type DuplicateMetric string
func (err DuplicateMetric) Error() string {
return fmt.Sprintf("duplicate metric: %s", string(err))
}
// A Registry holds references to a set of metrics by name and can iterate
// over them, calling callback functions provided by the user.
//
// This is an interface so as to encourage other structs to implement
// the Registry API as appropriate.
type Registry interface {
// Call the given function for each registered metric.
Each(func(string, interface{}))
// Get the metric by the given name or nil if none is registered.
Get(string) interface{}
// GetAll metrics in the Registry.
GetAll() map[string]map[string]interface{}
// Gets an existing metric or registers the given one.
// The interface can be the metric to register if not found in registry,
// or a function returning the metric for lazy instantiation.
GetOrRegister(string, interface{}) interface{}
// Register the given metric under the given name.
Register(string, interface{}) error
// Run all registered healthchecks.
RunHealthchecks()
// Unregister the metric with the given name.
Unregister(string)
}
type orderedRegistry struct {
StandardRegistry
}
// Call the given function for each registered metric.
func (r *orderedRegistry) Each(f func(string, interface{})) {
var names []string
reg := r.registered()
for name := range reg {
names = append(names, name)
}
sort.Strings(names)
for _, name := range names {
f(name, reg[name])
}
}
// NewRegistry creates a new registry.
func NewRegistry() Registry {
return new(StandardRegistry)
}
// NewOrderedRegistry creates a new ordered registry (for testing).
func NewOrderedRegistry() Registry {
return new(orderedRegistry)
}
// The standard implementation of a Registry uses sync.map
// of names to metrics.
type StandardRegistry struct {
metrics sync.Map
}
// Call the given function for each registered metric.
func (r *StandardRegistry) Each(f func(string, interface{})) {
for name, i := range r.registered() {
f(name, i)
}
}
// Get the metric by the given name or nil if none is registered.
func (r *StandardRegistry) Get(name string) interface{} {
item, _ := r.metrics.Load(name)
return item
}
// Gets an existing metric or creates and registers a new one. Threadsafe
// alternative to calling Get and Register on failure.
// The interface can be the metric to register if not found in registry,
// or a function returning the metric for lazy instantiation.
func (r *StandardRegistry) GetOrRegister(name string, i interface{}) interface{} {
// fast path
cached, ok := r.metrics.Load(name)
if ok {
return cached
}
if v := reflect.ValueOf(i); v.Kind() == reflect.Func {
i = v.Call(nil)[0].Interface()
}
item, _, ok := r.loadOrRegister(name, i)
if !ok {
return i
}
return item
}
// Register the given metric under the given name. Returns a DuplicateMetric
// if a metric by the given name is already registered.
func (r *StandardRegistry) Register(name string, i interface{}) error {
// fast path
_, ok := r.metrics.Load(name)
if ok {
return DuplicateMetric(name)
}
if v := reflect.ValueOf(i); v.Kind() == reflect.Func {
i = v.Call(nil)[0].Interface()
}
_, loaded, _ := r.loadOrRegister(name, i)
if loaded {
return DuplicateMetric(name)
}
return nil
}
// Run all registered healthchecks.
func (r *StandardRegistry) RunHealthchecks() {
r.metrics.Range(func(key, value any) bool {
if h, ok := value.(Healthcheck); ok {
h.Check()
}
return true
})
}
// GetAll metrics in the Registry
func (r *StandardRegistry) GetAll() map[string]map[string]interface{} {
data := make(map[string]map[string]interface{})
r.Each(func(name string, i interface{}) {
values := make(map[string]interface{})
switch metric := i.(type) {
case Counter:
values["count"] = metric.Snapshot().Count()
case CounterFloat64:
values["count"] = metric.Snapshot().Count()
case Gauge:
values["value"] = metric.Snapshot().Value()
case GaugeFloat64:
values["value"] = metric.Snapshot().Value()
case Healthcheck:
values["error"] = nil
metric.Check()
if err := metric.Error(); nil != err {
values["error"] = metric.Error().Error()
}
case Histogram:
h := metric.Snapshot()
ps := h.Percentiles([]float64{0.5, 0.75, 0.95, 0.99, 0.999})
values["count"] = h.Count()
values["min"] = h.Min()
values["max"] = h.Max()
values["mean"] = h.Mean()
values["stddev"] = h.StdDev()
values["median"] = ps[0]
values["75%"] = ps[1]
values["95%"] = ps[2]
values["99%"] = ps[3]
values["99.9%"] = ps[4]
case Meter:
m := metric.Snapshot()
values["count"] = m.Count()
values["1m.rate"] = m.Rate1()
values["5m.rate"] = m.Rate5()
values["15m.rate"] = m.Rate15()
values["mean.rate"] = m.RateMean()
case Timer:
t := metric.Snapshot()
ps := t.Percentiles([]float64{0.5, 0.75, 0.95, 0.99, 0.999})
values["count"] = t.Count()
values["min"] = t.Min()
values["max"] = t.Max()
values["mean"] = t.Mean()
values["stddev"] = t.StdDev()
values["median"] = ps[0]
values["75%"] = ps[1]
values["95%"] = ps[2]
values["99%"] = ps[3]
values["99.9%"] = ps[4]
values["1m.rate"] = t.Rate1()
values["5m.rate"] = t.Rate5()
values["15m.rate"] = t.Rate15()
values["mean.rate"] = t.RateMean()
}
data[name] = values
})
return data
}
// Unregister the metric with the given name.
func (r *StandardRegistry) Unregister(name string) {
r.stop(name)
r.metrics.LoadAndDelete(name)
}
func (r *StandardRegistry) loadOrRegister(name string, i interface{}) (interface{}, bool, bool) {
switch i.(type) {
case Counter, CounterFloat64, Gauge, GaugeFloat64, GaugeInfo, Healthcheck, Histogram, Meter, Timer, ResettingTimer:
default:
return nil, false, false
}
item, loaded := r.metrics.LoadOrStore(name, i)
return item, loaded, true
}
func (r *StandardRegistry) registered() map[string]interface{} {
metrics := make(map[string]interface{})
r.metrics.Range(func(key, value any) bool {
metrics[key.(string)] = value
return true
})
return metrics
}
func (r *StandardRegistry) stop(name string) {
if i, ok := r.metrics.Load(name); ok {
if s, ok := i.(Stoppable); ok {
s.Stop()
}
}
}
// Stoppable defines the metrics which has to be stopped.
type Stoppable interface {
Stop()
}
type PrefixedRegistry struct {
underlying Registry
prefix string
}
func NewPrefixedRegistry(prefix string) Registry {
return &PrefixedRegistry{
underlying: NewRegistry(),
prefix: prefix,
}
}
func NewPrefixedChildRegistry(parent Registry, prefix string) Registry {
return &PrefixedRegistry{
underlying: parent,
prefix: prefix,
}
}
// Call the given function for each registered metric.
func (r *PrefixedRegistry) Each(fn func(string, interface{})) {
wrappedFn := func(prefix string) func(string, interface{}) {
return func(name string, iface interface{}) {
if strings.HasPrefix(name, prefix) {
fn(name, iface)
} else {
return
}
}
}
baseRegistry, prefix := findPrefix(r, "")
baseRegistry.Each(wrappedFn(prefix))
}
func findPrefix(registry Registry, prefix string) (Registry, string) {
switch r := registry.(type) {
case *PrefixedRegistry:
return findPrefix(r.underlying, r.prefix+prefix)
case *StandardRegistry:
return r, prefix
}
return nil, ""
}
// Get the metric by the given name or nil if none is registered.
func (r *PrefixedRegistry) Get(name string) interface{} {
realName := r.prefix + name
return r.underlying.Get(realName)
}
// Gets an existing metric or registers the given one.
// The interface can be the metric to register if not found in registry,
// or a function returning the metric for lazy instantiation.
func (r *PrefixedRegistry) GetOrRegister(name string, metric interface{}) interface{} {
realName := r.prefix + name
return r.underlying.GetOrRegister(realName, metric)
}
// Register the given metric under the given name. The name will be prefixed.
func (r *PrefixedRegistry) Register(name string, metric interface{}) error {
realName := r.prefix + name
return r.underlying.Register(realName, metric)
}
// Run all registered healthchecks.
func (r *PrefixedRegistry) RunHealthchecks() {
r.underlying.RunHealthchecks()
}
// GetAll metrics in the Registry
func (r *PrefixedRegistry) GetAll() map[string]map[string]interface{} {
return r.underlying.GetAll()
}
// Unregister the metric with the given name. The name will be prefixed.
func (r *PrefixedRegistry) Unregister(name string) {
realName := r.prefix + name
r.underlying.Unregister(realName)
}
var (
DefaultRegistry = NewRegistry()
EphemeralRegistry = NewRegistry()
AccountingRegistry = NewRegistry() // registry used in swarm
)
// Call the given function for each registered metric.
func Each(f func(string, interface{})) {
DefaultRegistry.Each(f)
}
// Get the metric by the given name or nil if none is registered.
func Get(name string) interface{} {
return DefaultRegistry.Get(name)
}
// Gets an existing metric or creates and registers a new one. Threadsafe
// alternative to calling Get and Register on failure.
func GetOrRegister(name string, i interface{}) interface{} {
return DefaultRegistry.GetOrRegister(name, i)
}
// Register the given metric under the given name. Returns a DuplicateMetric
// if a metric by the given name is already registered.
func Register(name string, i interface{}) error {
return DefaultRegistry.Register(name, i)
}
// Register the given metric under the given name. Panics if a metric by the
// given name is already registered.
func MustRegister(name string, i interface{}) {
if err := Register(name, i); err != nil {
panic(err)
}
}
// Run all registered healthchecks.
func RunHealthchecks() {
DefaultRegistry.RunHealthchecks()
}
// Unregister the metric with the given name.
func Unregister(name string) {
DefaultRegistry.Unregister(name)
}