mirror of
https://github.com/grafana/grafana.git
synced 2025-09-14 05:32:52 +08:00
update tracing guidelines for tracing (#109299)
Update contributor guidelines for tracing
This commit is contained in:
@ -156,9 +156,120 @@ A distributed trace is data that tracks an application request as it flows throu
|
|||||||
|
|
||||||
### Usage
|
### Usage
|
||||||
|
|
||||||
Grafana uses [OpenTelemetry](https://opentelemetry.io/) for distributed tracing. There's an interface `Tracer` in the `pkg/infra/tracing` package that implements the [OpenTelemetry Tracer interface](https://pkg.go.dev/go.opentelemetry.io/otel/trace), which you can use to create traces and spans. To access `Tracer` you need to get it injected as a dependency of your service. Refer to [Services](services.md) for more details. For more information, you may also refer to [The OpenTelemetry documentation](https://opentelemetry.io/docs/instrumentation/go/manual/).
|
Grafana uses [OpenTelemetry](https://opentelemetry.io/) for distributed tracing. For more information, you may also refer to [The OpenTelemetry documentation](https://opentelemetry.io/docs/instrumentation/go/manual/).
|
||||||
|
|
||||||
For example:
|
1. Import the otel package
|
||||||
|
|
||||||
|
```Go
|
||||||
|
"go.opentelemetry.io/otel"
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Initialize a package level tracer
|
||||||
|
|
||||||
|
We don't want to change the provider signature for every service across Grafana, especially in a testing scenario. otel.Tracer gets the global tracer configured in `pkg/infra/tracing` and handles whether or not we use a noop when otel is disabled.
|
||||||
|
|
||||||
|
Using the package location as the provided name allows us to determine where the trace is coming from in the event of a naming collision.
|
||||||
|
|
||||||
|
```Go
|
||||||
|
var tracer = otel.Tracer("github.com/grafana/grafana/pkg/services/dashboards")
|
||||||
|
```
|
||||||
|
|
||||||
|
For enterprise, refer to the enterprise repo
|
||||||
|
|
||||||
|
```Go
|
||||||
|
var tracer = otel.Tracer("github.com/grafana/grafana-enterprise/src/pkg/extensions/accesscontrol")
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Create a span
|
||||||
|
|
||||||
|
Spans are connected via the context.Context named ctx by convention. Name your span according to `<package>.<function>`. In some cases, there is a folder inside a package with a name commonly used throughout the codebase. In that case use the form `<package>.<folder>.<function>`.
|
||||||
|
|
||||||
|
It is important to match casing for each of these as it's common convention to have an internal and external functions with the same name.
|
||||||
|
|
||||||
|
Be sure to always terminate your spans.
|
||||||
|
|
||||||
|
```Go
|
||||||
|
ctx, span := tracer.Start(ctx, "dashboards.store.GetDashboardByUID")
|
||||||
|
defer span.End()
|
||||||
|
```
|
||||||
|
|
||||||
|
4. Ensure spans are connected
|
||||||
|
|
||||||
|
Grafana's api package uses `*contextmodel.ReqContext` to pass between api functions with context.Context as a field on the request.
|
||||||
|
|
||||||
|
Given `func (hs *HTTPServer) isDashboardStarredByUser(c *contextmodel.ReqContext)`
|
||||||
|
|
||||||
|
you can fetch the context via `c.Req.Context()`
|
||||||
|
|
||||||
|
If you create a span, you need to ensure the new context is set on `ReqContext`
|
||||||
|
|
||||||
|
```Go
|
||||||
|
ctx, span := tracer.Start(c.Req.Context(), "api.isDashboardStarredByUser")
|
||||||
|
defer span.End()
|
||||||
|
c.Req = c.Req.WithContext(ctx)
|
||||||
|
```
|
||||||
|
|
||||||
|
If the function does not pass a `*contextmodel.ReqContext` to another function, do not set the `ctx`.
|
||||||
|
|
||||||
|
5. Spans with no children
|
||||||
|
Often in the context of cpu bound work you want a context, but won't pass the child down. In this case use a blank identifier like any other variable and we will get appropriate timing from the span.
|
||||||
|
|
||||||
|
```Go
|
||||||
|
_, span := tracer.Start(c.Req.Context(), "api.isDashboardStarredByUser")
|
||||||
|
defer span.End()
|
||||||
|
```
|
||||||
|
|
||||||
|
6. Deciding when to create a span
|
||||||
|
|
||||||
|
Only instrument functions that perform IO bound work like database calls or quantifiable cpu bound work. Simple wrappers in the service layer that make a call to the database should not be instrumented as they're just creating more data without giving us more insights.
|
||||||
|
|
||||||
|
For example, the following function is performing business logic, but not doing a meaningful amount of work. In this case, ensure ctx is passed to the child function and perform the tracing in `pd.store.FindByAccessToken`
|
||||||
|
|
||||||
|
```Go
|
||||||
|
func (pd *PublicDashboardServiceImpl) FindByAccessToken(ctx context.Context, accessToken string) (*PublicDashboard, error) {
|
||||||
|
ctx, span := tracer.Start(ctx, "publicdashboards.FindByAccessToken")
|
||||||
|
defer span.End()
|
||||||
|
|
||||||
|
pubdash, err := pd.store.FindByAccessToken(ctx, accessToken)
|
||||||
|
if err != nil {
|
||||||
|
return nil, ErrInternalServerError.Errorf("FindByAccessToken: failed to find a public dashboard: %w", err)
|
||||||
|
}
|
||||||
|
if pubdash == nil {
|
||||||
|
return nil, ErrPublicDashboardNotFound.Errorf("FindByAccessToken: Public dashboard not found accessToken: %s", accessToken)
|
||||||
|
}
|
||||||
|
return pubdash, nil
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
7. Adding attributes for investigation
|
||||||
|
|
||||||
|
It's often helpful understand the context of a span. To help with debugging we can add attributes to a span that tell us more about the state of the function. This is very helpful for cpu bound function. In the following example we're performing a cpu bound operation and adding an attribute to the span telling us the number of permissions that were passed into the function.
|
||||||
|
|
||||||
|
Import the otel attributes and trace packages.
|
||||||
|
|
||||||
|
```Go
|
||||||
|
"go.opentelemetry.io/otel/attribute"
|
||||||
|
"go.opentelemetry.io/otel/trace"
|
||||||
|
```
|
||||||
|
|
||||||
|
Add your attributes when creating the span.
|
||||||
|
|
||||||
|
```Go
|
||||||
|
func GroupScopesByActionContext(ctx context.Context, permissions []Permission) map[string][]string {
|
||||||
|
_, span := tracer.Start(ctx, "accesscontrol.GroupScopesByActionContext", trace.WithAttributes(
|
||||||
|
attribute.Int("permissions_count", len(permissions)),
|
||||||
|
))
|
||||||
|
defer span.End()
|
||||||
|
|
||||||
|
m := make(map[string][]string)
|
||||||
|
for i := range permissions {
|
||||||
|
m[permissions[i].Action] = append(m[permissions[i].Action], permissions[i].Scope)
|
||||||
|
}
|
||||||
|
return m
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Complete example
|
||||||
|
|
||||||
```go
|
```go
|
||||||
import (
|
import (
|
||||||
@ -169,23 +280,12 @@ import (
|
|||||||
"go.opentelemetry.io/otel/trace"
|
"go.opentelemetry.io/otel/trace"
|
||||||
)
|
)
|
||||||
|
|
||||||
type MyService struct {
|
var tracer = otel.Tracer("github.com/grafana/grafana/pkg/services/myservice")
|
||||||
tracer tracing.Tracer
|
|
||||||
}
|
|
||||||
|
|
||||||
func ProvideService(tracer tracing.Tracer) *MyService {
|
|
||||||
return &MyService{
|
|
||||||
tracer: tracer,
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
func (s *MyService) Hello(ctx context.Context, name string) (string, error) {
|
func (s *MyService) Hello(ctx context.Context, name string) (string, error) {
|
||||||
ctx, span := s.tracer.Start(ctx, "MyService.Hello", trace.WithAttributes(
|
ctx, span := s.tracer.Start(ctx, "MyService.Hello", trace.WithAttributes(
|
||||||
attribute.String("my_attribute", "val"),
|
attribute.String("my_attribute", "val"),
|
||||||
))
|
))
|
||||||
// make sure the span is marked as finished when this
|
|
||||||
// method ends to allow the span to be flushed and sent to
|
|
||||||
// storage backend.
|
|
||||||
defer span.End()
|
defer span.End()
|
||||||
|
|
||||||
// Add some event to show Events usage
|
// Add some event to show Events usage
|
||||||
@ -215,7 +315,7 @@ func (s *MyService) Hello(ctx context.Context, name string) (string, error) {
|
|||||||
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### Naming conventions
|
### Naming conventions for spans
|
||||||
|
|
||||||
Span names should follow the [guidelines from OpenTelemetry](https://opentelemetry.io/docs/reference/specification/trace/api/#span).
|
Span names should follow the [guidelines from OpenTelemetry](https://opentelemetry.io/docs/reference/specification/trace/api/#span).
|
||||||
|
|
||||||
|
Reference in New Issue
Block a user