Build a Full‑Stack Observability Platform with Grafana LGTM, Go, and OpenTelemetry
This guide walks you through creating a complete observability stack—exporting metrics, traces, and logs from a Go web service, collecting them with OpenTelemetry Collector, and storing them in Grafana Mimir, Loki, and Tempo, then visualizing everything on a unified Grafana dashboard.
Overview
Observability in cloud‑native systems involves collecting logs, metrics, traces, profiling data, and events, and managing their lifecycle from exposure to visualization.
Sample Go Application
The demo provides two HTTP endpoints /v1/books and /v1/books/1. Requests first check Redis, fall back to MySQL, record structured logs, trace spans, and latency. When latency exceeds 200 ms an exemplar containing the request TraceID is attached to the Prometheus histogram, enabling cross‑telemetry correlation.
Repository
git clone https://github.com/grafanafans/prometheus-exemplar.git
cd prometheus-exemplarRunning the Stack
Start all components with Docker Compose: docker-compose up -d The composition launches:
A single‑node Grafana Mimir for metric storage.
A single‑node Loki for log storage.
A single‑node Tempo for trace storage.
An Nginx gateway that proxies queries to Mimir, Loki, and Tempo.
The demo Go service together with MySQL and Redis, reachable at http://localhost:8080.
Grafana with pre‑configured data sources and a unified dashboard at http://localhost:3000.
Generating Load
Use wrk to issue traffic against the endpoints:
wrk http://localhost:8080/v1/books
wrk http://localhost:8080/v1/books/1Metrics Export (Prometheus Go SDK)
The service creates a histogram and records an exemplar when latency > 200 ms. The exemplar carries the TraceID, enabling correlation across telemetry types.
func Metrics(metricPath string, urlMapping func(string) string) gin.HandlerFunc {
httpDurationsHistogram := prometheus.NewHistogramVec(prometheus.HistogramOpts{
Name: "http_durations_histogram_seconds",
Help: "HTTP latency distribution",
Buckets: []float64{0.05, 0.1, 0.25, 0.5, 1, 2},
}, []string{"method", "path", "code"})
prometheus.MustRegister(httpDurationsHistogram)
return func(c *gin.Context) {
// collect method, url, status, elapsed
observer := httpDurationsHistogram.WithLabelValues(method, url, status)
observer.Observe(elapsed)
if elapsed > 0.2 {
observer.(prometheus.ExemplarObserver).ObserveWithExemplar(
elapsed,
prometheus.Labels{"traceID": c.GetHeader(api.XRequestID)},
)
}
}
}Trace Export (OTLP HTTP)
Instrumentation uses the OpenTelemetry Go SDK. Example for a MySQL service:
func (s *MysqlBookService) Show(id string, ctx context.Context) (item *Book, err error) {
_, span := otel.Tracer().Start(ctx, "MysqlBookService.Show")
span.SetAttributes(attribute.String("id", id))
defer span.End()
// Simulate variable latency
time.Sleep(time.Duration(rand.Intn(250)) * time.Millisecond)
err = db.Where(Book{Id: id}).Find(&item).Error
return
}The tracer provider sends spans to Tempo via OTLP HTTP:
func SetTracerProvider(name, environment, endpoint string) error {
serviceName = name
client := otlptracehttp.NewClient(
otlptracehttp.WithEndpoint(endpoint),
otlptracehttp.WithInsecure(),
)
exp, err := otlptrace.New(context.Background(), client)
if err != nil {
return err
}
tp := tracesdk.NewTracerProvider(
tracesdk.WithBatcher(exp),
tracesdk.WithResource(resource.NewWithAttributes(
semconv.SchemaURL,
semconv.ServiceNameKey.String(serviceName),
attribute.String("environment", environment),
)),
)
otel.SetTracerProvider(tp)
return nil
}Structured Logging
Logs are emitted with go.uber.org/zap to /var/log/app.log. Each request logs the TraceID for later correlation.
cfg := zap.NewProductionConfig()
cfg.OutputPaths = []string{"stderr", "/var/log/app.log"}
logger, _ := cfg.Build()
logger = logger.With(zap.String("traceID", ctx.GetHeader(api.XRequestID)))OpenTelemetry Collector Configuration
Metrics and Traces Pipeline
receivers:
otlp:
protocols:
grpc:
http:
prometheus:
config:
scrape_configs:
- job_name: 'app'
scrape_interval: 10s
static_configs:
- targets: ['app:8080']
exporters:
otlp:
endpoint: tempo:4317
tls:
insecure: true
prometheusremotewrite:
endpoint: http://mimir:8080/api/v1/push
tls:
insecure: true
headers:
X-Scope-OrgID: demo
processors:
batch:
service:
pipelines:
traces:
receivers: [otlp]
processors: [batch]
exporters: [otlp]
metrics:
receivers: [prometheus]
processors: [batch]
exporters: [prometheusremotewrite]Log Pipeline (Collector Contrib)
receivers:
filelog:
include: [/var/log/app.log]
exporters:
loki:
endpoint: http://loki:3100/loki/api/v1/push
tenant_id: demo
labels:
attributes:
log.file.name: "filename"
processors:
batch:
service:
pipelines:
logs:
receivers: [filelog]
processors: [batch]
exporters: [loki]Visualization
Grafana starts with data sources pointing to Mimir, Loki, and Tempo. The bundled dashboard displays latency histograms, trace lists, and log entries. Correlation is performed by querying the exemplar’s TraceID.
Architecture diagram:
Grafana dashboard screenshot:
Conclusion
The example demonstrates a complete observability pipeline: a Go service emits metrics (with exemplars), traces, and structured logs; the OpenTelemetry Collector scrapes or receives each type; Grafana Mimir, Tempo, and Loki store the data; and Grafana provides a unified view where the TraceID links metrics, logs, and traces.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
