-
Notifications
You must be signed in to change notification settings - Fork 61
Description
Describe the issue you're reporting
Hi All,
I followed the steps outlined in the developing.md file to build the kernel-collector and reducer binaries. After running both components, I encountered an error in the reducer’s console logs.
Error:
2025-02-20 10:51:41.436602+05:30 debug [p:1647882 t:1647882] setting up breakpad...
2025-02-20 10:51:41.436616+05:30 debug [p:1647882 t:1647882] setting up breakpad...
2025-02-20 10:51:41.436769+05:30 info [p:1647882 t:1647882] Starting OpenTelemetry eBPF Reducer version 0.10.2 (release)
2025-02-20 10:51:41.437030+05:30 info [p:1647882 t:1647882] Disabling metric tcp.rtt.num_measurements
2025-02-20 10:51:41.437031+05:30 info [p:1647882 t:1647882] Disabling metric ebpf_net.all
2025-02-20 10:51:41.437032+05:30 info [p:1647882 t:1647882] Enabling metric ebpf_net.collector_health
2025-02-20 10:51:41.437033+05:30 info [p:1647882 t:1647882] Enabling metric ebpf_net.bpf_log
2025-02-20 10:51:41.437034+05:30 info [p:1647882 t:1647882] Enabling metric ebpf_net.otlp_grpc.bytes_failed
2025-02-20 10:51:41.437035+05:30 info [p:1647882 t:1647882] Enabling metric ebpf_net.otlp_grpc.bytes_sent
2025-02-20 10:51:41.437035+05:30 info [p:1647882 t:1647882] Enabling metric ebpf_net.otlp_grpc.metrics_failed
2025-02-20 10:51:41.437036+05:30 info [p:1647882 t:1647882] Enabling metric ebpf_net.otlp_grpc.metrics_sent
2025-02-20 10:51:41.437037+05:30 info [p:1647882 t:1647882] Enabling metric ebpf_net.otlp_grpc.requests_failed
2025-02-20 10:51:41.437038+05:30 info [p:1647882 t:1647882] Enabling metric ebpf_net.otlp_grpc.requests_sent
2025-02-20 10:51:41.437038+05:30 info [p:1647882 t:1647882] Enabling metric ebpf_net.otlp_grpc.unknown_response_tags
2025-02-20 10:51:41.437039+05:30 info [p:1647882 t:1647882] Enabling metric ebpf_net.up
--------METADATA COMPLETE---------
Agent info:
Version: 0.10.2
OS: Linux (unknown)
Kernel: 5.15.151-0515151-generic
CPU Cores: 6
Hostname: Team1
Collector: kernel
Kernel Headers Source: 2025-02-20 10:51:45.110366+05:30 unknowninfo [p:1647882 t:1647889] agent version: 0.10.2
Entrypoint Error:
Role: (unknown)
AZ: (unknown)
Id: Team1
Instance: (unknown)
Agent: 7139553459298391204
Overrides:
namespace:
cluster:
service:
host:
zone:
IPs:
Metadata Report Complete.
2025-02-20 10:51:53.512430+05:30 debug [p:1647882 t:1647890] (612) error processing data from unknown at '(unknown)': Unknown error -2
2025-02-20 10:52:01.080022+05:30 debug [p:1647882 t:1647889] --------PROCESS STEADY STATE---------
2025-02-20 10:52:01.603038+05:30 debug [p:1647882 t:1647889] --------SOCKET STEADY STATE---------
2025-02-20 10:52:08.519373+05:30 debug [p:1647882 t:1647890] (612) error processing data from unknown at '(unknown)': Unknown error -2
2025-02-19 22:58:18.110453+05:30 error [p:1048775 t:1048783] Logging core failed to publish internal metrics writer stats
2025-02-19 22:58:23.073523+05:30 debug [p:1048775 t:1048783] (612) error processing data from unknown at '(unknown)': Unknown error -2
Prometheus config file:
- job_name: 'opentelemetry-ebpf-reducer'
static_configs:
- targets: ['Host_IP:7010']
OTel-Collector Config file:
receivers:
prometheus:
config:
scrape_configs:
- job_name: "opentelemetry-ebpf-reducer"
scrape_interval: 10s
static_configs:
- targets: ["Host_IP:7010"]
exporters:
debug:
verbosity: detailed # Replaces deprecated logging exporter
prometheus:
endpoint: "0.0.0.0:9090" # Expose metrics to Prometheus
otlp:
endpoint: "Host_IP:4317" # Optional: Forward to an OTLP backend
tls:
insecure: true # Disable TLS for local testing
service:
pipelines:
metrics:
receivers: [prometheus]
exporters: [prometheus, debug]
Steps to Reproduce:
- Follow the steps in the file to build the respective components.
- After building the binaries, exit the container and execute the following commands in the build/collector/kernel and build/reducer directories:
sudo ./kernel-collector --log-console --debug
sudo ./reducer --log-console --prom=Host_IP:7010 --debug
sudo ./prometheus --config.file=./prometheus.yml
docker run -d --name otel-collector -v $(pwd)/otel-collector-config.yaml:/etc/otel-collector-config.yml otel/opentelemetry-collector --config /etc/otel-collector-config.yml
- The reducer is listening for incoming connections from the kernel-collector at 127.0.0.1:8000 and exposes metrics in Prometheus format at Host_IP:7010. Kernel-collector, Prometheus scraping, and otel-collector are all running without errors.
Kernel-collector logs:
sudo ./kernel-collector --log-console --debug
2025-02-19 15:53:29.352371+05:30 debug [p:1014053 t:1014053] setting up breakpad...
2025-02-19 15:53:29.352398+05:30 debug [p:1014053 t:1014053] setting up breakpad...
2025-02-19 15:53:29.352546+05:30 info [p:1014053 t:1014053] Starting Kernel Collector version 0.10.2 (release)
2025-02-19 15:53:29.352551+05:30 info [p:1014053 t:1014053] Kernel Collector agent ID is FAID6BZZXOHEGIB5ZHD58NNX7A45LP8FMDUJ
2025-02-19 15:53:29.352554+05:30 info [p:1014053 t:1014053] Running on:
sysname: Linux
nodename: Team1
release: 5.15.151-0515151-generic
version: #202403061549 SMP Wed Mar 6 17:15:47 UTC 2024
machine: x86_64
2025-02-19 15:53:29.352581+05:30 info [p:1014053 t:1014053] HTTP Metrics: Enabled
2025-02-19 15:53:29.352583+05:30 info [p:1014053 t:1014053] Socket stats interval in seconds: 10
2025-02-19 15:53:29.352583+05:30 info [p:1014053 t:1014053] Userland TCP: Disabled
2025-02-19 15:53:30.358085+05:30 debug [p:1014053 t:1014053] Unable to fetch AWS metadata: no metadata returned by AWS
2025-02-19 15:53:30.612887+05:30 debug [p:1014053 t:1014053] Unable to fetch GCP metadata: error while fetching Google Cloud Platform instance metadata: Could not resolve host: metadata.google.internal
2025-02-19 15:53:30.612925+05:30 debug [p:1014053 t:1014053] Unable to fetch Nomad metadata - environment variables not found
2025-02-19 15:53:30.612948+05:30 info [p:1014053 t:1014053] Kernel Collector version 0.10.2 (release) started on host Ads-Team1
2025-02-19 15:53:30.660533+05:30 debug [p:1014053 t:1014053] intake record file: ``
2025-02-19 15:53:30.660646+05:30 debug [p:1014053 t:1014053] starting event loop...
2025-02-19 15:53:37.511782+05:30 info [p:1014053 t:1014053] connecting to 127.0.0.1:8000 (binary)...
2025-02-19 15:53:37.511804+05:30 debug [p:1014053 t:1014053] TCPChannel::connect: Connecting to intake @ 127.0.0.1:8000
2025-02-19 15:53:46.099736+05:30 info [p:1014053 t:1014053] eBPF program successfully compiled
2025-02-19 15:53:46.251889+05:30 info [p:1014053 t:1014053] Kernel symbols list loaded
2025-02-19 15:53:46.251910+05:30 debug [p:1014053 t:1014053] Kernel function not found: kill_css
2025-02-19 15:53:47.716582+05:30 debug [p:1014053 t:1014053] Kernel function not found: ctnetlink_dump_tuples
2025-02-19 15:53:49.099051+05:30 debug [p:1014053 t:1014053] Adding TCP processor probes
2025-02-19 15:53:49.411677+05:30 info [p:1014053 t:1014053] Agent connected successfully. Telemetry is flowing!
2025-02-19 15:53:49.411726+05:30 debug [p:1014053 t:1014053] KernelCollectorRestarter: startup completed
Docker logs:
docker logs -f otel-collector
2025-02-20T05:46:41.689Z info [email protected]/service.go:186 Setting up own telemetry...
2025-02-20T05:46:41.689Z info builders/builders.go:26 Development component. May change in the future. {"kind": "exporter", "data_type": "metrics", "name": "debug"}
2025-02-20T05:46:41.692Z info [email protected]/service.go:252 Starting otelcol... {"Version": "0.119.0", "NumCPU": 6}
2025-02-20T05:46:41.692Z info extensions/extensions.go:39 Starting extensions...
2025-02-20T05:46:41.693Z info [email protected]/metrics_receiver.go:118 Starting discovery manager{"kind": "receiver", "name": "prometheus", "data_type": "metrics"}
2025-02-20T05:46:41.696Z info targetallocator/manager.go:184 Scrape job added {"kind": "receiver", "name": "prometheus", "data_type": "metrics", "jobName": "opentelemetry-ebpf-reducer"}
2025-02-20T05:46:41.703Z info [email protected]/service.go:275 Everything is ready. Begin running and processing data.
2025-02-20T05:46:41.706Z info [email protected]/metrics_receiver.go:187 Starting scrape manager {"kind": "receiver", "name": "prometheus", "data_type": "metrics"}
2025-02-20T05:46:53.640Z info Metrics {"kind": "exporter", "data_type": "metrics", "name": "debug", "resource metrics": 1, "metrics": 5, "data points": 5}
2025-02-20T05:46:53.640Z info ResourceMetrics #0
Resource SchemaURL:
Resource attributes:
-> service.name: Str(opentelemetry-ebpf-reducer)
-> net.host.name: Str(Host_IP)
-> server.address: Str(Host_IP)
-> service.instance.id: Str(Host_IP:7010)
-> net.host.port: Str(7010)
-> http.scheme: Str(http)
-> server.port: Str(7010)
-> url.scheme: Str(http)
ScopeMetrics #0
ScopeMetrics SchemaURL:
InstrumentationScope github.com/open-telemetry/opentelemetry-collector-contrib/receiver/prometheusreceiver 0.119.0
Metric #0
Descriptor:
-> Name: up
-> Description: The scraping was successful
-> Unit:
-> DataType: Gauge
NumberDataPoints #0
StartTimestamp: 1970-01-01 00:00:00 +0000 UTC
Timestamp: 2025-02-20 05:46:53.638 +0000 UTC
Value: 1.000000
Prometheus log on its graph UI:
http_active_sockets{az_equal="true", dest_availability_zone="(unknown)", dest_environment="(unknown)", dest_namespace_name="(unknown)", dest_process_name="process-exporte", dest_resolution_type="PROCESS", dest_workload_name="process_exporter", instance="Host_IP:7010", job="opentelemetry-ebpf-reducer", sf_product="network-explorer", source_availability_zone="(unknown)", source_environment="(no agent)", source_resolution_type="IP", source_workload_name="(unknown)"}