Skip to content

ARM64 Amazon Linux 2023 (Kernel 6.12): BPF program failures and PostgreSQL TLS tracing not working #2318

@eibrahimarisoy

Description

@eibrahimarisoy

Multiple Pixie source connectors fail to initialize on ARM64 nodes running Amazon Linux 2023 with kernel 6.12. This affects:

  1. PostgreSQL TLS tracing - Unable to trace pgsql queries from Go applications
  2. perf_profiler - CPU profiling/flamegraphs unavailable
  3. proc_exit_tracer - Process exit tracking unavailable

Environment

Component Version/Details
Pixie Version v0.14.15+Distribution.623e988.202501242347.1.RELEASE.jenkins
Kernel 6.12.64-87.122.amzn2023.aarch64
Architecture ARM64 (aarch64)
OS Amazon Linux 2023
Kubernetes EKS

Issue 1: PostgreSQL TLS Tracing Not Working

Description

Pixie cannot trace PostgreSQL queries from Go applications connecting to AWS RDS over TLS. The pgsql_events table returns empty results, and connections are classified as kProtocolUnknown instead of kProtocolPGSQL.

Application Details

  • Go application built with CGO_ENABLED=0 (pure Go, uses Go's native TLS)
  • Binary contains debug symbols (with debug_info, not stripped)
  • crypto/tls symbols are present in binary (verified via go tool nm)
  • Connection to AWS RDS PostgreSQL uses TLS/SSL
  • Go Version: 1.24.5

What I've Tried

✅ Removed -ldflags="-w -s" to preserve debug symbols
✅ Verified binary has debug info: file shows with debug_info, not stripped
✅ Verified crypto/tls symbols exist via go tool nm
✅ Confirmed Pixie sees the pods and network traffic
✅ Confirmed RDS traffic flows (visible in px/net_flow_graph)

Evidence

1. PEM deploys uprobes but captures no PostgreSQL traffic:

I20260203 08:17:30.772593 uprobe_manager.cc:1014] Number of uprobes deployed = 114

2. ConnTracker shows zero PostgreSQL connections:

ConnTracker statistics: kProtocolPGSQL=0 kProtocolUnknown=55 kProtocolHTTP=31

3. Network traffic to RDS is visible but not decoded:

$ px run px/net_flow_graph -- -namespace default
# Shows traffic to ip-10-90-122-48.eu-central-1.compute.internal (RDS endpoint)
# But px/pgsql_data returns empty

4. pgsql_data query returns empty:

$ px run px/pgsql_data -- -start_time '-5m'
Table ID: pgsql_data
  TIME   SOURCE  DESTINATION  REMOTE PORT  REQ  RESP  LATENCY
(empty)

Expected Behavior

According to Pixie documentation:

"Pixie supports tracing of traffic encrypted with the following libraries: Go TLS – standard and boringcrypto. Requires a build with debug information."

PostgreSQL queries should be captured and decoded from Go applications using Go's native TLS library.


Issue 2: perf_profiler and proc_exit_tracer Initialization Failures

Description

The perf_profiler (CPU Profiling) and proc_exit_tracer (Process Exit Tracking) source connectors fail to initialize. BPF programs fail to load with "Invalid argument" errors.

Error Logs

perf_profiler failure:

I20260203 08:17:07.648699 source_connector.cc:35] Initializing source connector: perf_profiler
I20260203 08:17:07.648828 bcc_wrapper.cc:166] Initializing BPF program ...

In file included from src/stirling/source_connectors/perf_profiler/bcc_bpf/profiler.c:26:
./src/stirling/bpf_tools/bcc_bpf/task_struct_utils.h:41:10: warning: call to undeclared function 'div_u64'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
  return div_u64(x, 1000000000ULL / USER_HZ);
         ^

bpf: Failed to load program: Invalid argument
jump out of range from insn 37 to 120
processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0

W20260203 08:17:08.963372 stirling.cc:416] Source Connector (registry name=perf_profiler) not instantiated, error: Internal : Failed to load sample_call_stack: -22

proc_exit_tracer failure:

I20260203 08:17:08.963450 source_connector.cc:35] Initializing source connector: proc_exit_tracer
I20260203 08:17:08.963554 bcc_wrapper.cc:166] Initializing BPF program ...

In file included from src/stirling/source_connectors/proc_exit/bcc_bpf/proc_exit_trace.c:24:
./src/stirling/bpf_tools/bcc_bpf/task_struct_utils.h:41:10: warning: call to undeclared function 'div_u64'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
  return div_u64(x, 1000000000ULL / USER_HZ);
         ^

bpf: Failed to load program: Invalid argument
jump out of range from insn 6 to 74
processed 0 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0

W20260203 08:17:10.286451 stirling.cc:416] Source Connector (registry name=proc_exit_tracer) not instantiated, error: Internal : Failed to load tracepoint__sched__sched_process_exit: -22

Additional BPF compilation errors:

/lib/modules/6.12.64-87.122.amzn2023.aarch64/build/include/linux/kasan-checks.h:24:9: error: use of undeclared identifier 'true'
/lib/modules/6.12.64-87.122.amzn2023.aarch64/build/include/asm-generic/bitops/le.h:21:9: error: use of undeclared identifier 'uintptr_t'
fatal error: too many errors emitted, stopping now [-ferror-limit=]

I20260203 08:27:38.779588 stirling.cc:547] Internal : Could not compile bpftrace script, Clang parse failed: use of undeclared identifier 'true'

Impact

Without these source connectors:

  • ❌ No CPU profiling / flamegraphs for performance analysis
  • ❌ No process exit tracking for debugging crashes and terminations
  • ❌ No PostgreSQL query observability for Go applications using TLS

Root Cause Analysis

The errors suggest kernel 6.12 compatibility issues on ARM64:

  1. BPF instruction jump range issues - jump out of range from insn X to Y
  2. Missing kernel header compatibility for ARM64 kernel 6.12
  3. Undeclared functions/identifiers in BPF compilation context:
    • div_u64 function not declared in task_struct_utils.h
    • true and uintptr_t undeclared (C99 compatibility issues)
  4. BPF verifier restrictions on newer kernels

Related Issues


Suggested Fix

Based on similar issues (#2041, #2036), the following changes may be needed:

  1. Update BCC/libbpf for kernel 6.12 compatibility
  2. Add kernel 6.12 headers support for ARM64
  3. Fix div_u64 declaration in task_struct_utils.h
  4. Address C99 compatibility issues in kernel header includes
  5. Update BPF verifier constraints for newer kernels (jump range limits)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions