Skip to content

program/loader: properly log verifier errors #504

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 24, 2022

Conversation

willfindlay
Copy link
Contributor

Recent changes to cilium/ebpf have changed the way verifier logs are handled such that they are truncated to a single line by default and can be expanded with a "%+v" formatting directive. An unfortunate consequence of this was that our current implementation would always truncate verifier log dumps to a single line because the fmt.Errorf wrapper was implicitly using a "%s" formatting directive to print the root cause.

To rectify this issue, use the errors.As() helper to cast the verifier error down to an ebpf.VerifierError and use the entire verifier error in the call to slimVerifierError. This enables us to pretty print the first and last 30 lines as before.

While fixing this bug, I realized it would probably be a reasonable idea to support higher verbosity levels that would dump the entire verifier logs. While this wouldn't be useful in many cases, one could imagine specific use cases coming up where just the truncated dump may not be sufficient to debug an issue.

Finally, let's take advantage of the new default verifier log behaviour of cilium/ebpf to append a one line description of the problem to program load error messages. Example output without the --verbose flag looks like the following:

time="2022-10-24T17:45:51Z" level=fatal msg="Failed to start tetragon" error="failed prog bpf/objs/bpf_execve_event_v53.o kern_version 332544 LoadKprobeProgram: opening collection 'bpf/objs/bpf_execve_event_v53.o' failed: program event_execve: load program: permission denied: invalid access to map value, value_size=1872 off=327960 size=4: R2 min value is outside of the allowed memory range (520 line(s) omitted)"

Signed-off-by: William Findlay [email protected]

@willfindlay willfindlay requested a review from a team as a code owner October 24, 2022 17:54
@willfindlay willfindlay requested a review from tpapagian October 24, 2022 17:54
@olsajiri
Copy link
Contributor

nice, one nit - there are some places that still set it to 5

pkg/sensors/exec/exec_test.go:  option.Config.Verbosity = 5
pkg/sensors/tracing/selectors_test.go:  option.Config.Verbosity = 5

because of the generic name I thought it's used elsewhere for verbose output,
but it seems to be used just the verifier log

@willfindlay
Copy link
Contributor Author

Nice let's just get rid of those instances then @olsajiri. Any objections?

Recent changes to cilium/ebpf have changed the way verifier logs are handled such that
they are truncated to a single line by default and can be expanded with a `"%+v"`
formatting directive. An unfortunate consequence of this was that our current implementation
would always truncate verifier log dumps to a single line because the fmt.Errorf wrapper was
implicitly using a `"%s"` formatting directive to print the root cause.

To rectify this issue, use the `errors.As()` helper to cast the verifier error down to an
ebpf.VerifierError and use the entire verifier error in the call to `slimVerifierError`.
This enables us to pretty print the first and last 30 lines as before.

While fixing this bug, I realized it would probably be a reasonable idea to support higher
verbosity levels that would dump the entire verifier logs. While this wouldn't be useful
in many cases, one could imagine specific use cases coming up where just the truncated
dump may not be sufficient to debug an issue.

Finally, let's take advantage of the new default verifier log behaviour of cilium/ebpf to
append a one line description of the problem to program load error messages. Example output without
the --verbose flag looks like the following:

    time="2022-10-24T17:45:51Z" level=fatal msg="Failed to start tetragon" error="failed prog bpf/objs/bpf_execve_event_v53.o kern_version 332544 LoadKprobeProgram: opening collection 'bpf/objs/bpf_execve_event_v53.o' failed: program event_execve: load program: permission denied: invalid access to map value, value_size=1872 off=327960 size=4: R2 min value is outside of the allowed memory range (520 line(s) omitted)"

Signed-off-by: William Findlay <[email protected]>
@willfindlay willfindlay force-pushed the pr/willfindlay/verifier-log-fixes branch from 5df5262 to 6d1d312 Compare October 24, 2022 19:09
@willfindlay willfindlay merged commit 0ee605a into main Oct 24, 2022
@willfindlay willfindlay deleted the pr/willfindlay/verifier-log-fixes branch October 24, 2022 19:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants