Skip to content

cgroups: add basic cgroups tracking and make it part of the testing framework #471

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 27 commits into from
Nov 15, 2022

Conversation

tixxdz
Copy link
Member

@tixxdz tixxdz commented Oct 10, 2022

This PR introduce basic Cgroups tracking. It handles first task of issue: #477

The main functionality is cgroup_mkdir and cgroup_rmdir where we register our bpf programs and expect events to be sent to user space. Note that these programs are not loaded into main Tetragon, we are babysitting them with tests for now, so no new functionality is introduced.

  • The patches prefixed with bpf are bpf related parts.
  • The following patches are user space handlers for the cgroup events.
  • Then we have helpers to make tests easy, followed with tests including bpf unit tests.

@tixxdz tixxdz requested a review from a team as a code owner October 10, 2022 13:24
@tixxdz tixxdz requested a review from tpapagian October 10, 2022 13:24
@tixxdz tixxdz marked this pull request as draft October 10, 2022 13:24
@tixxdz tixxdz force-pushed the pr/tixxdz/cgroup-bpf branch 2 times, most recently from 3ec407d to d9d68fb Compare October 12, 2022 10:49
@tixxdz tixxdz force-pushed the pr/tixxdz/cgroup-bpf branch 2 times, most recently from 6a34e45 to fc4c717 Compare October 13, 2022 14:43
@tixxdz tixxdz marked this pull request as ready for review October 13, 2022 17:23
@tixxdz tixxdz requested a review from jrfastab October 13, 2022 17:24
@tixxdz tixxdz changed the title cgroups bpf cgroups: add basic cgroups tracking and make it part of to the testing framework Oct 13, 2022
@tixxdz tixxdz changed the title cgroups: add basic cgroups tracking and make it part of to the testing framework cgroups: add basic cgroups tracking and make it part of the testing framework Oct 13, 2022
@tixxdz tixxdz force-pushed the pr/tixxdz/cgroup-bpf branch 2 times, most recently from 36ae391 to 7cd47fb Compare October 14, 2022 10:46
@jrfastab
Copy link
Contributor

overall looks fine. I think you went a bit too far breaking this into patches. Putting all the BPF progs in one patch would have been fine I don't see much value in splitting all those apart like that. But, only real critique would be to pull the string out of the tracking logic so we don't store strings and then use an id->name from user space. But, consider it a suggestion.

@tixxdz
Copy link
Member Author

tixxdz commented Oct 18, 2022

overall looks fine. I think you went a bit too far breaking this into patches. Putting all the BPF progs in one patch would have been fine I don't see much value in splitting all those apart like that. But, only real critique would be to pull the string out of the tracking logic so we don't store strings and then use an id->name from user space. But, consider it a suggestion.

Thank you for the review, for splitting things up yeh, we had already several iterations where mkdir changed depending on cgroup versions and it was improved a lot, that's why on its own (helped me bisect bugs during dev), will wait for @kkourt review then put them together in that case.

Update: so after splitting this I got a bit confused... after discussion with @kkourt we agreed to follow up with current design and we can revisit this later. Reference: #471 (comment)

In next patches we abstract how we look up for cgroup names and container IDs anyway, so having it in user space or bpf can be considered a detail...

@tixxdz tixxdz force-pushed the pr/tixxdz/cgroup-bpf branch from 7cd47fb to 15c66fc Compare October 21, 2022 11:11
Copy link
Contributor

@kkourt kkourt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Most of the things look good to me.

It was not an easy PR to review though because many patches introduced something that was either used or tested many patches later. Also, I had to frequently check #225 to figure out why something exists.

I have some comments, but most of them are about improving small aspects of the code (e.g., adding comments).

Add MSG_OP_CGROUP and its sub operations. These operations will be
forwarded to the ring buffer for logging and debugging purpose, except
for MSG_OP_CGROUP_ATTACH_TASK that is used to discover cgroup config.

- MSG_OP_CGROUP_MKDIR: is when cgroup is created, we send event for debugging.
- MSG_OP_CGROUP_RMDIR: is when cgroup is removed from fs, used for debugging.
- MSG_OP_CGROUP_RELEASE: is when cgroup is released, used for debugging.
- MSG_OP_CGROUP_ATTACH_TASK: is when a task is migrated to a cgroup, used to
  migrate Tetragon so it discover current cgroup environment.

Signed-off-by: Djalal Harouni <[email protected]>
This is a preparation patch that adds related cgroup structures and bpf
maps:

- cgroup_state enum: to track cgroups states.
  cgroup is being created (NEW), then when a cgroup is started
  it usually means a container process is running now (RUNNING),
  or maybe cgroup info was obtained from proc (RUNNING_PROC)

- cgroup_tracking_value: is the necessary data that is used to track
  cgroups information. Key is the Cgroup ID.

- msg_cgroup_event: is the cgroup event that is sent from bpf to user space about
  cgroup operations. Right now it will contain and handle only cgroup_attach_task
  tracepoint event that is used to detect Tetragon cgroup configurations.

- tg_cgrps_tracking_map is the map where we track cgroups IDs that are under
  or before a cgroup level that is Tetragon cgroup level. This usually include
  containers cgroup levels. In other words it will track cgroups that are between
  cgroup level 1 (first) -> Tetragon cgroup level.

- tg_cgrps_tracking_heap: is a heap used to construct dynamically
  cgroup_tracking_value structures.

- tg_cgrps_msg_heap: is a heap used to construct dynamically
  msg_cgroup_event structures.

Signed-off-by: Djalal Harouni <[email protected]>
tixxdz added 16 commits October 31, 2022 10:58
Add cgroup message definition that will be sent from bpf side.

Signed-off-by: Djalal Harouni <[email protected]>
Add Cgroup operations and message definitions that are sent from BPF side.

The events are not exported to JSON, right now we only use them for
debugging and logging.

In future the different Cgroup states will allow to identify exactly
the state of containers if new, pre-running mode, running
(exec container entry point), etc.

While we are at it share the CgroupNameFromCStr() helper to convert
cgroup names from BPF side to Golang strings.

Signed-off-by: Djalal Harouni <[email protected]>
Add and register cgroup events handler.

Signed-off-by: Djalal Harouni <[email protected]>
Add cgrouptrackmap package that reads tracked cgroups from the
`tg_cgrps_tracking_map`.

The tracked cgroups information is pushed from bpf.

Signed-off-by: Djalal Harouni <[email protected]>
Add ReadTgRuntimeConf() to read Tetragon runtime configuration from
`tg_conf_map` bpf map.

Signed-off-by: Djalal Harouni <[email protected]>
Allow to pass custom BPF maps root directory to UpdateRuntimeConf()
this way we can use it in testing, where tests have their own location
for pinning bpf programs and maps.

Signed-off-by: Djalal Harouni <[email protected]>
This is needed by bpf unit tests to work on default cgroupfs.

Signed-off-by: Djalal Harouni <[email protected]>
Add simple Cgroup Operation and state tests that will help to assert
their operation code, description and the different states.

Signed-off-by: Djalal Harouni <[email protected]>
This test checks all Cgroup structs of processapi pkg against
the alignchecker.

Signed-off-by: Djalal Harouni <[email protected]>
…cker

This asserts that CgrpTrackingValue struct is aligned with
cgroup_tracking_value.

Signed-off-by: Djalal Harouni <[email protected]>
Assert that we return the corresponding copied data.

Signed-off-by: Djalal Harouni <[email protected]>
This allows to load and test bpf cgroups programs.

Later we will move these bpf cgroups programs to be part of the
base sensor; until then let's babysit them here.

Signed-off-by: Djalal Harouni <[email protected]>
Test if runtime configuration is properly saved inside the
`tg_conf_map` BPF map.

Signed-off-by: Djalal Harouni <[email protected]>
@tixxdz tixxdz force-pushed the pr/tixxdz/cgroup-bpf branch from 15c66fc to 0e46ec8 Compare October 31, 2022 13:26
Add TestCgroupNoEvents() to ensure that we do not get BPF cgroups
events.

Signed-off-by: Djalal Harouni <[email protected]>
@tixxdz tixxdz force-pushed the pr/tixxdz/cgroup-bpf branch from 0e46ec8 to 99d20d3 Compare October 31, 2022 14:11
This test ensure that we get bpf events when cgroup mkdir and rmdir
happen, also it includes extended logic to track if the cgroup data
is correct inside bpf map and if it was removed from the same cgroup
tracking map after cgroup_rmmdir.

Signed-off-by: Djalal Harouni <[email protected]>
@tixxdz tixxdz force-pushed the pr/tixxdz/cgroup-bpf branch from 99d20d3 to 84fc957 Compare October 31, 2022 14:14
@tixxdz
Copy link
Member Author

tixxdz commented Oct 31, 2022

Thanks!

Most of the things look good to me.

It was not an easy PR to review though because many patches introduced something that was either used or tested many patches later. Also, I had to frequently check #225 to figure out why something exists.

I have some comments, but most of them are about improving small aspects of the code (e.g., adding comments).

Thanks @kkourt for the review, all comments handled!

Copy link
Contributor

@kkourt kkourt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! 🚢 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants