Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: randomize lxc container names #651

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mikemccracken
Copy link
Contributor

What type of PR is this?

bug

Which issue does this PR fix:

In situations where concurrent stacker runs are happening on a system, and they are building containers with the same name, and they are being done inside a mount namespace,
and the path name given as the roots dir is the same, but the actual mounted volume is different,
then both stackers will be able to acquire the file lock at $rootdir/.lock, and will go ahead and start containers named $name, which will then race to set up the lxc control socket, which is named after the container name and the rootfs path, which are both the same here.

What does this PR do / Why do we need it:

The fix is to add some randomness to the lxc container name, which ensures that the socket won't clash. This should not affect other uses of the image name, which will still use the un-randomized name.

If an issue # is not available please add repro steps and logs showing the issue:

#645

Testing done on this change:

manual only, CI pending

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@mikemccracken mikemccracken marked this pull request as draft November 7, 2024 04:59
@mikemccracken
Copy link
Contributor Author

Draft because it might need work if it breaks CI. the test suite is not reliable on my system right now so I am just using the github runners

In situations where concurrent stacker runs are happening on a system,
and they are building containers with the same name,
and they are being done inside a mount namespace,
and the path name given as the roots dir is the same,
but the actual mounted volume is different,

then both stackers will be able to acquire the file lock at
$rootdir/.lock, and will go ahead and start containers named $name,
which will then race to set up the lxc control socket, which is named
after the container name and the rootfs path, which are both the same
here.

The fix is to add some randomness to the lxc container name, which
ensures that the socket won't clash. This should not affect other uses
of the image name, which will still use the un-randomized name.

Signed-off-by: Michael McCracken <mikmccra@cisco.com>
@mikemccracken mikemccracken force-pushed the 2024.11.01/main/uniquify-container-names branch from 669c12a to 74db868 Compare November 7, 2024 19:11
@mikemccracken
Copy link
Contributor Author

The CI failures are real, there is a case now where a build creates a new layer when it shouldn't. empty-layer.bats checks for this and that's what's failing.

However it isn't obvious to me what exactly is going wrong, I spent a half day looking at it before Thanksgiving and didn't conclude. Will revisit later.

Maybe the Right Way is to have lxc include the mount ns id in its socket naming code?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant