-
Notifications
You must be signed in to change notification settings - Fork 277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow mutating queue name in StatefulSet Webhook. #3520
base: main
Are you sure you want to change the base?
Allow mutating queue name in StatefulSet Webhook. #3520
Conversation
✅ Deploy Preview for kubernetes-sigs-kueue canceled.
|
/hold IIRC for Jobs we start such a Job (but please double-check and confirm). I synced with @mbobrovskyi that this is to align the behavior for Deployment, but another option is to simply reject such Deployments if they are not supported anyway. I think it deserves e2e test. |
I'd also like to understand how it interacts with the namespaceSelector on the pod integration. In the Pod webhook Default method, if the namespaceSelector doesn't match then we never get to the code that consults We don't support namespaceSelectors to modify What is the intended semantics for a StatefulSet or Deployment that is deployed in a namespace that doesn't match the |
Yes, this is WAI. The intention was to have a mechanism to exclude pods (like static pods or DeamonSet pods) in kube-system and kueue-system. We made the mechanism more generic (to exclude arbitrary namespaces).
Right, we don't do it for all other integrations. However, I think Deployments and StatefulSets need to be the other cases, first Deployments are used in kube-system and kueue-system so we better don't touch them. Second, the support is based on the PodGroup integration and so we inherit the lookup into
IIUC this means basically "for Deployments and StatefulSets in the kube-system or kueue-system". I think we should not manage them - no workload should be created. Since Deployments and StatefulSets are based on PodGroup integration this should happen "for free". Let me know if this matches your expectations and understanding. cc @mwielgus |
It honestly feels a bit like our implementation is leaking through to the API. In particular, treating StatefulSets one way and Jobs another wrt manageJobsWithoutQueueName. I think it could be less surprising / easier to explain if the boolean |
Yeah, I see the point - so that it is not clear why StatefulSet or Deployment pods are controlled by
You mean "replaced"? Or something like "restricted" - so that we only manage workloads matching the namespaceSelector?
I would be in favor of that. The original intention of podOptions.namespaceSelector was to exclude "kube-system" and "kueue-system" from pods. Back then we didn't foresee the need to exclude managing for Jobs or other supported CRDs. However, as we now support Deployments it makes also sense to exclude "kube-system" and "kueue-system". Luckily this is for free by using Pod integration, but as you say it means leaking implementation details. Let me also cc @mwielgus and @tenzen-y for their opinions, but +1 from me to decouple The remaining question from me: do we support both places, or we validate only one is set? We could consider supporting both places for v1beta1 and depracate the one in podOptions, but it would be good to have a KEP for that. Are you interested in driving this? |
restricted is a better word :). Yes, I'd propose that we do a uniform filtering by namespaceSelector for all integrations when manageJobsWithoutQueueName is true. I'll give people some time to comment, but if there is interest in exploring this I'd be happy to kick off a KEP and drive it. |
f9aa14c
to
752d4dc
Compare
43a4d26
to
3d997ec
Compare
3d997ec
to
534b8aa
Compare
/reopen |
a88469d
to
c0cd292
Compare
c0cd292
to
6bd2e16
Compare
b643b05
to
73132a5
Compare
151aa53
to
f146909
Compare
f146909
to
c457cb2
Compare
c457cb2
to
27b57c7
Compare
27b57c7
to
1489600
Compare
/unhold |
/cc @mimowo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR seems to be changing much more than needed. Let me hold until the scope and purpose of the PR is clarified.
/hold
@@ -39,7 +39,7 @@ const ( | |||
func init() { | |||
utilruntime.Must(jobframework.RegisterIntegration(FrameworkName, jobframework.IntegrationCallbacks{ | |||
SetupIndexes: SetupIndexes, | |||
NewReconciler: NewReconciler, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this changed? Please don't if not necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sometimes, a Pod is updated, but the StatefulSet isn't aware of it, causing the reconcile process to not work as expected (e.g., the finalizer isn't removed). Thats why it is better to reconcile Pod instead of StatefulSet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this case - can you elaborate? For example in the core k8s Job we also have finalizers on Job pods, but the reconciler is at the Job level, reference. I would like to first document such problematic scenarios in some form of tests and change the implementation in a dedicated PR (if needed).
ss.Spec.Template.Labels[constants.QueueLabel] = queueName | ||
ss.Spec.Template.Labels[pod.GroupNameLabel] = GetWorkloadName(ss.Name) | ||
groupName, err := GetWorkloadName(obj.(*appsv1.StatefulSet)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change does not seem releated to changing queue-name. Please revert if not necessary.
What type of PR is this?
/kind feature
What this PR does / why we need it:
Allow mutating queue name in StatefulSet Webhook.
Which issue(s) this PR fixes:
Fixes #3279
Special notes for your reviewer:
Does this PR introduce a user-facing change?