-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add support for kueue to work with VAP on OCP 4.16+ #1480
base: main
Are you sure you want to change the base?
Conversation
Skipping CI for Draft Pull Request. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What is this about ?
Is this actually needed? we use unstructured objects so the current version should work too |
i should revert the uplift for api part, at least was using &admissionregistrationv1.ValidatingAdmissionPolicyBinding which only contain these 2 from 0.30.4 |
). | ||
WatchesGVK( | ||
gvk.ValidatingAdmissionPolicyBinding, | ||
reconciler.Dynamic(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
pkg/cluster/cluster_config.go
Outdated
@@ -95,6 +95,16 @@ func GetDomain(ctx context.Context, c client.Client) (string, error) { | |||
return domain, err | |||
} | |||
|
|||
func GetOCPVersion(ctx context.Context, c client.Client) (semver.Version, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can probably be computed at startup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did some change, moved this call a bit up to the main
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1480 +/- ##
==========================================
- Coverage 19.88% 19.78% -0.11%
==========================================
Files 160 160
Lines 10818 10874 +56
==========================================
Hits 2151 2151
- Misses 8440 8496 +56
Partials 227 227 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks brilliant Wen, thank you!!
Just making sure. The ValidatingAdmissionPolicyBinding
resource should be configurable, i.e., by the Cluster Admin. This resource should have a similar behaviour to how ConfigMaps are handled. - Let me know if this makes sense
i was not aware of this requirement. |
@zdtsw For reference after our async discussion:
|
e4a4d57
to
ce77585
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm tested and works as expected. Thank you for the brilliant work!
|
||
// Add OCP 4.17+ specific manifests if Minor > 16 . | ||
func vapPredicate(context.Context, *odhtypes.ReconciliationRequest) bool { | ||
return cluster.GetClusterInfo().Version.Minor > 16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to make things a little but more clear, maybe we should have a constant for the minimum version and use SemVer.GTE
var V4_17 = semver.MustParse("4.17")
func vapPredicate(context.Context, *odhtypes.ReconciliationRequest) bool {
return cluster.GetClusterInfo().Version.GTE(V4_17)
}
c2830a2
to
1ea36b9
Compare
tests/e2e/kueue_test.go
Outdated
@@ -164,6 +172,36 @@ func (tc *KueueTestCtx) testOwnerReferences() error { | |||
return nil | |||
} | |||
|
|||
func (tc *KueueTestCtx) validateVAPReady() error { | |||
// if ocp is 4.17+ then VAP and VAPB should be created | |||
if cluster.GetClusterInfo().Version.GTE(semver.MustParse("4.17.0")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this check shold probably me moved to the t.Run
block, like:
t.Run("Validate Kueue Dynamically create VAP", func(t *testing.T) {
if !cluster.GetClusterInfo().Version.GTE(semver.MustParse("4.17.0") {
t.Skip("Disabled as requires OpenShift >= 4.17")
return
}
err = kueueCtx.validateVAPReady()
require.NoError(t, err, "Kueue instance is not Ready")
})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, doing rebase now, need local test before push the final change
tests/e2e/kueue_test.go
Outdated
vap := &unstructured.Unstructured{} | ||
vap.SetGroupVersionKind(gvk.ValidatingAdmissionPolicy) | ||
|
||
err := tc.testCtx.customClient.Get(tc.testCtx.ctx, keyvap, vap) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be better to use gomega's Eventually
so that it won't fail in case the resource takes a little time to become available
tests/e2e/kueue_test.go
Outdated
vapb := &unstructured.Unstructured{} | ||
vapb.SetGroupVersionKind(gvk.ValidatingAdmissionPolicyBinding) | ||
|
||
err = tc.testCtx.customClient.Get(tc.testCtx.ctx, keyvapb, vapb) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be better to use gomega's Eventually
so that it won't fail in case the resource takes a little time to become available
tests/e2e/kueue_test.go
Outdated
@@ -164,6 +172,36 @@ func (tc *KueueTestCtx) testOwnerReferences() error { | |||
return nil | |||
} | |||
|
|||
func (tc *KueueTestCtx) validateVAPReady() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also probably need a negative test so that it validated that no VAP resources are created on an unsupported openshift verions
@@ -41,6 +44,7 @@ import ( | |||
) | |||
|
|||
func (s *componentHandler) NewComponentReconciler(ctx context.Context, mgr ctrl.Manager) error { | |||
enableVAP = cluster.GetClusterInfo().Version.GTE(semver.MustParse("4.17.0")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since enableVAP
is a package variable, it may be better to set it as part of the init()
function
7d74968
to
7f51b9e
Compare
tests/e2e/kueue_test.go
Outdated
g.Get(gvk.ValidatingAdmissionPolicy, types.NamespacedName{Name: "kueue-validating-admission-policy"}).Eventually().Should( | ||
jq.Match(`.metadata.ownerReferences == "%s"`, componentApi.KueueInstanceName), | ||
) | ||
vapb, err := g.Get(gvk.ValidatingAdmissionPolicyBinding, types.NamespacedName{Name: "kueue-validating-admission-policy-binding"}).Get() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this probably should use Eventually
as the other one
v, err := tc.GetClusterVersion() | ||
g.Expect(err).ToNot(HaveOccurred()) | ||
if v.GTE(semver.MustParse("4.17.0")) { | ||
g.Get(gvk.ValidatingAdmissionPolicy, types.NamespacedName{Name: "kueue-validating-admission-policy"}).Eventually().WithTimeout(1 * time.Second). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any specific reason to wait only 1 second ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i moved this test case after "ValidateUpdateDeploymentsResources" test.
i think no need to wait DefaultTimeout 2mins here, if component is ready but VAP/VAPB not found
- only add VAP specific manifests into list if latest OCP version in history is over 16 in min version update: explicit check 4.17.0 not just minor version - now each component can get ocp version without calculate - add test in kueue - make VAPB object user configable (as no ownerreference set) - user cannot update VAP object ( it get reconciled back to default value) Signed-off-by: Wen Zhou <wenzhou@redhat.com>
- fix order for kueue, sinde the deploy need to be called after we get all manifests - testcase to explictly check error matching no CRD of VAP/VAPB found for < 4.17 - cannot use cluster.GetClusterInfo since that is init/set in the main, tc wont have that value Signed-off-by: Wen Zhou <wenzhou@redhat.com>
Signed-off-by: Wen Zhou <wenzhou@redhat.com>
/test opendatahub-operator-e2e |
/test opendatahub-operator-e2e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tested latest changes from this PR. This continues to work as expected, thanks!
Description
https://issues.redhat.com/browse/RHOAIENG-16133
How Has This Been Tested?
manual test on quay.io/wenzhou/opendatahub-operator-catalog:v2.17.1613315
spin up two clusters
create DSCI, DSC only enable kueue
kueue CR ready, 2 new resource VAP and VAPB created
change VAPB action from Deny to Warn, it keeps new value
disable kueue
kueue CR removed, VAP removed, VAPB remains
kueue CR ready, no API registered in cluster for VAP/VAPB CRD, operator pod not throw error
Screenshot or short clip
Merge criteria