-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add AdminNetworkPolicy #9206
Add AdminNetworkPolicy #9206
Conversation
ab29eb0
to
a9626f1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a first pass. Mostly looks good but I think there are a few details of the spec for ANP that differ from the other policy that we support, mainly to allow for future enhancements to ANP. We should try to follow those behaviours, even if they're no-ops right now.
I think we should include support for CIDR matches and named ports in the first release. They should both be trivial to do and, to be correct according to the spec, I think if you don't support them, the behaviour needs to change (you should drop allow rules/convert deny rules to deny all).
for _, r := range rule.To { | ||
// We need to add a copy of the peer so all the rules don't | ||
// point to the same location. | ||
peer := &adminpolicy.AdminNetworkPolicyEgressPeer{Namespaces: r.Namespaces} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Egress peers also have Networks
(list of CIDRs).
Do we need the copy step in recent golang? I think loop variables are now unique per loop so you may not need to copy anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's also "Nodes". I suppose we could translate that to something that would match the right auto-HEPs (but if we're not going to support it, should we validate against it?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note https://github.com/kubernetes-sigs/network-policy-api/blob/v0.1.5/apis/v1alpha1/shared_types.go#L148C1-L150C46 for forwards compatibility, we should check that something is set in the rule and fail closed if we see nothing.
// Exactly one of the selector pointers must be set for a given peer. If a
// consumer observes none of its fields are set, they must assume an unknown
// option has been specified and fail closed.
@@ -1567,6 +1567,18 @@ func validateTier(structLevel validator.StructLevel) { | |||
} | |||
} | |||
|
|||
if tier.Name == names.AdminNetworkPolicyTierName { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to validate this on read? If Felix reads a bad value it'll drop the resource and that'll break policy. I feel like we only want to check this on write so that, if bad data gets into the datastore, we still do something sensible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't validate this on read. In the Tier implementation of Clientv3, we only call validator on create and update.
calico/libcalico-go/lib/clientv3/tier.go
Line 146 in 1e41064
func (r tiers) Get(ctx context.Context, name string, opts options.GetOptions) (*apiv3.Tier, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Felix/Typha both validate in their calc graphs, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. It seems in Typha we also validate in reads:
if err := validatorFunc(elem.Interface()); err != nil { |
Also in Felix:
calico/felix/calc/validation_filter.go
Line 67 in 4ad72b7
if err := validatorFunc(elem.Interface()); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Originally the check was like this: https://github.com/tigera/calico-private/blame/d7f58624426ef41c64b6981b6017cf2956b50a25/libcalico-go/lib/clientv3/tier.go#L51
However, we changed it to v3 validator since the return error code, Operation not supported
, is not technically correct, as the create/update operations are supported. It just feels the right place for these type of checks is in validator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this be an issue if create and update does not allow the incorrect value for AdminNeworkPolicy tier in first place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can also use DefaultTierFields
here: https://github.com/projectcalico/calico/blob/master/libcalico-go/lib/resources/tier.go#L21
to make sure values are correct. This function is called by k8s client: https://github.com/projectcalico/calico/blob/master/libcalico-go/lib/backend/k8s/resources/tiers.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, we create adminneworkpolicy tier ourselves at the beginning of the day here:
calico/libcalico-go/lib/clientv3/client.go
Line 421 in 018190a
func (c client) ensureAdminNetworkPolicyTierExists(ctx context.Context) error { |
for _, r := range anp.Spec.Ingress { | ||
rules, err := c.k8sANPIngressRuleToCalico(r) | ||
if err != nil { | ||
log.WithError(err).Warn("dropping k8s rule that couldn't be converted.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the intent in the NPEP is to "fail closed" when we don't understand a rule in order to deal with forwards compatibility more cleanly. I take that to mean:
- If we can't understand an Allow rule, we should log a warning and drop it.
- If we can't understand a Deny rule, we should log a warning and convert that to a "Deny all" rule.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense for a fail closed system. For allow & pass action it's clear that we should drop the rule.
For deny action, I think we should only add the Deny all
for the direction of failed rule, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only add the Deny all for the direction of failed rule, right?
Yes, exactly, just that one rule gets converted to deny all (or at least, that's my interpretation).
protocol = c.k8sProtocolToCalico(&port.PortRange.Protocol) | ||
return | ||
} | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NamedPort is "Extended" but we support named ports in Felix so I think that's best to do now.
} | ||
} | ||
|
||
// If there no peers, or no ports, represent that as nil. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the spec requires there to be at least one peer (although that means it's hard to do basic things like allow all/deny all!). I filed an issue upstream for that: kubernetes-sigs/network-policy-api#248
e2e/cmd/adminpolicy/e2e_test.go
Outdated
@@ -0,0 +1,72 @@ | |||
// Copyright (c) 2024 Tigera, Inc. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied this file from here: https://github.com/kubernetes-sigs/network-policy-api/blob/main/conformance/conformance_test.go
I believe I should use their copy right clause, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, always keep external copyright notices intact. We can also add ours as and when we make changes (but best to make changes in a second file where possible to make it easier to unpick what's ours and what's theirs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved as an increment but, as discussed, you're planning to:
- Solve the end of tier ACTION PROPERLY.
- Add tiers to Dikastes.
Description
Add support for the new k8s
AdminNetworkPolicy
API. This PR only adds the core functionalities.Related issues/PRs
Tier support: #9085
Default action in tiers: #9232
Operator: tigera/operator#3501
AdminNetworkPolicy egress networks: #9276
AdminNetworkPolicy named port: #9254
BaselineAdminNetworkPolicy: #9464
Todos
Release Note
Reminder for the reviewer
Make sure that this PR has the correct labels and milestone set.
Every PR needs one
docs-*
label.docs-pr-required
: This change requires a change to the documentation that has not been completed yet.docs-completed
: This change has all necessary documentation completed.docs-not-required
: This change has no user-facing impact and requires no docs.Every PR needs one
release-note-*
label.release-note-required
: This PR has user-facing changes. Most PRs should have this label.release-note-not-required
: This PR has no user-facing changes.Other optional labels:
cherry-pick-candidate
: This PR should be cherry-picked to an earlier release. For bug fixes only.needs-operator-pr
: This PR is related to install and requires a corresponding change to the operator.