Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] [DNM]Test udn ns label merge #2414

Open
wants to merge 48 commits into
base: master
Choose a base branch
from

Conversation

trozet
Copy link
Contributor

@trozet trozet commented Jan 9, 2025

πŸ“‘ Description

Fixes #

Additional Information for reviewers

βœ… Checks

  • My code requires changes to the documentation
  • if so, I have updated the documentation as required
  • My code requires tests
  • if so, I have added and/or updated the tests as required
  • All the tests have passed in the CI

How to verify it

npinaeva and others added 30 commits December 18, 2024 20:23
Handle host-network pods as default network.
Don't return per-pod errors on startup.
Remove nadController from UDNHostIsolationManager as we don't use it
anymore to find pod's UDN based on NADs that exist in the namespace.

Signed-off-by: Nadia Pinaeva <n.m.pinaeva@gmail.com>
Signed-off-by: Jaime CaamaΓ±o Ruiz <jcaamano@redhat.com>
…face

Signed-off-by: Martin Kennelly <mkennell@redhat.com>
This code isnt being used anymore. We dont expect users
to upgrade directly from code which contained the legacy LRPs,
therefore its safe to remove.

Signed-off-by: Martin Kennelly <mkennell@redhat.com>
Signed-off-by: Martin Kennelly <mkennell@redhat.com>
L2 UDN: EgressIP hosted by primary interface (`breth0`)
If EncapIP is configured, it means it is different from the node's
primary address. Do not update EncapIP when node's primary address
changes.

Signed-off-by: Yun Zhou <yunz@nvidia.com>
Assign network ID from network manager running in cluster manager. The
network ID is included in NetInfo and annotated on the NAD along with
the network name. Network managers running in zone & node controllers
will read the network ID from the annotation to set it on NetInfo.

On startup, network manager running in cluster manager will read the
network IDs annotated on the nodes to cover for the upgrade scenario.
Network IDs will still be annotated on the nodes because this PR does
not transition all the code to use the network ID from the NetInfo
instead of the node annotation. That will have to be done progressively.

This have several benefits, among them:
- NetworkID is available sooner overall since we dont have to wait for
  all the nodes to be annotated
- No need to unmarshall the node annotation to get the network IDs, they
  are available in NetInfo
- No need to unmashall the NAD to get the network name, can be accessed
  directly from the annotation.

If a network is replaced with a different one with the same name, the
network ID is reused as the respective network controller will not start
as the previous one is stopped and cleaned up so it shouldn't be a
problem.

Signed-off-by: Jaime CaamaΓ±o Ruiz <jcaamano@redhat.com>
Instead of considering managed VRFs those that follow the mp<id>-udn-vrf
naming template, use the table number: those vrfs associated to a table
within our reserved block of table numbers are managed by us. The block
right now is anything higher than RoutingTableIDStart (1000). This
allows to manage VRFs with any name which is desirable if the name is
going to be exposed through BGP.

Signed-off-by: Jaime CaamaΓ±o Ruiz <jcaamano@redhat.com>
Anticipating that these VRF names are going to be exposed through BGP,
we should to use friendlier names for our VRFs. The most natural name to
use is the network name. Thus giving a cluster UDN a name below 15
characters that matches an already existing VRF not managed by ovn-k
will fail. This is considered an admin problem and not an ovn-k problem
for now.

Signed-off-by: Jaime CaamaΓ±o Ruiz <jcaamano@redhat.com>
Signed-off-by: Jaime CaamaΓ±o Ruiz <jcaamano@redhat.com>
Signed-off-by: Jaime CaamaΓ±o Ruiz <jcaamano@redhat.com>
Signed-off-by: Jaime CaamaΓ±o Ruiz <jcaamano@redhat.com>
Was causing deadlocks in unit tests

Signed-off-by: Jaime CaamaΓ±o Ruiz <jcaamano@redhat.com>
…heir subcontrollers

Signed-off-by: Jaime CaamaΓ±o Ruiz <jcaamano@redhat.com>
Signed-off-by: Jaime CaamaΓ±o Ruiz <jcaamano@redhat.com>
Assuming that there is three types of controllers, being: network
agnostic, network aware and network specific; we were already notifying
network specific controllers of network changes. But network aware
controllers, controllers for which we have a single instance capable of
managing multiple networks, had no code path to be informed of netwokr
changes.

This commit adds a code path for that and makes the RouteAdvertisments
controller aware of network changes.

Changed ClusterManager to be the controller manager for cluster manager
instead of secondaryNetworkClusterManager. It just makes more sense that
way sice ClusterManager is the top level manager.

Signed-off-by: Jaime CaamaΓ±o Ruiz <jcaamano@redhat.com>
Signed-off-by: Jaime CaamaΓ±o Ruiz <jcaamano@redhat.com>
…twork exist test

Signed-off-by: Or Mergi <ormergi@redhat.com>
On CUDN cleanup is inconsistent as we see some flaky tests due to CUDN
"already exist" errors, implying object are not actually deleted.

Wait for CUDN object be gone when deleted

Signed-off-by: Or Mergi <ormergi@redhat.com>
CUDN is cluster-scoped object, in case tests running in parallel,
having random names avoids conflicting with other tests.

Use random metadata.name for CUDN objects.

The "isolates overlapping CIDRs" tests create objects based on the
'red' and 'blue' variables, including CUDN objects.
Change the tests CUDN creation use random names and update the given
'networkAttachmentConfigParams' with the new generated name.
Update 'red' & 'blue' vaiables with the generated name, carried by
'networkAttachmentConfigParams' (netConfig.name).

The pod2Egress tests asserts on the CUDN object name given by 'userDefinedNetworkName'.
In practice the tests netConfigParam.name is userDefinedNetworkName.
Change the assertion to check the given netConfigParam.

Signed-off-by: Or Mergi <ormergi@redhat.com>
Signed-off-by: nithyar <nithyar@nvidia.com>
Signed-off-by: nithyar <nithyar@nvidia.com>
Reconcile RouteAdvertisements in cluster manager
Add missing enum validation for RouteAdvertisements
The NetPol test checks assigned pod IP only against IPv4 subnet
which would fail on IPv6 only cluster. This commit fixes it by
checking on all valid CIDRs.

Signed-off-by: Periyasamy Palanisamy <pepalani@redhat.com>
As of today the NetworkReady condition indicated a NAD has been created.
And not necessarily that the underlying network is ready to work with,
because it require other internal components to act (e.g.: set ovs ports,
ovn flows, etc..).

Rename the NetworkReady condition type to NetworkCreated so it describe
better what it indicates.

This change enable introducing alternative "NetworkReady" condition that
provider actual indication a UDN network is ready, and that other
internal component acted successfully.

Signed-off-by: Or Mergi <ormergi@redhat.com>
The variable ginkgo_focus is misspelled as gingko_focus.

As the latter var is not used anywhere else in this repo
and is used to concatenate the var ginkgo_focus in the
next line to ginkgoargs it seems to be a typo.

Fixes: #4942

Signed-off-by: Felix Schumacher <felix.schumacher@internetallee.de>
jcaamano and others added 17 commits January 8, 2025 11:49
Signed-off-by: Jaime CaamaΓ±o Ruiz <jcaamano@redhat.com>
Signed-off-by: Periyasamy Palanisamy <pepalani@redhat.com>
Signed-off-by: Periyasamy Palanisamy <pepalani@redhat.com>
Import learnt BGP routes into OVN
UDN: Rename NetworkReady condition to NetworkCreated
When multinetwork policies support is disabled
WatchNamespaces should run to completion for
primary UDNs. This was not happening because a primary
UDN is also a secondary network.

Signed-off-by: Patryk Diak <pdiak@redhat.com>
k8s.ovn.org/user-defined-network is now required to be labeled on a
namespace at namespace creation time in order to use a primary UDN. The
following conditions are true:

1. If namespace is missing the label, and a pod is created, it attaches
   to default network.
2. If the namespace is missing the label, and a primary UDN or CUDN is
   created that matches that namespace, the UDN/CUDN will report error
   status and the NAD will not be generated.
3. If the namespace is missing the label, and a primary UDN/CUDN exists,
   a pod in the namespace will be created and attached to default
   network.
4. If the namespace has the label, and a primary UDN/CUDN does not exist
   a pod in the namespace will fail creation until the UDN/CUDN is
   created.

Also includes some fixes to unit tests that were brought to light by
this PR. For example, the layer 2 multi-network tests were adding
invalid annotations for node-subnets, etc.

Signed-off-by: Tim Rozet <trozet@redhat.com>
Signed-off-by: Patryk Diak <pdiak@redhat.com>
Signed-off-by: Tim Rozet <trozet@redhat.com>
Signed-off-by: Tim Rozet <trozet@redhat.com>
Was using ipv6 on ipv4 cluster.

Signed-off-by: Tim Rozet <trozet@redhat.com>
EgressIP was depending on getActiveNetworkFromNamespace to work, or
would fail to remove egressIP status.

Signed-off-by: Tim Rozet <trozet@redhat.com>
Signed-off-by: Tim Rozet <trozet@redhat.com>
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 9, 2025
@openshift-ci openshift-ci bot requested review from abhat and JacobTanenbaum January 9, 2025 16:30
Copy link
Contributor

openshift-ci bot commented Jan 9, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: trozet

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 9, 2025
Copy link
Contributor

openshift-ci bot commented Jan 9, 2025

@trozet: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-ovn-upgrade-local-gateway bcf31ed link true /test e2e-aws-ovn-upgrade-local-gateway
ci/prow/e2e-metal-ipi-ovn-dualstack-techpreview bcf31ed link false /test e2e-metal-ipi-ovn-dualstack-techpreview
ci/prow/e2e-metal-ipi-ovn-techpreview bcf31ed link false /test e2e-metal-ipi-ovn-techpreview
ci/prow/security bcf31ed link false /test security
ci/prow/e2e-aws-ovn-single-node-techpreview bcf31ed link false /test e2e-aws-ovn-single-node-techpreview
ci/prow/e2e-vsphere-ovn-techpreview bcf31ed link false /test e2e-vsphere-ovn-techpreview
ci/prow/e2e-azure-ovn bcf31ed link false /test e2e-azure-ovn
ci/prow/e2e-aws-ovn-techpreview bcf31ed link false /test e2e-aws-ovn-techpreview
ci/prow/e2e-aws-ovn-hypershift-conformance-techpreview bcf31ed link false /test e2e-aws-ovn-hypershift-conformance-techpreview
ci/prow/e2e-openstack-ovn bcf31ed link false /test e2e-openstack-ovn
ci/prow/e2e-metal-ipi-ovn-dualstack-local-gateway-techpreview bcf31ed link false /test e2e-metal-ipi-ovn-dualstack-local-gateway-techpreview
ci/prow/e2e-metal-ipi-ovn-ipv6-techpreview bcf31ed link false /test e2e-metal-ipi-ovn-ipv6-techpreview
ci/prow/openshift-e2e-gcp-ovn-techpreview-upgrade bcf31ed link false /test openshift-e2e-gcp-ovn-techpreview-upgrade
ci/prow/4.19-upgrade-from-stable-4.18-e2e-gcp-ovn-rt-upgrade bcf31ed link true /test 4.19-upgrade-from-stable-4.18-e2e-gcp-ovn-rt-upgrade
ci/prow/e2e-gcp-ovn-techpreview bcf31ed link true /test e2e-gcp-ovn-techpreview
ci/prow/e2e-azure-ovn-upgrade bcf31ed link true /test e2e-azure-ovn-upgrade
ci/prow/e2e-azure-ovn-techpreview bcf31ed link false /test e2e-azure-ovn-techpreview

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Copy link
Contributor

openshift-ci bot commented Jan 13, 2025

@tssurya: This PR was included in a payload test run from openshift/origin#29417
trigger 14 job(s) of type blocking for the nightly release of OCP 4.19

  • periodic-ci-openshift-release-master-ci-4.19-e2e-aws-upgrade-ovn-single-node
  • periodic-ci-openshift-release-master-ci-4.19-e2e-aws-ovn-upgrade
  • periodic-ci-openshift-release-master-ci-4.19-e2e-azure-ovn-upgrade
  • periodic-ci-openshift-release-master-ci-4.19-upgrade-from-stable-4.18-e2e-gcp-ovn-rt-upgrade
  • periodic-ci-openshift-hypershift-release-4.19-periodics-e2e-aws-ovn-conformance
  • periodic-ci-openshift-release-master-nightly-4.19-e2e-aws-ovn-serial
  • periodic-ci-openshift-release-master-ci-4.19-e2e-aws-ovn-techpreview
  • periodic-ci-openshift-release-master-ci-4.19-e2e-aws-ovn-techpreview-serial
  • periodic-ci-openshift-release-master-nightly-4.19-fips-payload-scan
  • periodic-ci-openshift-release-master-nightly-4.19-e2e-metal-ipi-ovn-bm
  • periodic-ci-openshift-release-master-nightly-4.19-e2e-metal-ipi-ovn-ipv6
  • periodic-ci-openshift-microshift-release-4.19-periodics-e2e-aws-ovn-ocp-conformance
  • periodic-ci-openshift-microshift-release-4.19-periodics-e2e-aws-ovn-ocp-conformance-serial
  • periodic-ci-openshift-release-master-nightly-4.19-e2e-rosa-sts-ovn

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/fc874390-d1df-11ef-9be9-3f1c2f489e8c-0

Copy link
Contributor

openshift-ci bot commented Jan 13, 2025

@tssurya: This PR was included in a payload test run from openshift/origin#29417
trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-master-nightly-4.19-e2e-azure-ovn-runc-techpreview

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/291f2b10-d1e1-11ef-80eb-5a9fe669093f-0

Copy link
Contributor

openshift-ci bot commented Jan 13, 2025

@tssurya: This PR was included in a payload test run from openshift/origin#29417
trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-master-nightly-4.19-e2e-metal-ipi-ovn-dualstack-techpreview

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/39c95f80-d1e1-11ef-8e37-6d74b5ecc3a4-0

Copy link
Contributor

openshift-ci bot commented Jan 13, 2025

@tssurya: This PR was included in a payload test run from openshift/origin#29417
trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-master-nightly-4.19-e2e-vsphere-ovn-techpreview

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/4639e1e0-d1e1-11ef-95fc-02f63217ccf6-0

Copy link
Contributor

openshift-ci bot commented Jan 13, 2025

@tssurya: This PR was included in a payload test run from openshift/origin#29417
trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-master-nightly-4.19-e2e-metal-ipi-ovn-techpreview

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/5118a0b0-d1e1-11ef-924e-a5d12f299bf9-0

Copy link
Contributor

openshift-ci bot commented Jan 13, 2025

@tssurya: This PR was included in a payload test run from openshift/origin#29417
trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-master-nightly-4.19-e2e-metal-ipi-ovn-ipv6-techpreview

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/5bdf91c0-d1e1-11ef-9ade-b67ef733a655-0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress.
Projects
None yet
Development

Successfully merging this pull request may close these issues.