Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SDN-4930: Downstream Merge [01-08-2025] #2412

Open
wants to merge 37 commits into
base: master
Choose a base branch
from

Conversation

jluhrsen
Copy link
Contributor

@jluhrsen jluhrsen commented Jan 9, 2025

πŸ“‘ Description

Fixes #

Additional Information for reviewers

βœ… Checks

  • My code requires changes to the documentation
  • if so, I have updated the documentation as required
  • My code requires tests
  • if so, I have added and/or updated the tests as required
  • All the tests have passed in the CI

How to verify it

npinaeva and others added 30 commits December 18, 2024 20:23
Handle host-network pods as default network.
Don't return per-pod errors on startup.
Remove nadController from UDNHostIsolationManager as we don't use it
anymore to find pod's UDN based on NADs that exist in the namespace.

Signed-off-by: Nadia Pinaeva <n.m.pinaeva@gmail.com>
Signed-off-by: Jaime CaamaΓ±o Ruiz <jcaamano@redhat.com>
…face

Signed-off-by: Martin Kennelly <mkennell@redhat.com>
This code isnt being used anymore. We dont expect users
to upgrade directly from code which contained the legacy LRPs,
therefore its safe to remove.

Signed-off-by: Martin Kennelly <mkennell@redhat.com>
Signed-off-by: Martin Kennelly <mkennell@redhat.com>
L2 UDN: EgressIP hosted by primary interface (`breth0`)
If EncapIP is configured, it means it is different from the node's
primary address. Do not update EncapIP when node's primary address
changes.

Signed-off-by: Yun Zhou <yunz@nvidia.com>
Assign network ID from network manager running in cluster manager. The
network ID is included in NetInfo and annotated on the NAD along with
the network name. Network managers running in zone & node controllers
will read the network ID from the annotation to set it on NetInfo.

On startup, network manager running in cluster manager will read the
network IDs annotated on the nodes to cover for the upgrade scenario.
Network IDs will still be annotated on the nodes because this PR does
not transition all the code to use the network ID from the NetInfo
instead of the node annotation. That will have to be done progressively.

This have several benefits, among them:
- NetworkID is available sooner overall since we dont have to wait for
  all the nodes to be annotated
- No need to unmarshall the node annotation to get the network IDs, they
  are available in NetInfo
- No need to unmashall the NAD to get the network name, can be accessed
  directly from the annotation.

If a network is replaced with a different one with the same name, the
network ID is reused as the respective network controller will not start
as the previous one is stopped and cleaned up so it shouldn't be a
problem.

Signed-off-by: Jaime CaamaΓ±o Ruiz <jcaamano@redhat.com>
Instead of considering managed VRFs those that follow the mp<id>-udn-vrf
naming template, use the table number: those vrfs associated to a table
within our reserved block of table numbers are managed by us. The block
right now is anything higher than RoutingTableIDStart (1000). This
allows to manage VRFs with any name which is desirable if the name is
going to be exposed through BGP.

Signed-off-by: Jaime CaamaΓ±o Ruiz <jcaamano@redhat.com>
Anticipating that these VRF names are going to be exposed through BGP,
we should to use friendlier names for our VRFs. The most natural name to
use is the network name. Thus giving a cluster UDN a name below 15
characters that matches an already existing VRF not managed by ovn-k
will fail. This is considered an admin problem and not an ovn-k problem
for now.

Signed-off-by: Jaime CaamaΓ±o Ruiz <jcaamano@redhat.com>
Signed-off-by: Jaime CaamaΓ±o Ruiz <jcaamano@redhat.com>
Signed-off-by: Jaime CaamaΓ±o Ruiz <jcaamano@redhat.com>
Signed-off-by: Jaime CaamaΓ±o Ruiz <jcaamano@redhat.com>
Was causing deadlocks in unit tests

Signed-off-by: Jaime CaamaΓ±o Ruiz <jcaamano@redhat.com>
…heir subcontrollers

Signed-off-by: Jaime CaamaΓ±o Ruiz <jcaamano@redhat.com>
Signed-off-by: Jaime CaamaΓ±o Ruiz <jcaamano@redhat.com>
Assuming that there is three types of controllers, being: network
agnostic, network aware and network specific; we were already notifying
network specific controllers of network changes. But network aware
controllers, controllers for which we have a single instance capable of
managing multiple networks, had no code path to be informed of netwokr
changes.

This commit adds a code path for that and makes the RouteAdvertisments
controller aware of network changes.

Changed ClusterManager to be the controller manager for cluster manager
instead of secondaryNetworkClusterManager. It just makes more sense that
way sice ClusterManager is the top level manager.

Signed-off-by: Jaime CaamaΓ±o Ruiz <jcaamano@redhat.com>
Signed-off-by: Jaime CaamaΓ±o Ruiz <jcaamano@redhat.com>
…twork exist test

Signed-off-by: Or Mergi <ormergi@redhat.com>
On CUDN cleanup is inconsistent as we see some flaky tests due to CUDN
"already exist" errors, implying object are not actually deleted.

Wait for CUDN object be gone when deleted

Signed-off-by: Or Mergi <ormergi@redhat.com>
CUDN is cluster-scoped object, in case tests running in parallel,
having random names avoids conflicting with other tests.

Use random metadata.name for CUDN objects.

The "isolates overlapping CIDRs" tests create objects based on the
'red' and 'blue' variables, including CUDN objects.
Change the tests CUDN creation use random names and update the given
'networkAttachmentConfigParams' with the new generated name.
Update 'red' & 'blue' vaiables with the generated name, carried by
'networkAttachmentConfigParams' (netConfig.name).

The pod2Egress tests asserts on the CUDN object name given by 'userDefinedNetworkName'.
In practice the tests netConfigParam.name is userDefinedNetworkName.
Change the assertion to check the given netConfigParam.

Signed-off-by: Or Mergi <ormergi@redhat.com>
Signed-off-by: nithyar <nithyar@nvidia.com>
Signed-off-by: nithyar <nithyar@nvidia.com>
Reconcile RouteAdvertisements in cluster manager
Add missing enum validation for RouteAdvertisements
The NetPol test checks assigned pod IP only against IPv4 subnet
which would fail on IPv6 only cluster. This commit fixes it by
checking on all valid CIDRs.

Signed-off-by: Periyasamy Palanisamy <pepalani@redhat.com>
The variable ginkgo_focus is misspelled as gingko_focus.

As the latter var is not used anywhere else in this repo
and is used to concatenate the var ginkgo_focus in the
next line to ginkgoargs it seems to be a typo.

Fixes: #4942

Signed-off-by: Felix Schumacher <felix.schumacher@internetallee.de>
This commit adds a new controller to import BGP learnt routes into OVN.

The controller runs in ovnkube-controller so it only supports IC
architecture where ovnkube-controller has kernel access on each node.

Networks should register to this controller to have routes imported for
them. Routes are imported into the network's gateway router. Multipath
routes are supported.

The controller subscribes for netlink route events. When a route is
updated, the corresponding network is queued to be sync'ed. A network is
also sync'ed when registered to the controller.

Synchronizations are delayed by a small amount of time to prevent a
series of consecutive route updates so synchornize the same network
twice. Synchronizations apply the difference between current and desired
state.

The controller subscribes to netlink link events to learn the routing
table associated to a network vrf. The network is inferred from the vrf
device name. When learning the routing table, the corresponding network
is queued to be sync'ed.

Signed-off-by: Jaime CaamaΓ±o Ruiz <jcaamano@redhat.com>
jcaamano and others added 6 commits January 8, 2025 11:49
Signed-off-by: Jaime CaamaΓ±o Ruiz <jcaamano@redhat.com>
Signed-off-by: Periyasamy Palanisamy <pepalani@redhat.com>
Signed-off-by: Periyasamy Palanisamy <pepalani@redhat.com>
Import learnt BGP routes into OVN
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jan 9, 2025
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Jan 9, 2025

@jluhrsen: This pull request references SDN-4930 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target either version "4.19." or "openshift-4.19.", but it targets "openshift-4.18" instead.

In response to this:

πŸ“‘ Description

Fixes #

Additional Information for reviewers

βœ… Checks

  • My code requires changes to the documentation
  • if so, I have updated the documentation as required
  • My code requires tests
  • if so, I have added and/or updated the tests as required
  • All the tests have passed in the CI

How to verify it

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 9, 2025
Copy link
Contributor

openshift-ci bot commented Jan 9, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: jluhrsen
Once this PR has been reviewed and has the lgtm label, please assign trozet for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jluhrsen jluhrsen force-pushed the d/s-merge-01-08-2025 branch from 1759248 to 65a3e28 Compare January 9, 2025 00:51
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 9, 2025
@jluhrsen
Copy link
Contributor Author

jluhrsen commented Jan 9, 2025

/test e2e-metal-ipi-ovn-ipv6-techpreview
/test e2e-aws-ovn-hypershift-conformance-techpreview
/test e2e-azure-ovn-techpreview
/test e2e-metal-ipi-ovn-dualstack-techpreview
/test e2e-vsphere-ovn-techpreview
/test e2e-aws-ovn-techpreview
/test e2e-gcp-ovn-techpreview
/test e2e-metal-ipi-ovn-techpreview
/test openshift-e2e-gcp-ovn-techpreview-upgrade
/payload 4.19 ci blocking
/payload 4.19 nightly blocking

Copy link
Contributor

openshift-ci bot commented Jan 9, 2025

@jluhrsen: trigger 4 job(s) of type blocking for the ci release of OCP 4.19

  • periodic-ci-openshift-release-master-ci-4.19-upgrade-from-stable-4.18-e2e-aws-ovn-upgrade
  • periodic-ci-openshift-release-master-ci-4.19-upgrade-from-stable-4.18-e2e-azure-ovn-upgrade
  • periodic-ci-openshift-release-master-ci-4.19-e2e-gcp-ovn-upgrade
  • periodic-ci-openshift-hypershift-release-4.19-periodics-e2e-aws-ovn

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/59d94350-ce24-11ef-9fe7-2791c9a6bd2e-0

trigger 14 job(s) of type blocking for the nightly release of OCP 4.19

  • periodic-ci-openshift-release-master-ci-4.19-e2e-aws-upgrade-ovn-single-node
  • periodic-ci-openshift-release-master-ci-4.19-e2e-aws-ovn-upgrade
  • periodic-ci-openshift-release-master-ci-4.19-e2e-azure-ovn-upgrade
  • periodic-ci-openshift-release-master-ci-4.19-upgrade-from-stable-4.18-e2e-gcp-ovn-rt-upgrade
  • periodic-ci-openshift-hypershift-release-4.19-periodics-e2e-aws-ovn-conformance
  • periodic-ci-openshift-release-master-nightly-4.19-e2e-aws-ovn-serial
  • periodic-ci-openshift-release-master-ci-4.19-e2e-aws-ovn-techpreview
  • periodic-ci-openshift-release-master-ci-4.19-e2e-aws-ovn-techpreview-serial
  • periodic-ci-openshift-release-master-nightly-4.19-fips-payload-scan
  • periodic-ci-openshift-release-master-nightly-4.19-e2e-metal-ipi-ovn-bm
  • periodic-ci-openshift-release-master-nightly-4.19-e2e-metal-ipi-ovn-ipv6
  • periodic-ci-openshift-microshift-release-4.19-periodics-e2e-aws-ovn-ocp-conformance
  • periodic-ci-openshift-microshift-release-4.19-periodics-e2e-aws-ovn-ocp-conformance-serial
  • periodic-ci-openshift-release-master-nightly-4.19-e2e-rosa-sts-ovn

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/59d94350-ce24-11ef-9fe7-2791c9a6bd2e-1

@jluhrsen
Copy link
Contributor Author

jluhrsen commented Jan 9, 2025

/test e2e-azure-ovn-techpreview

@jluhrsen
Copy link
Contributor Author

jluhrsen commented Jan 9, 2025

/test e2e-aws-ovn-serial
/test e2e-azure-ovn-upgrade
/test e2e-metal-ipi-ovn-dualstack-techpreview

@jluhrsen
Copy link
Contributor Author

jluhrsen commented Jan 9, 2025

/payload-aggregate periodic-ci-openshift-microshift-release-4.19-periodics-e2e-aws-ovn-ocp-conformance 10
/payload-aggregate periodic-ci-openshift-microshift-release-4.19-periodics-e2e-aws-ovn-ocp-conformance-serial 10
/payload-aggregate periodic-ci-openshift-release-master-nightly-4.19-e2e-rosa-sts-ovn 10

Copy link
Contributor

openshift-ci bot commented Jan 9, 2025

@jluhrsen: trigger 3 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-microshift-release-4.19-periodics-e2e-aws-ovn-ocp-conformance
  • periodic-ci-openshift-microshift-release-4.19-periodics-e2e-aws-ovn-ocp-conformance-serial
  • periodic-ci-openshift-release-master-nightly-4.19-e2e-rosa-sts-ovn

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/12975570-cea8-11ef-820f-aade8edf1c59-0

Copy link
Contributor

openshift-ci bot commented Jan 9, 2025

@jluhrsen: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/security 65a3e28 link false /test security
ci/prow/e2e-openstack-ovn 65a3e28 link false /test e2e-openstack-ovn
ci/prow/e2e-azure-ovn-upgrade 65a3e28 link true /test e2e-azure-ovn-upgrade
ci/prow/e2e-metal-ipi-ovn-ipv6-techpreview 65a3e28 link false /test e2e-metal-ipi-ovn-ipv6-techpreview

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@jluhrsen
Copy link
Contributor Author

/test e2e-azure-ovn-upgrade
/test 2e-openstack-ovn
/test 4.19-upgrade-from-stable-4.18-e2e-aws-ovn-upgrade
/test 4.19-upgrade-from-stable-4.18-e2e-gcp-ovn-rt-upgrade
/test 4.19-upgrade-from-stable-4.18-images

@jluhrsen
Copy link
Contributor Author

/payload-job periodic-ci-openshift-release-master-nightly-4.19-e2e-rosa-sts-ovn 1

Copy link
Contributor

openshift-ci bot commented Jan 13, 2025

@jluhrsen: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-master-nightly-4.19-e2e-rosa-sts-ovn

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/417d8390-d1e7-11ef-8bd8-742fa6af9ebf-0

@jluhrsen
Copy link
Contributor Author

/payload-job periodic-ci-openshift-release-master-ci-4.19-upgrade-from-stable-4.18-e2e-aws-ovn-upgrade 10

Copy link
Contributor

openshift-ci bot commented Jan 13, 2025

@jluhrsen: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-master-ci-4.19-upgrade-from-stable-4.18-e2e-aws-ovn-upgrade

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/78df37c0-d1e7-11ef-945e-d8e140fc0948-0

@jluhrsen
Copy link
Contributor Author

/payload 4.19 ci blocking

Copy link
Contributor

openshift-ci bot commented Jan 13, 2025

@jluhrsen: trigger 4 job(s) of type blocking for the ci release of OCP 4.19

  • periodic-ci-openshift-release-master-ci-4.19-upgrade-from-stable-4.18-e2e-aws-ovn-upgrade
  • periodic-ci-openshift-release-master-ci-4.19-upgrade-from-stable-4.18-e2e-azure-ovn-upgrade
  • periodic-ci-openshift-release-master-ci-4.19-e2e-gcp-ovn-upgrade
  • periodic-ci-openshift-hypershift-release-4.19-periodics-e2e-aws-ovn

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/a3a513d0-d1e7-11ef-97fe-20124f3e1665-0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.
Projects
None yet
Development

Successfully merging this pull request may close these issues.