provisioned nodes can't reach rancher-vcluster on VIP #8

qrkourier · 2023-11-26T18:22:59Z

rancher-vcluster shares Harvester's VIP, and the Harvester guest VMs that Rancher provisions as RKE2 nodes are unable to reach the Rancher Server to apply the initial plan. They have no trouble pinging the VIP, to which the configured Rancher Server URL does resolve in DNS on the guest node, and they are able to reach the Harvester UI with a cURL GET to the Harvester node IP.

The HTTP request with Host: rancher.hella header, from the guest node to the Rancher Server URL, results in a SYN that is never acknowledged. The SYN flows from the guest node's veth to the Harvester node's mgmt-br, where it is mangled to an array of destination IPs that have routes to the Harvester node's calico and flannel interfaces.

A cURL GET on the Harvester node where the guest node is running is able to fetch the Harvester UI web server without a Host header, and the Rancher Server UI with the Host: rancher.hella header, proving the destination is correct. The expected response is the HTTP 302 redirect to the dashboard location.

rancher@nuc2:~> curl -ks https://rancher.hella|sha256sum
3509bf97089da3314f168d5811fd5a5015bc185c50e24f4855dab26bf7df8f8b  -

rancher@nuc2:~> curl -ks https://10.52.1.36 -H 'host: rancher.hella'|sha256sum 
3509bf97089da3314f168d5811fd5a5015bc185c50e24f4855dab26bf7df8f8b  -

It's interesting that the guest VM can reach the web server running on some, but not all, of the three Harverster node IPs, and the ones that can not be reached changes depending upon where the VIP is currently bound and on which Harvester node the RKE2 node guest is scheduled.

If the VIP is bound by node3, then the guest running on node2 is able to reach node3's primary interface and GET / gets an HTTP 302 to /dashboard/. The same request to node1, node2 IP times out. Request to the VIP with or without host header times out.

When the VIP is bound on the same node2 as the guest, the previously successful GET / times out, and the request to node1 begins to succeed!

I believe it's commonplace for VIPs like Harvester's to be assigned to the mgmt-br interface with subnet mask /32, despite the primary interface address having /24. Noting this in case Harvester's VIP should actually be using /24.

The text was updated successfully, but these errors were encountered:

qrkourier · 2023-11-26T19:41:47Z

More context, in case it's relevant.

The metal Harvester nodes have untagged mgmt network with PVID 30 on the switch. The VM network tags guest's packets for VLAN 40. There's no issue forwarding between the two VLANs, as determined by ICMP echo replies between the Harvester node and guest VM. Metal Harvester nodes and guest VMs lease IPs via DHCP running on the router, which has pools for each subnet.

TL;DR The SYN initiated by the Rancher-provisioned guest VM never reaches the router's VLAN interface on its way to the Rancher Server provided by rancher-vcluster. It's being dropped somewhere in the Harvester networking.

Slack thread about the same

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

provisioned nodes can't reach rancher-vcluster on VIP #8

provisioned nodes can't reach rancher-vcluster on VIP #8

qrkourier commented Nov 26, 2023

qrkourier commented Nov 26, 2023 •

edited

Loading

provisioned nodes can't reach rancher-vcluster on VIP #8

provisioned nodes can't reach rancher-vcluster on VIP #8

Comments

qrkourier commented Nov 26, 2023

qrkourier commented Nov 26, 2023 • edited Loading

qrkourier commented Nov 26, 2023 •

edited

Loading