Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

icmpv6 jitter increase after upgrade #326

Open
tiagonux opened this issue Apr 30, 2024 · 4 comments
Open

icmpv6 jitter increase after upgrade #326

tiagonux opened this issue Apr 30, 2024 · 4 comments

Comments

@tiagonux
Copy link

tiagonux commented Apr 30, 2024

Hi all,

Bringing this thread discussion to here -> https://www.mail-archive.com/ovs-discuss@openvswitch.org/msg09948.html that is reporting an issue regarding ICMP v6 packets.

While testing the upgrade path from OVN 22.03.1/OVS 2.17.2 to OVN 23.03.1/OVS 3.1.3 on Ubuntu 22.04/kernel 5.15 and 6.5 we are seeing a strange behavior for icmpv6 traffic.
Before the upgrade a simple north-south or west-east ping between IPv6 hosts would have a low jitter like below:

64 bytes from 2001:db8:2:a::301: icmp_seq=2712 ttl=62 time=0.676 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=2713 ttl=62 time=0.829 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=2714 ttl=62 time=0.568 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=2715 ttl=62 time=0.700 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=2716 ttl=62 time=0.768 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=2717 ttl=62 time=0.599 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=2718 ttl=62 time=0.656 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=2719 ttl=62 time=0.689 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=2720 ttl=62 time=0.724 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=2721 ttl=62 time=0.419 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=2722 ttl=62 time=0.732 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=2723 ttl=62 time=0.717 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=2724 ttl=62 time=0.755 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=2725 ttl=62 time=0.765 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=2726 ttl=62 time=0.535 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=2727 ttl=62 time=0.865 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=2728 ttl=62 time=0.692 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=2729 ttl=62 time=0.597 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=2730 ttl=62 time=0.661 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=2731 ttl=62 time=0.558 ms

But after the upgrade, the same ping started to have a higher jitter:

64 bytes from 2001:db8:2:a::301: icmp_seq=37 ttl=253 time=2.14 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=38 ttl=253 time=50.2 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=39 ttl=253 time=57.0 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=40 ttl=253 time=61.5 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=41 ttl=253 time=2.16 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=42 ttl=253 time=1.68 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=43 ttl=253 time=1.63 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=44 ttl=253 time=3.32 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=45 ttl=253 time=1.87 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=46 ttl=253 time=39.6 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=47 ttl=253 time=2.87 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=48 ttl=253 time=60.0 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=49 ttl=253 time=1.79 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=50 ttl=253 time=2.06 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=51 ttl=253 time=2.45 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=52 ttl=253 time=2.10 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=53 ttl=253 time=4.39 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=54 ttl=253 time=2.91 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=55 ttl=253 time=1.79 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=56 ttl=253 time=1.80 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=57 ttl=253 time=2.26 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=58 ttl=253 time=55.1 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=59 ttl=253 time=57.2 ms
64 bytes from 2001:db8:2:a::301: icmp_seq=60 ttl=253 time=3.34 ms
--- 2001:db8:2:a::301 ping statistics ---
60 packets transmitted, 60 received, 0% packet loss, time 59120ms
rtt min/avg/max/mdev = 0.531/16.329/61.464/23.395 ms

The icmp v4 is not affected and we have the same jitter before and after the upgrade. Regarding throughput, I ran a TCP/UDP (v4/v6) throughput test before and after the upgrade and the numbers are similar, so it seems it happens only in special with icmpv6 traffic.

Checking the datapath, I can see the flow related with the in_port(1706) where the VM is connected being removed and installed again:

ovs-dpctl dump-flows | grep 2001:db8:2:a::301
recirc_id(0x1ad3d),tunnel(tun_id=0x1c,src=10.26.73.135,dst=10.26.72.4,tos=0x20,geneve({}{}),flags(-df+csum+key)),in_port(137),ct_state(-new+est-rel-rpl-inv+trk),ct_mark(0/0x1),eth(src=fa:16:3e:9b:b3:c6,dst=fa:16:3e:d7:c9:46),eth_type(0x86dd),ipv6(src=2000::/ffc0::,dst=2001:db8:2:a::301,proto=58,hlimit=62,frag=no),
packets:7, bytes:826, used:0.674s, actions:1706
recirc_id(0),tunnel(tun_id=0x1c,src=10.26.73.135,dst=10.26.72.4,tos=0x20,geneve({class=0x102,type=0x80,len=4,0x60008/0x7fffffff}),flags(-df+csum+key)),in_port(137),eth(src=fa:16:3e:9b:b3:c6,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x86dd),ipv6(src=2000::/ffc0::,dst=2001:db8:2:a::301,proto=58,hlimit=62,frag=no),
packets:7, bytes:826, used:0.709s, actions:ct(zone=7185),recirc(0x1ad3d)

ovs-dpctl dump-flows | grep 2001:db8:2:a::301
recirc_id(0x1ad3d),tunnel(tun_id=0x1c,src=10.26.73.135,dst=10.26.72.4,tos=0x20,geneve({}{}),flags(-df+csum+key)),in_port(137),ct_state(-new+est-rel-rpl-inv+trk),ct_mark(0/0x1),eth(src=fa:16:3e:9b:b3:c6,dst=fa:16:3e:d7:c9:46),eth_type(0x86dd),ipv6(src=2000::/ffc0::,dst=2001:db8:2:a::301,proto=58,hlimit=62,frag=no),
packets:11, bytes:1298, used:0.190s, actions:1706
recirc_id(0),in_port(1706),eth(src=fa:16:3e:d7:c9:46,dst=fa:16:3e:9b:b3:c6),eth_type(0x86dd),ipv6(src=2001:db8:2:a::301,dst=2001:db8:1:2::10,proto=58,hlimit=255,frag=no),icmpv6(type=128/0xfc),
packets:0, bytes:0, used:never, actions:ct(zone=7185),recirc(0x1b13c)
recirc_id(0),tunnel(tun_id=0x1c,src=10.26.73.135,dst=10.26.72.4,tos=0x20,geneve({class=0x102,type=0x80,len=4,0x60008/0x7fffffff}),flags(-df+csum+key)),in_port(137),eth(src=fa:16:3e:9b:b3:c6,dst=00:00:00:00:00:00/01:00:00:00:00:00),eth_type(0x86dd),ipv6(src=2000::/ffc0::,dst=2001:db8:2:a::301,proto=58,hlimit=62,frag=no),
packets:11, bytes:1298, used:0.237s, actions:ct(zone=7185),recirc(0x1ad3d)

(Note: no OVS HW Offloading)

So, it seems there is a flow missing, the flow goes to userspace and it is installed again on the datapath. Maybe it can explain the higher jitter.

After debugging and trying to understand when this behavior was introduced, we figured out the offending commit was this one [0]. We backported only this commit to the OVS 2.17.2 and the issue was reproduced.

The flow below is an example that is always installed and removed from the datapath and is left with 0 packets matched:

recirc_id(0),in_port(1706),eth(src=fa:16:3e:d7:c9:46,dst=fa:16:3e:9b:b3:c6),eth_type(0x86dd),ipv6(src=2001:db8:2:a::301,dst=2001:db8:1:2::10,proto=58,hlimit=255,frag=no),icmpv6(type=128/0xfc),
packets:0, bytes:0, used:never, actions:ct(zone=7185),recirc(0xae46f4)

Since the commit changed the behavior of the classifier, this may have introduced an issue for ICMP v6 packets.

[0]
openvswitch/ovs@132fa24

Regards,

Tiago Pires

@tiagonux
Copy link
Author

tiagonux commented Aug 6, 2024

Hey @igsilya

I created this reproducer[0] where you can reproduce this issue.
Could you take a look?

[0] https://pastebin.com/Qw0wGyJ3

Regards,

Tiago Pires

@tiagonux
Copy link
Author

tiagonux commented Aug 7, 2024

Hi,

This reproducer when using the option "run", it will create 90 LRs, LSs, NAT rules for namespaces.
While it is running, there is a ping for v4 and v6 running in paralelal and the output is saved to individual files into the /tmp/*.
For exemple, when using OVS 3.x I can see the v6's ping working but the flow related with it on the datapath is flapping.
And regarding the v4's ping, the flow is installed on the datapath and it still there until the ping is finalized, so that shows the issue only affect the ICMP v6 packets.

Tiago Pires

@igsilya
Copy link
Member

igsilya commented Aug 7, 2024

Thanks @tiagonux ! I'll try to check it later this week.
One question: Are you running it on Ubuntu? If so, which version?

@tiagonux
Copy link
Author

tiagonux commented Aug 7, 2024

Hey @igsilya,

I'm running on Ubuntu 22.04(jammy).

Thanks

Tiago Pires

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants