-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ip full ip/v3/p9 #1
base: main
Are you sure you want to change the base?
Conversation
@@ -742,6 +862,10 @@ consider_port_binding_for_vif(const struct sbrec_port_binding *pb, | |||
child = local_binding_create(pb->logical_port, lbinding->iface, | |||
pb, binding_type); | |||
local_binding_add_child(lbinding, child); | |||
if (b_ctx_out->tracked_dp_bindings) { | |||
tracked_binding_datapath_lport_add( | |||
b_ctx_out->tracked_dp_bindings, pb, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we not calling "local_datapath_update_lport_count()" here to increase the port count in local datapath?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch.
@@ -1154,9 +1182,13 @@ static bool | |||
runtime_data_ovs_interface_handler(struct engine_node *node, void *data) | |||
{ | |||
struct ed_type_runtime_data *rt_data = data; | |||
struct ed_type_runtime_tracked_data *tracked_data = node->tracked_data; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
call
en_runtime_clear_tracked_data(node->tracked_data);
here as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need. engine will take care of clearing the tracked data.
struct binding_ctx_in b_ctx_in; | ||
struct binding_ctx_out b_ctx_out; | ||
init_binding_ctx(node, rt_data, &b_ctx_in, &b_ctx_out); | ||
tracked_data->tracked = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
call
en_runtime_clear_tracked_data(node->tracked_data);
here as well
@@ -1189,6 +1221,10 @@ runtime_data_sb_port_binding_handler(struct engine_node *node, void *data) | |||
return false; | |||
} | |||
|
|||
struct ed_type_runtime_tracked_data *tracked_data = node->tracked_data; | |||
tracked_data->tracked = true; | |||
b_ctx_out.tracked_dp_bindings = &tracked_data->tracked_dp_bindings; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why u missed
b_ctx_out.local_lports_changed = &tracked_data->local_lports_changed;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's because it is meant to track ovs interfaces with iface-id set.
This is required to update the condition monitoring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, in the below sequence ..
1)ovn-nbctl ls-add sw0
2)ovs-vsctl add-port br-int p1 --
set Interface p1 external_ids:iface-id=sw0-port1
3)ovn-nbctl lsp-add sw0 sw0-port1
We step 3 will add the local binding and claim the port.
Step 2 will run runtime_data_ovs_interface_handler, but local binding is not created as port doesn't exist in SBDB
Step 3 will result in runtime_data_sb_port_binding_handler, which adds local binding. So in this case, we need to update b_ctx_out.local_lports_changed as this flag is used in flow_output_runtime_data_handler
Am I missing anything here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
local_binding is created in step 2.
0d7b66c
to
53f31c0
Compare
…and binding_ctx_out. No functional changes are introduced in this patch. Signed-off-by: Numan Siddique <numans@ovn.org>
… state. This patch adds a new data structure - 'struct local_binding' which represents a probable local port binding. A new object of this structure is creared for each interface present in the integration bridge (br-int) with the external_ids:iface-id set. This struct has 2 main fields - 'struct ovsrec_interface *' and 'struct sbrec_port_binding *'. These fields are updated during port claim and release. A shash of the local bindings is maintained with the 'iface-id' as the hash key in the runtime data of the incremental processing engine. This patch helps in the upcoming patch to add incremental processing support for OVS interface and SB port binding changes. Signed-off-by: Numan Siddique <numans@ovn.org>
53f31c0
to
3bd6c0f
Compare
…data. This patch handles SB port binding changes and OVS interface changes incrementally in the runtime_data stage which otherwise would have resulted in calls to binding_run(). Prior to this patch, port binding changes were handled differently. The changes were only evaluated to see if they need full recompute or they can be ignored. With this patch, the runtime data is updated with the changes (both SB port binding and OVS interface) and ports are claimed/released in the handlers itself, avoiding the need to trigger recompute of runtime data stage. Signed-off-by: Numan Siddique <numans@ovn.org>
This patch adds partial support of incremental processing of datapath binding. If a datapath is deleted, then a full recompute is triggered if that datapath is present in the 'local_datapaths' hmap of runtime data. Signed-off-by: Numan Siddique <numans@ovn.org>
… actions have changed. If ofctrl_check_and_add_flow(F') is called where flow F' has match-actions (M, A2) and if there already exists a flow F with match-actions (M, A1) in the desired flow table, then this function doesn't update the existing flow F with new actions actions A2. This patch is required for the upcoming patch in this series which adds incremental processing for OVS interface in the flow output stage. Since we will be not be clearing the flow output data if these changes are handled incrementally, some of the existing flows will be updated with new actions. One such example would be flows in physical table OFTABLE_LOCAL_OUTPUT (table 33). And this patch is required to update such flows. Otherwise we will have incomplete actions installed. Signed-off-by: Numan Siddique <numans@ovn.org>
…put stage. This patch handles ct zone changes and OVS interface changes incrementally in the flow output stage. Any changes to ct zone can be handled by running physical_run() instead of running flow_output_run(). And any changes to OVS interfaces can be either handled incrementally (for OVS interfaces representing VIFs) or just running physical_run() (for tunnel and other types of interfaces). To better handle this, a new engine node 'physical_flow_changes' is added which handles changes to ct zone and OVS interfaces. Signed-off-by: Numan Siddique <numans@ovn.org>
With this patch, an engine node which is an input to another engine node can provide the tracked data. The parent of this engine node can handle this tracked data incrementally. At the end of the engine_run(), the tracked data of the nodes are cleared. Signed-off-by: Numan Siddique <numans@ovn.org>
In order to handle runtime data changes incrementally, the flow outut runtime data handle should know the changed runtime data. Runtime data now tracks the changed data for any OVS interface and SB port binding changes. The tracked data contains a hmap of tracked datapaths (which changed during runtime data processing. The flow outout runtime_data handler in this patch doesn't do much with the tracked data. It returns false if there is tracked data available so that flow_output run is called. If no tracked data is available then there is no need for flow computation and the handler returns true. Next patch in the series processes the tracked data incrementally. Co-Authored-by: Venkata Anil <anilvenkata@redhat.com> Signed-off-by: Venkata Anil <anilvenkata@redhat.com> Signed-off-by: Numan Siddique <numans@ovn.org>
This patch processes the logical flows of tracked datapaths and tracked logical ports. To handle the tracked logical port changes, reference of logical flows to port bindings is maintained. Co-Authored-by: Numan Siddique <numans@ovn.org> Signed-off-by: Venkata Anil <anilvenkata@redhat.com> Signed-off-by: Numan Siddique <numans@ovn.org>
3bd6c0f
to
dbd65b4
Compare
The 'nexthop' that passed to ic_route_hash() is not fully initialized in get_nexthop_from_lport_addresses(). 'nexthop' has type of 'struct v46_ip' which contains a union to share space for ipv4 and ipv6 address. If only ipv4 initialized where is a plenty of uninitialized space that goes to hash_bytes(nexthop, sizeof *nexthop, basis). Impact: there are two places where this function is called. 1. In add_to_routes_ad(), the nexthop is initialized in parse_route() before calling get_nexthop_from_lport_addresses(), luckily. 2. In add_network_to_routes_ad(), we are unlucky. When a directly connected network of a router is found to be advertised, if the route already existed in the global IC-SB, it may not be found due to the hash difference, and results in the existing route being deleted and the same one recreated, unnecessarily. This patch fixes the problem by initializing the struct to zero before setting the fields. From Ilya's report: > Report from MemorySanitizer: > > ==3074629==WARNING: MemorySanitizer: use-of-uninitialized-value > #0 0x67177e in mhash_add__ ovs/./lib/hash.h:66:9 > #1 0x671668 in mhash_add ovs/./lib/hash.h:78:12 > #2 0x6701e9 in hash_bytes ovs/lib/hash.c:38:16 > ovn-org#3 0x524b4a in add_network_to_routes_ad ic/ovn-ic.c:1095:5 > ovn-org#4 0x51eea3 in route_run ic/ovn-ic.c:1424:21 > ovn-org#5 0x51887b in main ic/ovn-ic.c:1674:17 > ovn-org#6 0x7fd4ce7871a2 in __libc_start_main > ovn-org#7 0x49c90d in _start (ic/ovn-ic+0x49c90d) > > Uninitialized value was created by an allocation of 'nexthop' in the > stack frame of function 'add_network_to_routes_ad' > #0 0x5245f0 in add_network_to_routes_ad ic/ovn-ic.c:1069 Reported-by: Ilya Maximets <i.maximets@ovn.org> Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2020-October/376160.html Fixes: 57b347c ("ovn-ic: Route advertisement.") Acked-by: Numan Siddique <numans@ovn.org> Signed-off-by: Han Zhou <hzhou@ovn.org>
'child_port_list' is an array of pointers that should be freed too. Direct leak of 30 byte(s) in 6 object(s) allocated from: #0 0x501fff in malloc (/tests/ovstest+0x501fff) #1 0x6227e6 in xmalloc /lib/util.c:138:15 #2 0x6228b8 in xmemdup0 /lib/util.c:168:15 ovn-org#3 0x8183d6 in parse_fwd_group_action /lib/actions.c:3374:30 ovn-org#4 0x814b6e in parse_action /lib/actions.c:3610:9 ovn-org#5 0x8139ef in parse_actions /lib/actions.c:3637:14 ovn-org#6 0x8136a3 in ovnacts_parse /lib/actions.c:3672:9 ovn-org#7 0x813c80 in ovnacts_parse_string /lib/actions.c:3699:5 ovn-org#8 0x53a979 in test_parse_actions /tests/test-ovn.c:1372:21 ovn-org#9 0x54e7a8 in ovs_cmdl_run_command__ /lib/command-line.c:247:17 ovn-org#10 0x537c75 in test_ovn_main /tests/test-ovn.c:1630:5 ovn-org#11 0x54e7a8 in ovs_cmdl_run_command__ /lib/command-line.c:247:17 ovn-org#12 0x537359 in main /tests/ovstest.c:133:9 ovn-org#13 0x7f06978f21a2 in __libc_start_main (/lib64/libc.so.6+0x271a2) CC: Manoj Sharma <manoj.sharma@nutanix.com> Fixes: edb2400 ("Forwarding group to load balance l2 traffic with liveness detection") Acked-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Numan Siddique <numans@ovn.org>
'dsts' should be freed in case of any error. Direct leak of 4 byte(s) in 1 object(s) allocated from: #0 0x502378 in realloc (/tests/ovstest+0x502378) #1 0x622826 in xrealloc /lib/util.c:149:9 #2 0x8194f4 in parse_select_action /lib/actions.c:1185:20 ovn-org#3 0x814f49 in parse_set_action /lib/actions.c:3499:13 ovn-org#4 0x814341 in parse_action /lib/actions.c:3554:9 ovn-org#5 0x8139ef in parse_actions /lib/actions.c:3643:14 ovn-org#6 0x8136a3 in ovnacts_parse /lib/actions.c:3678:9 ovn-org#7 0x813c80 in ovnacts_parse_string /lib/actions.c:3705:5 ovn-org#8 0x53a4e8 in test_parse_actions /tests/test-ovn.c:1321:17 ovn-org#9 0x54e7a8 in ovs_cmdl_run_command__ /lib/command-line.c:247:17 ovn-org#10 0x537c75 in test_ovn_main /tests/test-ovn.c:1630:5 ovn-org#11 0x54e7a8 in ovs_cmdl_run_command__ /lib/command-line.c:247:17 ovn-org#12 0x537359 in main /tests/ovstest.c:133:9 ovn-org#13 0x7f9ce05ba1a2 in __libc_start_main (/lib64/libc.so.6+0x271a2) CC: Han Zhou <hzhou@ovn.org> Fixes: 85b3544 ("ovn-controller: A new action "select".") Acked-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Numan Siddique <numans@ovn.org>
'parse_ofp_meter_mod_str' allocates space for meter.bands that should be freed. Direct leak of 448 byte(s) in 7 object(s) allocated from: #0 0x52100f in malloc (/controller/ovn-controller+0x52100f) #1 0x7523a6 in xmalloc /lib/util.c:138:15 #2 0x6fd079 in ofpbuf_init /lib/ofpbuf.c:123:26 ovn-org#3 0x6cba27 in parse_ofp_meter_mod_str /lib/ofp-meter.c:779:5 ovn-org#4 0x5705b8 in add_meter_string /controller/ofctrl.c:1674:19 ovn-org#5 0x56f736 in ofctrl_put /controller/ofctrl.c:2105:13 ovn-org#6 0x59aebb in main /controller/ovn-controller.c:2627:25 ovn-org#7 0x7f07873251a2 in __libc_start_main (/lib64/libc.so.6+0x271a2) CC: Guoshuai Li <ligs@dtdream.com> Fixes: c25094b ("ovn: OVN Support QoS meter") Acked-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Numan Siddique <numans@ovn.org>
'smap_clear()' doesn't free allocated memory, but 'smap_clone()' re-initializes hash map clearing internal pointers and leaking this memory. 'smap_destroy()' should be used instead. Also, all the records and array of datapaths should be freed on destruction of a cache entry. Direct leak of 16 byte(s) in 2 object(s) allocated from: #0 0x5211c7 in calloc (/controller/ovn-controller+0x5211c7) #1 0x752364 in xcalloc /lib/util.c:121:31 #2 0x576e76 in sync_dns_cache /controller/pinctrl.c:2517:25 ovn-org#3 0x5758fb in pinctrl_run /controller/pinctrl.c:3158:5 ovn-org#4 0x59b06c in main /controller/ovn-controller.c:2642:25 ovn-org#5 0x7fb570fc11a2 in __libc_start_main (/lib64/libc.so.6+0x271a2) Indirect leak of 26 byte(s) in 2 object(s) allocated from: #0 0x52100f in malloc (/controller/ovn-controller+0x52100f) #1 0x7523d6 in xmalloc /lib/util.c:138:15 #2 0x7524a8 in xmemdup0 /lib/util.c:168:15 ovn-org#3 0x73d8fc in smap_clone /lib/smap.c:314:45 ovn-org#4 0x576e2f in sync_dns_cache /controller/pinctrl.c:2513:13 ovn-org#5 0x5758fb in pinctrl_run /controller/pinctrl.c:3158:5 ovn-org#6 0x59b06c in main /controller/ovn-controller.c:2642:25 ovn-org#7 0x7fb570fc11a2 in __libc_start_main (/lib64/libc.so.6+0x271a2) Fixes: 6b72068 ("ovn-controller: Add a new thread in pinctrl module to handle packet-ins.") Acked-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Numan Siddique <numans@ovn.org>
shash contains pointers to the data that should be freed. Direct leak of 32 byte(s) in 2 object(s) allocated from: #0 0x52100f in malloc (/controller/ovn-controller+0x52100f) #1 0x752436 in xmalloc /lib/util.c:138:15 #2 0x5a2f0b in add_pending_ct_zone_entry /controller/ovn-controller.c:548:45 ovn-org#3 0x5a2d1d in update_ct_zones /controller/ovn-controller.c:668:9 ovn-org#4 0x59d8c6 in en_ct_zones_run /controller/ovn-controller.c:1495:5 ovn-org#5 0x5dade4 in engine_run /lib/inc-proc-eng.c:377:9 ovn-org#6 0x59adf4 in main /controller/ovn-controller.c ovn-org#7 0x7f0799ef41a2 in __libc_start_main (/lib64/libc.so.6+0x271a2) CC: xu rong <xu.rong@zte.com.cn> Fixes: 252e164 ("ovn-controller: pending_ct_zones should be destroyed") Acked-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Numan Siddique <numans@ovn.org>
38a5351
to
5be9e1e
Compare
When a port binding of type "l3gateway" is claimed its remote peer port_binding is also stored in local_datapath.peer_ports[].remote. If the remote peer port_binding is deleted first (i.e., before the local "l3gateway" one) then we need to remove the complete local_datapath.peer_ports[] entry in order to avoid ending up using dangling pointers to already freed port bindings. Also, properly reset local_datapath->has_local_l3gateway in remove_pb_from_local_datapath(). Ilya reported this issue found by AddressSanitizer during his testing: ==1816017==ERROR: AddressSanitizer: heap-use-after-free on address 0x6140000cb170 at pc 0x0000005ab574 bp 0x7fff68925a30 sp 0x7fff68925a28 READ of size 8 at 0x6140000cb170 thread T0 #0 0x5ab573 in put_replace_chassis_mac_flows git/ovn/controller/physical.c:550:9 #1 0x5a65eb in consider_port_binding git/ovn/controller/physical.c:1168:13 #2 0x5a8764 in physical_run git/ovn/controller/physical.c:1607:9 ovn-org#3 0x5a0064 in flow_output_physical_flow_changes_handler git/ovn/controller/ovn-controller.c:2127:9 ovn-org#4 0x5db423 in engine_compute git/ovn/lib/inc-proc-eng.c:306:18 ovn-org#5 0x5dae1f in engine_run_node git/ovn/lib/inc-proc-eng.c:352:14 ovn-org#6 0x5dac74 in engine_run git/ovn/lib/inc-proc-eng.c:377:9 ovn-org#7 0x59ad64 in main git/ovn/controller/ovn-controller.c ovn-org#8 0x7f39fa6491a2 in __libc_start_main (/lib64/libc.so.6+0x271a2) ovn-org#9 0x480b2d in _start (git/ovn/controller/ovn-controller+0x480b2d) 0x6140000cb170 is located 304 bytes inside of 408-byte region [0x6140000cb040,0x6140000cb1d8) freed by thread T0 here: #0 0x520d07 in free (git/ovn/controller/ovn-controller+0x520d07) #1 0x712de7 in ovsdb_idl_db_track_clear git/ovs/lib/ovsdb-idl.c:1984:21 #2 0x59b5cd in main git/ovn/controller/ovn-controller.c:2762:9 ovn-org#3 0x7f39fa6491a2 in __libc_start_main (/lib64/libc.so.6+0x271a2) Reported-by: Ilya Maximets <i.maximets@ovn.org> Fixes: 354bdba ("ovn-controller: I-P for SB port binding and OVS interface in runtime_data.") Tested-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Numan Siddique <numans@ovn.org>
176890d
to
031af9a
Compare
397915b
to
22bacab
Compare
1a799cf
to
9a0f307
Compare
In some scenarios, op->list is not attached anywhere, which makes attempts to detach it trigger ubsan failure. ovn/ovs/include/openvswitch/list.h:252:17: runtime error: member access within null pointer of type 'struct ovs_list' #0 0x?? in ovs_list_remove ovn/ovs/include/openvswitch/list.h:252:17 #1 0x?? in ovn_port_allocate_key ovn/northd/northd.c:4021:13 #2 0x?? in ls_port_init ovn/northd/northd.c:4321:10 ovn-org#3 0x?? in ls_port_create ovn/northd/northd.c:4342:10 ovn-org#4 0x?? in ls_handle_lsp_changes ovn/northd/northd.c:4511:18 ovn-org#5 0x?? in northd_handle_ls_changes ovn/northd/northd.c:4655:14 ovn-org#6 0x?? in northd_nb_logical_switch_handler ovn/northd/en-northd.c:150: This patch makes northd use op->list only as a temporary means for build_ports logic to track ports that are persisted in both, nb, or sb only. Now build_ports will always detach ops once done. Now that op->list is never left attached to a list, we can remove ovs_list_remove calls for it elsewhere, including where op was never attached in the first place. Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com> Signed-off-by: Numan Siddique <numans@ovn.org> Acked-by: Mark Michelson <mmichels@redhat.com>
Issue: Upon updating the network_name option of localnet port from one physical bridge to another, ovn-controller fails to cleanup the peer localnet patch port from the old bridge and ends up creating a duplicate peer localnet patch port which fails in the following ovsdb transaction: ``` "Transaction causes multiple rows in \"Interface\" table to have identical values (patch-localnet_a7d47490-8a90-4c4d-8266-2915ad07c185-to-br-int) ``` Current workflow: 1. Keep a set of all existing localnet ports on br-int. Let us call this set: existing_ports. 2. For each localnet port in SB: 2.1 Create a patch port on br-int. (if it already exists on br-int, do not create but remove the entry from exisitng_ports set). 2.2 Create a peer patch port on br-phys-x. (if it already exists on the bridge, do not create but remove the entry from exisitng_ports set). 3. Finally, cleanup all the ports and its peers from exisiting_ports. With the above workflow, upon network_name change of localnet LSP, since ports on br-int does not change, only peer port needs to be move from br-phys-x to br-phys-y, the localnet port is removed from exisiting_ports in #2.1 and its peer never becomes eligible for cleanup. Fix: Include both patch port on br-int and its peer port in the exisiting_ports set as part of #1. This make sures that the peer port is only removed from existing_ports only if it is already present on the correct bridge. (#2.1/#2.2) Otherwise, during the cleanup in ovn-org#3 it will be considered. Fixes: b600316 ("Don't delete patch ports that don't belong to br-int") Signed-off-by: Priyankar Jain <priyankar.jain@nutanix.com> Acked-by: Numan Siddique <numans@ovn.org> Signed-off-by: Numan Siddique <numans@ovn.org>
Issue: Upon updating the network_name option of localnet port from one physical bridge to another, ovn-controller fails to cleanup the peer localnet patch port from the old bridge and ends up creating a duplicate peer localnet patch port which fails in the following ovsdb transaction: ``` "Transaction causes multiple rows in \"Interface\" table to have identical values (patch-localnet_a7d47490-8a90-4c4d-8266-2915ad07c185-to-br-int) ``` Current workflow: 1. Keep a set of all existing localnet ports on br-int. Let us call this set: existing_ports. 2. For each localnet port in SB: 2.1 Create a patch port on br-int. (if it already exists on br-int, do not create but remove the entry from exisitng_ports set). 2.2 Create a peer patch port on br-phys-x. (if it already exists on the bridge, do not create but remove the entry from exisitng_ports set). 3. Finally, cleanup all the ports and its peers from exisiting_ports. With the above workflow, upon network_name change of localnet LSP, since ports on br-int does not change, only peer port needs to be move from br-phys-x to br-phys-y, the localnet port is removed from exisiting_ports in #2.1 and its peer never becomes eligible for cleanup. Fix: Include both patch port on br-int and its peer port in the exisiting_ports set as part of #1. This make sures that the peer port is only removed from existing_ports only if it is already present on the correct bridge. (#2.1/#2.2) Otherwise, during the cleanup in ovn-org#3 it will be considered. Fixes: b600316 ("Don't delete patch ports that don't belong to br-int") Signed-off-by: Priyankar Jain <priyankar.jain@nutanix.com> Acked-by: Numan Siddique <numans@ovn.org> Signed-off-by: Numan Siddique <numans@ovn.org> (cherry picked from commit 2609cd9)
Issue: Upon updating the network_name option of localnet port from one physical bridge to another, ovn-controller fails to cleanup the peer localnet patch port from the old bridge and ends up creating a duplicate peer localnet patch port which fails in the following ovsdb transaction: ``` "Transaction causes multiple rows in \"Interface\" table to have identical values (patch-localnet_a7d47490-8a90-4c4d-8266-2915ad07c185-to-br-int) ``` Current workflow: 1. Keep a set of all existing localnet ports on br-int. Let us call this set: existing_ports. 2. For each localnet port in SB: 2.1 Create a patch port on br-int. (if it already exists on br-int, do not create but remove the entry from exisitng_ports set). 2.2 Create a peer patch port on br-phys-x. (if it already exists on the bridge, do not create but remove the entry from exisitng_ports set). 3. Finally, cleanup all the ports and its peers from exisiting_ports. With the above workflow, upon network_name change of localnet LSP, since ports on br-int does not change, only peer port needs to be move from br-phys-x to br-phys-y, the localnet port is removed from exisiting_ports in #2.1 and its peer never becomes eligible for cleanup. Fix: Include both patch port on br-int and its peer port in the exisiting_ports set as part of #1. This make sures that the peer port is only removed from existing_ports only if it is already present on the correct bridge. (#2.1/#2.2) Otherwise, during the cleanup in ovn-org#3 it will be considered. Fixes: b600316 ("Don't delete patch ports that don't belong to br-int") Signed-off-by: Priyankar Jain <priyankar.jain@nutanix.com> Acked-by: Numan Siddique <numans@ovn.org> Signed-off-by: Numan Siddique <numans@ovn.org> (cherry picked from commit 2609cd9)
Issue: Upon updating the network_name option of localnet port from one physical bridge to another, ovn-controller fails to cleanup the peer localnet patch port from the old bridge and ends up creating a duplicate peer localnet patch port which fails in the following ovsdb transaction: ``` "Transaction causes multiple rows in \"Interface\" table to have identical values (patch-localnet_a7d47490-8a90-4c4d-8266-2915ad07c185-to-br-int) ``` Current workflow: 1. Keep a set of all existing localnet ports on br-int. Let us call this set: existing_ports. 2. For each localnet port in SB: 2.1 Create a patch port on br-int. (if it already exists on br-int, do not create but remove the entry from exisitng_ports set). 2.2 Create a peer patch port on br-phys-x. (if it already exists on the bridge, do not create but remove the entry from exisitng_ports set). 3. Finally, cleanup all the ports and its peers from exisiting_ports. With the above workflow, upon network_name change of localnet LSP, since ports on br-int does not change, only peer port needs to be move from br-phys-x to br-phys-y, the localnet port is removed from exisiting_ports in #2.1 and its peer never becomes eligible for cleanup. Fix: Include both patch port on br-int and its peer port in the exisiting_ports set as part of #1. This make sures that the peer port is only removed from existing_ports only if it is already present on the correct bridge. (#2.1/#2.2) Otherwise, during the cleanup in ovn-org#3 it will be considered. Fixes: b600316 ("Don't delete patch ports that don't belong to br-int") Signed-off-by: Priyankar Jain <priyankar.jain@nutanix.com> Acked-by: Numan Siddique <numans@ovn.org> Signed-off-by: Numan Siddique <numans@ovn.org> (cherry picked from commit 2609cd9)
Issue: Upon updating the network_name option of localnet port from one physical bridge to another, ovn-controller fails to cleanup the peer localnet patch port from the old bridge and ends up creating a duplicate peer localnet patch port which fails in the following ovsdb transaction: ``` "Transaction causes multiple rows in \"Interface\" table to have identical values (patch-localnet_a7d47490-8a90-4c4d-8266-2915ad07c185-to-br-int) ``` Current workflow: 1. Keep a set of all existing localnet ports on br-int. Let us call this set: existing_ports. 2. For each localnet port in SB: 2.1 Create a patch port on br-int. (if it already exists on br-int, do not create but remove the entry from exisitng_ports set). 2.2 Create a peer patch port on br-phys-x. (if it already exists on the bridge, do not create but remove the entry from exisitng_ports set). 3. Finally, cleanup all the ports and its peers from exisiting_ports. With the above workflow, upon network_name change of localnet LSP, since ports on br-int does not change, only peer port needs to be move from br-phys-x to br-phys-y, the localnet port is removed from exisiting_ports in #2.1 and its peer never becomes eligible for cleanup. Fix: Include both patch port on br-int and its peer port in the exisiting_ports set as part of #1. This make sures that the peer port is only removed from existing_ports only if it is already present on the correct bridge. (#2.1/#2.2) Otherwise, during the cleanup in ovn-org#3 it will be considered. Fixes: b600316 ("Don't delete patch ports that don't belong to br-int") Signed-off-by: Priyankar Jain <priyankar.jain@nutanix.com> Acked-by: Numan Siddique <numans@ovn.org> Signed-off-by: Numan Siddique <numans@ovn.org> (cherry picked from commit 2609cd9)
Issue: Upon updating the network_name option of localnet port from one physical bridge to another, ovn-controller fails to cleanup the peer localnet patch port from the old bridge and ends up creating a duplicate peer localnet patch port which fails in the following ovsdb transaction: ``` "Transaction causes multiple rows in \"Interface\" table to have identical values (patch-localnet_a7d47490-8a90-4c4d-8266-2915ad07c185-to-br-int) ``` Current workflow: 1. Keep a set of all existing localnet ports on br-int. Let us call this set: existing_ports. 2. For each localnet port in SB: 2.1 Create a patch port on br-int. (if it already exists on br-int, do not create but remove the entry from exisitng_ports set). 2.2 Create a peer patch port on br-phys-x. (if it already exists on the bridge, do not create but remove the entry from exisitng_ports set). 3. Finally, cleanup all the ports and its peers from exisiting_ports. With the above workflow, upon network_name change of localnet LSP, since ports on br-int does not change, only peer port needs to be move from br-phys-x to br-phys-y, the localnet port is removed from exisiting_ports in #2.1 and its peer never becomes eligible for cleanup. Fix: Include both patch port on br-int and its peer port in the exisiting_ports set as part of #1. This make sures that the peer port is only removed from existing_ports only if it is already present on the correct bridge. (#2.1/#2.2) Otherwise, during the cleanup in ovn-org#3 it will be considered. Fixes: b600316 ("Don't delete patch ports that don't belong to br-int") Signed-off-by: Priyankar Jain <priyankar.jain@nutanix.com> Acked-by: Numan Siddique <numans@ovn.org> Signed-off-by: Numan Siddique <numans@ovn.org> (cherry picked from commit 2609cd9)
Issue: Upon updating the network_name option of localnet port from one physical bridge to another, ovn-controller fails to cleanup the peer localnet patch port from the old bridge and ends up creating a duplicate peer localnet patch port which fails in the following ovsdb transaction: ``` "Transaction causes multiple rows in \"Interface\" table to have identical values (patch-localnet_a7d47490-8a90-4c4d-8266-2915ad07c185-to-br-int) ``` Current workflow: 1. Keep a set of all existing localnet ports on br-int. Let us call this set: existing_ports. 2. For each localnet port in SB: 2.1 Create a patch port on br-int. (if it already exists on br-int, do not create but remove the entry from exisitng_ports set). 2.2 Create a peer patch port on br-phys-x. (if it already exists on the bridge, do not create but remove the entry from exisitng_ports set). 3. Finally, cleanup all the ports and its peers from exisiting_ports. With the above workflow, upon network_name change of localnet LSP, since ports on br-int does not change, only peer port needs to be move from br-phys-x to br-phys-y, the localnet port is removed from exisiting_ports in #2.1 and its peer never becomes eligible for cleanup. Fix: Include both patch port on br-int and its peer port in the exisiting_ports set as part of #1. This make sures that the peer port is only removed from existing_ports only if it is already present on the correct bridge. (#2.1/#2.2) Otherwise, during the cleanup in ovn-org#3 it will be considered. Fixes: b600316 ("Don't delete patch ports that don't belong to br-int") Signed-off-by: Priyankar Jain <priyankar.jain@nutanix.com> Acked-by: Numan Siddique <numans@ovn.org> Signed-off-by: Numan Siddique <numans@ovn.org> (cherry picked from commit 2609cd9)
Issue: Upon updating the network_name option of localnet port from one physical bridge to another, ovn-controller fails to cleanup the peer localnet patch port from the old bridge and ends up creating a duplicate peer localnet patch port which fails in the following ovsdb transaction: ``` "Transaction causes multiple rows in \"Interface\" table to have identical values (patch-localnet_a7d47490-8a90-4c4d-8266-2915ad07c185-to-br-int) ``` Current workflow: 1. Keep a set of all existing localnet ports on br-int. Let us call this set: existing_ports. 2. For each localnet port in SB: 2.1 Create a patch port on br-int. (if it already exists on br-int, do not create but remove the entry from exisitng_ports set). 2.2 Create a peer patch port on br-phys-x. (if it already exists on the bridge, do not create but remove the entry from exisitng_ports set). 3. Finally, cleanup all the ports and its peers from exisiting_ports. With the above workflow, upon network_name change of localnet LSP, since ports on br-int does not change, only peer port needs to be move from br-phys-x to br-phys-y, the localnet port is removed from exisiting_ports in #2.1 and its peer never becomes eligible for cleanup. Fix: Include both patch port on br-int and its peer port in the exisiting_ports set as part of #1. This make sures that the peer port is only removed from existing_ports only if it is already present on the correct bridge. (#2.1/#2.2) Otherwise, during the cleanup in ovn-org#3 it will be considered. Fixes: b600316 ("Don't delete patch ports that don't belong to br-int") Signed-off-by: Priyankar Jain <priyankar.jain@nutanix.com> Acked-by: Numan Siddique <numans@ovn.org> Signed-off-by: Numan Siddique <numans@ovn.org> (cherry picked from commit 2609cd9)
d95180c
to
0d7e867
Compare
e4222db
to
cf55c6c
Compare
Commit 8d13579 creates a chassisredirect port for a logical switch port (of type 'router') in certain scenarios and 'op->nbsp' can be NULL. The following crash is reported by Sanitizer. ==84927==ERROR: AddressSanitizer: SEGV on unknown address ==84927==The signal is caused by a READ memory access. ==84927==Hint: address points to the zero page. #0 0x57ab3854f04a in hmap_first_with_hash ovn/ovs/./include/openvswitch/hmap.h:360:38 #1 0x57ab3854fedf in smap_find__ ovn/ovs/lib/smap.c:421:5 #2 0x57ab3854f7b8 in smap_get_node ovn/ovs/lib/smap.c:217:12 ovn-org#3 0x57ab3854f74f in smap_get_def ovn/ovs/lib/smap.c:208:30 ovn-org#4 0x57ab3854f726 in smap_get ovn/ovs/lib/smap.c:200:12 ovn-org#5 0x57ab3854f862 in smap_get_int ovn/ovs/lib/smap.c:240:25 ovn-org#6 0x57ab383222eb in ovn_port_assign_requested_tnl_id ovn/northd/northd.c:4372:27 ovn-org#7 0x57ab383072fa in build_ports ovn/northd/northd.c:4454:9 ovn-org#8 0x57ab38301457 in ovnnb_db_run ovn/northd/northd.c:18023:5 ovn-org#9 0x57ab3841d99e in en_northd_run ovn/northd/en-northd.c:137:5 ovn-org#10 0x57ab384599b2 in engine_recompute ovn/lib/inc-proc-eng.c:411:5 ovn-org#11 0x57ab38459d6e in engine_run_node ovn/lib/inc-proc-eng.c:473:9 ovn-org#12 0x57ab38459ec3 in engine_run ovn/lib/inc-proc-eng.c:524:9 ovn-org#13 0x57ab38430c5d in inc_proc_northd_run ovn/northd/inc-proc-northd.c:420:5 ovn-org#14 0x57ab3841bb2f in main ovn/northd/ovn-northd.c:970:32 This patch fixes this issue. It also corrects some typo introduced by the commit - changes the reference from "chassisresident" to "chassisredirect". Fixes: 8d13579 ("Add support for centralize routing for distributed gw ports.") Reported-by: Felix Huettner <felix.huettner@mail.schwarz> Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2024-August/416264.html Acked-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Numan Siddique <numans@ovn.org>
Commit 8d13579 creates a chassisredirect port for a logical switch port (of type 'router') in certain scenarios and 'op->nbsp' can be NULL. The following crash is reported by Sanitizer. ==84927==ERROR: AddressSanitizer: SEGV on unknown address ==84927==The signal is caused by a READ memory access. ==84927==Hint: address points to the zero page. #0 0x57ab3854f04a in hmap_first_with_hash ovn/ovs/./include/openvswitch/hmap.h:360:38 #1 0x57ab3854fedf in smap_find__ ovn/ovs/lib/smap.c:421:5 #2 0x57ab3854f7b8 in smap_get_node ovn/ovs/lib/smap.c:217:12 ovn-org#3 0x57ab3854f74f in smap_get_def ovn/ovs/lib/smap.c:208:30 ovn-org#4 0x57ab3854f726 in smap_get ovn/ovs/lib/smap.c:200:12 ovn-org#5 0x57ab3854f862 in smap_get_int ovn/ovs/lib/smap.c:240:25 ovn-org#6 0x57ab383222eb in ovn_port_assign_requested_tnl_id ovn/northd/northd.c:4372:27 ovn-org#7 0x57ab383072fa in build_ports ovn/northd/northd.c:4454:9 ovn-org#8 0x57ab38301457 in ovnnb_db_run ovn/northd/northd.c:18023:5 ovn-org#9 0x57ab3841d99e in en_northd_run ovn/northd/en-northd.c:137:5 ovn-org#10 0x57ab384599b2 in engine_recompute ovn/lib/inc-proc-eng.c:411:5 ovn-org#11 0x57ab38459d6e in engine_run_node ovn/lib/inc-proc-eng.c:473:9 ovn-org#12 0x57ab38459ec3 in engine_run ovn/lib/inc-proc-eng.c:524:9 ovn-org#13 0x57ab38430c5d in inc_proc_northd_run ovn/northd/inc-proc-northd.c:420:5 ovn-org#14 0x57ab3841bb2f in main ovn/northd/ovn-northd.c:970:32 This patch fixes this issue. It also corrects some typo introduced by the commit - changes the reference from "chassisresident" to "chassisredirect". Fixes: 8d13579 ("Add support for centralize routing for distributed gw ports.") Reported-by: Felix Huettner <felix.huettner@mail.schwarz> Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2024-August/416264.html Acked-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Numan Siddique <numans@ovn.org> (cherry picked from commit 9fbda4a)
No description provided.