-
Notifications
You must be signed in to change notification settings - Fork 47
Enable TLS-e by default #754
Enable TLS-e by default #754
Conversation
I'll be super happy when this lands, but we're not quite there yet. Specifically we're waiting on some work to be completed on the control plane side. The control plane is not yet enabled for internal TLS by default. |
Build failed (check pipeline). Post https://review.rdoproject.org/zuul/buildset/0488afe1d9834cf98c2fe8a38188c724 ✔️ openstack-k8s-operators-content-provider SUCCESS in 1h 39m 09s |
oh ok so we need a patch to enable internal tls on the contolplane first and then use depend on form this to that. to test them together before we can proceed with this |
ok so this fails copying the cert the same way it was failing for me locally last week i tried enabling internal tls after the contolane and daata plane was intillly deployed and it never seamed to create the tls keys for the the compute node |
there does appar to be a cert for the compute with a .ci-rdo.local domain and another her its not clear to me why the generic role for copying the cert failed but as the pod output shows it did not copy the libvirt service certs correctly and we can confirm that by looking at |
ah there is a bug in the generic edpm_install_certs role @vakwetu the combined secte has compute-0.ci-rdo.local-tls.key but its trying to copy so the generic role is copying using the hostname but the comnined secrert is using the fqdn this is likely because the hostname filed is the fqdn but the node name is the hostname not the fqdn. this can either be fixed in the ansible code or in the dataplane operator go code but we need to deciend on one or the other. |
@SeanMooney nice catch. The generic secret is using the node hostName as in here: https://github.com/openstack-k8s-operators/dataplane-operator/blob/main/pkg/deployment/cert.go#L136 , whereas the install-certs role is copying using the inventory_hostname - https://github.com/openstack-k8s-operators/edpm-ansible/blob/f98ed5f09467620a00e2936c27a80120b634a56c/roles/edpm_install_certs/tasks/copy_certs_and_keys.yaml#L49 I'm not sure, but is inventory_hostname being populated here :
I have no particular preference for either method. However, as I'll be at a conference for the next few days - please go ahead and choose one accordingly. I'll review as I can. |
thanks this is exactly what i was expcting one is while i generally recommend having unique hostnames not just unique fqdns it might be better to update the role to use ansible_hostname instead of inventory_hostname inventory_hostname in this case is edpm-compute-0
is translated into the following in the inventory
so if we use host_vars[inventory_hostname][canonical_hostname] we will match the key in the cert secrete i might just use host_vars[inventory_hostname][canonical_hostname] |
i believe openstack-k8s-operators/edpm-ansible#596 will fix this |
check-rdo lets see if the depends on resolves the name missmatch |
Build failed (check pipeline). Post https://review.rdoproject.org/zuul/buildset/0205b3f0fa9546cc89a7e2d170b73999 ❌ openstack-k8s-operators-content-provider FAILURE in 9m 42s |
check-rdo content-provider failed |
Build failed (check pipeline). Post https://review.rdoproject.org/zuul/buildset/a4ffc2ccb483470990dd2e88d9d95ca9 ❌ openstack-k8s-operators-content-provider FAILURE in 7m 06s |
we are getting connection resets in the content provider Get "https://proxy.golang.org/github.com/openstack-k8s-operators/lib-common/modules/common/@v/v0.3.1-0.20240229121803-169ced56d56e.mod\": read tcp 38.102.83.153:51364->142.251.33.177:443: read: connection reset by peer", "make: *** [Makefile:412: gowork] Error 1" ill try this again later and see if i can run this locally in the mean time |
check-rdo lets see if the package download issues have resolved themselves |
Build failed (check pipeline). Post https://review.rdoproject.org/zuul/buildset/be302e29aea64931bed5e245b255b4fb ❌ openstack-k8s-operators-content-provider FAILURE in 9m 24s |
ok so my fix as written wont quite work given
the dataplae operator calulats the host names for edpm-comptue-0 as
and canonical host name is calculated as edpm-compute-0.ctlplane.example.com the key in the libvirt secrete however is so canonical_hostname will not always match that key. this is what the relevnet part of the inventory looks like
so really to fix this we need a change to the go code. we need to normalise the keys of the secrete to either rthe inventory_hostname or the canonical name |
the Depends-On: openstack-k8s-operators/edpm-ansible#596 should now be flipped and that should depend on this patch but ill wait for the ci jobs to complete before doing that. --- later--- the job default change cant happen until the ansible code change an dhte ansible code change requires the go code change. in anycase we like will want to merge both together if we decide to proceed with these patches. we could merge the ansible change first since no job is testing it yet but it will technically break deployment that done have the contolpalne FQDN set in the hostname field of the nodes install_yamls does not use the fqdn but the ci_framework does in the jobs so this would only be a minor local issues while we with for both to merge. |
ok cool so this work for tempest i just need to fix the kuttl tests. |
Build failed (check pipeline). Post https://review.rdoproject.org/zuul/buildset/4cf9340b223b4335afabf15da2241ba5 ✔️ openstack-k8s-operators-content-provider SUCCESS in 2h 16m 59s |
This is orthogonal to what we've had in earlier OSP versions where it was an opt-in. Have we discussed and agreed on the general and upgrade implications for customers and normal non-production deployments? |
yes the aggree strage is that this will not be change as part of adoption. for greenfiled the intent is that all new greenfeild deployments should be tls-e but you can use the same option to opt our in greenfiled as is used for adoption. i do not belive there is any intent to force tls-e only we did discuss that as an option and decide the impact to customer was not warranted and default to on was sufficent. |
TLS-e is intended to be on by default with an explict opt-out when not required. This change has already been done for the contolplane and this commit does the same for the dataplaneNodeSet This change updates the dataplaneNodeset TLSEnabled default to true the envtests are modifed to account for this change and the dataplane-create-test kuttl test is exented to assert the default behavior. exisiting test covergeage fo tls supprot is unmodifed. The base example cr is also updated to make it clear that tls is the default however that change is not strictly requried. With this change the tempest jobs should now default to useing TLS unless they explictly opt out. Related: OSPRH-2382 Signed-off-by: Fabricio Aguiar <fabricio.aguiar@gmail.com>
Not sure how this PR affects tempest |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: fao89, SeanMooney The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
4059a52
into
openstack-k8s-operators:main
/lgtm |
As there is no longer an external dependency, TLS-e can now be enabled by default (with explicit opt-out e.g for adoption). Depends-On: openstack-k8s-operators/dataplane-operator#754 Related: OSPRH-2038
As there is no longer an external dependency, TLS-e can now be enabled by default (with explicit opt-out e.g for adoption). Depends-On: openstack-k8s-operators/dataplane-operator#754 Related: OSPRH-2038
As there is no longer an external dependency, TLS-e can now be enabled by default (with explicit opt-out e.g for adoption). Depends-On: openstack-k8s-operators/dataplane-operator#754 Related: OSPRH-2038
It's required with tls enabled, [1] already enabled tls by default so need to include the install-certs service before the services requiring the certs. The issue should be visible once dataplane-operator is updated[2] [1] openstack-k8s-operators/dataplane-operator#754 [2] openstack-k8s-operators/openstack-operator#732
It's required with tls enabled, [1] already enabled tls by default so need to include the install-certs service before the services requiring the certs. The issue should be visible once dataplane-operator is updated[2] or tlsEnabled flag is explicitly set in the nodeset. [1] openstack-k8s-operators/dataplane-operator#754 [2] openstack-k8s-operators/openstack-operator#732
It's required with tls enabled, [1] already enabled tls by default so need to include the install-certs service before the services requiring the certs. The issue should be visible once dataplane-operator is updated[2] or tlsEnabled flag is explicitly set in the nodeset. Also update nova-custom-ceph dataplane service to include missing tls params. [1] openstack-k8s-operators/dataplane-operator#754 [2] openstack-k8s-operators/openstack-operator#732
It's required with tls enabled, [1] already enabled tls by default so need to include the missing certs. The issue should be visible once dataplane-operator is updated[2] or tlsEnabled flag is explicitly set in the nodeset. [1] openstack-k8s-operators/dataplane-operator#754 [2] openstack-k8s-operators/openstack-operator#732
It's required with tls enabled, [1] already enabled tls by default so need to include the install-certs service before the services requiring the certs. The issue should be visible once dataplane-operator is updated[2] or tlsEnabled flag is explicitly set in the nodeset. Also update nova-custom-ceph dataplane service to include missing tls params. [1] openstack-k8s-operators/dataplane-operator#754 [2] openstack-k8s-operators/openstack-operator#732
It's required with tls enabled, [1] already enabled tls by default so need to include the missing certs. The issue should be visible once dataplane-operator is updated[2] or tlsEnabled flag is explicitly set in the nodeset. [1] openstack-k8s-operators/dataplane-operator#754 [2] openstack-k8s-operators/openstack-operator#732
It's required with tls enabled, [1] already enabled tls by default so need to include the install-certs service before the services requiring the certs. The issue should be visible once dataplane-operator is updated[2] or tlsEnabled flag is explicitly set in the nodeset. Also update nova-custom-ceph dataplane service to include missing tls params. [1] openstack-k8s-operators/dataplane-operator#754 [2] openstack-k8s-operators/openstack-operator#732
As there is no longer an external dependency, TLS-e can now be enabled by default (with explicit opt-out e.g for adoption). Depends-On: openstack-k8s-operators/dataplane-operator#754 Related: OSPRH-2038
As there is no longer an external dependency, TLS-e can now be enabled by default (with explicit opt-out e.g for adoption). Depends-On: openstack-k8s-operators/dataplane-operator#754 Related: OSPRH-2038
As there is no longer an external dependency, TLS-e can now be enabled by default (with explicit opt-out e.g for adoption). Depends-On: openstack-k8s-operators/dataplane-operator#754 Related: OSPRH-2038
As there is no longer an external dependency, TLS-e can now be enabled by default (with explicit opt-out e.g for adoption). Depends-On: openstack-k8s-operators/dataplane-operator#754 Related: OSPRH-2038
As there is no longer an external dependency, TLS-e can now be enabled by default (with explicit opt-out e.g for adoption). Depends-On: openstack-k8s-operators/dataplane-operator#754 Related: OSPRH-2038
TLS-e is intended to be on by default with an explict opt-out
when not required. This change has already been done for the
contolplane and this commit does the same for the dataplaneNodeSet
This change updates the dataplaneNodeset TLSEnabled default to true
the envtests are modifed to account for this change and
the dataplane-create-test kuttl test is exented to assert the default
behavior.
exisiting test covergeage fo tls supprot is unmodifed.
The base example cr is also updated to make it clear that tls is the
default however that change is not strictly requried.
With this change the tempest jobs should now default to useing
TLS unless they explictly opt out.
Related: OSPRH-2382