From 384cae283f17e6566f179cb165f896735bc7b238 Mon Sep 17 00:00:00 2001 From: Jian Wang Date: Fri, 31 May 2024 10:38:56 +0200 Subject: [PATCH] Add security related best practices Signed-off-by: Jian Wang --- .../harvester_security_best_practice.md | 164 ++++++++++++++++++ 1 file changed, 164 insertions(+) create mode 100644 kb/2024-05-31/harvester_security_best_practice.md diff --git a/kb/2024-05-31/harvester_security_best_practice.md b/kb/2024-05-31/harvester_security_best_practice.md new file mode 100644 index 00000000..bf808bd6 --- /dev/null +++ b/kb/2024-05-31/harvester_security_best_practice.md @@ -0,0 +1,164 @@ +--- +title: Best Practices for Harvester Security +description: A set of best practices for Harvester security. +slug: harvester_security_best_practices +authors: + - name: Jian Wang + title: Staff Software Engineer + url: https://github.com/w13915984028 + image_url: https://github.com/w13915984028.png +tags: [harvester, security] +hide_table_of_contents: false +--- + +# User Provided Credentials on Harvester + +When [installating a Harvester cluster](https://docs.harvesterhci.io/v1.2/install/index#installation-steps), you will provide: + +- `Cluster token` to the initial node, and all following nodes will use this `Cluster token` to join the cluster. + +- `Password` for the default Linux user `rancher` on each node. + +- `SSH keys` (optional) on each node. + +- `HTTP proxy` (optional) on each node. + +## Cluster Token + +### Change the Cluster Token on the none-initial Node + +When a node fails to join the cluster due to cluster token error, you can follow the link below to modify it. + +https://docs.harvesterhci.io/v1.2/troubleshooting/index/#modifying-cluster-token-on-agent-nodes + +### Change the `Cluster token` (`RKE2 Token Rotation`) + +This is not supported on Harvester. + +Harvester has an embedded RKE2, and according the RKE2 document [token rotation](https://docs.rke2.io/security/token), RKE2 (Available as of 2023-11 releases (v1.28.3+rke2r2, v1.27.7+rke2r2, v1.26.10+rke2r2, v1.25.15+rke2r2).) supports to rotate the `cluster token` using such command `rke2 token rotate --token original --new-token new`. + +But when testing this on Harvester `v1.3.0` with RKE2 version `RKE2_VERSION="v1.27.10+rke2r1"`, the funtion is not working as expected, as Harvester is based on `RKE2 r1`, not `r2`. + +1. Rotate the token on initial node. + +``` +/opt/rke2/bin $ ./rke2 token rotate --token rancher --new-token rancher1 + +WARNING: Recommended to keep a record of the old token. If restoring from a snapshot, you must use the token associated with that snapshot. +WARN[0000] Cluster CA certificate is not trusted by the host CA bundle, but the token does not include a CA hash. Use the full token from the server's node-token file to enable Cluster CA validation. +Token rotated, restart rke2 nodes with new token +``` + +2. Reboot the initial node, the RKE2 cann't start. + +RKE2 log: + +``` +... +May 29 15:45:11 harv41 rke2[3293]: time="2024-05-29T15:45:11Z" level=info msg="etcd temporary data store connection OK" +May 29 15:45:11 harv41 rke2[3293]: time="2024-05-29T15:45:11Z" level=info msg="Reconciling bootstrap data between datastore and disk" +May 29 15:45:11 harv41 rke2[3293]: time="2024-05-29T15:45:11Z" level=fatal msg="Failed to reconcile with temporary etcd: bootstrap data already found and encrypted with different token" +May 29 15:45:11 harv41 systemd[1]: rke2-server.service: Main process exited, code=exited, status=1/FAILURE +... +``` + +:::Warning + +Before Harvester officially supports this feature, don't try it on your cluster, even though the embedded rke2 binary has the `token rotate` option. + +::: + +## Change the Password of the Default User `rancher` + +The process is node specific, if your cluster uses same password for all nodes then you need to change it on each node. + +https://docs.harvesterhci.io/v1.2/install/update-harvester-configuration/#password-of-user-rancher + +## Change the SSH keys + +https://docs.harvesterhci.io/v1.2/install/update-harvester-configuration#ssh-keys-of-user-rancher + +## Change the HTTP Proxy + +After a Harvester cluster is installed, you can change the HTTP Proxy from Harvester UI. + +https://docs.harvesterhci.io/v1.2/advanced/index#http-proxy + +Alternatively, you can use `kubectl` or reset api of path `//harvesterhci.io.setting/http-proxy` to update it. + +``` +$ kubectl get settings.harvesterhci.io http-proxy -oyaml + +apiVersion: harvesterhci.io/v1beta1 +default: '{}' +kind: Setting +metadata: + creationTimestamp: "2024-05-13T20:44:20Z" + generation: 1 + name: http-proxy + resourceVersion: "5914" + uid: 282506bb-f1dd-4247-bf0e-93640698c1f5 +status: {} +``` + +Harvester has webhook to check this setting to make sure those internal IPs/CIDRs are in the `noProxy`. + +:::note + +It is not recommended to directly change it from the files under `/oem` path on the host with following considerations: +1. You need to configure it on each node manually. +1. The local file is not populate to new node automatically. +1. Without the help of the webhook, some error configurations may not be detected on the right time, e.g. [Node IP should be in noProxy](https://github.com/harvester/harvester/pull/5824). +1. Harvester may change the file name/content structure in the future. + +::: + +# More + +## `auto-rotate-rke2-certs` + +Harvester is built on top of `Kubernetes`, `RKE2`, `Rancher`. For those kubernetes, `RKE2` generates a list of `*.crt` and `*.key` files for those `kubernetes` components to work. The `*.crt` file will expire in 1 year by default. + +``` +$ ls /var/lib/rancher/rke2/server/tls/ -alth + +total 156K +-rw-r--r-- 1 root root 570 May 27 08:45 server-ca.nochain.crt +-rw------- 1 root root 1.7K May 27 08:45 service.current.key +-rw-r--r-- 1 root root 574 May 27 08:45 client-ca.nochain.crt +drwxr-xr-x 2 root root 4.0K May 13 20:45 kube-controller-manager +drwxr-xr-x 2 root root 4.0K May 13 20:45 kube-scheduler +drwx------ 6 root root 4.0K May 13 20:45 . +drwx------ 8 root root 4.0K May 13 20:45 .. +-rw-r--r-- 1 root root 3.9K May 13 20:40 dynamic-cert.json +drwx------ 2 root root 4.0K May 13 20:39 temporary-certs +-rw------- 1 root root 1.7K May 13 20:39 service.key +-rw-r--r-- 1 root root 1.2K May 13 20:39 client-auth-proxy.crt +-rw------- 1 root root 227 May 13 20:39 client-auth-proxy.key +-rw-r--r-- 1 root root 1.2K May 13 20:39 client-rke2-cloud-controller.crt +... +-rw-r--r-- 1 root root 1.2K May 13 20:39 client-admin.crt +-rw------- 1 root root 227 May 13 20:39 client-admin.key +... + + +$ openssl x509 -enddate -noout -in /var/lib/rancher/rke2/server/tls/client-admin.crt + +notAfter=May 13 20:39:42 2025 GMT +``` + +You may encounter this on upgrading or node rebooting when a cluster has been running more than 1 year, the [workaround](https://github.com/harvester/harvester/issues/3863#issuecomment-1539681311) is to delete the related files ad restart the POD. + +From Harvester v1.3.0, a new setting `auto-rotate-rke2-certs` is added, when enable this setting with a proper rotation time, the cluster will rotate the files automatically. + +Harvester document will be updated by [PR 573](https://github.com/harvester/docs/pull/573) + +:::note + +It is strongly recommended to configure this setting on your cluster. + +::: + +## Renew Harvester Cloud Credentials + +Refer the knowledge basde [Renew Harvester Cloud Credentials](https://harvesterhci.io/kb/renew_harvester_cloud_credentials).