-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VC-37264: Disable compression to give us time to work on a fix #628
Conversation
57af677
to
b2b753b
Compare
pkg/agent/config.go
Outdated
if !flags.DisableCompression && res.AuthMode != VenafiCloudKeypair && res.AuthMode != VenafiCloudVenafiConnection { | ||
errs = multierror.Append(errs, fmt.Errorf("--disable-compression=false can only be used with the %s and %s modes", VenafiCloudKeypair, VenafiCloudVenafiConnection)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since I have no way of knowing if the flag has been passed or not, I propose that we only show this warning when --disable-compression=false
is passed. It feels a bit off, and another option would have been to remove this validation, but I didn't want to change the code further than I already had.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've replaced this with a log line that will show whenever the user uses --disable-compression
.
b2b753b
to
9368c19
Compare
We decided to entirely remove the logic behind --disable-compression because the validation logic was looking super weird; we don't think folks should be using --disable-compression=false anyways. If we want to re-introduce a similar flag later on, we can create a new, for example --no-compression.
9368c19
to
20d9b6f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I test it by modifying the E2E test in #629 and rebasing it on top of your branch.
It takes 3-4 minutes for the certificate to appear in the API.
$ make test-e2e-gke
...
{
"ts": 1732294473390.6462,
"caller": "agent/run.go:396",
"msg": "Posting data",
"v": 1,
"logger": "Run.gatherAndOutputData.postData",
"baseURL": ""
}
{
"ts": 1732294473397.594,
"caller": "cache/reflector.go:368",
"msg": "Caches populated for *v1alpha1.VenafiConnection from k8s.io/client-go@v0.31.1/tools/cache/reflector.go:243",
"v": 2
}
{"ts":1732294473950.9778,"caller":"agent/run.go:409","msg":"Data sent successfully","v":0,"logger":"Run.gatherAndOutputData.postData"}
...+.........+........+...+.......+...+.....+.......+..+.+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*....+..+...+............+....+.........+......+..+...+....+.....+.......+.....+....+..+...+...............+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*.........+.+.....+.......+...........+.+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
.......+..........+..+...+............+.......+.......................+.+...........+.+...+..+...+.+...............+..............+.......+.....+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*....+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++*....+..........+........+...+....+......+.....+....+..+...............+...+..........+..+..........+.....+......+.............+..+....+........+.+..+...+..........+.........+......+......+......+..+........................................+...+...+......+.....+...+.+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
-----
secret/venafi-kubernetes-agent-e2e.d56671da-dbb2-4411-be66-d846d145f79c created
{"count":0,"certificates":[]}
{"count":0,"certificates":[]}
{"count":0,"certificates":[]}
{"count":0,"certificates":[]}
{"count":0,"certificates":[]}
{"count":0,"certificates":[]}
{
"count": 1,
"certificates": [
{
"id": "dfc5f820-a8f2-11ef-a949-796f7926985d",
"companyId": "9a0cab61-2b00-11ee-ba09-0733b0fe5adc",
"managedCertificateId": "e08354b0-a8f2-11ef-82d5-bb97de46047d",
"fingerprint": "C8F617E52743F3B1BB762BEE8A43C8B9C3A68364",
"certificateName": "venafi-kubernetes-agent-e2e.d56671da-dbb2-4411-be66-d846d145f79c",
"issuerCertificateIds": [],
"certificateStatus": "ACTIVE",
"modificationDate": "2024-11-22T16:57:37.390+00:00",
"validityStart": "2024-11-22T16:54:38.000+00:00",
"validityEnd": "2025-11-22T16:54:38.000+00:00",
"selfSigned": true,
"signatureAlgorithm": "SHA256_WITH_RSA_ENCRYPTION",
"signatureHashAlgorithm": "SHA256",
"encryptionType": "RSA",
"keyStrength": 2048,
"subjectKeyIdentifierHash": "76A333DE3A2A66700F7AEAB0B55A1422905D567C",
"authorityKeyIdentifierHash": "76A333DE3A2A66700F7AEAB0B55A1422905D567C",
"serialNumber": "0AEEAB504DD6E7576539BFE5C3BEFF453EC65CD4",
"subjectDN": "cn=venafi-kubernetes-agent-e2e.d56671da-dbb2-4411-be66-d846d145f79c",
"subjectCN": [
"venafi-kubernetes-agent-e2e.d56671da-dbb2-4411-be66-d846d145f79c"
],
"subjectAlternativeNamesByType": {
"otherName": [],
"rfc822Name": [],
"dNSName": [],
"x400Address": [],
"directoryName": [],
"ediPartyName": [],
"uniformResourceIdentifier": [],
"iPAddress": [],
"registeredID": []
},
"issuerDN": "cn=venafi-kubernetes-agent-e2e.d56671da-dbb2-4411-be66-d846d145f79c",
"issuerCN": [
"venafi-kubernetes-agent-e2e.d56671da-dbb2-4411-be66-d846d145f79c"
],
"ocspNoCheck": false,
"versionType": "CURRENT",
"totalInstanceCount": 1,
"totalActiveInstanceCount": 0,
"instances": [],
"ownership": {}
}
]
}
It looks like you've basically reverted the original PR while keeping the new CLI flag (for backwards compatibility).
In that case, you could do a sanity check by comparing the diff of the original PR with this diff and make sure that (almost) everything has indeed been returned to its original state.
I have only tested the venaficonnection mode and I have only tested wit the EU Venafi Cloud tenant.
@@ -302,7 +302,7 @@ func InitAgentCmdFlags(c *cobra.Command, cfg *AgentCmdFlags) { | |||
&cfg.DisableCompression, | |||
"disable-compression", | |||
false, | |||
"Disables GZIP compression when uploading the data. Useful for debugging purposes or when an intermediate proxy doesn't like compressed data.", | |||
"Deprecated. No longer has an effect.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional. Consider using the pflag.MarkDeprecated
method instead, which will hide the flag from the --help and therefore from the documented CLI help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, that would be even better.
I've tried this:
c.PersistentFlags().MarkDeprecated("disable-compression", "no longer has an effect")
But it uses logrus' logger, so it doesn't show as JSON (first line of the below snippet):
Flag --disable-compression has been deprecated, no longer has an effect
I1122 18:09:24.792410 25556 run.go:59] "Starting" logger="Run" version="development" commit=""
I1122 18:09:24.793470 25556 config.go:404] "Using the Venafi Cloud Key Pair Service Account auth mode since --client-id and --private-key-path were specified." logger="Run"
I1122 18:09:24.793480 25556 config.go:540] "Using period from config" logger="Run" period="5m0s"
I1122 18:09:24.793486 25556 config.go:592] "The flag --disable-compression has been deprecated an no longer has any effect." logger="Run"
Is that OK?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use MarkHidden instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've decided to use MarkDeprecated
as you initially suggested. It seemed more appropriate and I was able to remove all traces of the DisableCompression
variable, so even better.
I don't think this "plain text" line is a big deal; no one will ever see it anyways, and I can fix it later if I can find a way to set up pflag's logger.
pkg/agent/config.go
Outdated
} | ||
res.DisableCompression = flags.DisableCompression | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$ go run ./ agent --one-shot --api-token should-not-be-required --install-namespace venafi --output-path /dev/null --agent-config-file examples/cert-manager-agent.yaml --disable-compression
I1122 17:06:03.444467 284739 run.go:59] "Starting" logger="Run" version="development" commit=""
I1122 17:06:03.444747 284739 config.go:403] "Using the Jetstack Secure API Token auth mode since --api-token was specified." logger="Run"
I1122 17:06:03.444766 284739 config.go:421] "Using deprecated Endpoint configuration. User Server instead." logger="Run"
I1122 17:06:03.444773 284739 config.go:591] "The flag --disable-compression has been deprecated an no longer has any effect." logger="Run"
I1122 17:06:03.444805 284739 run.go:117] "Healthz endpoints enabled" logger="Run.APIServer" addr=":8081" path="/healthz"
I1122 17:06:03.444816 284739 run.go:121] "Readyz endpoints enabled" logger="Run.APIServer" addr=":8081" path="/readyz"
E1122 17:06:03.445471 284739 run.go:269] "Error messages will not show in the pod's events because the POD_NAME environment variable is empty" logger="Run"
I1122 17:06:03.753982 284739 run.go:322] "Data saved to local file" logger="Run.gatherAndOutputData" outputPath="/dev/null"
One small downside of using MarkDeprecated is that pflag doesn't use our logr.Logger, so it is printed as pain text rather than the format specified in --logging-format.
I'll sanity check tha PR against the original PR on Monday morning before proceeding with the release. |
Ref: VC-37264
Why?
On Nov 21, 2024, the Production Engineering team found that the test pipeline was failing due to venctl 1.15.3. They rolled back to venctl 1.15.2, which made the pipeline work again.
After investigating, we found that the Agent's compression feature broke the Agent. The Agent sends its upload request using
Content-Encoding: gzip
, but the server doesn't process this header and doesn't decompress the body. It copies the contents of the request's body to a bucket without checking its validity. When the backend asynchronously tries to process the contents of the upload, it fails with the error:Solution
After some consideration, we decided to remove entirely the compression logic that was added in #594. The
--disable-compression
flag will no longer do anything, and a log line will be printed when using the flag:Previously, we wanted to simply switch the default value of the
--disable-compression
flag fromtrue
tofalse
; we found the new logic clunky due to the fact that users can set--disable-compression=false
, so we voted against it. We had also considered adding the flag--disable-compression
to the Helm chart but realized that it would impact users that don't use the Venafi Connection mode nor the Key Pair Service Account mode.Manual test
For this test, I've used the tenant https://ven-tlspk.venafi.cloud/. To access the API key, use the user
system.admin@tlspk.qa.venafi.io
and the password is visible in the page Production Accounts (private to Venafi). Then go to the settings and find the API key, and set it in the env varAPIKEY
.The tenant https://ven-tlspk.venafi.cloud/ doesn't have the right tier to pull images, so I use an API key from the tenant https://glow-in-the-dark.venafi.cloud/, that's why I set the env var
APIKEY_GLOW_IN_THE_DARK
. Ask Atanas to get access to the tenant https://glow-in-the-dark.venafi.cloud/.Then:
Then, create three files:
get.sh
,create.sh
, andrm.sh
:Now, run the end-to-end test:
Once it completes, run:
Then, check that the certificate is present in Venafi Control Plane:
You should get something like this:
What was happening before?
Before this change, with v1.3.0, Venafi Kubernetes Agent used to not report secrets at all. It was successfully pushing, but the backend silently failed to process it.