-
Notifications
You must be signed in to change notification settings - Fork 209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Add content-encoding support to spec for posting blobs #235
Comments
content encoding supports zstd its in the registry https://www.iana.org/assignments/http-parameters/http-parameters.xhtml#content-coding |
The main issue with changing to content encoding is handling clients who cant understand the format, as decompressing and eg recompressing with gzip is expensive on the server side, assuming that the server is storing blobs with zstd. Also the client can't verify the blobs until uncompressed. |
@justincormack Yeah, but I think supporting Content-encoding is a nice "SHOULD" to have in the registry spec. I understand that decompressing / recompressing gzip is expensive. You can keep the content compressed ahead of time if you want server side. On the other hand, streaming decompression + hashing isn't that much overhead compared to CPU for home users. CPU has gotten way cheaper, but 10Mbps throughput still isn't common for lots of folks. |
I am a little worried about giving the registry more visibility into blobs. Right now, they are completely opaque to the registry until you push a descriptor that references them, so registries generally don't try to do anything cute with blobs, which gives flexibility to clients. My concern also applies to #101 I don't feel strongly enough about this for it to sway my opinion on either issue, but I wanted to bring it up as a discussion point. If we give registries more information, it becomes possible to act on that information, even if we have spec language that says something like:
|
@jonjohnsonjr I believe the content-encoding could still be opaque if the registry did not want to do "transcoding" -- it would just only be allowed to serve content with the same content encoding (or reject it, and only allow for the identity encoding). On the other hand, from an efficiency perspective, uploading uncompressed data has some major benefits.
|
Why? This seems like it might limit the ability to cross-repo mount blobs.
I disagree with this point. There are OCI-specific media types defined, but that does not exclude other media types from being used as long as they conform with RFC 6838. I think a lot of folks (include myself) have made some incorrect assumptions about media types in such a way that the image spec is internally inconsistent, e.g. with the zstd stuff. The OCI image spec is not the place for specifying these things, though documenting them there is reasonable. There is a standard process for registering media type suffixes that was not followed for the zstd layer proposals. I poked enough people to correct that by getting +zstd added to the structured suffix registry. There's no reason that other encodings couldn't be used (aside from clients not understanding them) if they are similarly added via the IANA process.
How would this work for pushing and pulling? I'm curious about how this would affect the current blob upload flow, digest calculations, and client behavior around descriptors. Because a client is responsible for compression currently, it's trivial to produce an appropriate descriptor by counting and hashing the bytes as they're uploaded. How can a client do this if the registry has the freedom to change the encoding? Doesn't this largely break anything that depends on the properties of the CAS?
I think this is important, as it opens us up to zip bombs. If we have a trusted client, we can trust the descriptor they produced. Currently, that means we can download exactly the number of bytes we expect from the registry and hash them to ensure that the content matches what was uploaded before attempting to decompress anything. If we have to decompress the blobs to verify the contents, a malicious registry server could effectively DoS the client. |
I think I know what you're getting at with this, and if I'm right, I agree with you. We'd just need to clarify what registries and clients SHOULD support. To achieve this, clients should upload the uncompressed contents, but use compression during transport, as negotiated by This would mean that the resulting descriptors that point to these blobs (e.g. a manifest's As you described, this has a ton of benefits registry side, as they can store, dedupe, and manage content however is most efficient for them. Clients are still free to do what they are doing today, if that would be inefficient for them. This would end up burning some CPU in certain cases, but we can specify that registries and clients SHOULD support both gzip and zstd compression during transport for broadest compatibility. If a client or registry doesn't want to store the uncompressed content or re-compress it themselves, they can always tee the compressed content to disk while de-compressing the blob to verify the uncompressed content's digest and size.
My concern here goes away if the descriptor has a size for the uncompressed contents (a bit). We can know an upper limit to how much should be read, at least, as you would expect compression to almost always result in smaller blobs than uncompressed content (arguably, if a registry is using an encoding that increases the size of the uncompressed content, that's probably a bug, because it should be smart enough to know that compressing the thing actually made it bigger). This all seems fine and good to me -- even better than what we're doing today. I don't think we need any changes to the image-spec for this, at all. My only concerns are around registry and client compatibility. I expect that if you started doing this today, registries wouldn't even think to compress things or try to negotiate the Content-Encoding, so we'd end up just shipping around uncompressed tarballs for a while until everything got fixed. I'd also expect some clients just assume layers are always gzipped, in spite of the spec, so it would take a while to squash all the bugs around that. |
@jonjohnsonjr You've got it!
I would rather not have to go through the process of adding each encoding that IANA adds. IMHO, it's better that we do it "once" -- and that "once" is relying on the HTTP spec to define the base content encodings. If we want to extend it (custom zstd dictionaries or xz, that's our prerogative). The way that this would ~roughly work is that people can push / pull blobs and the registry can store the unencoded format, or a trivially compressed format. Periodically (let's say 1 / day), "hot" blobs would be highly compressed with a purpose built dictionary.
The digest would be the contents of the unencoded file. As the client is pulling the file, it can process the hash in a streaming manner. My reasoning for the client being responsible for choosing the encoding at pull / push time is that:
The blobs would be addresses by their unencoded values. Compression can happen in the background as described above. In regards to ZIP bombs, all of the encodings (xz, br, zstd, deflate) that are popularly supported have had hardening against this, either from the browser / HTTP client vendors after years of hard work (deflate), or are built into the encoding themselves (zstd). |
I'm not sure that I understand your point here, so apologies if I'm just repeating what you've said (again). I want to lean heavily on existing HTTP standards for this. Per RFC 7231, acceptable content-codings are defined by IANA in the HTTP Content Coding Registry. I am not interested in us maintaining a parallel set of content-codings, but would prefer that anything novel here goes through the standard RFC process. This might be a great place to discuss and experiment with such things, which eventually get turned into an RFC, and then standardized through IANA, but defining them in an OCI spec is somewhat of a layering violation. I agree that the IANA process is pretty heavy, but for something as fundamental as this, I think it's appropriate. In general, I'm heavily in favor of anything where we just rely on existing HTTP semantics and clarify that registries are expected to support them. You can get what we have today with We don't even need to really specify much about which content-codings a registry or client should support, since that's already part of the content-negotation process in the RFC, right? I guess to summarize my opinion: Yes, this is a good idea. We should be doing this, and nothing in the spec forbids it, but it might make sense to call this out as expected behavior. I am interested in seeing client and registry implementations that take advantage of this. I am also interested to know if any registries support this today, and how many clients would be broken by this. |
@jonjohnsonjr Sorry, my comment was somewhat flippant. Moreso, support for content-encoding comes "for free" when IANA approves it an my HTTP libraries + load balancers + backend storage get support. Although it's not meant to be a hop-hop header, proxies can do encoding on behalf of the workload. |
Not at all, I've conflated
I missed this originally -- you want a parallel version of this that corresponds to
I now understand your original point here. The
This is interesting to me -- I'm not sure why Do you have more context for this? I haven't read through all of the background, but I want to suggest we just start using Transfer-Encoding and in parallel see if we can get this "fixed" to include the new compression standards. I guess either of these would make sense, depending on the registry's implementation, and would be orthogonal to the @felixhandte any background? |
The content registry (as of this point in time --- https://www.iana.org/assignments/http-parameters/http-parameters.xhtml#content-coding):
I believe that in HTTP/2 and HTTP/1.1, CE has superseded TE as you can basically do everything with CEs that you can with TEs:
There's also the RE: |
My understanding matches, I think, what is being discussed here. Outside of Without knowing much about the context of this topic, I'll offer generic advice: The fundamental question (that is sort of touched on in Some pros and cons:
|
That matches my understanding and makes me feel better about conflating
Yep! This is exactly what's being proposed, so I feel like we are heading in the right direction.
This was also my main concern, but I think we are somewhat in the clear here because we know the exact size and hash of the expected output from decompression, based on the content descriptor (which is most of the context you're probably missing, though I don't expect you to dig into this at all 😄). Thanks again! |
Given decompression can happen at "wire speed" these days, is it so much more overhead than doing traditional validation itself? Maybe the above proposal could be split into two:
|
The idea to improve compression options is goodness. We need the means to evolve as new performant options come available, and the network topologies evolve.
Context: There should be some generic libraries to support new compression formats that each artifact tooling could support. I just don't think we want to make registries "smart" to have to engage in this processing and figure it out for all artifact types. The registry supports blobs. How a client compresses those blobs is up to each artifact type's tooling. This simplicity and separation of concerns has allowed us to scale to all types of scenarios. I'm all for better compression and shared libraries to enable evolving forms of compression. Even asking a registry if a multi-arch tag has different versions. I'd just suggest this is an artifact-specific decision, and the image-spec can and should evolve to support any range of formats it believes are necessary. |
This is just a natural part of HTTP. Registries speak HTTP, so it's reasonable to expect they could take advantage of this. It's also negotiated with the client, so a "dumb" registry that doesn't want to support any of this works just fine.
I don't think there were any issues captured above other than some optional sugar for clients. At least for me, I came to the conclusion that this is an excellent proposal that is pretty trivial to implement both registry side and client side.
Registries wouldn't need to do anything for clients that started trying to take advantage of this. The benefits really arise when both clients and registries start doing this, but I would expect most registries could support a no-op version of this today without any changes. The client would just be uploading uncompressed blobs in that case. To expand on this a bit, for media types ending in +gzip or +zstd, registries should keep doing what they're doing today. The blobs are opaque and they don't care about them. For media types that are not compressed, the registry and client can perform this content-encoding negotiation to determine the best format to use when uploading blobs. On the backend, the registry can do the same content-encoding negotiation when the client downloads blobs. Again, for registries that don't do any content-encoding negotiation, this would still work, but they'd lose out on possible compression savings.
The manifest would not change on the server. The content-encoding negotiation happens as part of the HTTP protocol. How clients interpret the media types within the descriptor does not change.
See my comment: #235 (comment) You could drop This unshackles us from gzip in a backward compatible way, so we could actually start rolling out zstd support, for which we currently have no transition story.
I think you're really misunderstanding the proposal. This wouldn't require registries to do anything extra unless it would be beneficial to them. It gives registries the flexibility to trade off CPU (this can often be offloaded to hardware) to save storage.
You are not expected to support an infinite number of compression formats. The entire point of the negotiation is to find a subset that both client and registry support. Worst case, you can just store and return the uncompressed bytes directly. Best case, the client supports the compression format that is most convenient for you.
I think this is a really naive way to think about software. Not all software is written in the same language, so I don't understand what you're proposing here? These generic libraries should already exist for basically anything that's part of the IANA standards (which this proposal would lean on) because it's just compression. We don't need anything specific to OCI for registries to do compression.
Again, we discussed this at length. The registry can continue to be dumb, but this proposal uses a standard mechanism from HTTP to progressively enhance the experience of content transport. Right now, most manifests are being produced with the compression hard-coded as part of the media type. That's fine, and will always work, but compression should be an implementation detail of distribution, as it's not inherent to the nature of the content.
This seems like an incomplete thought and I don't understand how architecture would relate to compression.
This doesn't change that at all, it just allows for compression to be an implementation detail of transport instead of a quality of the artifact. If we can get broad support for this, we could give guidance that artifacts should generally reference content by its uncompressed hash and use content-encoding negotiation for efficiency at transport time, if needed. Of course, everything that currently works today would still work. |
Sitting down to read this issue with a single cup of coffee was not enough 😑 But, I am a huge fan of checksumming only the uncompressed blob, and allowing ourselves the flexibility of choosing whatever compression (or level thereof or flavor of implementation). When we discussed this ages ago, the concern was for the 3-5x increase in storage for the registry maintainers. (who honestly could compress, chunk, or deflate in many ways); and for breaking existing/old clients With that in mind, it is worth a quick PoC to show this new media-type served up and the behavior expected for clients that may never update. Along with a recommendation/methodology for registry maintainers to effectively serve up existing known tarballs whose checksum is bound to the golang gzip to older clients, while allowing new clients to fetch with say |
😆 Yeah, this would definitely benefit from some more concrete examples. |
We have two clients (push vs pull) and a registry as variables here. For pushing an image, I think we can have a graceful fall back. If the client pushing doesn't support it, then we have the situation today with The risk of uncompressed transfers is if the client pushing and registry both support this, but the client pulling does not. I'm thinking of three options:
I can see implementations starting with option 1 and then switching to option 2 after enough time has passed and clients pushing are switched to use the new method for all pushes. |
Stumbled across this in an unrelated thread: golang/go#30829 (comment) @cben had some thoughts on how this would interact with range requests. I'm not sure I understand the problem, but I think it's relevant :) |
I think that this post is probably best split into two sections, which:
But, waiting for my other distribution spec proposal to go through before proposing anything scary. |
if a manifest refers to uncompressed blobs, and client donwloads with Content-Encoding negotiation to select the best supported compression, it still can rely on manifest's layer size to constrain the decompress process so it can't exceed expected size for output stream, so a gzip-bomb would be easily blocked. |
How do we envision this change rolling out through the ecosystem? I'm assuming the fallback is storing and/or transmitting uncompressed blobs. Will this only be used for new use cases (e.g. artifacts) or do we plan to initially roll this out for container images? Will build systems avoid creating images with uncompressed layers by default until after deployed runtimes and registries have been upgraded to support compression on transport? |
I haven't found any path, including hack-ish ones, to avoid waiting for runtimes and registries to be upgraded :'( |
As we want to future proof the protocol. I suggest we add Content-Encoding type. This would be around making it so that content can be pushed, and individual chunks can be compressed given the compression format that best fits their use case, as opposed to whatever the media type dictates. This enables us in the future to move to better formats. I specifically think that this should be supported on the upload step.
Flow
In the step where the user creates a new session /v2//blobs/uploads/, the server responds with the
Distribution-accept-encoding: ....
, where...
, uses the standard mechanism of listing encoding accepted, and supported the quality value described in the HTTP spec (https://tools.ietf.org/html/rfc7231#section-5.3.1).For example:
The other aspect that I think we should require is that the user use the same content-encoding for all blobs in a multi-blob push.
FAQ
Question: Why not return a body like:
Answer: This endpoint does not return a payload today, and it's unknown whether returning a payload would break any clients. IMHO, we should move the location / session info into a JSON body, but that's a bigger question
Question: Why not use transfer encoding?
Answer: It doesn't seem to have support of new compression standards.
Question: Why don't we just add media types with all possible encoding standards?
Answer: Those are largely dictated by the image spec. The current behaviour of coupling the distribution and image spec tightly makes it difficult to adopt new things in distribution. Since distribution has come out, zstd, br, and others have become much better than gzip in many ways. Lifting and shifting today would require a massive change or on the fly conversion, which isn't always possible, especially with cryptographically verified payloads.
Question: Why wouldn't we just zstd all the things?
Answer: It's not always the smallest:
In the day of slow, WFH internet, size matters a lot to people. Scaling up endpoint compute is much easier in many cases.
The text was updated successfully, but these errors were encountered: