Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dist-spec 1.0.0-rc OPTIONAL marked items #194

Open
rchincha opened this issue Oct 9, 2020 · 17 comments
Open

dist-spec 1.0.0-rc OPTIONAL marked items #194

rchincha opened this issue Oct 9, 2020 · 17 comments

Comments

@rchincha
Copy link
Contributor

rchincha commented Oct 9, 2020

The Requirements section of the spec states the following:

Requirements

Registries conforming to this specification MUST handle all APIs required by the following workflow categories:

Pull - Clients are able to pull from the registry
Push (OPTIONAL) - Clients are able to push to the registry
Content Discovery (OPTIONAL) - Clients are able to list or otherwise query the content stored in the registry
Content Management (OPTIONAL) - Clients are able to control the full life-cycle of the content stored in the registry

Would argue that without the OPTIONAL items, we really don't have a registry.

@jdolitsky
Copy link
Member

I think this came out of some discussion. I see what you're saying.. Would you feel differently if "Push" were not considered optional?

@rchincha
Copy link
Contributor Author

rchincha commented Oct 9, 2020

Let's flip the question around from a client pov. Doesn't it make sense for a client to assume that the registry it is talking to is Pull-conformant and other areas are not? registry-agnostic for Pulls but registry-aware for others?

@jdolitsky
Copy link
Member

I suppose it depends on the client. If I'm running a container in a Kubernetes environment, I only really need to be assured that the images can be downloaded in a common way. How they got there is not important (in that scenario)

@jdolitsky jdolitsky added this to the v1.0.0-rc2 milestone Oct 14, 2020
@jdolitsky
Copy link
Member

There does not seem to be consensus on this.. Going to pull out of the rc2 milestone. If anybody wants, they can open a PR to modify this language a bit cc @mikebrow @SteveLasker

@jdolitsky jdolitsky removed this from the v1.0.0-rc2 milestone Nov 2, 2020
@vbatts
Copy link
Member

vbatts commented Jan 28, 2021

I get your point @rchincha, though this may just be a clarification of the sentence being what qualifies as a registry.
The fact that 99% usage of a registry is only pull.
This gets into the definitions of use-cases that we considered outlining.

@SteveLasker
Copy link
Contributor

Can I ask why we're concerned here?
While it's true that a plane spends most of its time flying, it's pretty important it can land and take off.

While it's true a kubernetes cluster must only pull, the content must get into the registry somehow.
If a registry doesn't want to provide standard oci-distribution spec compliance on push, that's their choice. But, nobody is saying a registry must be compliant. It just doesn't get the certification if it's not.

For instance, ACR is a registry for our users. We fully support push, pull, discover, manage. All oci-distribution spec-compliant CLI experiences should work, and we treat any failure as a bug we would resolve.

On the other hand, for the purposes of security, MCR does not support push. We have a back-end loading process where teams request their content to be moved from their team ACR into MCR.
We have no qualms if MCR shows up in a compliance certification as failing push and manage. But, it should pass on pull and discover.

I thought the main point of breaking out the distinct operations was to provide a categorical breakdown of what features are supported/not supported in a registry. As a comparison, a ski mountain isn't simply red or green. If any trails are open, the mountain is green. The details are which trails are open.

I'd just ask what registry operators/products feel they need a fully green Registry Compliance certification if they don't support Push, Pull, Discover, Manage?

If we can't articulate a good use-case, then what value does it have to say a registry is OCI-Complaint if it can only do one distinct thing?

@jonjohnsonjr
Copy link
Contributor

If we can't articulate a good use-case, then what value does it have to say a registry is OCI-Complaint if it can only do one distinct thing?

I guess mostly because this GitHub repository is called distribution-spec. It is certainly possible to distribute images via the distribution protocol without building or pushing them in the "normal" way (we do this all the time).

I agree that it's useful to define push and discovery things, which is why I didn't fight too hard to remove the tag listing API (even though I don't think it belongs), but I'd take issue with someone claiming that a registry is non-compliant just because they can't push to it.

@SteveLasker
Copy link
Contributor

I guess this really goes back to the requirements conversation. What do we consider the purpose of the distribution-spec is? How can you distribute something if you can't first load it? Implementations can have lots of extra hooks, we all do. But why can't we agree a functioning implementation of the distribution spec, at a minimum, supports push & pull?
What implementations are we worried about not conforming? I've tossed out MCR as an example that won't support push, but that's ok. We're not looking for it to get a badge of some sort.

If we're really going to define it that narrowly, is the distribution-spec really that interesting?

@jonjohnsonjr
Copy link
Contributor

jonjohnsonjr commented Feb 2, 2021

How can you distribute something if you can't first load it?

What does load mean?

But why can't we agree a functioning implementation of the distribution spec, at a minimum, supports push & pull?

Because distributing artifacts doesn't require the ability to push them.

What implementations are we worried about not conforming? I've tossed out MCR as an example that won't support push, but that's ok. We're not looking for it to get a badge of some sort.

For example, nixery.dev. It's impossible for anyone to push to this, not just by policy, but because the images are just projections of nix derivations into a container image. I think it is valuable to be able to claim that a registry like this implements the distribution spec even if it's impossible to push to it.

If we're really going to define it that narrowly, is the distribution-spec really that interesting?

Absolutely. In fact, I think a lot of the value of the distribution spec is in removing things from the original registry spec, not in adding new things to it. The reason OCI even exists is so that we could run containers in a standard way, so a lot of this work is focused around that. Being able to distribute container images in a standard way is necessary to complete the picture here. Managing artifacts in a registry is related, and useful, and something I believe is worth standardizing (as per #22 and #222), but I don't think they're required for a lot of use cases, especially the use cases for which OCI exists.

@rchincha
Copy link
Contributor Author

rchincha commented Feb 3, 2021

If the "distribution spec" is intended to:

  1. be a on-the-wire (since remote access) protocol spec for distribution of container images
  2. be a precise spec (ambiguity has a history of leading to bad things and security holes)
  3. achieve consensus

then, doesn't it make sense to drop all of OPTIONAL language from the spec in its current form?

(OR)

On the other hand, for a microservices environment where building and pushing container images is part of the normal CI/CD pipeline, then wouldn't it more interesting for the "distribution spec" to:

  1. be a on-the-wire protocol spec for the entire container image lifecycle (push, pull and the whole shebang)
  2. be a precise spec which means all of the OPTIONAL language be made MANDATORY
  3. be "nice" to various registries so that they can claim compliance to parts of the spec likely starting with "pull" and we leave room/time to build up full compliance
  4. achieve consensus

Are we really saying that in a CI/CD pipeline, I will need N different "push" clients to N different registries?? In that case I am pretty sure that those N clients will also allow "pull" in their own way, so why bother with the spec at all!

@jonjohnsonjr
Copy link
Contributor

What I actually care about is that there is some distinction for a registry that implements pull, even if it doesn't implement anything else. Especially for things like distributing base images, we want read-only registries that we can claim are compliant with the spec, IMO.

Given that we aren't hung up on trademark issues (maybe we are, but I'm not aware) like kubernetes, there isn't really an interesting legal distinction between being compliant with the spec or not, so does it really matter? Maybe someone has plans for this kind of thing?

I'm somewhat worried that if we make everything mandatory, people will use this spec as a weird political tool. "You don't support this thing we just merged into distribution-spec so your registry is non-compliant" is something I'd like to avoid happening to me, personally.

For example, implementing single-request monolithic blob uploads is really annoying, but it's part of the spec. Does that mean my registry is non-compliant even though I only know of one client on earth that uses that API? Hopefully not.

If we did make everything mandatory, the two largest registries will suddenly be non-compliant, and the spec becomes basically just wishful thinking.

be "nice" to various registries so that they can claim compliance to parts of the spec likely starting with "pull" and we leave room/time to build up full compliance

I like this, and I feel like the conformance tests serve this purpose, so that feels sufficient to me.

Are we really saying that in a CI/CD pipeline, I will need N different "push" clients to N different registries?? In that case I am pretty sure that those N clients will also allow "pull" in their own way, so why bother with the spec at all!

I would guess that most registries that implement the "pull" part will also implement the "push" part. We're not coming up with a new thing that we hope people will implement (though that will come), this is mostly just documenting the state of the world. Your concern here confuses me a bit, given that we have existence proofs to the contrary (see: existing registry implementations).

On the call today, we talked about having a KEP-like process for additions to the API, which is very similar to what @vbatts proposed in #74. I think that lost some steam, but if we're going to partition the spec into mandatory and optional bits, having some formal thing like this might be more interesting than a binary conformance stamp. The conformance tests are already partitioned like this, so maybe we should reify those groupings into the API somehow.

@jdolitsky
Copy link
Member

maybe we should reify those groupings into the API somehow

The new spec is intended to be split up in this way as well

@SteveLasker
Copy link
Contributor

I'm somewhat worried that if we make everything mandatory

The conformance tests break this down into 4 categories today (push, pull, content discovery, content management)
https://github.com/opencontainers/oci-conformance/tree/master/distribution-spec
The question we're having is: what is the minimum bar?

We're just suggesting Push & Pull is that minimum bar.

@jonjohnsonjr
Copy link
Contributor

I think pull is the minimum bar.

@SteveLasker
Copy link
Contributor

Understood. This is where we can get some consensus and a vote on why we need to delcare an implementation that only supports pull is a viable starting point. They're still implementations, just below the minimum.

@rchincha
Copy link
Contributor Author

rchincha commented Feb 4, 2021

Firstly, my apologies if I am nitpicking here.

The legal and political problems are very real, but will ignore those for the following comments.

For example, implementing single-request monolithic blob uploads is really annoying, but it's part of the spec. Does that mean my registry is non-compliant even though I only know of one client on earth that uses that API? Hopefully not.

If we did make everything mandatory, the two largest registries will suddenly be non-compliant, and the spec becomes basically just wishful thinking.

A full disclaimer, that I was not involved with the design of either the dist-spec or the original docker distribution spec.

There are 3 types of uploads - 1) monolithic single push, 2) streaming, and 3) chunked.
If I were to guess, 2) is probably the default - push blobs of reasonable size and let TCP do its thing. But what if either my registry or my client is running on a memory-constrained device like a raspberry-pi or a bad dial-up network, then I would probably choose 3). And if I am on an extreme-speed local network or pushing tiny chunks, then 1-RTT matters, so I would pick 1). Now the dist-spec in its current form allows for all three scenarios, but agreed that this is a bit much and we should probably pick what most registries and clients use today.

Are we really saying that in a CI/CD pipeline, I will need N different "push" clients to N different registries?? In that case I am pretty sure that those N clients will also allow "pull" in their own way, so why bother with the spec at all!

I would guess that most registries that implement the "pull" part will also implement the "push" part. We're not coming up with a new thing that we hope people will implement (though that will come), this is mostly just documenting the state of the world. Your concern here confuses me a bit, given that we have existence proofs to the contrary (see: existing registry implementations).

Most registries implement the original docker distribution or some small variation thereof. However, details matter and conformance tests show that they are not quite the same [1]. For a client (pull and push) that I want to run in my CI/CD pipeline, why can't I get a registry with some standard front-facing interface without worrying about what its underlying implementation is.

Unfortunately, this being a two-legged problem, we have to break the tie somewhere - clients implementing a registry standard, or a registry implementing a client standard.

References:
[1] https://github.com/opencontainers/oci-conformance/tree/master/distribution-spec

@sudo-bmitch
Copy link
Contributor

I feel like the conformance tests separating these, allowing implementations to indicate if they conform to the full spec or only a subset, is a good middle ground between the various requirements. While it would be good for implementations to provide the full API, a significant portion of requests are only to pull data and there are registries that support loading data outside of the OCI APIs. If it wasn't for the headers, it would have been possible to create a pull only registry with a static site generator.

Given the age of this issue and the the release of the spec, is this issue something we should close now @rchincha?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants