Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docs]: retention policies could use more explanation #2867

Open
mikemccracken opened this issue Jan 8, 2025 · 3 comments
Open

[docs]: retention policies could use more explanation #2867

mikemccracken opened this issue Jan 8, 2025 · 3 comments
Labels
feature New feature or request

Comments

@mikemccracken
Copy link
Contributor

Is your feature request related to a problem? Please describe.

It is not very clear from the documentation how the retention policy rules work with each other. there are some more details that help in the PR discussion for #1866 and it would be nice to have some of that detail in the docs.

for example - afaict the description of the rules for mostrecentpush/pull count apply to each repository that matches the "repositories" path separately - so if you have "**" combined with mostrecentlypulledcount: 2, then you will get 2 tags from every repo saved, but this isn't clear from the description

also it is not clear what the difference in behavior (if any) is between these two examples:

    "keepTags": [{                               
        "mostRecentlyPulledCount": 10
      },
     {                               
        "mostRecentlyPushedCount": 10
      }]

and

    "keepTags": [{                               
        "mostRecentlyPulledCount": 10,
        "mostRecentlyPulledCount": 10 
     }]

the example at https://zotregistry.dev/v2.1.0/articles/retention/?h=rete#complete-configuration-file-example has policies where there is only one dict in the keepTags array and where there are two, and it doesn't seem to explain what the difference is.

Describe the solution you'd like

No response

Describe alternatives you've considered

No response

Additional context

No response

@mikemccracken mikemccracken added the feature New feature or request label Jan 8, 2025
@andaaron
Copy link
Contributor

andaaron commented Jan 9, 2025

I'm not 100% sure, below is my understanding, tagging @eusebiu-constantin-petu-dbk, who worked on this, for confirmation.

It is not very clear from the documentation how the retention policy rules work with each other. there are some more details that help in the PR discussion for #1866 and it would be nice to have some of that detail in the docs.

The retention rules under the same policy/keeptags list act as a logical OR.

for example - afaict the description of the rules for mostrecentpush/pull count apply to each repository that matches the "repositories" path separately - so if you have "**" combined with mostrecentlypulledcount: 2, then you will get 2 tags from every repo saved, but this isn't clear from the description

That is correct. Each repo is processed individually.
All image retention and garbage collection processing is made per repo, not per groups of repos.
I think it got overlooked in documentation because we didn't think retaining a specified count of all tags across multiple repos makes sense in a real world scenario.
Different repos represent different content, images witch different purposes, why would someone let zot choose to retain tags from one repo in the detriment of the other repo?

also it is not clear what the difference in behavior (if any) is between these two examples:

    "keepTags": [{                               
        "mostRecentlyPulledCount": 10
      },
     {                               
        "mostRecentlyPushedCount": 10
      }]

and

    "keepTags": [{                               
        "mostRecentlyPulledCount": 10,
        "mostRecentlyPulledCount": 10 
     }]

I believe in your second example you meant actually

     "keepTags": [{                               
         "mostRecentlyPulledCount": 10,
         "mostRecentlyPushedCount": 10 
      }]

Theoretically there shouldn't be. You would want them in separate rules if they matched different "patterns".
The retention rules under the same policy/keeptags list act as a logical OR, but inside a specific keepTags entry there is a logical AND between the tags matched by "patterns" and the ones matched by mostRecentlyPushedCount OR mostRecentlyPulledCount.

For example if you would want to keep the 3 of the most mostRecentlyPulledCount matching v1.*, 2 of the mostRecentlyPushedCount matching pattern v2.*, and 5 each of mostRecentlyPulledCount/mostRecentlyPushedCount regardless of pattern (v1.*, v2.*, v3.*, latest, and so on).

     "keepTags": [{                               
         "pattern": ["v1.*"],
         "mostRecentlyPulledCount": 3 
      },
     {                               
         "pattern": ["v2.*"],
         "mostRecentlyPushedCount": 2 
      },
      {                                
         "mostRecentlyPulledCount": 5,
         "mostRecentlyPushedCount": 5 
      }]

If the same tag matches multiple such rules under "keepTags", if would be retained. Given the same tag may match multiple list entries, there may be less than 15 tags matched even if there are enough tags in the repo.

@eusebiu-constantin-petu-dbk, can you please confirm this specific logic?

the example at https://zotregistry.dev/v2.1.0/articles/retention/?h=rete#complete-configuration-file-example has policies where there is only one dict in the keepTags array and where there are two, and it doesn't seem to explain what the difference is.

As in my example above the example in the docs is mentions different patterns. If patterns is missing, it is considered all tags match the keepsTag rule.

You might also want to check https://zotregistry.dev/v2.1.0/articles/retention/?h=rete#configuration-notes if you haven't already.

@mikemccracken
Copy link
Contributor Author

Thanks @andaaron ! that was very useful. You were right about my typo, I didn't mean to repeat mostRecentlyPulledCount.

The retention rules under the same policy/keeptags list act as a logical OR, but inside a specific keepTags entry there is a logical AND between the tags matched by "patterns" and the ones matched by mostRecentlyPushedCount OR mostRecentlyPulledCount.

this and your example cleared it up for me - multiple keepTags entries are useful if you want different rules for different tag patterns.

@eusebiu-constantin-petu-dbk
Copy link
Collaborator

eusebiu-constantin-petu-dbk commented Jan 15, 2025

I'm not 100% sure, below is my understanding, tagging @eusebiu-constantin-petu-dbk, who worked on this, for confirmation.

It is not very clear from the documentation how the retention policy rules work with each other. there are some more details that help in the PR discussion for #1866 and it would be nice to have some of that detail in the docs.

The retention rules under the same policy/keeptags list act as a logical OR.

for example - afaict the description of the rules for mostrecentpush/pull count apply to each repository that matches the "repositories" path separately - so if you have "**" combined with mostrecentlypulledcount: 2, then you will get 2 tags from every repo saved, but this isn't clear from the description

That is correct. Each repo is processed individually. All image retention and garbage collection processing is made per repo, not per groups of repos. I think it got overlooked in documentation because we didn't think retaining a specified count of all tags across multiple repos makes sense in a real world scenario. Different repos represent different content, images witch different purposes, why would someone let zot choose to retain tags from one repo in the detriment of the other repo?

also it is not clear what the difference in behavior (if any) is between these two examples:

    "keepTags": [{                               
        "mostRecentlyPulledCount": 10
      },
     {                               
        "mostRecentlyPushedCount": 10
      }]

and

    "keepTags": [{                               
        "mostRecentlyPulledCount": 10,
        "mostRecentlyPulledCount": 10 
     }]

I believe in your second example you meant actually

     "keepTags": [{                               
         "mostRecentlyPulledCount": 10,
         "mostRecentlyPushedCount": 10 
      }]

Theoretically there shouldn't be. You would want them in separate rules if they matched different "patterns". The retention rules under the same policy/keeptags list act as a logical OR, but inside a specific keepTags entry there is a logical AND between the tags matched by "patterns" and the ones matched by mostRecentlyPushedCount OR mostRecentlyPulledCount.

For example if you would want to keep the 3 of the most mostRecentlyPulledCount matching v1.*, 2 of the mostRecentlyPushedCount matching pattern v2.*, and 5 each of mostRecentlyPulledCount/mostRecentlyPushedCount regardless of pattern (v1.*, v2.*, v3.*, latest, and so on).

     "keepTags": [{                               
         "pattern": ["v1.*"],
         "mostRecentlyPulledCount": 3 
      },
     {                               
         "pattern": ["v2.*"],
         "mostRecentlyPushedCount": 2 
      },
      {                                
         "mostRecentlyPulledCount": 5,
         "mostRecentlyPushedCount": 5 
      }]

If the same tag matches multiple such rules under "keepTags", if would be retained. Given the same tag may match multiple list entries, there may be less than 15 tags matched even if there are enough tags in the repo.

@eusebiu-constantin-petu-dbk, can you please confirm this specific logic?

the example at https://zotregistry.dev/v2.1.0/articles/retention/?h=rete#complete-configuration-file-example has policies where there is only one dict in the keepTags array and where there are two, and it doesn't seem to explain what the difference is.

As in my example above the example in the docs is mentions different patterns. If patterns is missing, it is considered all tags match the keepsTag rule.

You might also want to check https://zotregistry.dev/v2.1.0/articles/retention/?h=rete#configuration-notes if you haven't already.

All of these points are correct. Thanks @andaaron !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants