-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(retention): added image retention policies #1866
feat(retention): added image retention policies #1866
Conversation
Also call this "feat(retention)" |
7938a52
to
318aeff
Compare
371c6cd
to
04143f1
Compare
sync policies (are per-repo, regex, semver, etc) are a good fit here IMO. Thoughts? |
c8506fb
to
0806c32
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are interested in both tags which would be retained and tags which would be removed, in case of dry run.
Can we show both? Are we showing both?
8b45eae
to
e3f7742
Compare
Codecov Report
@@ Coverage Diff @@
## main #1866 +/- ##
==========================================
+ Coverage 89.87% 89.98% +0.11%
==========================================
Files 158 164 +6
Lines 26978 27542 +564
==========================================
+ Hits 24246 24785 +539
- Misses 2012 2031 +19
- Partials 720 726 +6
... and 1 file with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
0f3519c
to
97fb79c
Compare
examples/config-gc.json
Outdated
"deleteUntagged": true, | ||
"tagsRetention": [{ | ||
"names": ["v2.*", "*-prod"], | ||
"retainAlways": true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
retains all tags, is true by default, and can not be used together with the other rules.
examples/config-gc.json
Outdated
"deleteReferrers": true, | ||
"deleteUntagged": true, | ||
"tagsRetention": [{ | ||
"mostRecentlyPushedCount": 10, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tag is retained if any of these 4 rules is met.
because names are missing, then it will be applied for all tags in matched repos.
this will be applied for all repos that do not match the above policies (can be used like a default policy)
examples/config-gc.json
Outdated
"deleteReferrers": false, | ||
"deleteUntagged": true, | ||
"tagsRetention": [{ | ||
"names": ["v2.*", "*-prod"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any tag that doesn't match any tagRetention[].names will be removed!
if names rule is empty, then the rules below are applied for all tags.
examples/config-gc.json
Outdated
"policies": [ | ||
{ | ||
"repoNames": ["infra/*", "prod/*"], | ||
"deleteReferrers": false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whether or not to delete manifests with missing subjects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if you push an referrer but don't push the subject, then this policy kicks in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if they are older than retention.delay, yes
examples/config-gc.json
Outdated
{ | ||
"repoNames": ["infra/*", "prod/*"], | ||
"deleteReferrers": false, | ||
"deleteUntagged": true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whether or not delete manifests without tags.
examples/config-gc.json
Outdated
"dryRun": false, | ||
"policies": [ | ||
{ | ||
"repoNames": ["infra/*", "prod/*"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the policy applied on a repo is the first one in the list that matches
same for tagsRetention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not longest prefix match?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, I raised that last week, can't remember why we didn't want that...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because it would be very tricky to get the configuration right as a user if the entries in the policy list can be in any order.
You would need to look at potentially hundreds such entries in any order to figure out which one would apply for the repo you are interested in.
If the first one matches you don't lose anything functionality-wise.
You can place your longest prefix match first, and your shorter prefix match afterwards and get the same result you would have gotten if we matched always on longest prefix.
7c82037
to
592d362
Compare
d305898
to
92a7815
Compare
cc: @mbshields this needs documentation as well |
Good interface, nice options for any case - great job, thanks! I will build and test it for sure, I need this! I have one suggestion, I need it and I think it will be nice option too! I don't know how to name it exactly, let it be "dynamic tags": Imagine tags like environment-{1,2,..}-* and I want to keep 2 latest for each environment (1, 2, etc), so I must write rules for each environment, but it's a dynamic thing, and there are many of them, I suggest to simplify it somehow. So many groups of tags and single rule for each group, something like regex capture group (maybe named) I think the best solution cause our group value must be in any matched tag and we should extract&compare it for each tag, and it's also very "flat":
.., so for each matched tag we should compare $1 and interpret tags with same group as unique list/rule and do the things. Maybe you imagine better interface, but I really think it's cool feature, and it will be very unique feature! |
Update after first testing:
For ex. here for tags like: main (meta-tag, always points to the last build of this branch), main.1, main.2, branch-foo (meta), branch-foo.13 I want:
So the main point that this rules will not preserve Also about above feature
Everything like above, but also keep meta-tag for any environment and 1 latest build for each, and also, as for In general, everything looks working, I use build of fork with retention for 2 days in my main zot installation, also seems that all works as I described in 3, and so as I said - it's great! |
And also, I see there is a no difference between |
ba6b3de
to
f748ba2
Compare
Hello, and thanks for trying it!
^here, I don't understand why you need rules per each environment, you want different rules for each one? Let me think about the rest of the points, thanks for raising them. |
@peusebiu pls rebase |
The same, the same to keep last N, but N for each, 1 latest for env-A, 1 latest for env-B, not 1 for any. It seems to me that the best implementation that I have reached is described here.
PS what is subject?) |
The OCI distribution spec is working towards allowing uploading of arbitrary content and add "soft links" to each other. |
a074690
to
f3ae020
Compare
feat(metaDB): add more image statistics info Signed-off-by: Petu Eusebiu <peusebiu@cisco.com>
f3ae020
to
782170b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@lictw Thanks for all the feedback and testing. |
@rchincha yep, nice job! Please ping me in future PRs to watch. |
@peusebiu Hi! I'm back. I'm faced with a situation that I can't seem to solve with the current implementation. But a task seems so basic that I can't believe in it, so I decided to ask if I understand everything correctly. I want to keep 3 latest tags, but only for some time, something like:
but there is OR logic, so above will keep latest 3 tags for forever. It's a DEV builds, I want to delete them fully after some time, but I also what to have only some latest builds if there is many, so it's the reason. Also, I also have the latest tag, so I want to always keep the latest and 3 latest pushed tags, and delete all of them after some time. And as side question about latest pulled, is it tracking for a specific tag or for whole image? If an image has 2 tags, but was pulled only by one, the second will be deleted? |
@peusebiu up (@rchincha maybe?) |
To answer your last paragraph, only the tags are deleted, the actual image/manifest is kept/deleted afterwards based on the separate setting you used for I'm not sure there is a possibility to do an AND (intersection) between mostRecentlyPushedCount and pushedWithin. They were designed to independently apply for the same regex pattern identifying the tag strings. And I believe this constraint was because the |
@andaaron you are right with your points. Unfortunately what you want is not possible with current implementation :(. the best you can do is:
if you want to always keep 'latest' and 3 more recent tags ^
If you want to always keep latest + all recently pushed tags. ^
If you want to always keep latest + most recently pushed tags (within 100h) and as a default case keep most recent 4 tags(latest may be included) if the tags are older than 100h. ^ |
@peusebiu why 4? Just to clarify that I understand how it equally works. Latest already done after first rule and should be excluded, is not it? So 3 from of the remaining. After all, does my case is actual? Do you have any thoughts on feature improvements? |
feat(metaDB): add more image statistics info
What type of PR is this?
Which issue does this PR fix:
What does this PR do / Why do we need it:
If an issue # is not available please add repro steps and logs showing the issue:
Testing done on this change:
Automation added to e2e:
Will this break upgrades or downgrades?
Does this PR introduce any user-facing change?:
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.