Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

redis driver for blob cache information and metadb #2865

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

andaaron
Copy link
Contributor

@andaaron andaaron commented Jan 7, 2025

Rebased #2412 and continued work on missing method implementation for cache driver, and implemented MetaDB methods.

Still need to:

  • run linter
  • make sure all existing tests pass
  • add more tests to cover redis
  • account for separate redis configuration for different substores
  • see if we can do anything for image signatures (right now testing with local storage, but we should have an alternative).
  • use an effective locking system:
    • need to check what support they offer in this library for concurrent goroutines
    • even if they have support for concurrent goroutines in the library we need a solution to work with multiple zot instances using the same database instance

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

This was referenced Jan 7, 2025
@elee1766
Copy link
Contributor

elee1766 commented Jan 7, 2025

@andaaron

this is awesome work. thank you.

i wish i could finish this but company needs put me elsewhere. will be super excited to use this feature when it is ready.

I wanted to note a few things, since i see some notes in your PR.

  1. for locks, i have had good success in https://github.com/go-redsync/redsync in the past, using it with redis-v9 driver. however, they only provide a somewhat-safe mutex, and you still need to implement either a leadership election on top of the mutex primative, or force the hot-paths to use mutex. for my needs, this has been okay, since mostly I am using mutexes to save on duplication of work.

  2. this is a slight tangent, but the more modern redis library is https://github.com/redis/rueidis
    which has https://github.com/redis/rueidis/tree/main/rueidislock. i have been using it for some more recent work, as it claims function to better in a redis cluster, however, the locking package is much less tested and used vs redsync/v4. I wouldn't use it for this project personally, but the option does exist, so i thought i would say

  3. redis cluster, maybe you know, has very interesting behavior + latency during failover, reshuffle, etc, or at least in my experience. i'm not sure if this is in scope for the project, but especially if a lock is going to be in the hot path, it may be important to test that zot does not explode when there are 10-15 second timeouts on random queries to redis

  4. fwiw, i dont think that redis is a great choice for a cache driver. the max value size is 512mb, which is much smaller than the max size for an oci blob (i think its a few gb?), there are many layers with size above 512mb. You could potentially shard/multipart the blob into multiple keys, by creating some container format around blob parts to indicate where the next "part" of the blob is, (now you get into some weird stuff with eviction, so you probably want this to be a lua script), but i dont know, it seems a lot more effort than its worth.

@rchincha
Copy link
Contributor

rchincha commented Jan 7, 2025

@andaaron thx for taking this up

@andaaron
Copy link
Contributor Author

andaaron commented Jan 8, 2025

  1. fwiw, i dont think that redis is a great choice for a cache driver. the max value size is 512mb, which is much smaller than the max size for an oci blob (i think its a few gb?), there are many layers with size above 512mb. You could potentially shard/multipart the blob into multiple keys, by creating some container format around blob parts to indicate where the next "part" of the blob is, (now you get into some weird stuff with eviction, so you probably want this to be a lua script), but i dont know, it seems a lot more effort than its worth.

With regards to this specific point, we have a misunderstanding. The role of the cache driver is to store information on what blob was deduped and and where are the duplicates. The role of metadb is to store information taken from manifests, as well as other metadata about manifests, repos, and users.

We wouldn't be using it to store and serve the actual image blobs, that is a separate storage implementation, where we support local folders and S3.

@andaaron andaaron force-pushed the redis2 branch 5 times, most recently from 2651e26 to b82e496 Compare January 9, 2025 16:14
Copy link

codecov bot commented Jan 9, 2025

Codecov Report

Attention: Patch coverage is 68.66430% with 617 lines in your changes missing coverage. Please review.

Project coverage is 90.49%. Comparing base (fba695a) to head (c04f85d).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
pkg/meta/redisdb/redis.go 64.25% 457 Missing and 135 partials ⚠️
pkg/storage/cache/redis.go 90.71% 18 Missing and 4 partials ⚠️
pkg/meta/meta.go 92.30% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2865      +/-   ##
==========================================
- Coverage   91.90%   90.49%   -1.41%     
==========================================
  Files         170      172       +2     
  Lines       30291    32246    +1955     
==========================================
+ Hits        27838    29181    +1343     
- Misses       1825     2300     +475     
- Partials      628      765     +137     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@andaaron andaaron force-pushed the redis2 branch 19 times, most recently from 4f0eb30 to 68f3926 Compare January 14, 2025 13:53
@JakeCooper
Copy link

JakeCooper commented Jan 15, 2025

Just commenting here; we'd love to move to Zot at Railway.com for our Cloud+Baremetal clusters. This, we believe, would be the last thing blocking us

Would love to help get it across the line in any way we can

@andaaron andaaron force-pushed the redis2 branch 11 times, most recently from 16cf7bf to aabb483 Compare January 17, 2025 09:20
elee1766 and others added 9 commits January 17, 2025 10:18
Currently, we have dynamoDB as the remote shared cache but ideal only
for the cloud use case.
For on-prem use case, add support for redis.

Signed-off-by: Ramkumar Chinchani <rchincha@cisco.com>
Signed-off-by: Petu Eusebiu <peusebiu@cisco.com>
Signed-off-by: Alexei Dodon <adodon@cisco.com>
Signed-off-by: Andrei Aaron <aaaron@luxoft.com>
- add missing method GetAllBlobs
- add redis cache tests, with and without mocking

Signed-off-by: Andrei Aaron <aaaron@luxoft.com>
Signed-off-by: Andrei Aaron <aaaron@luxoft.com>
…s DB

Signed-off-by: Andrei Aaron <aaaron@luxoft.com>
Signed-off-by: Andrei Aaron <aaaron@luxoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants