Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDH | add image pull API #608

Merged
merged 7 commits into from
Sep 5, 2024

Conversation

Xynnn007
Copy link
Member

@Xynnn007 Xynnn007 commented Jul 9, 2024

This commit adds a new API pull_image() for Confidential DataHub. This new api will converge all operations related to image pull to one API, hinting that image signature validation, auth file retrievement, security policy enforcement will all be done inside CDH.

Related configurations is included in CDH's config.image item.

Although the basic workflow passes due to the unit test, let's keep this PR as draft because some details still needs to promote.

  • Configuration file for image pull of CDH
  • More test cases
  • Determine details of interacting with kata-agent

Related to kata-containers/kata-containers#9266

@fitzthum
Copy link
Member

fitzthum commented Jul 9, 2024

Do you think we should do this before doing encryped/signed images in Kata? Also, are we going to run into problems here if we don't have init-data supported? I saw you mentioned some problems with provisioning the config file on your encrypted image PR.

@Xynnn007
Copy link
Member Author

@fitzthum Before initdata is implemented, the config for CDH is hard to inject so we might need to wait for that.

So logically we might still merge to main with what we are doing, and then initdata, and then this PR and move things to CDH in kata.

@Xynnn007 Xynnn007 force-pushed the feat-cdh-image-pull branch 2 times, most recently from 4e1d4d3 to e021335 Compare July 15, 2024 08:41
@Xynnn007
Copy link
Member Author

Ok, for image decryption, it does not work because some dead lock happens inside ocicrypt-rs. let me try to refactor some code inside ocicrypt-rs to make it work

@Xynnn007 Xynnn007 force-pushed the feat-cdh-image-pull branch 2 times, most recently from 4f9c7ca to e23effe Compare August 15, 2024 03:48
@@ -37,7 +37,7 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
.include("src/utils")
.rust_protobuf()
.customize(ttrpc_codegen::Customize {
async_all: true,
async_all: false,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @arronwy , I changed the ttrpc client to sync here. Please take a look if it is good to you

@Xynnn007 Xynnn007 force-pushed the feat-cdh-image-pull branch 2 times, most recently from 1017c52 to 1b8e10c Compare August 15, 2024 04:29
@Xynnn007 Xynnn007 marked this pull request as ready for review August 15, 2024 05:08
@Xynnn007
Copy link
Member Author

Xynnn007 commented Aug 15, 2024

Hi @BbolroC , the CI for SE cannot pass https://github.com/confidential-containers/guest-components/actions/runs/10398806362/job/28796702259?pr=608#step:6:479, but the x86-64 succeeded. Could you please help to take a look? Thanks!

Sorry to ping, now it passes!

Hi @fitzthum @stevenhorsman I think this is now ready for review.

Copy link
Member

@stevenhorsman stevenhorsman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments/questions from me, but overall it's looking good.

confidential-data-hub/Makefile Outdated Show resolved Hide resolved
confidential-data-hub/hub/src/config.rs Outdated Show resolved Hide resolved
confidential-data-hub/hub/src/config.rs Outdated Show resolved Hide resolved
confidential-data-hub/hub/src/config.rs Outdated Show resolved Hide resolved
confidential-data-hub/hub/src/config.rs Show resolved Hide resolved
confidential-data-hub/hub/src/config.rs Outdated Show resolved Hide resolved
confidential-data-hub/hub/tests/image-pull.rs Outdated Show resolved Hide resolved
.github/workflows/cdh_basic.yml Show resolved Hide resolved
@Xynnn007 Xynnn007 force-pushed the feat-cdh-image-pull branch 2 times, most recently from c808e1d to 64255ea Compare August 15, 2024 14:49
@Xynnn007 Xynnn007 force-pushed the feat-cdh-image-pull branch from 64255ea to 3de4344 Compare August 15, 2024 14:59
Copy link
Member

@fitzthum fitzthum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. A couple initial comments.

confidential-data-hub/hub/src/hub.rs Outdated Show resolved Hide resolved
confidential-data-hub/hub/tests/image-pull.rs Show resolved Hide resolved
confidential-data-hub/hub/tests/image-pull.rs Outdated Show resolved Hide resolved
image-rs/src/pull.rs Outdated Show resolved Hide resolved
@Xynnn007 Xynnn007 force-pushed the feat-cdh-image-pull branch 2 times, most recently from 17710a4 to c9fbddd Compare August 16, 2024 06:55
@ChengyuZhu6
Copy link
Member

cc @bpradipt

@BbolroC
Copy link
Member

BbolroC commented Aug 16, 2024

I think an error below for s390x looks legit:

error: single-character string constant used as pattern
  --> confidential-data-hub/secret/src/secret/mod.rs:46:45
   |
46 |         let sections: Vec<_> = secret.split(".").collect();
   |                                             ^^^ help: try using a `char` instead: `'.'`
   |

Thanks!

@Xynnn007
Copy link
Member Author

I think an error below for s390x looks legit:

error: single-character string constant used as pattern
  --> confidential-data-hub/secret/src/secret/mod.rs:46:45
   |
46 |         let sections: Vec<_> = secret.split(".").collect();
   |                                             ^^^ help: try using a `char` instead: `'.'`
   |

Thanks!

this error is not related to this PR. Probably we could fix it in another pr then

@Xynnn007 Xynnn007 force-pushed the feat-cdh-image-pull branch 2 times, most recently from 316ac26 to 3fd0233 Compare August 19, 2024 03:49
@Xynnn007
Copy link
Member Author

@BbolroC Fixed in #680 and get the pr rebased. and now it works

Copy link
Member

@fitzthum fitzthum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more comments. These are mainly about the config file. I think we can take this opportunity to clarify a lot of our parameter names.

We should also make sure we have really clear documentation of all the parameters either in the source or in the sample config.

confidential-data-hub/hub/src/config.rs Outdated Show resolved Hide resolved
confidential-data-hub/hub/src/config.rs Outdated Show resolved Hide resolved
image-rs/src/image.rs Outdated Show resolved Hide resolved
confidential-data-hub/hub/src/config.rs Outdated Show resolved Hide resolved
confidential-data-hub/hub/src/config.rs Outdated Show resolved Hide resolved
confidential-data-hub/hub/src/config.rs Outdated Show resolved Hide resolved
confidential-data-hub/hub/src/config.rs Show resolved Hide resolved
@Xynnn007 Xynnn007 force-pushed the feat-cdh-image-pull branch 4 times, most recently from 6cd213a to 0c02ec5 Compare August 24, 2024 13:56
@Xynnn007
Copy link
Member Author

@fitzthum I changed the config names, ptal

Copy link
Member

@fitzthum fitzthum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. I appreciate the changes to the config. I think combining the flags to enable the features with the URIs was a good idea.

I have a few more comments. Let's just try to settle on the names for the parameters. We can tweak the descriptions in a follow-up.

confidential-data-hub/hub/src/config.rs Outdated Show resolved Hide resolved
confidential-data-hub/hub/src/config.rs Outdated Show resolved Hide resolved
confidential-data-hub/hub/src/config.rs Outdated Show resolved Hide resolved
confidential-data-hub/hub/src/config.rs Outdated Show resolved Hide resolved
confidential-data-hub/hub/src/hub.rs Show resolved Hide resolved
@Xynnn007 Xynnn007 force-pushed the feat-cdh-image-pull branch from 0c02ec5 to a83b1f2 Compare August 28, 2024 02:10
Copy link
Member

@ChengyuZhu6 ChengyuZhu6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall code looks good to me. LGTM ! thanks @Xynnn007

@Xynnn007 Xynnn007 force-pushed the feat-cdh-image-pull branch from a83b1f2 to 3190f33 Compare August 29, 2024 14:22
@Xynnn007
Copy link
Member Author

Just did a rebase to resolve conflicts.

Copy link
Member

@fitzthum fitzthum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left one possibly suggestion but generally I think this is good.

I can live with no_proxy for now I guess and we can clean up the docs a little bit later.


if let Some(number) = config.image.max_concurrent_layer_downloads_per_image {
image_client.config.max_concurrent_download = number;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it would make sense to move this logic into the initializer for the image_client (and pass config.image into that) rather than putting all this image-specific stuff in this function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some barriers because the parameters is different from image-rs's. If you are worrying about the exposion of the args of iamge_client, we can add a Builder logic to ImageClient. If you are worrying about the duplication of settings, we could do a refactoring upon image-rs to have a ClientBuilder logic to make it better.

wdyt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the fundamental issue is duplication. We have configs for image-rs but we also have the image config for the CDH. Ideally those might be the same thing, which would simplify this code. That would take some messing around, though, so we can skip it for now.

This commit adds a new API `pull_image()` for Confidential DataHub. This
new api will converge all operations related to image pull to one API,
hinting that image signature validation, auth file retrievement,
security policy enforcement will all be done inside CDH.

Related configurations is included in CDH's config.image item.

Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
actions-rs/toolchain@v1 is now out of data. Also, we need extra
permission to run CDH tests.

Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
The original ttrpc keyprovider client is marked `async`, but we use it
as sync version by fork a new thread and create a new tokio runtime.
This commit changes the client into sync version, this will match the
original usage.

Why we need this commit is the original fork-thread-and-tokio-runtime
way would create a dead lock for CDH's image-pull API when an encrypted
image is pulled. When we bring in a tokio::spawn_blocking in image-rs,
the tokio runtime would have possibility to switch context thus a dead
lock can be avoided.

Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
meta_store stores the metadata about the pulled images. Using Mutex
would block the concurrent performance when there are few write cases to
meta_store, like pulling a lot of pulled images.

Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
meta store json file records the current state of pulled images and the
layers. Before this commit, the json file is not ever written back to
local filesystem.

For legacy CoCo, everytime a new pull request comes, a new ImageClient
would be specified a new workdir, thus there is no re-use of the
meta-store and works fine.

With this commit, when a ImageClient with same workdir restarts, it
still can share the same metadata of pulled images.

Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
Before this commit, different Overlayfs mounts of a same ImageClient
would mark its mounts as 1, 2, 3... Thus there will be 0,1,2... under
$work_dir/overlayfs.

If a ImageClient with same work dir launches, new mounts will fail as
the original index cannot be correctly read from the meta store json
file.

To make it easy, we changed the way to distinguish different mounts, by
calculating a hash from (overlay layer paths, mount paths, time). As the
three variables should not be the same, thus we could avoid collisions
of overlayfs working dir.

Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
/run is a tmpfs workdir that would live inside guest memory for CoCo,
while `/var/lib` is on disk. This would help CDH to avoid setting
another default image workdir.

As image-rs is able to set any workdir from its config, this patch will
not influence the flexibility.

Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
@Xynnn007 Xynnn007 force-pushed the feat-cdh-image-pull branch from 3190f33 to 8480ae9 Compare September 4, 2024 09:09
@Xynnn007 Xynnn007 requested a review from a team as a code owner September 4, 2024 09:09
@Xynnn007 Xynnn007 merged commit 493c703 into confidential-containers:main Sep 5, 2024
15 checks passed
@Xynnn007 Xynnn007 deleted the feat-cdh-image-pull branch September 5, 2024 01:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

7 participants