From 8b5778b166db2437be37ceae8cbb8e11195462ee Mon Sep 17 00:00:00 2001 From: Rand McKinney Date: Mon, 18 Nov 2024 15:56:58 -0800 Subject: [PATCH 1/7] Move release notes to separate file and xref Rust supported formats --- README.md | 91 ++++++++----------------------------------- docs/release-notes.md | 52 +++++++++++++++++++++++++ 2 files changed, 68 insertions(+), 75 deletions(-) create mode 100644 docs/release-notes.md diff --git a/README.md b/README.md index 4dce63e..9519d2b 100644 --- a/README.md +++ b/README.md @@ -1,13 +1,18 @@ # C2PA Python -This repository implements Python bindings for the Content Authenticity Initiative (CAI) library. +This package implements Python bindings for the Content Authenticity Initiative (CAI) SDK. +It enables you to read and validate C2PA manifest data from and add signed manifests to media files in the [supported formats](https://github.com/contentauth/c2pa-rs/blob/main/docs/supported-formats.md). -This library enables you to read and validate C2PA data in supported media files and add signed manifests to supported media files. - -**NOTE**: Starting with version 0.5.0, this package has a completely different API from version 0.4.0. See [Release notes](#release-notes) for more information. +**NOTE**: Starting with version 0.5.0, this package has a completely different API from version 0.4.0. See [Release notes](docs/release-notes.md) for more information. **WARNING**: This is an prerelease version of this library. There may be bugs and unimplemented features, and the API is subject to change. +
+ +For information on what's in the current release, see the [Release notes](docs/release-notes.md). + +
+ ## Installation Install from PyPI by entering this command: @@ -36,8 +41,14 @@ If you tried unsuccessfully to install this package before the [0.40 release](ht pip install --upgrade --force-reinstall c2pa-python ``` +## Supported formats + +The Python library [supports the same media file formats](https://github.com/contentauth/c2pa-rs/blob/main/docs/supported-formats.md) as the Rust library. + ## Usage +This package works with media files in the [supported formats](https://github.com/contentauth/c2pa-rs/blob/main/docs/supported-formats.md). + ### Import Import the API as follows: @@ -100,7 +111,7 @@ def sign_ps256(data: bytes, key_path: str) -> bytes: **Read and validate C2PA data from an asset file** -Use the `Reader` to read C2PA data from the specified asset file (see [supported file formats](#supported-file-formats)). +Use the `Reader` to read C2PA data from the specified asset file. This examines the specified media file for C2PA data and generates a report of any data it finds. If there are validation errors, the report includes a `validation_status` field. @@ -270,29 +281,6 @@ except Exception as err: print(err) ``` -## Supported file formats - - | Extensions | MIME type | - | ------------- | --------------------------------------------------- | - | `avi` | `video/msvideo`, `video/avi`, `application-msvideo` | - | `avif` | `image/avif` | - | `c2pa` | `application/x-c2pa-manifest-store` | - | `dng` | `image/x-adobe-dng` | - | `gif` | `image/gif` | - | `heic` | `image/heic` | - | `heif` | `image/heif` | - | `jpg`, `jpeg` | `image/jpeg` | - | `m4a` | `audio/mp4` | - | `mp3` | `audio/mpeg` | - | `mp4` | `video/mp4`, `application/mp4` | - | `mov` | `video/quicktime` | - | `png` | `image/png` | - | `svg` | `image/svg+xml` | - | `tif`,`tiff` | `image/tiff` | - | `wav` | `audio/x-wav` | - | `webp` | `image/webp` | - - ## Development It is best to [set up a virtual environment](https://virtualenv.pypa.io/en/latest/installation.html) for development and testing. @@ -377,53 +365,6 @@ python3 tests/training.py deactivate ``` -## Release notes - -### Version 0.5.2 - -New features: - -- Allow EC signatures in DER format from signers and verify signature format during validation. -- Fix bug in signing audio and video files in ISO Base Media File Format (BMFF). -- Add the ability to verify PDF files (but not to sign them). -- Increase speed of `sign_file` by 2x or more, when using file paths (uses native Rust file I/O). -- Fixes for RIFF and GIF formats. - -### Version 0.5.0 - -New features in this release: - -- Rewrites the API to be stream-based using a Builder / Reader model. -- The functions now support throwing `c2pa.Error` values, caught with `try`/`except`. -- Instead of `c2pa.read_file` you now call `c2pa_api.Reader.from_file` and `reader.json`. -- Read thumbnails and other resources use `reader.resource_to_stream` or `reader.resource.to_file`. -- Instead of `c2pa.sign_file` use `c2pa_api.Builder.from_json` and `builder.sign` or `builder.sign_file`. -- Add thumbnails or other resources with `builder.add_resource` or `builder.add_resource_file`. -- Add Ingredients with `builder.add_ingredient` or `builder.add_ingredient_file`. -- You can archive a `Builder` using `builder.to_archive` and reconstruct it with `builder.from_archive`. -- Signers can be constructed with `c2pa_api.create_signer`. -- The signer now requires a signing function to keep private keys private. -- Example signing functions are provided in c2pa_api.py - -### Version 0.4.0 - -This release: - -- Changes the name of the import from `c2pa-python` to `c2pa`. -- Supports pre-built platform wheels for macOS, Windows and [manylinux](https://github.com/pypa/manylinux). - -### Version 0.3.0 - -This release includes some breaking changes to align with future APIs: - -- `C2paSignerInfo` moves the `alg` to the first parameter from the 3rd. -- `c2pa.verify_from_file_json` is now `c2pa.read_file`. -- `c2pa.ingredient_from_file_json` is now `c2pa.read_ingredient_file`. -- `c2pa.add_manifest_to_file_json` is now `c2pa.sign_file`. -- There are many more specific errors types now, and Error messages always start with the name of the error i.e (str(err.value).startswith("ManifestNotFound")). -- The ingredient thumbnail identifier may be jumbf uri reference if a valid thumb already exists in the active manifest. -- Extracted file paths for read_file now use a folder structure and different naming conventions. - ## License This package is distributed under the terms of both the [MIT license](https://github.com/contentauth/c2pa-python/blob/main/LICENSE-MIT) and the [Apache License (Version 2.0)](https://github.com/contentauth/c2pa-python/blob/main/LICENSE-APACHE). diff --git a/docs/release-notes.md b/docs/release-notes.md new file mode 100644 index 0000000..56ff1b0 --- /dev/null +++ b/docs/release-notes.md @@ -0,0 +1,52 @@ +# Release notes + +## Version 0.6.0 + + + +See [Release tag 0.6.0](https://github.com/contentauth/c2pa-python/releases/tag/v0.6.0). + +## Version 0.5.2 + +New features: + +- Allow EC signatures in DER format from signers and verify signature format during validation. +- Fix bug in signing audio and video files in ISO Base Media File Format (BMFF). +- Add the ability to verify PDF files (but not to sign them). +- Increase speed of `sign_file` by 2x or more, when using file paths (uses native Rust file I/O). +- Fixes for RIFF and GIF formats. + +## Version 0.5.0 + +New features in this release: + +- Rewrites the API to be stream-based using a Builder / Reader model. +- The functions now support throwing `c2pa.Error` values, caught with `try`/`except`. +- Instead of `c2pa.read_file` you now call `c2pa_api.Reader.from_file` and `reader.json`. +- Read thumbnails and other resources use `reader.resource_to_stream` or `reader.resource.to_file`. +- Instead of `c2pa.sign_file` use `c2pa_api.Builder.from_json` and `builder.sign` or `builder.sign_file`. +- Add thumbnails or other resources with `builder.add_resource` or `builder.add_resource_file`. +- Add Ingredients with `builder.add_ingredient` or `builder.add_ingredient_file`. +- You can archive a `Builder` using `builder.to_archive` and reconstruct it with `builder.from_archive`. +- Signers can be constructed with `c2pa_api.create_signer`. +- The signer now requires a signing function to keep private keys private. +- Example signing functions are provided in c2pa_api.py + +## Version 0.4.0 + +This release: + +- Changes the name of the import from `c2pa-python` to `c2pa`. +- Supports pre-built platform wheels for macOS, Windows and [manylinux](https://github.com/pypa/manylinux). + +## Version 0.3.0 + +This release includes some breaking changes to align with future APIs: + +- `C2paSignerInfo` moves the `alg` to the first parameter from the 3rd. +- `c2pa.verify_from_file_json` is now `c2pa.read_file`. +- `c2pa.ingredient_from_file_json` is now `c2pa.read_ingredient_file`. +- `c2pa.add_manifest_to_file_json` is now `c2pa.sign_file`. +- There are many more specific errors types now, and Error messages always start with the name of the error i.e (str(err.value).startswith("ManifestNotFound")). +- The ingredient thumbnail identifier may be jumbf uri reference if a valid thumb already exists in the active manifest. +- Extracted file paths for read_file now use a folder structure and different naming conventions. From 59aa84397a0e16f92b1a7da7833b21ef644aa47e Mon Sep 17 00:00:00 2001 From: Rand McKinney Date: Tue, 19 Nov 2024 11:31:32 -0800 Subject: [PATCH 2/7] Fix links, etc --- CONTRIBUTING.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 7b28791..d5fa001 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -6,7 +6,7 @@ Before you start, we ask that you understand the following guidelines. ## Code of conduct -This project adheres to the Adobe [code of conduct](../CODE_OF_CONDUCT.md). By participating, +This project adheres to the Adobe [code of conduct](CODE_OF_CONDUCT.md). By participating, you are expected to uphold this code. Please report unacceptable behavior to [Grp-opensourceoffice@adobe.com](mailto:Grp-opensourceoffice@adobe.com). From 20550d30772d5c9ae9891e7701ea766113e9b198 Mon Sep 17 00:00:00 2001 From: Rand McKinney Date: Tue, 19 Nov 2024 15:00:07 -0800 Subject: [PATCH 3/7] Replace outdated info about PR/commit messages --- CONTRIBUTING.md | 38 ++++++++++++++++++++++++++++++-------- 1 file changed, 30 insertions(+), 8 deletions(-) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index d5fa001..231055f 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -18,12 +18,10 @@ consensus around project direction and issue solutions within issue threads ### Current areas of work -The Adobe CAI team has been using this crate as the foundation of Adobe's Content Authenticity Initiative-related products and services since late 2020. As we shift toward making this crate available for open usage, we're aware that there is quite a bit of work to do to create what we'd feel comfortable calling a 1.0 release. We've decided to err on the side of releasing earlier so that people can experiment with it and give us feedback. - -We expect to do work on a number of areas in the next few months while we remain in prerelease (0.x) versions. Some broad categories of work (and thus things you might expect to change) are: +Some broad categories of work (and thus things you might expect to change) are: * We'll be reviewing and refining our APIs for ease of use and comprehension. We'd appreciate feedback on areas that you find confusing or unnecessarily difficult. -* We'll also be reviewing our APIs for compliance with Rust community best practices. There are some areas (for example, use of public fields and how we take ownership vs references) where we know some work is required. +* We'll also be reviewing our APIs for compliance with best practices. * Our documentation is incomplete. We'll be working on refining the documentation. * Our testing infrastructure is incomplete. We'll be working on improving test coverage, memory efficiency, and performance benchmarks. @@ -61,11 +59,35 @@ This will give us an opportunity to discuss API design and avoid duplicate effor ### Pull request titles -The build process automatically adds a pull request (PR) to the [CHANGELOG](CHANGELOG.md) unless the title of the PR begins with `(IGNORE)`. Start PR titles with `(IGNORE)` for minor documentation updates and other trivial fixes that you want to specifically exclude from the CHANGELOG. +Titles of pull requests that target a long-lived branch such as _main_ or a release-specific branch should follow [conventional commits](https://www.conventionalcommits.org/en/v1.0.0/#specification). This means that the first word of the pull request title should be one of the following: + + * `build` + * `chore` + * `ci` + * `docs` + * `feat` + * `fix` + * `perf` + * `refactor` + * `revert` + * `style` + * `test` + +Optionally, but preferred, a scope can be added in parentheses after the type. The scope should be the name of the module or component that the commit affects. For example, `feat(api): Introduce a new API to validate 1.0 claims`. + +If more detail is warranted, add a blank line and then continue with sentences (these sentences should be punctuated as such) and paragraphs as needed to provide that detail. There is no need to word-wrap this message. + +For example: + +```text +feat(api): Introduce a new API to validate 1.0 claims + +Repurpose existing v2 API for 0.8 compatibility (read: no validation) mode. +``` + +The conventional commit message requirement does not apply to individual commits within a pull request, provided that those commits will be squashed when the PR is merged and the resulting squash commit does follow the conventional commit requirement. This may require the person merging the PR to verify the commit message syntax when performing the squash merge. -Additionally, the build process takes specific actions if the title of a PR begins with certain special strings: -- `(MINOR)`: Increments the minor version, per [semantic versioning](https://semver.org/) convention. **IMPORTANT:** This flag should be used for any API change that breaks compatibility with previous releases while this crate is in prerelease (version 0.x) status. -- `(MAJOR)`: Increments the major version number, per [semantic versioning](https://semver.org/) convention. +TIP: For single-commit PRs, ensure the commit message conforms to the conventional commit requirement, since by default that will also be the title of the PR. ## From contributor to committer From 92d6e589542f3eae820a9e06e655468caeeaa098 Mon Sep 17 00:00:00 2001 From: Rand McKinney Date: Wed, 20 Nov 2024 11:12:21 -0800 Subject: [PATCH 4/7] Clarify signing function use --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 9519d2b..c896c71 100644 --- a/README.md +++ b/README.md @@ -83,7 +83,7 @@ manifest_json = json.dumps({ ### Signing function -The `sign_ps256` function is [defined in the library](https://github.com/contentauth/c2pa-python/blob/main/c2pa/c2pa_api/c2pa_api.py#L209) and is reproduced here to show how signing is performed. +The `sign_ps256` function is [defined in the library](https://github.com/contentauth/c2pa-python/blob/main/c2pa/c2pa_api/c2pa_api.py#L209) is used in both file-based and stream-based methods and is reproduced here to show how signing is performed. ```py # Example of using Python crypto to sign data using openssl with Ps256 From 5df715a536712b9fb6ea3eeb46876331ea300e94 Mon Sep 17 00:00:00 2001 From: Rand McKinney Date: Thu, 21 Nov 2024 13:39:16 -0800 Subject: [PATCH 5/7] Move project contributions and usage info to separate files in docs dir --- README.md | 327 +--------------------------------- docs/project-contributions.md | 87 +++++++++ docs/usage.md | 235 ++++++++++++++++++++++++ 3 files changed, 327 insertions(+), 322 deletions(-) create mode 100644 docs/project-contributions.md create mode 100644 docs/usage.md diff --git a/README.md b/README.md index c896c71..3b24dc2 100644 --- a/README.md +++ b/README.md @@ -9,7 +9,10 @@ It enables you to read and validate C2PA manifest data from and add signed manif
-For information on what's in the current release, see the [Release notes](docs/release-notes.md). +Additional documentation: +- [Using the Python library](docs/usage.md) +- [Release notes](docs/release-notes.md) +- [Contributing to the project](docs/project-contributions.md)
@@ -21,7 +24,7 @@ Install from PyPI by entering this command: pip install -U c2pa-python ``` -This is a platform wheel built with Rust that works on Windows, macOS, and most Linux distributions (using [manylinux](https://github.com/pypa/manylinux)). If you need to run on another platform, see [Development](#development) for information on how to build from source. +This is a platform wheel built with Rust that works on Windows, macOS, and most Linux distributions (using [manylinux](https://github.com/pypa/manylinux)). If you need to run on another platform, see [Project contributions - Development](docs/project-contributions.md#development) for information on how to build from source. ### Updating @@ -45,326 +48,6 @@ pip install --upgrade --force-reinstall c2pa-python The Python library [supports the same media file formats](https://github.com/contentauth/c2pa-rs/blob/main/docs/supported-formats.md) as the Rust library. -## Usage - -This package works with media files in the [supported formats](https://github.com/contentauth/c2pa-rs/blob/main/docs/supported-formats.md). - -### Import - -Import the API as follows: - -```py -from c2pa import * -``` - -### Define manifest JSON - -The Python library works with both file-based and stream-based operations. -In both cases, the manifest JSON string defines the C2PA manifest to add to an asset; for example: - -```py -manifest_json = json.dumps({ - "claim_generator": "python_test/0.1", - "assertions": [ - { - "label": "c2pa.training-mining", - "data": { - "entries": { - "c2pa.ai_generative_training": { "use": "notAllowed" }, - "c2pa.ai_inference": { "use": "notAllowed" }, - "c2pa.ai_training": { "use": "notAllowed" }, - "c2pa.data_mining": { "use": "notAllowed" } - } - } - } - ] - }) -``` - -### Signing function - -The `sign_ps256` function is [defined in the library](https://github.com/contentauth/c2pa-python/blob/main/c2pa/c2pa_api/c2pa_api.py#L209) is used in both file-based and stream-based methods and is reproduced here to show how signing is performed. - -```py -# Example of using Python crypto to sign data using openssl with Ps256 -from cryptography.hazmat.primitives import hashes, serialization -from cryptography.hazmat.primitives.asymmetric import padding - -def sign_ps256(data: bytes, key_path: str) -> bytes: - with open(key_path, "rb") as key_file: - private_key = serialization.load_pem_private_key( - key_file.read(), - password=None, - ) - signature = private_key.sign( - data, - padding.PSS( - mgf=padding.MGF1(hashes.SHA256()), - salt_length=padding.PSS.MAX_LENGTH - ), - hashes.SHA256() - ) - return signature -``` - -### File-based operation - -**Read and validate C2PA data from an asset file** - -Use the `Reader` to read C2PA data from the specified asset file. - -This examines the specified media file for C2PA data and generates a report of any data it finds. If there are validation errors, the report includes a `validation_status` field. - -An asset file may contain many manifests in a manifest store. The most recent manifest is identified by the value of the `active_manifest` field in the manifests map. The manifests may contain binary resources such as thumbnails which can be retrieved with `resource_to_stream` or `resource_to_file` using the associated `identifier` field values and a `uri`. - -NOTE: For a comprehensive reference to the JSON manifest structure, see the [Manifest store reference](https://opensource.contentauthenticity.org/docs/manifest/manifest-ref). - -```py -try: - # Create a reader from a file path - reader = c2pa.Reader.from_file("path/to/media_file.jpg") - - # Print the JSON for a manifest. - print("manifest store:", reader.json()) - - # Get the active manifest. - manifest = reader.get_active_manifest() - if manifest != None: - - # get the uri to the manifest's thumbnail and write it to a file - uri = manifest["thumbnail"]["identifier"] - reader.resource_to_file(uri, "thumbnail_v2.jpg") - -except Exception as err: - print(err) -``` - -**Add a signed manifest to an asset file** - -**WARNING**: This example accesses the private key and security certificate directly from the local file system. This is fine during development, but doing so in production may be insecure. Instead use a Key Management Service (KMS) or a hardware security module (HSM) to access the certificate and key; for example as show in the [C2PA Python Example](https://github.com/contentauth/c2pa-python-example). - -Use a `Builder` to add a manifest to an asset: - -```py -try: - # Define a function to sign the claim bytes - # In this case we are using a pre-defined sign_ps256 method, passing in our private cert - # Normally this cert would be kept safe in some other location - def private_sign(data: bytes) -> bytes: - return sign_ps256(data, "tests/fixtures/ps256.pem") - - # read our public certs into memory - certs = open(data_dir + "ps256.pub", "rb").read() - - # Create a signer from the private signer, certs and a time stamp service url - signer = create_signer(private_sign, SigningAlg.PS256, certs, "http://timestamp.digicert.com") - - # Create a builder add a thumbnail resource and an ingredient file. - builder = Builder(manifest_json) - - # The uri provided here "thumbnail" must match an identifier in the manifest definition. - builder.add_resource_file("thumbnail", "tests/fixtures/A_thumbnail.jpg") - - # Define an ingredient, in this case a parent ingredient named A.jpg, with a thumbnail - ingredient_json = { - "title": "A.jpg", - "relationship": "parentOf", # "parentOf", "componentOf" or "inputTo" - "thumbnail": { - "identifier": "thumbnail", - "format": "image/jpeg" - } - } - - # Add the ingredient to the builder loading information from a source file. - builder.add_ingredient_file(ingredient_json, "tests/fixtures/A.jpg") - - # At this point we could archive or unarchive our Builder to continue later. - # In this example we use a bytearray for the archive stream. - # all ingredients and resources will be saved in the archive - archive = io.BytesIO(bytearray()) - builder.to_archive(archive) - archive.seek() - builder = builder.from_archive(archive) - - # Sign and add our manifest to a source file, writing it to an output file. - # This returns the binary manifest data that could be uploaded to cloud storage. - c2pa_data = builder.sign_file(signer, "tests/fixtures/A.jpg", "target/out.jpg") - -except Exception as err: - print(err) -``` - -### Stream-based operation - -Instead of working with files, you can read, validate, and add a signed manifest to streamed data. This example code does the same thing as the file-based example. - -**Read and validate C2PA data from a stream** - -```py -try: - # It's also possible to create a reader from a format and stream - # Note that these two readers are functionally equivalent - stream = open("path/to/media_file.jpg", "rb") - reader = c2pa.Reader("image/jpeg", stream) - - # Print the JSON for a manifest. - print("manifest store:", reader.json()) - - # Get the active manifest. - manifest = reader.get_active_manifest() - if manifest != None: - - # get the uri to the manifest's thumbnail and write it to a file - uri = manifest["thumbnail"]["identifier"] - reader.resource_to_file(uri, "thumbnail_v2.jpg") - -except Exception as err: - print(err) -``` - -**Add a signed manifest to a stream** - -**WARNING**: This example accesses the private key and security certificate directly from the local file system. This is fine during development, but doing so in production may be insecure. Instead use a Key Management Service (KMS) or a hardware security module (HSM) to access the certificate and key; for example as show in the [C2PA Python Example](https://github.com/contentauth/c2pa-python-example). - -Use a `Builder` to add a manifest to an asset: - -```py -try: - # Define a function to sign the claim bytes - # In this case we are using a pre-defined sign_ps256 method, passing in our private cert - # Normally this cert would be kept safe in some other location - def private_sign(data: bytes) -> bytes: - return sign_ps256(data, "tests/fixtures/ps256.pem") - - # read our public certs into memory - certs = open(data_dir + "ps256.pub", "rb").read() - - # Create a signer from the private signer, certs and a time stamp service url - signer = create_signer(private_sign, SigningAlg.PS256, certs, "http://timestamp.digicert.com") - - # Create a builder add a thumbnail resource and an ingredient file. - builder = Builder(manifest_json) - - # Add the resource from a stream - a_thumbnail_jpg_stream = open("tests/fixtures/A_thumbnail.jpg", "rb") - builder.add_resource("image/jpeg", a_thumbnail_jpg_stream) - - # Define an ingredient, in this case a parent ingredient named A.jpg, with a thumbnail - ingredient_json = { - "title": "A.jpg", - "relationship": "parentOf", # "parentOf", "componentOf" or "inputTo" - "thumbnail": { - "identifier": "thumbnail", - "format": "image/jpeg" - } - } - - # Add the ingredient from a stream - a_jpg_stream = open("tests/fixtures/A.jpg", "rb") - builder.add_ingredient("image/jpeg", a_jpg_stream) - - # At this point we could archive or unarchive our Builder to continue later. - # In this example we use a bytearray for the archive stream. - # all ingredients and resources will be saved in the archive - archive = io.BytesIO(bytearray()) - builder.to_archive(archive) - archive.seek() - builder = builder.from_archive(archive) - - # Sign the builder with a stream and output it to a stream - # This returns the binary manifest data that could be uploaded to cloud storage. - input_stream = open("tests/fixtures/A.jpg", "rb") - output_stream = open("target/out.jpg", "wb") - c2pa_data = builder.sign(signer, "image/jpeg", input_stream, output_stream) - -except Exception as err: - print(err) - ``` - -## Development - -It is best to [set up a virtual environment](https://virtualenv.pypa.io/en/latest/installation.html) for development and testing. - -To build from source on Linux, install `curl` and `rustup` then set up Python. - -First update `apt` then (if needed) install `curl`: - -```bash -apt update -apt install curl -``` - -Install Rust: - -```bash -curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -source "$HOME/.cargo/env" -``` - -Install Python, `pip`, and `venv`: - -```bash -apt install python3 -apt install pip -apt install python3.11-venv -python3 -m venv .venv -``` - -Build the wheel for your platform (from the root of the repository): - -```bash -source .venv/bin/activate -pip install -r requirements.txt -python3 -m pip install build -pip install -U pytest - -python3 -m build --wheel -``` - -Note: To peek at the Python code (uniffi generated and non-generated), run `maturin develop` and look in the c2pa folder. - -### ManyLinux build - -Build using [manylinux](https://github.com/pypa/manylinux) by using a Docker image as follows: - -```bash -docker run -it quay.io/pypa/manylinux_2_28_aarch64 bash -curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -source "$HOME/.cargo/env" -export PATH=/opt/python/cp312-cp312/bin:$PATH -pip install maturin -pip install venv -pip install build -pip install -U pytest - -cd home -git clone https://github.com/contentauth/c2pa-python.git -cd c2pa-python -python3 -m build --wheel -auditwheel repair target/wheels/c2pa_python-0.4.0-py3-none-linux_aarch64.whl -``` - -### Testing - -We use [PyTest](https://docs.pytest.org/) for testing. - -Run tests by following these steps: - -1. Activate the virtual environment: `source .venv/bin/activate` -2. (optional) Install dependencies: `pip install -r requirements.txt` -3. Setup the virtual environment with local changes: `maturin develop` -4. Run the tests: `pytest` -5. Deactivate the virtual environment: `deactivate` - -For example: - -```bash -source .venv/bin/activate -maturin develop -python3 tests/training.py -deactivate -``` - ## License This package is distributed under the terms of both the [MIT license](https://github.com/contentauth/c2pa-python/blob/main/LICENSE-MIT) and the [Apache License (Version 2.0)](https://github.com/contentauth/c2pa-python/blob/main/LICENSE-APACHE). diff --git a/docs/project-contributions.md b/docs/project-contributions.md new file mode 100644 index 0000000..94df240 --- /dev/null +++ b/docs/project-contributions.md @@ -0,0 +1,87 @@ +# Contributing to the project + +The information in this page is primarily for those who wish to contribute to the c2pa-python library project itself, rather than those who simply wish to use it in an application. For general contribution guidelines, see [CONTRIBUTING.md](../CONTRIBUTING.md). + +## Development + +It is best to [set up a virtual environment](https://virtualenv.pypa.io/en/latest/installation.html) for development and testing. + +To build from source on Linux, install `curl` and `rustup` then set up Python. + +First update `apt` then (if needed) install `curl`: + +```bash +apt update +apt install curl +``` + +Install Rust: + +```bash +curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh +source "$HOME/.cargo/env" +``` + +Install Python, `pip`, and `venv`: + +```bash +apt install python3 +apt install pip +apt install python3.11-venv +python3 -m venv .venv +``` + +Build the wheel for your platform (from the root of the repository): + +```bash +source .venv/bin/activate +pip install -r requirements.txt +python3 -m pip install build +pip install -U pytest + +python3 -m build --wheel +``` + +Note: To peek at the Python code (uniffi generated and non-generated), run `maturin develop` and look in the c2pa folder. + +## ManyLinux build + +Build using [manylinux](https://github.com/pypa/manylinux) by using a Docker image as follows: + +```bash +docker run -it quay.io/pypa/manylinux_2_28_aarch64 bash +curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh +source "$HOME/.cargo/env" +export PATH=/opt/python/cp312-cp312/bin:$PATH +pip install maturin +pip install venv +pip install build +pip install -U pytest + +cd home +git clone https://github.com/contentauth/c2pa-python.git +cd c2pa-python +python3 -m build --wheel +auditwheel repair target/wheels/c2pa_python-0.4.0-py3-none-linux_aarch64.whl +``` + +## Testing + +We use [PyTest](https://docs.pytest.org/) for testing. + +Run tests by following these steps: + +1. Activate the virtual environment: `source .venv/bin/activate` +2. (optional) Install dependencies: `pip install -r requirements.txt` +3. Setup the virtual environment with local changes: `maturin develop` +4. Run the tests: `pytest` +5. Deactivate the virtual environment: `deactivate` + +For example: + +```bash +source .venv/bin/activate +maturin develop +python3 tests/training.py +deactivate +``` diff --git a/docs/usage.md b/docs/usage.md new file mode 100644 index 0000000..7e7aa8e --- /dev/null +++ b/docs/usage.md @@ -0,0 +1,235 @@ +# Using the Python library + +This package works with media files in the [supported formats](https://github.com/contentauth/c2pa-rs/blob/main/docs/supported-formats.md). + +## Import + +Import the API as follows: + +```py +from c2pa import * +``` + +## Define manifest JSON + +The Python library works with both file-based and stream-based operations. +In both cases, the manifest JSON string defines the C2PA manifest to add to an asset; for example: + +```py +manifest_json = json.dumps({ + "claim_generator": "python_test/0.1", + "assertions": [ + { + "label": "c2pa.training-mining", + "data": { + "entries": { + "c2pa.ai_generative_training": { "use": "notAllowed" }, + "c2pa.ai_inference": { "use": "notAllowed" }, + "c2pa.ai_training": { "use": "notAllowed" }, + "c2pa.data_mining": { "use": "notAllowed" } + } + } + } + ] + }) +``` + +## Signing function + +The `sign_ps256` function is [defined in the library](https://github.com/contentauth/c2pa-python/blob/main/c2pa/c2pa_api/c2pa_api.py#L209) is used in both file-based and stream-based methods and is reproduced here to show how signing is performed. + +```py +# Example of using Python crypto to sign data using openssl with Ps256 +from cryptography.hazmat.primitives import hashes, serialization +from cryptography.hazmat.primitives.asymmetric import padding + +def sign_ps256(data: bytes, key_path: str) -> bytes: + with open(key_path, "rb") as key_file: + private_key = serialization.load_pem_private_key( + key_file.read(), + password=None, + ) + signature = private_key.sign( + data, + padding.PSS( + mgf=padding.MGF1(hashes.SHA256()), + salt_length=padding.PSS.MAX_LENGTH + ), + hashes.SHA256() + ) + return signature +``` + +## File-based operation + +### Read and validate C2PA data + +Use the `Reader` to read C2PA data from the specified asset file. + +This examines the specified media file for C2PA data and generates a report of any data it finds. If there are validation errors, the report includes a `validation_status` field. + +An asset file may contain many manifests in a manifest store. The most recent manifest is identified by the value of the `active_manifest` field in the manifests map. The manifests may contain binary resources such as thumbnails which can be retrieved with `resource_to_stream` or `resource_to_file` using the associated `identifier` field values and a `uri`. + +NOTE: For a comprehensive reference to the JSON manifest structure, see the [Manifest store reference](https://opensource.contentauthenticity.org/docs/manifest/manifest-ref). + +```py +try: + # Create a reader from a file path + reader = c2pa.Reader.from_file("path/to/media_file.jpg") + + # Print the JSON for a manifest. + print("manifest store:", reader.json()) + + # Get the active manifest. + manifest = reader.get_active_manifest() + if manifest != None: + + # get the uri to the manifest's thumbnail and write it to a file + uri = manifest["thumbnail"]["identifier"] + reader.resource_to_file(uri, "thumbnail_v2.jpg") + +except Exception as err: + print(err) +``` + +### Add a signed manifest + +**WARNING**: This example accesses the private key and security certificate directly from the local file system. This is fine during development, but doing so in production may be insecure. Instead use a Key Management Service (KMS) or a hardware security module (HSM) to access the certificate and key; for example as show in the [C2PA Python Example](https://github.com/contentauth/c2pa-python-example). + +Use a `Builder` to add a manifest to an asset: + +```py +try: + # Define a function to sign the claim bytes + # In this case we are using a pre-defined sign_ps256 method, passing in our private cert + # Normally this cert would be kept safe in some other location + def private_sign(data: bytes) -> bytes: + return sign_ps256(data, "tests/fixtures/ps256.pem") + + # read our public certs into memory + certs = open(data_dir + "ps256.pub", "rb").read() + + # Create a signer from the private signer, certs and a time stamp service url + signer = create_signer(private_sign, SigningAlg.PS256, certs, "http://timestamp.digicert.com") + + # Create a builder add a thumbnail resource and an ingredient file. + builder = Builder(manifest_json) + + # The uri provided here "thumbnail" must match an identifier in the manifest definition. + builder.add_resource_file("thumbnail", "tests/fixtures/A_thumbnail.jpg") + + # Define an ingredient, in this case a parent ingredient named A.jpg, with a thumbnail + ingredient_json = { + "title": "A.jpg", + "relationship": "parentOf", # "parentOf", "componentOf" or "inputTo" + "thumbnail": { + "identifier": "thumbnail", + "format": "image/jpeg" + } + } + + # Add the ingredient to the builder loading information from a source file. + builder.add_ingredient_file(ingredient_json, "tests/fixtures/A.jpg") + + # At this point we could archive or unarchive our Builder to continue later. + # In this example we use a bytearray for the archive stream. + # all ingredients and resources will be saved in the archive + archive = io.BytesIO(bytearray()) + builder.to_archive(archive) + archive.seek() + builder = builder.from_archive(archive) + + # Sign and add our manifest to a source file, writing it to an output file. + # This returns the binary manifest data that could be uploaded to cloud storage. + c2pa_data = builder.sign_file(signer, "tests/fixtures/A.jpg", "target/out.jpg") + +except Exception as err: + print(err) +``` + +## Stream-based operation + +Instead of working with files, you can read, validate, and add a signed manifest to streamed data. This example code does the same thing as the file-based example. + +### Read and validate C2PA data + +```py +try: + # It's also possible to create a reader from a format and stream + # Note that these two readers are functionally equivalent + stream = open("path/to/media_file.jpg", "rb") + reader = c2pa.Reader("image/jpeg", stream) + + # Print the JSON for a manifest. + print("manifest store:", reader.json()) + + # Get the active manifest. + manifest = reader.get_active_manifest() + if manifest != None: + + # get the uri to the manifest's thumbnail and write it to a file + uri = manifest["thumbnail"]["identifier"] + reader.resource_to_file(uri, "thumbnail_v2.jpg") + +except Exception as err: + print(err) +``` + +### Add a signed manifest to a stream + +**WARNING**: This example accesses the private key and security certificate directly from the local file system. This is fine during development, but doing so in production may be insecure. Instead use a Key Management Service (KMS) or a hardware security module (HSM) to access the certificate and key; for example as show in the [C2PA Python Example](https://github.com/contentauth/c2pa-python-example). + +Use a `Builder` to add a manifest to an asset: + +```py +try: + # Define a function to sign the claim bytes + # In this case we are using a pre-defined sign_ps256 method, passing in our private cert + # Normally this cert would be kept safe in some other location + def private_sign(data: bytes) -> bytes: + return sign_ps256(data, "tests/fixtures/ps256.pem") + + # read our public certs into memory + certs = open(data_dir + "ps256.pub", "rb").read() + + # Create a signer from the private signer, certs and a time stamp service url + signer = create_signer(private_sign, SigningAlg.PS256, certs, "http://timestamp.digicert.com") + + # Create a builder add a thumbnail resource and an ingredient file. + builder = Builder(manifest_json) + + # Add the resource from a stream + a_thumbnail_jpg_stream = open("tests/fixtures/A_thumbnail.jpg", "rb") + builder.add_resource("image/jpeg", a_thumbnail_jpg_stream) + + # Define an ingredient, in this case a parent ingredient named A.jpg, with a thumbnail + ingredient_json = { + "title": "A.jpg", + "relationship": "parentOf", # "parentOf", "componentOf" or "inputTo" + "thumbnail": { + "identifier": "thumbnail", + "format": "image/jpeg" + } + } + + # Add the ingredient from a stream + a_jpg_stream = open("tests/fixtures/A.jpg", "rb") + builder.add_ingredient("image/jpeg", a_jpg_stream) + + # At this point we could archive or unarchive our Builder to continue later. + # In this example we use a bytearray for the archive stream. + # all ingredients and resources will be saved in the archive + archive = io.BytesIO(bytearray()) + builder.to_archive(archive) + archive.seek() + builder = builder.from_archive(archive) + + # Sign the builder with a stream and output it to a stream + # This returns the binary manifest data that could be uploaded to cloud storage. + input_stream = open("tests/fixtures/A.jpg", "rb") + output_stream = open("target/out.jpg", "wb") + c2pa_data = builder.sign(signer, "image/jpeg", input_stream, output_stream) + +except Exception as err: + print(err) + ``` From 7ff05a6c029ee7e01b929353af17fde65be7a336 Mon Sep 17 00:00:00 2001 From: Rand McKinney Date: Mon, 25 Nov 2024 18:27:51 -0800 Subject: [PATCH 6/7] Change intro wording for consistency --- README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 3b24dc2..5945638 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,7 @@ -# C2PA Python +# C2PA Python library -This package implements Python bindings for the Content Authenticity Initiative (CAI) SDK. -It enables you to read and validate C2PA manifest data from and add signed manifests to media files in the [supported formats](https://github.com/contentauth/c2pa-rs/blob/main/docs/supported-formats.md). +The [c2pa-python](https://github.com/contentauth/c2pa-python) repository implements Python bindings for the Content Authenticity Initiative (CAI) SDK. +It enables you to read and validate C2PA manifest data from and add signed manifests to media files in supported formats. **NOTE**: Starting with version 0.5.0, this package has a completely different API from version 0.4.0. See [Release notes](docs/release-notes.md) for more information. From c8afc12af6dc0730090abcf81f05c86188b8662d Mon Sep 17 00:00:00 2001 From: Rand McKinney Date: Tue, 26 Nov 2024 15:34:45 -0800 Subject: [PATCH 7/7] Updates to testing from #70 --- docs/project-contributions.md | 20 +++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-) diff --git a/docs/project-contributions.md b/docs/project-contributions.md index 94df240..d36d2ff 100644 --- a/docs/project-contributions.md +++ b/docs/project-contributions.md @@ -67,7 +67,7 @@ auditwheel repair target/wheels/c2pa_python-0.4.0-py3-none-linux_aarch64.whl ## Testing -We use [PyTest](https://docs.pytest.org/) for testing. +We use [PyTest](https://docs.pytest.org/) and [unittest](https://docs.python.org/3/library/unittest.html) for testing. Run tests by following these steps: @@ -85,3 +85,21 @@ maturin develop python3 tests/training.py deactivate ``` + +### Testing during bindings development + +While developing bindings locally, we use [unittest](https://docs.python.org/3/library/unittest.html), since [PyTest](https://docs.pytest.org/) can get confused by virtual environment re-deployments (especially if you bump the version number). + +To run tests while developing bindings, enter this command: + +```sh +make test +``` + +To rebuild and test, enter these commands: + +```sh +make build-python +make test +``` +