feat: Provide an action to redeliver failed webhook deliveries (#40)

* WIP checkin public interface * change public interface to use an action * simplify public interface * move logic to the workload * fix charmcraft.yaml * use paas-charm * fix wrong script name * lint * revert tox.ini changes * change since argument to take seconds * revert interface to redeliver * WIP checkin * checkin high-level impl * use private function to abstract github api interaction away * impl private function to redeliver * update perms * add support for org webhooks * add exception handling * add arg parsing * add script to coverage check * add first version of charm action * only forward webhooks with action type queued * only redeliver webhooks with workflow_job event * use constants from outer module * checkin auth details parsing * add -id suffix to secret parms * support str|int for app-id * refactor a bit * fix wrong webhook_origin for repo * WIP checkin integration tests * WIP checkin integration tests * more integration tests * fix workflow file and cancel jobs * add non-secrets args to extra-arguments * lint * lint * split modules in integration test * use pytest_asyncio fixture * rename unit test file * bump juju version * use client id * restructure script * use env vars for secrets in integration test * lint * lint * ignore cve * fix test_app * pass over auth via env * fix/add integration tests * remove sys exit from private fcts * lint and refactor integration tests * ignore bandit warnings * recover coverage and fix none_fields check * provide better action failed message on argument parsing error * refmt * use operator-workflows from main * move arg validation to workload * remove action from required non-null fields * add information for integration test arguments. * pass pydantic validation error msg to github app auth details * add note about juju 3.3 * add tox file for charm code * re-enable self-hosted runners * only set env vars if present * _WebhookDeliveryAttempt -> _WebhookDeliveryAttempts * lint * inline function * use edge runners * use edge runners in integration tests * move inline comment * remove nesting * lint * ignore cve * try rockcraft with latest/stable * Revert "try rockcraft with latest/stable" This reverts commit 5b139d2. * pin paas-charm 1.1.0 * update docs * fix lint
canonical · Jan 6, 2025 · 6b58f01 · 6b58f01
1 parent d90c04c
commit 6b58f01
Show file tree

Hide file tree

Showing 23 changed files with 1,474 additions and 35 deletions.
diff --git a/.github/workflows/integration_test.yaml b/.github/workflows/integration_test.yaml
@@ -3,14 +3,20 @@ name: Integration tests
 on:
   pull_request:
 
+
 jobs:
   integration-tests:
+
     uses: canonical/operator-workflows/.github/workflows/integration_test.yaml@main
     secrets: inherit
+
     with:
-      juju-channel: 3.1/stable
+      juju-channel: 3.6/stable
       channel: 1.28-strict/stable
       trivy-image-config: "trivy.yaml"
-      self-hosted-runner: false
+      self-hosted-runner: true
+      self-hosted-runner-label: 'edge'
       rockcraft-channel: latest/edge
       charmcraft-channel: latest/edge
+      modules: '["test_app", "test_webhook_redelivery"]'
+      extra-arguments: --webhook-test-repository cbartz-org/gh-runner-test
diff --git a/.github/workflows/test.yaml b/.github/workflows/test.yaml
@@ -8,4 +8,5 @@ jobs:
     uses: canonical/operator-workflows/.github/workflows/test.yaml@main
     secrets: inherit
     with:
-      self-hosted-runner: false
+      self-hosted-runner: true
+      self-hosted-runner-label: 'edge'
diff --git a/.github/workflows/webhook_redelivery_test.yaml b/.github/workflows/webhook_redelivery_test.yaml
@@ -0,0 +1,15 @@
+name: Webhook Redelivery Test
+#  This workflow will be triggered by the integration test used to test webhook redelivery.
+# It is not necessary to be picked up by a runner, we only need to ensure a webhook is triggered.
+
+on:
+  workflow_dispatch:
+
+
+jobs:
+  dispatch-job:
+    runs-on: ["self-hosted", "invalid-flavor"] # The job is not supposed to take a runner, therefore we use an invalid-flavor
+    steps:
+     - name: Hello world
+       run: |
+          echo "Hello, world"
diff --git a/.trivyignore b/.trivyignore
@@ -2,3 +2,5 @@
 CVE-2024-34156
 # Vulnerability in golang.org/x/crypto introduced by statsd-exporter. We have to wait until it is fixed upstream: https://github.com/prometheus/statsd_exporter/blob/master/go.mod
 CVE-2024-45337
+# Vulnerability in golang.org/x/net introduced by statsd-exporter. We have to wait until it is fixed upstream: https://github.com/prometheus/statsd_exporter/blob/master/go.mod
+CVE-2024-45338
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -9,6 +9,14 @@ tox devenv -e integration
 source venv/bin/activate
 ```
 
+## Repository structure
+
+The repository contains the charm code in the `charm` directory and the code for the workload
+in the root directory. The charm directory has been built using the
+[`paas-charm`](https://juju.is/docs/sdk/12-factor-app-charm) approach and then modified to support
+the specific actions of this charm.
+
+
 ## Generating src docs for every commit
 
 Run the following command:
@@ -24,13 +32,24 @@ This project uses `tox` for managing test environments. There are some pre-confi
 that can be used for linting and formatting code when you're preparing contributions to the charm:
 
 ```shell
-tox run -e format        # update your code according to linting rules
+tox run -e fmt        # update your code according to linting rules
 tox run -e lint          # code style
 tox run -e unit          # unit tests
 tox run -e integration   # integration tests
-tox                      # runs 'format', 'lint', and 'unit' environments
+tox                      # runs 'fmt', 'lint', and 'unit' environments
 ```
 
+The integration tests require additional parameters which can be looked up in the `tests/conftest.py` file.
+Some of them have environment variable counterparts (see `tests/integration/conftest.py`),
+which can be set instead of passing them as arguments, which is more secure for sensitive data.
+
+There is also a `tox` root in the `charm` directory, which can be used to lint and format the charm code:
+
+```shell
+cd charm
+tox run -e fmt        # update your code according to linting rules
+tox run -e lint       # code style
+```
 
 ## Development server
 

diff --git a/README.md b/README.md
@@ -52,6 +52,16 @@ Change the webhook secret used for webhook validation:
 juju config github-runner-webhook-router webhook-secret=<your-secret>
 ```
 
+In an error scenario, you may want to redeliver failed webhook deliveries. You can use
+the `redeliver-failed-webhooks` action to redeliver failed webhook deliveries. The following 
+example redelivers failed deliveries since last minute for a webhook with ID `516986490`
+
+```shell
+juju add-secret github-token token=<your-token> # the token needs webhook write permissions
+# output is: secret:ctik2gfmp25c7648t7j0
+juju run-action github-runner-webhook-router/0 redeliver-failed-webhook github-path=canonical/github-runner-webhook-router webhook-id=516986490 since=60 github-token-secret-id=ctik2gfmp25c7648t7j0
+```
+
 ### Integrations
 
 The charm requires an integration with MongoDB (either the [machine](https://charmhub.io/mongodb)

diff --git a/charm/charmcraft.yaml b/charm/charmcraft.yaml
@@ -30,6 +30,61 @@ requires:
     limit: 1
 
 
+actions:
+  redeliver-failed-webhooks:
+    description: >-
+      Redeliver failed webhook deliveries since a certain time period. This action fetches the
+      Github api for failed deliveries and triggers redelivery. Note that the amount of
+      webhook deliveries that will be redelivered can be quite large and the requests are counted
+      against the rate limit of the Github API. The action returns the amount of webhooks
+      that were redelivered.
+      Note that this action requires juju user secrets, which have been available since juju 3.3.
+    params:
+      since:
+        description: "The amount of seconds to look back for failed deliveries."
+        type: integer
+      github-path:
+        description: >-
+          The path of the organisation or repository where the webhooks are registered. Should
+          be in the format of <organisation> or <organisation>/<repository>.
+        type: string
+      webhook-id:
+        description: "The id of the webhook to redeliver."
+        type: integer
+      github-app-client-id:
+        description: >-
+          The client ID of the GitHub App to use for communication with GitHub,
+          If provided, the other github-app-* params must also be provided.
+          The Github App needs to have write permission for Webhooks.
+          Either this or the github-token must be provided.
+        type: string
+      github-app-installation-id:
+        description: >-
+          The app installation id of the GitHub App to use for communication with GitHub.
+          If provided, the other github-app-* params must also be provided.
+          The Github App needs to have write permission for Webhooks.
+          Either this or the github-token must be provided.
+        type: integer
+      github-app-private-key-secret-id:
+        description: >-
+          The juju user secret id of the private key to use for communication with GitHub. The 
+          key has to be provided in a field named 'private-key' in the secret.
+          If provided, the other github-app-* params must also be provided.
+          The Github App needs to have write permission for Webhooks.
+          Either this or the github-token must be provided.
+        type: string
+      github-token-secret-id:
+        description: >-
+          The juju user secret id of the token to use for communication with GitHub.The 
+          token has to be provided in a field named 'token' in the secret.
+          This can be a PAT with write admin:repo_hook or a fine-grained token with write permission for Webhooks. 
+          Either this or the GitHub App configuration must be provided.
+        type: string
+    required:
+      - since
+      - github-path
+      - webhook-id
+
 config:
   options:
     default-flavour:
@@ -52,6 +107,7 @@ config:
             If a job matches multiple flavours, the first flavour matching defined in this configuration will be used.
             Note that labels are treated case-insensitive.
          required: true
+
     log-level:
       type: string
       description: "The log level to use for the application logs. Use any of: CRITICAL, ERROR, WARNING, INFO, DEBUG, NOTSET"

diff --git a/charm/requirements.txt b/charm/requirements.txt
@@ -1 +1 @@
-paas-app-charmer~=1.4.0
+paas-charm~=1.1.0
diff --git a/charm/src/charm.py b/charm/src/charm.py
@@ -3,18 +3,39 @@
 #  See LICENSE file for licensing details.
 
 """Flask Charm entrypoint."""
-
+import json
 import logging
 import typing
 
 import ops
 
-import paas_app_charmer.flask
+# we don't have the types for paas_charm.flask
+import paas_charm.flask  # type: ignore
+from ops import ActionEvent
+from ops.pebble import ExecError
 
 logger = logging.getLogger(__name__)
 
 
-class FlaskCharm(paas_app_charmer.flask.Charm):
+SCRIPT_ARG_PARSE_ERROR_EXIT_CODE = 1
+SINCE_PARAM_NAME = "since"
+GITHUB_PATH_PARAM_NAME = "github-path"
+WEBHOOK_ID_PARAM_NAME = "webhook-id"
+GITHUB_TOKEN_SECRET_ID_PARAM_NAME = "github-token-secret-id"
+GITHUB_APP_CLIENT_ID_PARAM_NAME = "github-app-client-id"
+GITHUB_APP_INSTALLATION_ID_PARAM_NAME = "github-app-installation-id"
+GITHUB_APP_PRIVATE_KEY_SECRET_ID_PARAM_NAME = "github-app-private-key-secret-id"
+GITHUB_TOKEN_ENV_NAME = "GITHUB_TOKEN"
+GITHUB_APP_CLIENT_ID_ENV_NAME = "GITHUB_APP_CLIENT_ID"
+GITHUB_APP_INSTALLATION_ID_ENV_NAME = "GITHUB_APP_INSTALLATION_ID"
+GITHUB_APP_PRIVATE_KEY_ENV_NAME = "GITHUB_APP_PRIVATE_KEY"
+
+
+class _ActionParamsInvalidError(Exception):
+    """Raised when the action parameters are invalid."""
+
+
+class FlaskCharm(paas_charm.flask.Charm):
     """Flask Charm service."""
 
     def __init__(self, *args: typing.Any) -> None:
@@ -24,6 +45,110 @@ def __init__(self, *args: typing.Any) -> None:
             args: passthrough to CharmBase.
         """
         super().__init__(*args)
+        self.framework.observe(
+            self.on.redeliver_failed_webhooks_action, self._on_redeliver_failed_webhooks_action
+        )
+
+    def _on_redeliver_failed_webhooks_action(self, event: ops.charm.ActionEvent) -> None:
+        """Redeliver failed webhooks since a given time."""
+        logger.info("Redelivering failed webhooks.")
+        container: ops.Container = self.unit.get_container("flask-app")
+        since_seconds = event.params[SINCE_PARAM_NAME]
+        github_path = event.params[GITHUB_PATH_PARAM_NAME]
+        webhook_id = event.params[WEBHOOK_ID_PARAM_NAME]
+
+        try:
+            auth_env = self._get_github_auth_env(event)
+        except _ActionParamsInvalidError as exc:
+            event.fail(f"Invalid action parameters passed: {exc}")
+            return
+        try:
+            stdout, _ = container.exec(
+                [
+                    "/usr/bin/python3",
+                    "/flask/app/webhook_redelivery.py",
+                    "--since",
+                    str(since_seconds),
+                    "--github-path",
+                    github_path,
+                    "--webhook-id",
+                    str(webhook_id),
+                ],
+                environment=auth_env,
+            ).wait_output()
+            logger.info("Got %s", stdout)
+            # only consider the last line as result
+            result = json.loads(stdout.rstrip().split("\n")[-1])
+            event.set_results(result)
+        except ExecError as exc:
+            logger.warning("Webhook redelivery failed, script reported: %s", exc.stderr)
+            if exc.exit_code == SCRIPT_ARG_PARSE_ERROR_EXIT_CODE:
+                event.fail(f"Argument parsing failed. {exc.stderr}")
+                return
+            event.fail("Webhooks redelivery failed. Look at the juju logs for more information.")
+
+    def _get_github_auth_env(self, event: ActionEvent) -> dict[str, str]:
+        """Get the GitHub auth environment variables from the action parameters.
+
+        Args:
+            event: The action event.
+
+        Returns:
+            The GitHub auth environment variables used by the script in the workload.
+        """
+        github_token_secret_id = event.params.get(GITHUB_TOKEN_SECRET_ID_PARAM_NAME)
+        github_app_client_id = event.params.get(GITHUB_APP_CLIENT_ID_PARAM_NAME)
+        github_app_installation_id = event.params.get(GITHUB_APP_INSTALLATION_ID_PARAM_NAME)
+        github_app_private_key_secret_id = event.params.get(
+            GITHUB_APP_PRIVATE_KEY_SECRET_ID_PARAM_NAME
+        )
+
+        github_token = (
+            self._get_secret_value(github_token_secret_id, "token")
+            if github_token_secret_id
+            else None
+        )
+        github_app_private_key = (
+            self._get_secret_value(github_app_private_key_secret_id, "private-key")
+            if github_app_private_key_secret_id
+            else None
+        )
+
+        env_vars = {
+            GITHUB_TOKEN_ENV_NAME: github_token,
+            GITHUB_APP_CLIENT_ID_ENV_NAME: github_app_client_id,
+            GITHUB_APP_INSTALLATION_ID_ENV_NAME: (
+                str(github_app_installation_id) if github_app_installation_id else None
+            ),
+            GITHUB_APP_PRIVATE_KEY_ENV_NAME: github_app_private_key,
+        }
+        return {k: v for k, v in env_vars.items() if v}
+
+    def _get_secret_value(self, secret_id: str, key: str) -> str:
+        """Get the value of a secret.
+
+        Args:
+            secret_id: The secret id.
+            key: The key of the secret value to extract.
+
+        Returns:
+            The secret value.
+
+        Raises:
+            _ActionParamsInvalidError: If the secret does not exist
+                or the key is not in the secret.
+        """
+        try:
+            secret = self.model.get_secret(id=secret_id)
+        except ops.model.ModelError as exc:
+            raise _ActionParamsInvalidError(f"Could not access/find secret {secret_id}") from exc
+        secret_data = secret.get_content()
+        try:
+            return secret_data[key]
+        except KeyError as exc:
+            raise _ActionParamsInvalidError(
+                f"Secret {secret_id} does not contain a field called '{key}'."
+            ) from exc
 
 
 if __name__ == "__main__":