Skip to content

Commit

Permalink
Merge pull request #35616 from github/repo-sync
Browse files Browse the repository at this point in the history
Repo sync
  • Loading branch information
docs-bot authored Dec 10, 2024
2 parents fc00ea2 + 3e83ae4 commit 14a8f7f
Show file tree
Hide file tree
Showing 43 changed files with 159 additions and 2,871 deletions.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,10 @@ For more information, see:
* "[AUTOTITLE](/organizations/managing-organization-settings/configuring-the-retention-period-for-github-actions-artifacts-and-logs-in-your-organization)"
* "[AUTOTITLE](/admin/policies/enforcing-policies-for-your-enterprise/enforcing-policies-for-github-actions-in-your-enterprise#enforcing-a-policy-for-artifact-and-log-retention-in-your-enterprise)"

## Workflow run history retention policy

The workflow runs in a repository's workflow run history are retained for 400 days. After 400 days, workflow runs are archived. 10 days after archival, they are permanently deleted. The retention period for workflow runs cannot be modified. For more information, see "[AUTOTITLE](/actions/monitoring-and-troubleshooting-workflows/monitoring-workflows/viewing-workflow-run-history)."

## Disabling or limiting {% data variables.product.prodname_actions %} for your repository or organization

{% data reusables.actions.disabling-github-actions %}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ To configure the OIDC identity provider in GCP, you will need to perform the fol

Additional guidance for configuring the identity provider:

* For security hardening, make sure you've reviewed "[AUTOTITLE](/actions/deployment/security-hardening-your-deployments/about-security-hardening-with-openid-connect#configuring-the-oidc-trust-with-the-cloud)." For an example, see "[AUTOTITLE](/actions/deployment/security-hardening-your-deployments/about-security-hardening-with-openid-connect#configuring-the-subject-in-your-cloud-provider)."
* For security hardening, make sure you've reviewed "[Configuring the OIDC trust with the cloud](/actions/deployment/security-hardening-your-deployments/about-security-hardening-with-openid-connect#configuring-the-oidc-trust-with-the-cloud)." For an example, see "[Configuring the subject in your cloud provider](/actions/deployment/security-hardening-your-deployments/about-security-hardening-with-openid-connect#configuring-the-subject-in-your-cloud-provider)."
* For the service account to be available for configuration, it needs to be assigned to the `roles/iam.workloadIdentityUser` role. For more information, see [the GCP documentation](https://cloud.google.com/iam/docs/workload-identity-federation?_ga=2.114275588.-285296507.1634918453#conditions).
* The Issuer URL to use: {% ifversion ghes %}`https://HOSTNAME/_services/token`{% else %}`https://token.actions.githubusercontent.com`{% endif %}

Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Removing sensitive data from a repository
intro: 'If you commit sensitive data into a Git repository, you can remove it from the history.'
intro: 'Sensitive data can be removed from the history of a repository _if_ you can carefully coordinate with everyone who has cloned it and you are willing to manage the side effects.'
redirect_from:
- /remove-sensitive-data
- /removing-sensitive-data
Expand All @@ -20,15 +20,33 @@ shortTitle: Remove sensitive data

## About removing sensitive data from a repository

When altering your repository's history using tools like `git filter-repo`, it's crucial to understand the implications, especially regarding open pull requests and sensitive data.
When altering your repository's history using tools like `git filter-repo`, it's crucial to understand the implications. Rewriting history requires careful coordination with collaborators to successfully execute, and has a number of side effects that must be managed.

The `git filter-repo` tool rewrites your repository's history, which changes the SHAs for existing commits that you alter and any dependent commits. Changed commit SHAs may affect open pull requests in your repository. We recommend merging or closing all open pull requests before removing files from your repository.
It is important to note that if the sensitive data you need to remove is a secret (e.g. password/token/credential), as is often the case, then as a first step you need to revoke and/or rotate that secret. Once the secret is revoked or rotated, it can no longer be used for access, and that may be sufficient to solve your problem. Going through the extra steps to rewrite the history and remove the secret may not be warranted.

You can remove the file from the latest commit with `git rm`. For information on removing a file that was added with the latest commit, see "[AUTOTITLE](/repositories/working-with-files/managing-large-files/about-large-files-on-github#removing-files-from-a-repositorys-history)."
## Side effects of rewriting history

There are numerous side effects to rewriting history; these include:

* High risk of recontamination: It is unfortunately easy to re-push the sensitive data to the repository and make a bigger mess. If a fellow developer has a clone from before your rewrite, and after your rewrite simply runs `git pull` followed by `git push`, the sensitive data will return. They need to either discard their clone and re-clone, or carefully walk through multiple steps to clean up their clone first.
* Risk of losing other developers' work: If other developers continue updating branches which contain the sensitive data while you are trying to clean up, you will be forced to either redo the cleanup, or to discard their work.
* Changed commit hashes: Rewriting history will change the hashes of the commits that introduced the sensitive data _and_ all commits that came after. Any tooling or automation that depends on commit hashes not changing will be broken or have problems.
* Branch protection challenges: If you have any branch protections that prevent force pushes, those protections will have to be turned off (at least temporarily) for the sensitive data to be removed.
* Broken diff view for closed pull requests: Removing the sensitive data will require removing the internal references used for displaying the diff view in pull requests, so you will no longer be able to see these diffs. This is true not only for the PR that introduced the sensitive data, but any PR that builds on a version of history after the sensitive data PR was merged (even if those later PRs didn't add or modify any file with sensitive data).
* Poor interaction with open pull requests: Changed commit SHAs will result in a different PR diff, and comments on the old PR diff may become invalidated and lost, which may cause confusion for authors and reviewers. We recommend merging or closing all open pull requests before removing files from your repository.
* Lost signatures on commits and tags: Signatures for commits or tags depend on commit hashes; since commit hashes are modified by history rewrites, signatures would no longer be valid and many history rewriting tools (including `git filter-repo`) will simply remove the signatures. In fact, `git filter-repo` will remove commit signatures and tag signatures for commits that pre-date the sensitive data removal as well. (Technically one can workaround this with the `--refs` option to `git filter-repo` if needed, but then you will need to be careful to ensure you specify all refs that have sensitive data in their history and that include the commits that introduced the sensitive data in your range).
* Leading others directly to the sensitive data: Git was designed with cryptographic checks built into commit identifiers so that nefarious individuals could not break into a server and modify history without being noticed. That's helpful from a security perspective, but from a sensitive data perspective it means that expunging sensitive data is a very involved process of coordination; it further means that when you do modify history, clueful users with an existing clone will notice the history divergence and can use it to quickly and easily find the sensitive data still in their clone that you removed from the central repository.

## About sensitive data exposure

This article tells you how to make commits with sensitive data unreachable from any branches or tags in your repository on {% data variables.location.product_location %}. However, those commits may still be accessible elsewhere:
Removing sensitive data from a repository involves four high-level steps:

* Rewrite the repository locally, using git-filter-repo
* Update the repository on GitHub, using your locally rewritten history
* Coordinate with colleagues to clean up other clones that exist
* Prevent repeats and avoid future sensitive data spills

If you only rewrite your history and force push it, the commits with sensitive data may still be accessible elsewhere:

* In any clones or forks of your repository
* Directly via their SHA-1 hashes in cached views on {% data variables.product.product_name %}
Expand All @@ -42,8 +60,6 @@ You cannot remove sensitive data from other users' clones of your repository, bu
{% endif %}

Once you have pushed a commit to {% data variables.product.product_name %}, you should consider any sensitive data in the commit compromised. If you have committed a password, you should change it. If you have committed a key, generate a new one.

If the commit that introduced the sensitive data exists in any forks, it will continue to be accessible there. You will need to coordinate with the owners of the forks, asking them to remove the sensitive data or delete the fork entirely. {% ifversion fpt or ghec %}{% data variables.product.company_short %} cannot provide contact information for these owners. {% endif %}

Consider these limitations and challenges in your decision to rewrite your repository's history.
Expand Down Expand Up @@ -103,24 +119,14 @@ To illustrate how `git filter-repo` works, we'll show you how to remove your fil

> [!IMPORTANT] If the file with sensitive data used to exist at any other paths (because it was moved or renamed), you must run this command on those paths, as well.

1. Add your file with sensitive data to `.gitignore` to ensure that you don't accidentally commit it again.
```shell
$ echo "YOUR-FILE-WITH-SENSITIVE-DATA" >> .gitignore
$ git add .gitignore
$ git commit -m "Add YOUR-FILE-WITH-SENSITIVE-DATA to .gitignore"
> [main 051452f] Add YOUR-FILE-WITH-SENSITIVE-DATA to .gitignore
> 1 files changed, 1 insertions(+), 0 deletions(-)
```
1. Double-check that you've removed everything you wanted to from your repository's history, and that all of your branches are checked out.
1. Double-check that you've removed everything you wanted to from your repository's history.
1. The `git filter-repo` tool will automatically remove your configured remotes. Use the `git remote set-url` command to restore your remotes, replacing `OWNER` and `REPO` with your repository details. For more information, see "[AUTOTITLE](/get-started/getting-started-with-git/managing-remote-repositories#adding-a-remote-repository)."

```shell
git remote add origin https://github.com/OWNER/REPOSITORY.git
```

1. Once you're happy with the state of your repository, and you have set the appropriate remote, force-push your local changes to overwrite your repository on {% data variables.location.product_location %}, as well as all the branches you've pushed up. A force push is required to remove sensitive data from your commit history.
1. Once you're happy with the state of your repository, and you have set the appropriate remote, force-push your local changes to overwrite your repository on {% data variables.location.product_location %}. A force push is required to remove sensitive data from your commit history.
```shell
$ git push origin --force --all
Expand Down Expand Up @@ -199,7 +205,7 @@ If references are found in any forks, the results will look similar, but will st
ghe-nwo NWO
```

The same procedure using `git filter-repo` can be used to remove the sensitive data from the repository's forks. Alternatively, the forks can be deleted altogether, and if needed, the repository can be re-forked once the cleanup of the root repository is complete.
The sensitive data can be removed from a repository's forks by going to a clone of one, fetching from the cleaned up repository, then rebasing all branches and tags that contain the sensitive data on top of the relevant branch or tag from the cleaned up repository. Alternatively, the forks can be deleted altogether, and if needed, the repository can be re-forked once the cleanup of the root repository is complete.
Once you have removed the commit's references, re-run the commands to double-check.

Expand All @@ -217,8 +223,11 @@ Once garbage collection has successfully removed the commit, you'll want to brow

Preventing contributors from making accidental commits can help you prevent sensitive information from being exposed. For more information see "[AUTOTITLE](/code-security/getting-started/best-practices-for-preventing-data-leaks-in-your-organization)."

There are a few simple tricks to avoid committing things you don't want committed:
There are a few things you can do to avoid committing or pushing things that should not be shared:

* If the sensitive data is likely to be found in a file that should not be tracked by git, add that filename to `.gitignore` (and make sure to commit and push that change to `.gitignore` so other developers are protected).
* Avoid hardcoding secrets in code. Use environment variables, or secret management services like Azure Key Vault, AWS Secrets Manager, or HashiCorp Vault to manage and inject secrets at runtime.
* Create a pre-commit hook to check for sensitive data before it is committed or pushed anywhere, or use a well-known tool in a pre-commit hook like git-secrets or gitleaks. (Make sure to ask each collaborator to set up the pre-commit hook you have chosen.)
* Use a visual program like [{% data variables.product.prodname_desktop %}](https://desktop.github.com/) or [gitk](https://git-scm.com/docs/gitk) to commit changes. Visual programs generally make it easier to see exactly which files will be added, deleted, and modified with each commit.
* Avoid the catch-all commands `git add .` and `git commit -a` on the command line—use `git add filename` and `git rm filename` to individually stage files, instead.
* Use `git add --interactive` to individually review and stage changes within each file.
Expand All @@ -229,4 +238,4 @@ There are a few simple tricks to avoid committing things you don't want committe
* [`git filter-repo` man page](https://htmlpreview.github.io/?https://github.com/newren/git-filter-repo/blob/docs/html/git-filter-repo.html)
* [Pro Git: Git Tools - Rewriting History](https://git-scm.com/book/en/v2/Git-Tools-Rewriting-History)
* "[AUTOTITLE](/code-security/secret-scanning/introduction/about-secret-scanning)"
* "[AUTOTITLE](/code-security/secret-scanning/introduction/about-secret-scanning)"
Original file line number Diff line number Diff line change
Expand Up @@ -24,13 +24,12 @@ shortTitle: Add a GPG key

To sign commits associated with your account on {% data variables.product.product_name %}, you can add a public GPG key to your personal account. Before you add a key, you should check for existing keys. If you don't find any existing keys, you can generate and copy a new key. For more information, see "[AUTOTITLE](/authentication/managing-commit-signature-verification/checking-for-existing-gpg-keys)" and "[AUTOTITLE](/authentication/managing-commit-signature-verification/generating-a-new-gpg-key)."

You can add multiple public keys to your account on {% data variables.product.product_name %}. Commits signed by any of the corresponding private keys will show as verified. If you remove a public key, any commits signed by the corresponding private key will no longer show as verified.
You can add multiple public keys to your account on {% data variables.product.product_name %}. Commits signed by any of the corresponding private keys will show as verified. {% ifversion persistent-commit-verification %}Once a commit has been verified, any commits signed by the corresponding private key will continue to show as verified, even if the public key is removed.{% else %}If you remove a public key, any commits signed by the corresponding private key will no longer show as verified.{% endif %}

{% ifversion upload-expired-or-revoked-gpg-key %}
To verify as many of your commits as possible, you can add expired and revoked keys. If the key meets all other verification requirements, commits that were previously signed by any of the corresponding private keys will show as verified and indicate that their signing key is expired or revoked.
![Screenshot of a list of commits. One commit is marked with a "Verified" label. Next to the label, a dropdown explains that the commit was signed and shows a timestamp of when it was signed.](/assets/images/help/settings/verified-persistent-commit.png)

![Screenshot of a list of commits. One commit is marked with a "Verified" label. Below the label, a dropdown explains that the commit was signed, but the key has now expired.](/assets/images/help/settings/gpg-verified-with-expired-key.png)
{% endif %}
{% ifversion upload-expired-or-revoked-gpg-key %}
To verify as many of your commits as possible, you can add expired and revoked keys. If the key meets all other verification requirements, commits that were previously signed by any of the corresponding private keys will show as verified and indicate that their signing key is expired or revoked.{% endif %}

{% data reusables.gpg.supported-gpg-key-algorithms %}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ You can view the usage of your {% data variables.enterprise.enterprise_or_org %}

>[!NOTE] The usage graph is configured to represent the start of the month to the end of the month, not your specific billing period.
1. To request a CSV usage report, select **Get usage report** in the upper-right corner of the page.
1. To request a CSV usage report, select **Get usage report** in the upper-right corner of the page. You can choose a pre-selected option or use the Custom range option to specify a date range of up to 31 days.

## Viewing license usage

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -91,8 +91,8 @@ version 2.1.0 (this is the default version of SARIF used by CodeQL).
By default, the CLI will wait for GitHub to process the SARIF file for a
maximum of 2 minutes, returning a non-zero exit code if there were any
errors during processing of the analysis results. You can customize how
long the CLI will wait with `--wait-for-processing-timeout`, or
disable the feature with `--no-wait-for-processing`.
long the CLI will wait with `--wait-for-processing-timeout`, or disable
the feature with `--no-wait-for-processing`.

#### `--wait-for-processing-timeout=<waitForProcessingTimeout>`

Expand Down
Loading

0 comments on commit 14a8f7f

Please sign in to comment.