Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing originalUriBaseIds property from the SARIF specification #35043

Merged
merged 5 commits into from
Nov 7, 2024

Conversation

marcandre-larochelle
Copy link
Contributor

@marcandre-larochelle marcandre-larochelle commented Oct 22, 2024

The SARIF format specified in the documentation is missing a field allowing resolution of the references for SARIF consumers (which is invalid):

SARIF Producer file scheme instructions (see Step 6 of when the URI is split)

SARIF producers SHALL create "file" scheme URIs by means of the following procedure or any
procedure with the same result:

1. In the case of a direct producer, preserve the file system’s casing, even if the file system is caseinsensitive. In the case of a converter (which might not know the file system’s casing), preserve
the casing specified in the analysis tool’s native output file.

2. Remove "." path segments.

3. Remove empty path segments.

4. If the path contains ".." path segments, then in the case of a direct producer, resolve the path
to a canonical absolute path, using an appropriate algorithm for the operating system on which
the tool ran.
   NOTE 1: This is necessary because, for example, the path /d1/../f naively converted
   to a URI is file:///d1/../f, which resolves to file:///f according to the URI
   standard [RFC3986]. But if /d1 is a symbolic link to the directory d2/d3, then the correct
   URI is file:///d2/f.
   
   NOTE 2: ".." path segments are dangerous because the semantics of the file system on
   which the SARIF log file was produced might not match the semantics of the file system
   on which it is consumed. For example, the presence of a symbolic link in the path might
   redirect the consumer to an unpredictable location.

5. Create a URI from the resulting path.

6. Optionally, divide the resulting URI into a base URI and a relative URI (preserving case in both
parts), and create an entry for the base URI in theRun.originalUriBaseIds (§3.14.14).
   NOTE 3: URI and path manipulation are complex topics. Many operating systems,
languages, and frameworks provide methods to perform these operations, which is
preferable to having every SARIF producer reimplement them. For example, in C#, the
operation can be performed as follows

And related to the originalUriBaseIds the property:

A run object MAY contain a property named originalUriBaseIds whose value is an object (§3.6)
each of whose property names designates a URI base id (§3.4.4) and each of whose property values is
an artifactLocation object (§3.4) that specifies (in the manner described below) the absolute URI
[RFC3986] of that URI base id on the machine where the SARIF producer ran.

If the artifactLocation object’s uri property (§3.4.3) is a relative reference, its uriBaseId property
(§3.4.4) SHALL be present. Otherwise (that is, if uri is an absolute URI, or if it is absent), uriBaseId
SHALL be absent.

Why:

As per the SARIF specification:

This property allows SARIF consumers to resolve any relative references which appear in any
artifactLocation objects elsewhere in the run, as long as the consumer runs either on the same
machine as the producer, or on a machine with an identical file system layout. This is useful for individual
developers who wish to run analysis tools and examine the results in a viewer. It is also useful for teams
which share a convention for their file system layout.

A SARIF consumer SHALL use the following procedure to resolve a URI base id from the information in
originalUriBaseIds:

NOTE 3: This procedure is part of an overall URI base id resolution procedure described
in §3.4.4.
NOTE 4: In this procedure, we refer to the resolved URI value by the variable name
resolvedUri.

1. Set resolvedUri to an empty string.

2. Fetch the artifactLocation object whose property name within originalUriBaseIds is
the value of uriBaseId. If there is no such property, the resolution procedure fails.

3. Prepend artifactLocation.uri to resolvedUri.

4. If artifactLocation.uri is an absolute URI, resolvedUri is the final resolved URI, and the
procedure succeeds.

Otherwise:

5. If uriBaseId is absent, the resolution procedure fails.
    NOTE 3: This would not occur in a valid SARIF file, but the file might not be valid.
  
6. If the value of uriBaseId has already been encountered during this resolution procedure (that
is, if there is a loop in the sequence of URI base ids), the resolution procedure fails.
    NOTE 4: This would not occur in a valid SARIF file, but the file might not be valid.
  
7. Otherwise (that is, if uriBaseId is present and its value has not previously been encountered
during this resolution), return to Step 2.

Multiple vendors are following GitHub's documentation here and omitting the originalUriBaseIds during their implementation, making it impossible to perform the above mentioned relative references resolution for other SARIF consumers. This is to correct this missing property.

What's being changed (if available, include any code snippets, screenshots, or gifs):

Added the property originalUriBaseIds with the example data from the SARIF specification.

Check off the following:

  • I have reviewed my changes in staging, available via the View deployment link in this PR's timeline (this link will be available after opening the PR).

    • For content changes, you will also see an automatically generated comment with links directly to pages you've modified. The comment won't appear if your PR only edits files in the data directory.
  • For content changes, I have completed the self-review checklist.

Fix invalid SARIF specification, missing originalUriBaseIds from the format
Copy link

welcome bot commented Oct 22, 2024

Thanks for opening this pull request! A GitHub docs team member should be by to give feedback soon. In the meantime, please check out the contributing guidelines.

@github-actions github-actions bot added the triage Do not begin working on this issue until triaged by the team label Oct 22, 2024
Copy link
Contributor

github-actions bot commented Oct 22, 2024

Automatically generated comment ℹ️

This comment is automatically generated and will be overwritten every time changes are committed to this branch.

The table contains an overview of files in the content directory that have been changed in this pull request. It's provided to make it easy to review your changes on the staging site. Please note that changes to the data directory will not show up in this table.


Content directory changes

You may find it useful to copy this table into the pull request summary. There you can edit it to share links to important articles or changes and to give a high-level overview of how the changes in your pull request support the overall goals of the pull request.

Source Preview Production What Changed
code-security/code-scanning/integrating-with-code-scanning/sarif-support-for-code-scanning.md fpt
ghec
ghes@ 3.14 3.13 3.12 3.11 3.10
fpt
ghec
ghes@ 3.14 3.13 3.12 3.11 3.10

fpt: Free, Pro, Team
ghec: GitHub Enterprise Cloud
ghes: GitHub Enterprise Server

@nguyenalex836 nguyenalex836 added content This issue or pull request belongs to the Docs Content team waiting for review Issue/PR is waiting for a writer's review code security Content related to code security and removed triage Do not begin working on this issue until triaged by the team labels Oct 22, 2024
@nguyenalex836
Copy link
Contributor

@marcandre-larochelle-bell Thanks so much for opening a PR! I'll get this triaged for review ✨

@felicitymay felicitymay added needs SME This proposal needs review from a subject matter expert and removed waiting for review Issue/PR is waiting for a writer's review labels Oct 23, 2024
Copy link
Contributor

Thanks for opening a pull request! We've triaged this issue for technical review by a subject matter expert 👀

@Kamran21202

This comment was marked as spam.

@marcandre-larochelle
Copy link
Contributor Author

@nguyenalex836 I think a bot / fake account is trying to sneak-in some participation on this MR (check the empty comment above)

@nguyenalex836
Copy link
Contributor

@marcandre-larochelle-bell They've been blocked 💛

@nguyenalex836
Copy link
Contributor

@marcandre-larochelle-bell Thank you for your patience while our SME team reviewed! They shared the following upon review

I had a quick read through the PR description and changes.

  • I understand the submitter is asking from the perspective of another SARIF consumer (GitHub code scanning is one such consumer), and asking us to require the originalUriBaseIds property in our guidance for SARIF producers.
  • I don't think that is a reasonable requirement for us to enforce. Code scanning does not require that property, and does not satisfy the conditions for a consumer that would need the property: "as long as the consumer runs either on the same machine as the producer, or on a machine with an identical file system layout."
    • The first statement is always false, as code scanning processes SARIF uploads within GitHub systems, never on the same machine that produced the analysis.
    • The second statement is likely to be false, as the GitHub machine running the code scanning service is not likely to match the filesystem layout of the machine that produced the analysis (though it is possible if both are Linux and follow a very simple common convention).
  • I would be open to adding this as optional guidance for SARIF producers. We should make clear the fact that code scanning does not (currently) use or require this property.

Per our SMEs input, let us know if you'd be willing to update your PR to add this as optional guidance for SARIF producers 💛

@nguyenalex836 nguyenalex836 added SME reviewed An SME has reviewed this issue/PR more-information-needed More information is needed to complete review and removed needs SME This proposal needs review from a subject matter expert labels Oct 29, 2024
@marcandre-larochelle
Copy link
Contributor Author

@nguyenalex836 I'm ok with switching it to an optional guidance for SARIF producers, how should I go about it? Should I keep the example inside the SARIF format and add a kind of "note" at the bottom?

@nguyenalex836
Copy link
Contributor

@marcandre-larochelle-bell Thank you! Adding a note sounds like a like a good option (here's a link to our style guide, for reference). Just in case, I'll loop in one of our technical writers @subatoi for a second opinion 💛

@subatoi
Copy link
Contributor

subatoi commented Oct 30, 2024

Hi @marcandre-larochelle-bell, would you be open to creating a short example under https://docs.github.com/en/code-security/code-scanning/integrating-with-code-scanning/sarif-support-for-code-scanning#sarif-output-file-examples?

This would be similar to https://docs.github.com/en/code-security/code-scanning/integrating-with-code-scanning/sarif-support-for-code-scanning#example-with-minimum-required-properties, containing the relevant fields you're proposing to add. It would also need some small explanatory text beforehand, indicating that it's optional.

@marcandre-larochelle
Copy link
Contributor Author

Hi @subatoi, yes definitely, should I duplicate the https://docs.github.com/en/code-security/code-scanning/integrating-with-code-scanning/sarif-support-for-code-scanning#example-with-minimum-required-properties and add the relevant field (the one in the current PR) with a explanatory text or I can add an explanatory text to the existing https://docs.github.com/en/code-security/code-scanning/integrating-with-code-scanning/sarif-support-for-code-scanning#example-with-minimum-required-properties (but with the added field)?

@subatoi
Copy link
Contributor

subatoi commented Oct 31, 2024

Hi 👋

Let's go with this option 😄

duplicate the https://docs.github.com/en/code-security/code-scanning/integrating-with-code-scanning/sarif-support-for-code-scanning#example-with-minimum-required-properties and add the relevant field (the one in the current PR) with a explanatory text

I can help with an appropriate title once we have the section set up.

Thank you!

@marcandre-larochelle
Copy link
Contributor Author

Hi,

I attempted a draft of the new section with the field and note as per the comments.

Thanks in advance!

@subatoi
Copy link
Contributor

subatoi commented Oct 31, 2024

Thank you for your help ✨ ! I'll be able to review this next week, and I will circle back to you then. We are in any case in a deploy freeze right now.

@github-actions github-actions bot added the stale There is no recent activity on this issue or pull request label Nov 7, 2024
Copy link
Contributor

github-actions bot commented Nov 7, 2024

This PR has been automatically closed because there has been no response to to our request for more information from the original author. Please reach out if you have the information we requested, or open a new issue to describing your changes. Then we can begin the review process.

@github-actions github-actions bot closed this Nov 7, 2024
@marcandre-larochelle
Copy link
Contributor Author

It was mistakenly closed, I am awaiting review

@marcandre-larochelle
Copy link
Contributor Author

@subatoi can you re-open it? Thanks!

@subatoi subatoi reopened this Nov 7, 2024
@github-actions github-actions bot added triage Do not begin working on this issue until triaged by the team and removed stale There is no recent activity on this issue or pull request labels Nov 7, 2024
Copy link
Contributor

@subatoi subatoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your patience, @marcandre-larochelle-bell! I'm happy with the wording, but I'm not sure about the link you've used being characterised as Microsoft documentation, and I wonder if the best thing to do would just be to delete it instead, so I've left a suggestion accordingly.

Once we resolve that point, I'll be happy to get this merged (cc @nguyenalex836 if I'm not around and you want to go ahead)

…ning/sarif-support-for-code-scanning.md

Co-authored-by: Ben Ahmady <32935794+subatoi@users.noreply.github.com>
@marcandre-larochelle
Copy link
Contributor Author

@subatoi I'm ok with your suggestion, committed it! Thanks

Copy link
Contributor

@subatoi subatoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, I'll get this merged now. Thank you again very much for your help!

@subatoi subatoi added this pull request to the merge queue Nov 7, 2024
Merged via the queue into github:main with commit aec2b02 Nov 7, 2024
45 checks passed
Copy link
Contributor

github-actions bot commented Nov 7, 2024

Thanks very much for contributing! Your pull request has been merged 🎉 You should see your changes appear on the site in approximately 24 hours. If you're looking for your next contribution, check out our help wanted issues

@nguyenalex836 nguyenalex836 removed triage Do not begin working on this issue until triaged by the team more-information-needed More information is needed to complete review labels Nov 7, 2024
@Ksrouri

This comment was marked as spam.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code security Content related to code security content This issue or pull request belongs to the Docs Content team SME reviewed An SME has reviewed this issue/PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants