Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Add "effective" severity for rules in SARIF v2 logs #67365

Closed
MattKotsenas opened this issue Mar 19, 2023 · 9 comments · Fixed by #68892
Closed

Proposal: Add "effective" severity for rules in SARIF v2 logs #67365

MattKotsenas opened this issue Mar 19, 2023 · 9 comments · Fixed by #68892
Assignees
Labels
Area-Analyzers Concept-API This issue involves adding, removing, clarification, or modification of an API. Feature Request

Comments

@MattKotsenas
Copy link
Member

Background and Motivation

My team recently discovered that a number of analyzer rules had been accidently disabled in our repo, letting rule violations into the codebase. To prevent a change from accidently disabling rules in the future, I'd like to write unit tests that "baseline" the rules used by a project.

#64277 added a rules section to the SARIF v2 log. With that information I now know what rules were applied to my compilation. However, the log doesn't specify the "effective" severity of the rule, only the default severity provided by the analyzer itself. This means that a developer could, for example, demote a rule from an Error to an Info without any change to the SARIF log.

Understanding the effective severity for a rule is the missing piece of information we need to baseline our rules and prevent configuration regressions.

Proposed API

The proposed API is to add an effectiveConfiguration element to the rule, with an array of the effective severities of the rules.

"rules": [
  {
    "id": "CA1000",
    "shortDescription": {
      "text": "Do not declare static members on generic types"
    },
    "fullDescription": {
      "text": "When a static member of a generic type is called, the type argument must be specified for the type. When a generic instance member that does not support inference is called, the type argument must be specified for the member. In these two cases, the syntax for specifying the type argument is different and easily confused."
    },
    "defaultConfiguration": {
      "level": "note"
      },
+    "effectiveConfiguration": {
+      "levels": ["note", "error"]
+    },
    "helpUri": "https://docs.microsoft.com/dotnet/fundamentals/code-analysis/quality-rules/ca1000",
    "properties": {
      "category": "Design",
      "isEverSuppressed": "true",
      "suppressionKinds": [
        "external"
      ],
      "tags": [
        "PortedFromFxCop",
        "Telemetry",
        "EnabledRuleInAggressiveMode"
      ]
    }
  }
]

The effective configuration can contain multiple values because .editorconfig files allow for setting rules at the file level. In most cases, this array will only have a single element.

Alternative Designs

Alternate designs to consider would be instead of providing the effective configuration as an array, we could simplify the majority case and only specify the project default. We could also flag such cases that rules are being specified per file. However, since per-file configuration is a valid scenario, it seems unnecessary to push people away from such a configuration. Additionally, providing the severities as an array provides a good tradeoff of information and complexity. All information about the effective severity is confined to a single property, and if people want to catch / prevent such file-level configuration, they can assert that the array length is always 1.

The entry can also be moved into the properties node if desired to keep the effective severity closer to the suppression info.

Risks

Only risk I'm aware of is a small increase in the size of the log file.

@MattKotsenas MattKotsenas added Concept-API This issue involves adding, removing, clarification, or modification of an API. Feature Request labels Mar 19, 2023
@dotnet-issue-labeler dotnet-issue-labeler bot added Area-Analyzers untriaged Issues and PRs which have not yet been triaged by a lead labels Mar 19, 2023
@jaredpar
Copy link
Member

IIRC, not all rules show up in the SARIF file today. Rules that are not run due to being disabled are not represented. Will this proposal include those now but list their effective severity as disabled?

@mavasani
Copy link
Contributor

IIRC, not all rules show up in the SARIF file today. Rules that are not run due to being disabled are not represented. Will this proposal include those now but list their effective severity as disabled?

@jaredpar With #64277 we do report all rules in the compilation, regardless of whether they are disabled for part or whole of the compilation.

@MattKotsenas
Copy link
Member Author

I believe this addresses #60550 as well

@MattKotsenas
Copy link
Member Author

MattKotsenas commented Mar 28, 2023

If it's helpful, here's an example set of diagnostics, a set of configuration via .editorconfig, .globalconfig, and a ruleset file, and what I would expect the effective configuration in the SARIF log to be:

Diagnostics in this hypothetical project:

new []
{
    new DiagnosticDescriptor
    {
        Id = "CA1111",
        Title = "CA1111 Title",
        Description = "CA1111 Description",
        HelpLinkUri = "https://help/for/ca1111",
        MessageFormat = "{0}",
        Category = "Design",
        DefaultSeverity = DiagnosticSeverity.Error,
        IsEnabledByDefault = true,
        CustomTags = new [] { "tag1", "tag2" }
    },
    new DiagnosticDescriptor
    {
        Id = "CA2222",
        Title = "CA2222 Title",
        Description = "CA2222 Description",
        HelpLinkUri = "https://help/for/CA2222",
        MessageFormat = "{0}",
        Category = "Design",
        DefaultSeverity = DiagnosticSeverity.Warning,
        IsEnabledByDefault = true,
        CustomTags = new [] { "tag3", "tag4" }
    },
    new DiagnosticDescriptor
    {
        Id = "CA3333",
        Title = "CA3333 Title",
        Description = "CA3333 Description",
        HelpLinkUri = "https://help/for/CA3333",
        MessageFormat = "{0}",
        Category = "Design",
        DefaultSeverity = DiagnosticSeverity.Info,
        IsEnabledByDefault = true,
        CustomTags = new [] { "tag5", "tag6" }
    },
    new DiagnosticDescriptor
    {
        Id = "CA4444",
        Title = "CA4444 Title",
        Description = "CA4444 Description",
        HelpLinkUri = "https://help/for/CA4444",
        MessageFormat = "{0}",
        Category = "Design",
        DefaultSeverity = DiagnosticSeverity.Error,
        IsEnabledByDefault = true,
        CustomTags = new [] { "tag7", "tag8" }
    },
    new DiagnosticDescriptor
    {
        Id = "CA5555",
        Title = "CA5555 Title",
        Description = "CA5555 Description",
        HelpLinkUri = "https://help/for/CA5555",
        MessageFormat = "{0}",
        Category = "Design",
        DefaultSeverity = DiagnosticSeverity.Warning,
        IsEnabledByDefault = true,
        CustomTags = new [] { "tag9", "tag10" }
    },
    new DiagnosticDescriptor
    {
        Id = "CA6666",
        Title = "CA6666 Title",
        Description = "CA6666 Description",
        HelpLinkUri = "https://help/for/CA6666",
        MessageFormat = "{0}",
        Category = "Design",
        DefaultSeverity = DiagnosticSeverity.Warning,
        IsEnabledByDefault = true,
        CustomTags = new [] { "tag11", "tag12" }
    },
}

.editorconfig:

[*.cs]
dotnet_diagnostic.CA1111.severity = warn

[third_party/**.cs]
dotnet_diagnostic.CA2222.severity = silent

.globalconfig:

is_global = true
dotnet_diagnostic.CA3333.severity = silent

ruleset:

<RuleSet Name="Example rules" Description="Example rules" ToolsVersion="15.0">
   <Rules AnalyzerId="Microsoft.Analyzers.ManagedCodeAnalysis" RuleNamespace="Microsoft.Rules.Managed">
     <Rule Id="CA4444" Action="Warning" />
     <Rule Id="CA5555" Action="Error" />
   </Rules>
</RuleSet>

Expected output:

"rules": [
  {
    "id": "CA1111",
    // ...
    "defaultConfiguration": {
      "level": "error"
      },
    "effectiveConfiguration": {
      "levels": ["warning"]
    },
  },
  {
    "id": "CA2222",
    // ...
    "defaultConfiguration": {
      "level": "warning"
      },
    "effectiveConfiguration": {
      "levels": ["warning", "note"]
    },
  },
  {
    "id": "CA3333",
    // ...
    "defaultConfiguration": {
      "level": "note"
      },
    "effectiveConfiguration": {
      "levels": ["note"]
    },
  },
  {
    "id": "CA4444",
    // ...
    "defaultConfiguration": {
      "level": "error"
      },
    "effectiveConfiguration": {
      "levels": ["warning"]
    },
  },
  {
    "id": "CA5555",
    // ...
    "defaultConfiguration": {
      "level": "warning"
      },
    "effectiveConfiguration": {
      "levels": ["error"]
    },
  },
  {
    "id": "CA6666",
    // ...
    "defaultConfiguration": {
      "level": "warning"
      },
    "effectiveConfiguration": {
      "levels": ["warning"]
    },
  },
]

@mavasani / @jaredpar do these match your expectations? Are there any other scenarios worth exploring that I'm not considering?

If not, what are the next steps here, or how should I think about the timeline here? Thanks!

@mavasani
Copy link
Contributor

@mavasani / @jaredpar do these match your expectations? Are there any other scenarios worth exploring that I'm not considering?

@MattKotsenas

  1. defaultConfiguration: Note that we already emit defaultConfiguration values in the sarif file - though we only do so when the default configuration is not a warning. Basically lack of a default defaultConfiguration property indicates defaultConfiguration = warning, I am not sure why it was chosen to be this way, but it seems reasonable to me to always emit a defaultConfiguration property even if it is warning by default.

  2. effectiveConfiguration: Your proposed JSON array for this property seems reasonable to me. The only question would be whether it the logs become too verbose when majority of rules are always at their default configuration or configured to only a single different effective severity. I am personally not too concerned about this added verbosity, I like your suggested semantics of always emitting effectiveConfiguration property as that provides added clarity.

@jaredpar @michaelcfanning - can you please confirm if the above enhancements to compiler errorlog format seem good to you?

@MattKotsenas
Copy link
Member Author

Friendly ping to @jaredpar and @michaelcfanning. What's the next step here to make progress? Thanks!

@jaredpar
Copy link
Member

jaredpar commented Jun 6, 2023

For my side I'm tentatively okay with the changes. They seem reasonable to me.

The compiler is merely a consumer of the spec though, not an owner. Hard for us to say if this is okay / inline with the spec. Need @michaelcfanning, or someone else in a position to help us say if this change is legal, before we can move forward.

@mavasani
Copy link
Contributor

@MattKotsenas @jaredpar I looked at the SARIF v2 spec http://json.schemastore.org/sarif-2.1.0 and can confirm that we cannot add an arbitrary effectiveConfiguration element as proposed in #67365 (comment).

  • defaultConfiguration is a defined json property name in the spec, with the following syntax:
        "defaultConfiguration": {
          "description": "Default reporting configuration information.",
          "$ref": "#/definitions/reportingConfiguration"
        },
        ...
    "reportingConfiguration": {
      "description": "Information about a rule or notification that can be configured at runtime.",
      "type": "object",
      "additionalProperties": false,
      "properties": {
        "enabled": {
          "description": "Specifies whether the report may be produced during the scan.",
          "type": "boolean",
          "default": true
        },
        "level": {
          "description": "Specifies the failure level for the report.",
          "default": "warning",
          "enum": ["none", "note", "warning", "error"]
        },
        "rank": {
          "description": "Specifies the relative priority of the report. Used for analysis output only.",
          "type": "number",
          "default": -1,
          "minimum": -1,
          "maximum": 100
        },
        "parameters": {
          "description": "Contains configuration information specific to a report.",
          "$ref": "#/definitions/propertyBag"
        },
        "properties": {
          "description": "Key/value pairs that provide additional information about the reporting configuration.",
          "$ref": "#/definitions/propertyBag"
        }
      }
    },
  • Instead, we can add a custom property, say effectiveConfigurationLevels, to the custom properties bag associated with each rule as below:
{
  "id": "CA1001",
  "defaultConfiguration": {
    "level": "note"
  },
  "properties": {
    "effectiveConfigurationLevels": [ "error", "warning" ]
  }
}

@michaelcfanning Can you please confirm if this looks fine to you?

@mavasani mavasani self-assigned this Jul 5, 2023
mavasani added a commit to mavasani/roslyn that referenced this issue Jul 6, 2023
Closes dotnet#67365

Add a new custom property array named `effectiveConfigurationLevels` to the `rule` section in sarif v2 error log output. If the severity of any analyzer diagnostic ID was configured via options (command line options, editorconfig, globalconfig, rulesets, etc.) for a part or whole of the compilation, we emit this array of effective severities.
mavasani added a commit to mavasani/roslyn that referenced this issue Jul 18, 2023
Closes dotnet#67365

Add a new custom property array named `effectiveConfigurationLevels` to the `rule` section in sarif v2 error log output. If the severity of any analyzer diagnostic ID was configured via options (command line options, editorconfig, globalconfig, rulesets, etc.) for a part or whole of the compilation, we emit this array of effective severities.
@mavasani
Copy link
Contributor

mavasani commented Jul 19, 2023

Based on further offline design discussions, this effective severity override information should go inside the ruleConfigurationOverrides array within an invocations section in the SARIF file. https://docs.oasis-open.org/sarif/sarif/v2.1.0/os/sarif-v2.1.0-os.html#_Toc34317854 contains the existing SARIF format for specifying such end user configuration information for each compiler invocation.

For example, if we have rule1 emitted as the first rule in the rules section and it is configured to error and none severities for different part of the project, and rule2 emitted as the second rule in the rules section and it is configured to warning severity throughout the project, then we will generate the following SARIF data, where the invocations section is the new data being emitted now.

"runs": [
{
  "results": [
  ],
  "properties": {
    "analyzerExecutionTime": "x.xxx"
  },
  "tool": {
    "driver": {
      ...
      "rules": [
        {
          "id": "rule1",
          ...
        },
        {
          "id": "rule2",
          ...
        }
      ]
    }
  },
  "invocations": [
    {
      "ruleConfigurationOverrides": [
        {
          "descriptor": {
            "index": 0
          },
          "configuration": {
            "level": "error"
          }
        },
        {
          "descriptor": {
            "index": 0
          },
          "configuration": {
            "level": "none"
          }
        },
        {
          "descriptor": {
            "index": 1
          },
          "configuration": {
            "level": "warning"
          }
        }
      ]
    }
  ],
  "columnKind": "utf16CodeUnits"
}
]

@arunchndr arunchndr removed the untriaged Issues and PRs which have not yet been triaged by a lead label Aug 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Analyzers Concept-API This issue involves adding, removing, clarification, or modification of an API. Feature Request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants