Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update codeql.yml to exclude YAML front-matter and Liquid code #6548

Open
1 of 5 tasks
Tracked by #5005 ...
gaylem opened this issue Mar 29, 2024 · 38 comments
Open
1 of 5 tasks
Tracked by #5005 ...

Update codeql.yml to exclude YAML front-matter and Liquid code #6548

gaylem opened this issue Mar 29, 2024 · 38 comments
Assignees
Milestone

Comments

@gaylem
Copy link
Member

gaylem commented Mar 29, 2024

Overview

Many of our Javascript and HTML code files cannot be scanned by CodeQL as-is because they contain non-JS Liquid code {% ... %} or YAML front matter --- ... ---, which cause syntax errors. We need to try and resolve these errors without removing all non-JS code.

Details

The error message "Could not process some files due to syntax errors" indicates that these "syntax errors" may prevent CodeQL from scanning the files below (see issue #5234 for details).

  • hamburger-nav.js: YAML front-matter with a title
  • toolkit.js: 1 line of Liquid, empty YAML front-matter
  • wins.js : 2 lines (Liquid), empty YAML front-matter
  • project.js : 2 lines (Liquid), empty YAML front-matter
  • about.js: for loop (Liquid), empty YAML front-matter
  • current-project.js: 2 lines + for loop (Liquid), empty YAML front-matter
  • Separately, we have observed problems with CodeQL scanning of HTML with embedded liquid statements - see ER: CodeQL did not raise alerts on each instance of "Potentially unsafe external link" #6485
Screenshot: CodeQL error message

CodeQL error message 1

Simply deleting the Liquid lines would break the site (and CodeQL raised those errors accordingly in testing), so an alternative, holistic solution is required.

Action Items

  • Review the Possible Solutions content under Resources
  • Implement a solution that will exclude YAML front-matter and Liquid code from CodeQL scans on .js and .html files.
  • Thoroughly test your changes and ensure the codeql.yml file runs as expected. If it does not run as expected, detail your testing in a comment.

Testing

  • Test your solution by running the .codeql-scan-job.yml workflow.
  • You'll have to figure out a way to confirm that CodeQL was able to scan the files listed above without scanning the YAML front-matter or Liquid code and without throwing the error.

Resources/Instructions

Possible Solutions

Here are two possible solutions (in order of preference) to this problem. Please use your best judgment, these are only recommendations.

Option 1

This approach is preferred because it is

Define a new CodeQL query file that excludes Liquid and YAML patterns within JavaScript files.

Create a file named exclude-patterns.ql

import javascript

from File file
where (file.getExtension() = "js" or file.getExtension() = "html")
  and not file.getCode().matches(".*\\{%.*%\\}.*") // Exclude Liquid code
  and not file.getCode().matches(".*---.*")        // Exclude YAML front matter
select file

Then modify codeql-scan-job.yml file to use the new query file for analysis. Update the queries section in the Initialize CodeQL step to include the new query file:

# On codeql-scan-job.yml file:

- name: Initialize CodeQL
     uses: github/codeql-action/init@v3
     with:
       languages: ${{ matrix.language }}
       queries: path/to/exclude-patterns.ql, security-and-quality

Option 2

Exclude liquid code and YAML front matter patterns from the CodeQL analysis within `codeql-scan-job.yml`

    # On codeql-scan-job.yml file:
    
        - name: Perform CodeQL Analysis
          uses: github/codeql-action/analyze@v3
          with:
            languages: javascript
            queries: security-and-quality
            # Exclude Liquid and YAML code within JavaScript files
            exclude: |
              path: "**/*.{js,html}"
              patterns:
                - pattern: |
                    // Start of Liquid code
                    {% if variable %}
                    // Liquid code here
                    {% endif %}
                    // End of Liquid code
                - pattern: |
                    // Start of YAML front matter
                    ---
                    # YAML front matter here
                    ---
                    // End of YAML front matter

@gaylem gaylem added Feature Missing This label means that the issue needs to be linked to a precise feature label. size: missing Draft Issue is still in the process of being created role missing Complexity: Missing labels Mar 29, 2024
@gaylem gaylem changed the title Update codeql.yml to run CodeQL scan after Jekyll build [DRAFT] Update codeql.yml to run CodeQL scan after Jekyll build Mar 29, 2024

This comment has been minimized.

@gaylem

This comment was marked as outdated.

@gaylem gaylem added Complexity: Large size: 5pt Can be done in 19-30 hours Feature: Code Alerts ready for dev lead Issues that tech leads or merge team members need to follow up on role: back end/devOps Tasks for back-end developers and removed Complexity: Missing size: missing Feature Missing This label means that the issue needs to be linked to a precise feature label. role missing labels Mar 29, 2024
@gaylem gaylem changed the title [DRAFT] Update codeql.yml to run CodeQL scan after Jekyll build Update codeql.yml to run CodeQL scan after Jekyll build Mar 29, 2024
@roslynwythe

This comment was marked as outdated.

@roslynwythe roslynwythe added ready for product and removed ready for dev lead Issues that tech leads or merge team members need to follow up on labels Mar 31, 2024
@gaylem

This comment was marked as outdated.

@ExperimentsInHonesty

This comment was marked as outdated.

@ExperimentsInHonesty ExperimentsInHonesty added ready for dev lead Issues that tech leads or merge team members need to follow up on and removed ready for product labels Apr 4, 2024
@gaylem gaylem changed the title Update codeql.yml to run CodeQL scan after Jekyll build Update codeql.yml to exclude YAML front-matter and Liquid code Apr 5, 2024
@gaylem

This comment was marked as outdated.

@roslynwythe

This comment was marked as outdated.

@luisitocanlas
Copy link
Member

Progress update: Below are the solutions that I've tried and the errors that kept flagging. I would have to say that I did not find a solution and would be moving this issue back to the Prioritized Backlog.

Solution # 1:

Option 1 under the possible solutions that is slightly modified.
import javascript

from File file
where (file.getExtension() = "js" or file.getExtension() = "html")
  and not file.getCode().matches(".*\\{%-?\\s*[a-zA-Z]+.*%\\}.*") // Exclude Liquid code
  and not file.getCode().matches("(?s).*---.*---.*") // Exclude YAML front matter
select file

Solution # 2:

Using predicates.
/**
 * @name Exclude YAML and Liquid Front Matter
 * @description Excludes YAML front matter and Liquid template sections from the analysis
 * @kind problem
 * @problem.severity warning
 */

 import javascript

 /** Predicate to identify YAML front matter lines */
 predicate isYamlFrontMatterLine(File f, int line) {
   exists (
     int start, int end |
     start = f.getLine(1).getLineNumber() and
     (end = f.getLine(2).getLineNumber() or end = f.getLine(3).getLineNumber()) and
     line >= start and
     line <= end and
     f.getLine(start).getText().matches("---") and
     f.getLine(end).getText().matches("---")
   )
 }
 
 /** Predicate to identify Liquid template sections */
 predicate isLiquidTemplateLine(File f, int line) {
   exists (
     string content |
     f.getLine(line).getText() = content and
     (
       content.matches("{%.*%}") or
       content.matches("{{.*}}")
     )
   )
 }
 
 /** Class to represent code excluding YAML front matter and Liquid templates */
 class CodeExcludingFrontMatter extends Expr {
   CodeExcludingFrontMatter() {
     this.getFile().getExtension() = "js" and
     not isYamlFrontMatterLine(this.getFile(), this.getLocation().getStartLine()) and
     not isLiquidTemplateLine(this.getFile(), this.getLocation().getStartLine())
   }
 }
 
 from CodeExcludingFrontMatter c
 select c, "Code excluding YAML front matter and Liquid templates"

Error 1: When the first solution was implemented, this error keeps popping up. When the second solution was implemented this error didn't pop up.

Inside Perform CodeQL Analysis Log

image


Error 2: For both solutions, the alerts are persistent.

Under Security Tab inside Code scanning alerts

image

@luisitocanlas luisitocanlas removed their assignment Aug 3, 2024
@ExperimentsInHonesty ExperimentsInHonesty added the ready for dev lead Issues that tech leads or merge team members need to follow up on label Aug 20, 2024
@roslynwythe
Copy link
Member

roslynwythe commented Aug 21, 2024

@ExperimentsInHonesty @t-will-gillis Here is a summary of our options:

  • re-prioritize the issue as-is. I still think it makes sense and another developer may have success with it
  • we can experiment with moving liquid statements from JS into HTML
  • we can deploy another code security/quality linter such as ESLint which features the ability to exclude particular lines of "code"
  • we can try to modify the workflow such that CodeQL analyzes code after the Jekyll build. This article discusses such a pattern https://some-natalie.dev/blog/codeql-container-builds/ but it involves using a container job, and I'm not sure if we are comfortable with those.

@roslynwythe roslynwythe added ready for product and removed ready for dev lead Issues that tech leads or merge team members need to follow up on ready for product labels Aug 21, 2024
@HackforLABot

This comment has been minimized.

@duojet2ez
Copy link
Member

I look forward to tackling this issue! It seems to be a challenge but also presents a good learning opportunity.

This week 8/26 I have time Mon - Thurs and then will be away labor day weekend and back the following week. My eta on this is not known at this point. I plan on reviewing the previous work on this/looking at comments.. after that I may have a better estimate.

@t-will-gillis
Copy link
Member

Hey @duojet2ez We talked about this issue in our Monday meeting... If you look at @roslynwythe 's note above, we would like to focus most of your efforts on the last one:

we can try to modify the workflow such that CodeQL analyzes code after the Jekyll build. This article discusses such a pattern https://some-natalie.dev/blog/codeql-container-builds/ but it involves using a container job, and I'm not sure if we are comfortable with those.

@duojet2ez
Copy link
Member

gotcha!

@HackforLABot HackforLABot added the To Update ! No update has been provided label Sep 6, 2024
@HackforLABot

This comment has been minimized.

@duojet2ez
Copy link
Member

  1. Progress: Did some research to understand site architecture (jekyll and how liquid works). Read about codeQL.
  2. Blockers: Last week I didn't have the time I was hoping to work on this so didn't do much
  3. Availability: 5 - 8 hours this week
  4. ETA: Not sure because in order to solve this I need to understand codeQL and docker and the build process.. kind of complex and will take time. But I enjoy learning this stuff on a deeper level

@duojet2ez duojet2ez removed the To Update ! No update has been provided label Sep 7, 2024
@duojet2ez
Copy link
Member

I'll be at the meeting tomorrow to discuss this issue. I would like to edit the codeql-scan-job.yml file and test but not entirely sure how to do this

@duojet2ez
Copy link
Member

Progress: This week I was able to successfully reproduce the issue with my fork by starting a codeql scan. I have something to test against. I am considering modifying the codeql.yml file or codeql-scan-job.yml and then running a test again to see if that solves the issue. I also started watching a docker youtube tutorial

Blockers: I don't know enough about containers, specifically docker. To remedy this I am going through a docker youtube tutorial. I am hoping at the end of this I'll have some answers.

Availability: roughly 5 hours a week

Eta: no idea there's a lot to learn

@HackforLABot

This comment has been minimized.

@HackforLABot HackforLABot added the To Update ! No update has been provided label Oct 4, 2024
@HackforLABot HackforLABot added 2 weeks inactive An issue that has not been updated by an assignee for two weeks and removed To Update ! No update has been provided labels Oct 11, 2024
@HackforLABot
Copy link
Contributor

@duojet2ez

Please add update using the below template (even if you have a pull request). Afterwards, remove the '2 weeks inactive' label and add the 'Status: Updated' label.

  1. Progress: "What is the current status of your project? What have you completed and what is left to do?"
  2. Blockers: "Difficulties or errors encountered."
  3. Availability: "How much time will you have this week to work on this issue?"
  4. ETA: "When do you expect this issue to be completed?"
  5. Pictures (optional): "Add any pictures of the visual changes made to the site so far."

If you need help, be sure to either: 1) place your issue in the Questions/In Review column of the Project Board and ask for help at your next meeting, 2) put a "Status: Help Wanted" label on your issue and pull request, or 3) put up a request for assistance on the #hfla-site channel. Please note that including your questions in the issue comments- along with screenshots, if applicable- will help us to help you. Here and here are examples of well-formed questions.

You are receiving this comment because your last comment was before Tuesday, October 8, 2024 at 12:05 AM PST.

@duojet2ez
Copy link
Member

Progress: Not making a lot of progress with this. Didn't have much time this week to do anything!

Blockers: I don't know enough about containers, specifically docker. To remedy this I am going through a docker youtube tutorial. I am hoping at the end of this I'll have some answers.

Availability: roughly 5 hours a week

Eta: no idea there's a lot to learn

@duojet2ez duojet2ez removed the 2 weeks inactive An issue that has not been updated by an assignee for two weeks label Oct 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment