Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C++: Total number of baseline files limit #17743

Open
artem-smotrakov opened this issue Oct 11, 2024 · 5 comments
Open

C++: Total number of baseline files limit #17743

artem-smotrakov opened this issue Oct 11, 2024 · 5 comments
Labels
awaiting-response The CodeQL team is awaiting further input or clarification from the original reporter of this issue. question Further information is requested

Comments

@artem-smotrakov
Copy link
Contributor

Hey friends, I have quite a large C++ database:

codeql database print-baseline -- ${CODEQL_DATABASE_DIR}
Counted a baseline of 27711380 lines of code for cpp.

Before running scans, I normally run some simple diagnostic queries to make sure the database looks fine. The queries look for things like:

  • Files
  • FunctionCalls
  • IfStmts

When I run these queries on this large database, I get this

codeql database analyze ${CODEQL_DATABASE_DIR} --format=sarif-latest --output=calls.sarif ${CODEQL_QUERIES}/qlpacks/cpp-queries/diagnostics/FunctionCalls.ql
Running queries.
[1/1 comp 7.8s] Compiled [...]/qlpacks/cpp-queries/diagnostics/FunctionCalls.ql.
Files.ql: [1/1 eval 36s] Results written to cpp-queries/diagnostics/FunctionCalls.bqrs.
Shutting down query evaluator.
Interpreting results.
Will not interpret file coverage baseline information, since the total number of baseline files is 153738, which is greater than the limit of 50000.

The exit code is 0 but calls.sarif is empty.

When I run queries from the standard C++ pack, I get the same message.

What does this limit mean? Is there any way to increase it? I didn't find anything either in the docs or in this repo unfortunately, may be missing something though. Thanks!

@artem-smotrakov artem-smotrakov added the question Further information is requested label Oct 11, 2024
@redsun82
Copy link
Contributor

👋 @artem-smotrakov, sorry for the late reply!

That limit is meant to avoid generating too large a SARIF file when populating the Tool Status Page for information about how many files were analyzed, hitting the SARIF file size limit. It is currently hard-coded and cannot be configured.

That said, I'm not entirely sure this limit (and the warning) should really cause a custom query like yours to return no results. Could you:

  • share your FunctionCalls.ql, so we can experiment with it a bit?
  • maybe try another output format like cvs to see if the issue is specifically related to the SARIF format?

@artem-smotrakov
Copy link
Contributor Author

artem-smotrakov commented Oct 14, 2024

Hi @redsun82 ! Thanks for your reply!

share your FunctionCalls.ql, so we can experiment with it a bit?

Yeah, sure, it's quite simple

import cpp

from FunctionCall call, Function func
where func = call.getTarget()
select call, "Call " + func + "(" + func.getParameterString() + ")"

maybe try another output format like cvs to see if the issue is specifically related to the SARIF format?

I get the same message for CSV if I use --format=csv --output=calls.csv. The calls.csv file is empty.

It is currently hard-coded and cannot be configured.

Would it be possible to make it configurable in one of the next releases? 🤔

@rvermeulen
Copy link
Contributor

Hi @artem-smotrakov,

The base line information should not influence the result of the query.
Could you run https://github.com/github/codeql/blob/main/cpp/ql/src/Diagnostics/ExtractionWarnings.ql to determine if other issue are influencing the results of the query?

@rvermeulen rvermeulen added the awaiting-response The CodeQL team is awaiting further input or clarification from the original reporter of this issue. label Oct 14, 2024
@artem-smotrakov
Copy link
Contributor Author

Hi @rvermeulen ! Attaching the results of the ExtractionWarnings.ql. I see errors in several files but the codebase has way more C++ files. Also, I got the same limit warning when I ran the query.

extractrion_warnings.sarif.txt

@rvermeulen
Copy link
Contributor

rvermeulen commented Oct 17, 2024

Hi @artem-smotrakov,

Let me forward this to our C/C++ team.
In the mean time, could you share which CodeQL CLI version you are using codeql version --format=json and the build-tracer.log that you can find in the database directory under logs. Before sharing make sure possible sensitive information is redacted (such as the unpack location).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting-response The CodeQL team is awaiting further input or clarification from the original reporter of this issue. question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants