-
-
Notifications
You must be signed in to change notification settings - Fork 511
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(noSecrets): refine the entropy computation to avoid some false positives #4118
Conversation
I'll defer to the other maintainers for their opinions on adding a dependency for this. For the proposed options: We generally try to avoid adding options for the sake of adding options. We need an appropriate amount of demand (from users) and justification to determine the correct granularity and scope of those options.
Personally, I think a dedicated tool is probably going to have better heuristics than we will, at least in the short term. Plus, we don't take commit history into account like some of those tools do (and we shouldn't, that's not what biome is for). I think this rule will at least be good for picking out the most egregious cases, with the added benefit that adding the rule to their existing config is easier than adding a new tool to their chain. We could also consider pointing users to a dedicated tool in the diagnostic message and rule documentation. Gitleaks seems to be the most mature. |
To add to what @dyc3 said, in my opinion Biome likely isn't a right tool to fully replace full repo secret scanning. Biome does not support all the languages which those tools do and need to be a proper security solution, we have stricter perf constraints and so on. Having this rule in Biome does help development velocity, no need to wait for CI, and majority of users who wouldn't think about secrets get a warning for common cases, though we definitely should document about full solutions in case users need those. As for integrations listed: |
No. Turbopack had a whole zig helper library to interop between rust and golang while they were porting the project to rust. Definitely not worth the complexity. |
I agree! There are some features which would make us feature-parity with eslint no-secrets, I've updated the description with a legend to reflect that.
Definitely going to do this!
I can port some code then 🤠 Or maybe ask their team to make a crate/API for reusability.
Yikes. |
We usually add options only when required or requested, and when there are valid use cases to cover. Not sure there's enough value to add yet another dependency for this rule. Sure, it's an important rule, but as the others said, we don't cover all languages. Our documentation could actually propose alternatives to the users. |
I totally agree with others. The rule should find the most obvious secret leaks and avoid false positives because users might get annoyed and turn off the rule (that defeat its purpose). We could add a disclaimer in the rule description saying that and point to relevant tools (as the one you cited) for advanced secret leak detections. |
In that case, (which I agree with), we should just do two tasks for now:
However I do suggest 3) adding the option to control entropy as it's not only easy to add, but also useful for us to learn what people are usually comfortable with so we can improve the default option. What do you think? |
This looks fair enough to me. |
@Conaclos - Noob question:
How can I get f32 or f64 options (i.e. any floating point option) in Options? |
|
@Conaclos - I'm getting the following error: error[E0277]: the trait bound `f64: std::cmp::Eq` is not satisfied
--> crates/biome_js_analyze/src/lint/nursery/no_secrets.rs:135:5
|
130 | #[derive(Clone, Debug, Default, Deserialize, Deserializable, Eq, PartialEq, Serialize)]
| -- in this derive macro expansion
...
135 | entropy_threshold: f64, // @TODO: Doesn't work currently.
| ^^^^^^^^^^^^^^^^^^^^^^ the trait `std::cmp::Eq` is not implemented for `f64`
|
= help: the following other types implement trait `std::cmp::Eq`:
i128
i16
i32
i64
i8
isize
u128
u16
and 4 others
note: required by a bound in `AssertParamIsEq` Reading the docs:
It seems like f64 isn't supported in Options, am I missing out on something? |
For now, I see two possible approaches:
Personnaly I could choose (2). |
Yeah I thought about (2) too, it'll allow us to have an abstraction for later on when/if we change the underlying entropy function. I'll go for that. |
…-fast, comment some non-working tests
|
CodSpeed Performance ReportMerging #4118 will degrade performances by 8.01%Comparing Summary
Benchmarks breakdown
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a quick review.
// TODO: Remove these false positives, they unfortunately hurt the user experience. | ||
// const NAMESPACE_CLASSNAME = 'Validation.JSONSchemaValidationUtilsImplFactory'; | ||
// const BASE64_CHARS = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/="; | ||
// const webpackFriendlyConsole = require('./config/webpack/webpackFriendlyConsole'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could easily exclude strings inside require()
and import()
calls, but I'm not blocking this PR for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would that be by looking at the sibling / previous node in the AST?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parent, but yeah.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can do that in V3 alongside implementing JavaScript comments, what say? Because we haven't traversed the AST as of yet in this rule.
For example, Continuous mixed cases (lIkE tHiS) are more likely to contribute to a higher score than single cases. | ||
Symbols also contribute highly to secrets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you based this on existing works? This could be worth adding references.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really, I took assumptions and did some prompt engineering (attached in the ChatGPT chat). I'm considering reading some paper(s) on this topic to see if there's a better entropy function out in the wild. Do you have any recommendations for reads?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately I have no expertise in that domain and I didn't take the time to read the literature. Have you tried the new version of the rule on some code bases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added more test cases in valid.js, it was able to clear at least 1-2 more false positives (still leaves some leftover ones). Not a HUGE impact though.
I will see if I can find time to read through the literature, however, I might defer it to one of my company's team members if it takes too long.
Tackles #4113 and #3861
Further improves #3823
Research
Ever since the creation of this biome plugin originally, I've found various other secret scanning solutions:
I've been wondering; Can we consume either one of the above to use in our usecase? Do we really need to add another to the ecosystem?
This I'd like to ask the authors here.
In the meanwhile...
Fixes
New features planned
Legend:
FP = Feature-Parity with eslint no-secrets (features listed here)
I would love to hear strategies on how to achieve this.