Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The future of TaintedString #24

Closed
dom96 opened this issue Nov 28, 2017 · 14 comments
Closed

The future of TaintedString #24

dom96 opened this issue Nov 28, 2017 · 14 comments

Comments

@dom96
Copy link
Contributor

dom96 commented Nov 28, 2017

Currently TaintedString is implemented somewhat inconsistently throughout the stdlib. The question is what should we do about that?

There are a number of options:

  • Remove TaintedString and tainted mode from the language
  • Go through the stdlib and use TaintedString consistently anywhere
    • This would be very disruptive and I personally don't think it's a good idea.
  • (Your other option here, please suggest it)

What does everyone think?

@andreaferretti
Copy link

Ideally 2, but if it is too much work I guess 1 is the only option.

Actually, I don't think this should be too much work. Pure string procs can stay as they are, and be borrowed when needed. The only thing to pay attention to is that the IO routines return TaintedString instead of string - there should not be many of those

@Araq
Copy link
Member

Araq commented Nov 28, 2017

I'm torn. It's a nice feature. It never found any real bugs.

@ghost
Copy link

ghost commented Nov 28, 2017

Usage of ReadIoEffect is as inconsistent in the stdlib as TaintedString. This may be problematic as their use often overlaps. For example, memfiles uses TaintedString but there's no sign of any effects tracking there.

For consistency, all stdlib I/O operations should have effects tracking tagged on their procs anyway. Perhaps the goal of a taint mode or any sort of extra-nagging about dangerous I/O by the compiler is best expressed as opt-in of existing effects tracking rather than a distinct type.

Anyway you slice it, I/O needs to be consistently tagged in the stdlib irrespective.

@Araq
Copy link
Member

Araq commented Nov 28, 2017

For example, memfiles uses TaintedString but there's no sign of any effects tracking there.

The pointer can be used for reading or writing, what effect should it be? Pointers don't have effects, procs do.

@andreaferretti
Copy link

andreaferretti commented Nov 29, 2017

Well, then procs such as memfiles.open should have IO effects

@Araq
Copy link
Member

Araq commented Nov 29, 2017

What I said about TaintedString applies to every tags effect:

It's a nice feature. It never found any real bugs.

@dom96
Copy link
Contributor Author

dom96 commented Nov 29, 2017

It's a nice feature. It never found any real bugs.

I would argue it's because we haven't properly adopted these effects. I've always been waiting for this feature to be finished before starting to use it.

@Araq
Copy link
Member

Araq commented Nov 29, 2017

What does it mean to be finished?

@saem
Copy link

saem commented Dec 24, 2020

Commenting here because I saw a semi-recently updated PR about deprecating it. I really like the idea of TaintedString, it would be a shame to see them go (depends). With Nim I've not been writing web apps or lots of untrusted input handing, nor have I had to go through heavy technical security audit. I do know taint tracking would help in those cases and have used various tools to achieve it, but haven't directly experienced it with Nim.

If I get to suggest an alternative option, we can already signal hints, warnings, and errors at compile time. I think taint tracking could be thought of as a specific case of tagging a value's type and then optionally making compile time assertions about it -- don't mix with other strings or taint them as well. Then this analysis/mode can be turned on and off rather than simply taint mode.

Moreover, I don't think it's simply strings, being able to tag types (compile time state) indicating yes I have or haven't done something (tainted string input usually being validation/sanitization) is likely some of the completeness of the effects system people are after. Whatever my program is it usually has a major phase(s) where some invariant must hold, webapps examples:

  • escape input
  • http/url encode output, etc...
  • ensure tenant id comes from a trusted source

A bunch of this could be done by distinct types but the fact that you don't want to necessarily enforce this everywhere or wrap and unwrap when going in and out of 3rd party code is a selling point. Maybe this is all sufficiently facilitated through existing facilities, however.

Writing this post gave me the idea of seeing if I can start tracking effects of using nim/nimsuggest/nimble etc in the vscode extension and perhaps be able to answer the completeness question, because though it feels incomplete, I can't precisely say why either.

@disruptek
Copy link

TaintedString is good. Can we vote on keeping it and close this issue for good?

@Araq
Copy link
Member

Araq commented Dec 27, 2020

There is quite some overlap with dedicated types for SQL/HTML/etc generation which provides about the same security benefits and is much easier to use and doesn't introduce yet another compiler switch. So IMO we should accept this RFC and remove TaintedString.

@mratsim
Copy link
Collaborator

mratsim commented Jan 1, 2021

I like Tainted String but they shouldn't be a compiler switch.

At the very least we should have a tutorial on how to write a parser with security/IO effect in mind. Right now Tainted String feels forced and contrary to Haskell, there is no dozens of explainers on why IO monads are great.

A more advanced tutorial would be a networking app with different stages of validation modeled in the type system via distinct types. For example in Nimbus (https://github.com/status-im/nimbus-eth2/blob/644c17f/docs/block_validation_flow.md) we have the following security stages for blocks from the blockchain:

  • Untrusted: raw block from the network
  • SigVerified: the cryptographic signature is correct but we didn't check the block logic
  • TransitionVerified: the block logic is correct but we didn't check the cryptographic signature
  • Trusted: we checked both logic and cryptographic signature.

@Araq
Copy link
Member

Araq commented Jan 15, 2021

I'm using my BDFL powers this time, TaintedString should go. Every string is untrusted, use type LettersOnly = distinct string etc. for trusted strings. We did this for SQL, it worked well enough and is not yet another compiler switch that in practice cannot be used anyway as it defaults to off and so practically no library supports it.

@timotheecour
Copy link
Member

closed via nim-lang/Nim#15423

sid-code added a commit to sid-code/nmoo that referenced this issue Aug 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants