-
Notifications
You must be signed in to change notification settings - Fork 8.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add initial support for paste filtering and bracketed paste mode #7508
Changes from 4 commits
9b4b08b
8cf4ccb
50f8e77
1fdf4de
a360eec
56d46ac
0399056
6dcbd57
2fdf153
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
// Copyright (c) Microsoft Corporation. | ||
// Licensed under the MIT license. | ||
|
||
#include "precomp.h" | ||
#include "inc/PasteConverter.h" | ||
|
||
using namespace Microsoft::Console::Utils; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't honestly know where to put this class. Right now it is heavily VT-related. But in the future maybe other features will be added, making it less VT-related. So perhaps There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we should keep this VT-specific for now. It can be refactored later if need be. |
||
|
||
std::wstring PasteConverter::Convert(const std::wstring& wstr, PasteFlags flags) | ||
{ | ||
std::wstring converted{ wstr }; | ||
|
||
// Convert Windows-space \r\n line-endings to \r (carriage return) line-endings | ||
if (flags & PasteFlags::CarriageReturnNewline) | ||
{ | ||
// Some notes on this implementation: | ||
// | ||
// - std::regex can do this in a single line, but is somewhat | ||
// overkill for a simple search/replace operation (and its | ||
// performance guarantees aren't exactly stellar) | ||
// - The STL doesn't have a simple string search/replace method. | ||
// This fact is lamentable. | ||
// - This line-ending conversion is intentionally fairly | ||
// conservative, to avoid stripping out lone \n characters | ||
// where they could conceivably be intentional. | ||
std::wstring::size_type pos = 0; | ||
|
||
while ((pos = converted.find(L"\r\n", pos)) != std::wstring::npos) | ||
{ | ||
converted.replace(pos, 2, L"\r"); | ||
} | ||
} | ||
|
||
// Bracketed Paste Mode, invented by xterm and implemented in many popular terminal emulators. | ||
// See: http://www.xfree86.org/current/ctlseqs.html#Bracketed%20Paste%20Mode | ||
if (flags & PasteFlags::Bracketed) | ||
{ | ||
// For security reasons, control characters should be filtered. | ||
// Here ASCII control characters will be removed, except HT(0x09), LF(0x0a), CR(0x0d) and DEL(0x7F). | ||
converted.erase(std::remove_if(converted.begin(), converted.end(), [](wchar_t c) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I try to minimize the copy opertion so I choose |
||
if (c >= L'\x20' && c <= L'\x7f') | ||
{ | ||
// Printable ASCII + DEL. | ||
return false; | ||
} | ||
|
||
if (c > L'\x7f') | ||
{ | ||
// Not a control character for sure. | ||
return false; | ||
} | ||
|
||
return c >= L'\x00' && c <= L'\x08' || | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The best resource I can find is here https://security.stackexchange.com/a/52655. It suggest removing every control characters except:
Then there's some other opinions in https://bugzilla.gnome.org/show_bug.cgi?id=753197, regarding removing:
And here https://bugzilla.gnome.org/show_bug.cgi?id=794653, regarding removing:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since security is my thing even more than dev stuff, I would agree all of these should be disabled by default. I would also agree we should probably put it behind a setting for certain control characters to be allowed such as BS and DEL with nice warning basically saying "you're on your own if you get hacked". There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks @WSLUser. That’s a great idea, to give users more options controlling paste behavior. In fact many terminals offer these kind of options so I figure a dedicated class for pasting is a must. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think there's some confusion here about what bracketed paste is actually meant to do. On most terminals I've tested, it doesn't do any additional filtering - it just adds the escape sequences at the beginning and end of the paste (that's the "bracketing"). While some terminals do filter out certain control characters, or convert them in some way, they do that all the time - not just when bracketed paste mode is enabled. So I think you're kind of mixing two different features here - paste filtering (which should have nothing to do with bracketed paste mode), and then the bracketed paste mode itself. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks now I realize that. It’s just we don’t have a setting for people to enable/disable filtering pasted content. So I think that can be added later with another flag SanitizeContent or something. Or should we just filter them all without letting people choose. In my imagination there will be more flags coming in the future and do various kinds of things There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I can understand wanting to add support for paste filtering (although I would have expected that in a PR named something like "Add paste filtering"). I also understand leaving the option to enable or disable it for a follow-up PR. What I don't understand is why we want to tie this to bracketed paste - it's a completely separate concept. If we want to make the paste filtering controllable via an escape sequence for some reason, then we should be inventing our own escape sequence for that - not repurposing an existing sequence. I don't know if I've just misunderstood the intention of this PR, but it make no sense to me. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah I kind misunderstand the relation between filtering escape sequence and adding support for bracketed paste mode. I’ll try to make this better. |
||
c >= L'\x0b' && c <= L'\x0c' || | ||
c >= L'\x0e' && c <= L'\x1f'; | ||
})); | ||
|
||
converted.insert(0, L"\x1b[200~"); | ||
converted.append(L"\x1b[201~"); | ||
} | ||
|
||
return converted; | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
/*++ | ||
Copyright (c) Microsoft Corporation | ||
Licensed under the MIT license. | ||
|
||
Module Name: | ||
- PasteConverter.h | ||
|
||
Abstract: | ||
- Pre-process the text pasted (presumably from the clipboard). | ||
|
||
--*/ | ||
|
||
#pragma once | ||
|
||
#include <string> | ||
|
||
namespace Microsoft::Console::Utils | ||
{ | ||
enum PasteFlags | ||
{ | ||
CarriageReturnNewline = 1, | ||
Bracketed = 2 | ||
}; | ||
|
||
DEFINE_ENUM_FLAG_OPERATORS(PasteFlags) | ||
|
||
class PasteConverter | ||
{ | ||
public: | ||
static std::wstring Convert(const std::wstring& wstr, PasteFlags flags); | ||
}; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what to do with conhost. Those legacy code always scare me . Call we just leave it like this?