Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
9b44a68
add variant and docs with examples
ntBre Mar 6, 2025
626ce7d
add test_ok
ntBre Mar 6, 2025
8336a66
add failing test_err case
ntBre Mar 6, 2025
2dea97c
check for invalid components, pass tests
ntBre Mar 6, 2025
990d212
add f-string kind variants and update messages
ntBre Mar 6, 2025
3881490
gate whole check behind is_unsupported
ntBre Mar 6, 2025
4eafe99
clippy
ntBre Mar 6, 2025
532b3f9
add true positive quote and continuation cases
ntBre Mar 7, 2025
e8346d7
Merge branch 'main' into brent/syn-f-strings
ntBre Mar 7, 2025
fc97d27
set target_version for black_compatibility tests
ntBre Mar 7, 2025
2bac998
add failing false positives with comments
ntBre Mar 13, 2025
f57ee3a
pass string comment cases
ntBre Mar 13, 2025
890069c
add cases with nesting, move comment check out of loop
ntBre Mar 13, 2025
d488818
Merge branch 'main' into brent/syn-f-strings
ntBre Mar 13, 2025
d4e5498
move check_f_string_comments to helpers module
ntBre Mar 13, 2025
6db743c
avoid panic on invalid f-strings
ntBre Mar 13, 2025
eb3551d
add a test case with escapes outside of the expression
ntBre Mar 13, 2025
d8d3eb1
factor out Tokens::in_range_impl and Tokens::after_impl
ntBre Mar 14, 2025
a07b8fe
make TokenSource::tokens private and add TokenSource::in_range
ntBre Mar 14, 2025
79b88be
make check_fstring_comments a method and use in_range
ntBre Mar 14, 2025
84cc975
remove f-string stack now that range is restricted
ntBre Mar 14, 2025
d6ec8d6
use then_some and document check_fstring_comments
ntBre Mar 14, 2025
82222d5
factor out range
ntBre Mar 14, 2025
2ccd4e1
Merge branch 'main' into brent/syn-f-strings
ntBre Mar 14, 2025
9d393c7
add test for escaped quote outside expression portion
ntBre Mar 17, 2025
61cac6f
mark the whole triple quote when reused
ntBre Mar 17, 2025
6a673d0
switch to memchr searches
ntBre Mar 17, 2025
3a3c107
Merge branch 'main' into brent/syn-f-strings
ntBre Mar 17, 2025
ffb1935
use supports_pep_701
ntBre Mar 17, 2025
0b00a8a
revert Tokens changes and use handrolled approach
ntBre Mar 17, 2025
45e5716
use unwrap for character indices
ntBre Mar 17, 2025
fe305bc
loop over memchr_iter
ntBre Mar 17, 2025
c093a14
add test case with multiple escapes
ntBre Mar 17, 2025
779fa1a
index -> position
ntBre Mar 18, 2025
682ba38
get TextSize from slash
ntBre Mar 18, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"target_version": "3.12"}
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# parse_options: {"target-version": "3.11"}
f'Magic wand: { bag['wand'] }' # nested quotes
f"{'\n'.join(a)}" # escape sequence
f'''A complex trick: {
bag['bag'] # comment
}'''
f"{f"{f"{f"{f"{f"{1+1}"}"}"}"}"}" # arbitrary nesting
f"{f'''{"nested"} inner'''} outer" # nested (triple) quotes
f"test {a \
} more" # line continuation
f"""{f"""{x}"""}""" # mark the whole triple quote
f"{'\n'.join(['\t', '\v', '\r'])}" # multiple escape sequences, multiple errors
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# parse_options: {"target-version": "3.11"}
f"outer {'# not a comment'}"
f'outer {x:{"# not a comment"} }'
f"""{f'''{f'{"# not a comment"}'}'''}"""
f"""{f'''# before expression {f'# aro{f"#{1+1}#"}und #'}'''} # after expression"""
f"escape outside of \t {expr}\n"
f"test\"abcd"
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# parse_options: {"target-version": "3.12"}
f'Magic wand: { bag['wand'] }' # nested quotes
f"{'\n'.join(a)}" # escape sequence
f'''A complex trick: {
bag['bag'] # comment
}'''
f"{f"{f"{f"{f"{f"{1+1}"}"}"}"}"}" # arbitrary nesting
f"{f'''{"nested"} inner'''} outer" # nested (triple) quotes
f"test {a \
} more" # line continuation
46 changes: 46 additions & 0 deletions crates/ruff_python_parser/src/error.rs
Original file line number Diff line number Diff line change
Expand Up @@ -452,6 +452,14 @@ pub enum StarTupleKind {
Yield,
}

/// The type of PEP 701 f-string error for [`UnsupportedSyntaxErrorKind::Pep701FString`].
#[derive(Debug, PartialEq, Eq, Hash, Clone, Copy)]
pub enum FStringKind {
Backslash,
Comment,
NestedQuote,
}

#[derive(Debug, PartialEq, Eq, Hash, Clone, Copy)]
pub enum UnparenthesizedNamedExprKind {
SequenceIndex,
Expand Down Expand Up @@ -661,6 +669,34 @@ pub enum UnsupportedSyntaxErrorKind {
TypeAliasStatement,
TypeParamDefault,

/// Represents the use of a [PEP 701] f-string before Python 3.12.
///
/// ## Examples
///
/// As described in the PEP, each of these cases were invalid before Python 3.12:
///
/// ```python
/// # nested quotes
/// f'Magic wand: { bag['wand'] }'
///
/// # escape characters
/// f"{'\n'.join(a)}"
///
/// # comments
/// f'''A complex trick: {
/// bag['bag'] # recursive bags!
/// }'''
///
/// # arbitrary nesting
/// f"{f"{f"{f"{f"{f"{1+1}"}"}"}"}"}"
/// ```
///
/// These restrictions were lifted in Python 3.12, meaning that all of these examples are now
/// valid.
///
/// [PEP 701]: https://peps.python.org/pep-0701/
Pep701FString(FStringKind),

/// Represents the use of a parenthesized `with` item before Python 3.9.
///
/// ## Examples
Expand Down Expand Up @@ -838,6 +874,15 @@ impl Display for UnsupportedSyntaxError {
UnsupportedSyntaxErrorKind::TypeParamDefault => {
"Cannot set default type for a type parameter"
}
UnsupportedSyntaxErrorKind::Pep701FString(FStringKind::Backslash) => {
"Cannot use an escape sequence (backslash) in f-strings"
}
UnsupportedSyntaxErrorKind::Pep701FString(FStringKind::Comment) => {
"Cannot use comments in f-strings"
}
UnsupportedSyntaxErrorKind::Pep701FString(FStringKind::NestedQuote) => {
"Cannot reuse outer quote character in f-strings"
}
UnsupportedSyntaxErrorKind::ParenthesizedContextManager => {
"Cannot use parentheses within a `with` statement"
}
Expand Down Expand Up @@ -904,6 +949,7 @@ impl UnsupportedSyntaxErrorKind {
UnsupportedSyntaxErrorKind::TypeParameterList => Change::Added(PythonVersion::PY312),
UnsupportedSyntaxErrorKind::TypeAliasStatement => Change::Added(PythonVersion::PY312),
UnsupportedSyntaxErrorKind::TypeParamDefault => Change::Added(PythonVersion::PY313),
UnsupportedSyntaxErrorKind::Pep701FString(_) => Change::Added(PythonVersion::PY312),
UnsupportedSyntaxErrorKind::ParenthesizedContextManager => {
Change::Added(PythonVersion::PY39)
}
Expand Down
84 changes: 81 additions & 3 deletions crates/ruff_python_parser/src/parser/expression.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,13 +11,15 @@ use ruff_python_ast::{
};
use ruff_text_size::{Ranged, TextLen, TextRange, TextSize};

use crate::error::{StarTupleKind, UnparenthesizedNamedExprKind};
use crate::error::{FStringKind, StarTupleKind, UnparenthesizedNamedExprKind};
use crate::parser::progress::ParserProgress;
use crate::parser::{helpers, FunctionKind, Parser};
use crate::string::{parse_fstring_literal_element, parse_string_literal, StringType};
use crate::token::{TokenKind, TokenValue};
use crate::token_set::TokenSet;
use crate::{FStringErrorType, Mode, ParseErrorType, UnsupportedSyntaxErrorKind};
use crate::{
FStringErrorType, Mode, ParseErrorType, UnsupportedSyntaxError, UnsupportedSyntaxErrorKind,
};

use super::{FStringElementsKind, Parenthesized, RecoveryContextKind};

Expand Down Expand Up @@ -1393,13 +1395,89 @@ impl<'src> Parser<'src> {

self.expect(TokenKind::FStringEnd);

// test_ok pep701_f_string_py312
// # parse_options: {"target-version": "3.12"}
// f'Magic wand: { bag['wand'] }' # nested quotes
// f"{'\n'.join(a)}" # escape sequence
// f'''A complex trick: {
// bag['bag'] # comment
// }'''
// f"{f"{f"{f"{f"{f"{1+1}"}"}"}"}"}" # arbitrary nesting
// f"{f'''{"nested"} inner'''} outer" # nested (triple) quotes
// f"test {a \
// } more" # line continuation

// test_ok pep701_f_string_py311
// # parse_options: {"target-version": "3.11"}
// f"outer {'# not a comment'}"
// f'outer {x:{"# not a comment"} }'
// f"""{f'''{f'{"# not a comment"}'}'''}"""
// f"""{f'''# before expression {f'# aro{f"#{1+1}#"}und #'}'''} # after expression"""
// f"escape outside of \t {expr}\n"
// f"test\"abcd"

// test_err pep701_f_string_py311
// # parse_options: {"target-version": "3.11"}
// f'Magic wand: { bag['wand'] }' # nested quotes
// f"{'\n'.join(a)}" # escape sequence
// f'''A complex trick: {
// bag['bag'] # comment
// }'''
// f"{f"{f"{f"{f"{f"{1+1}"}"}"}"}"}" # arbitrary nesting
// f"{f'''{"nested"} inner'''} outer" # nested (triple) quotes
// f"test {a \
// } more" # line continuation
// f"""{f"""{x}"""}""" # mark the whole triple quote
// f"{'\n'.join(['\t', '\v', '\r'])}" # multiple escape sequences, multiple errors

let range = self.node_range(start);

if !self.options.target_version.supports_pep_701() {
let quote_bytes = flags.quote_str().as_bytes();
let quote_len = flags.quote_len();
for expr in elements.expressions() {
for slash_position in memchr::memchr_iter(b'\\', self.source[expr.range].as_bytes())
{
let slash_position = TextSize::try_from(slash_position).unwrap();
self.add_unsupported_syntax_error(
UnsupportedSyntaxErrorKind::Pep701FString(FStringKind::Backslash),
TextRange::at(expr.range.start() + slash_position, '\\'.text_len()),
);
}

if let Some(quote_position) =
memchr::memmem::find(self.source[expr.range].as_bytes(), quote_bytes)
{
let quote_position = TextSize::try_from(quote_position).unwrap();
self.add_unsupported_syntax_error(
UnsupportedSyntaxErrorKind::Pep701FString(FStringKind::NestedQuote),
TextRange::at(expr.range.start() + quote_position, quote_len),
);
};
}

self.check_fstring_comments(range);
}

ast::FString {
elements,
range: self.node_range(start),
range,
flags: ast::FStringFlags::from(flags),
}
}

/// Check `range` for comment tokens and report an `UnsupportedSyntaxError` for each one found.
fn check_fstring_comments(&mut self, range: TextRange) {
self.unsupported_syntax_errors
.extend(self.tokens.in_range(range).iter().filter_map(|token| {
token.kind().is_comment().then_some(UnsupportedSyntaxError {
kind: UnsupportedSyntaxErrorKind::Pep701FString(FStringKind::Comment),
range: token.range(),
target_version: self.options.target_version,
})
}));
}

/// Parses a list of f-string elements.
///
/// # Panics
Expand Down
6 changes: 6 additions & 0 deletions crates/ruff_python_parser/src/token.rs
Original file line number Diff line number Diff line change
Expand Up @@ -418,6 +418,12 @@ impl TokenKind {
matches!(self, TokenKind::Comment | TokenKind::NonLogicalNewline)
}

/// Returns `true` if this is a comment token.
#[inline]
pub const fn is_comment(&self) -> bool {
matches!(self, TokenKind::Comment)
}

#[inline]
pub const fn is_arithmetic(self) -> bool {
matches!(
Expand Down
15 changes: 15 additions & 0 deletions crates/ruff_python_parser/src/token_source.rs
Original file line number Diff line number Diff line change
Expand Up @@ -166,6 +166,21 @@ impl<'src> TokenSource<'src> {
self.tokens.truncate(tokens_position);
}

/// Returns a slice of [`Token`] that are within the given `range`.
pub(crate) fn in_range(&self, range: TextRange) -> &[Token] {
let start = self
.tokens
.iter()
.rposition(|tok| tok.start() == range.start());
let end = self.tokens.iter().rposition(|tok| tok.end() == range.end());

let (Some(start), Some(end)) = (start, end) else {
return &self.tokens;
};

&self.tokens[start..=end]
}
Comment on lines +169 to +182
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you plan on using this method elsewhere?

If not, we could inline the logic in check_fstring_comments and simplify it to avoid the iteration for the end variable as, I think, the parser is already at that position? So, something like what Micha suggested in #16543 (comment) i.e., just iterate over the tokens in reverse order until we reach the f-string start and report an error for all the Comment tokens found.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need/want a method of some kind because TokenSource::tokens is a private field. I could just add a tokens getter though, of course.

I also tried this without end, but cases like

f'Magic wand: { bag['wand'] }'     # nested quotes

caught new errors on the trailing comment. At the point we do this processing, we've bumped past the FStringEnd and any trivia tokens after it, so I think we do need to find the end point as well.

Hmm, maybe a tokens getter would be nicest. Then I could do all of the processing on a single iterator in check_fstring_comments at least.

Copy link
Member

@dhruvmanila dhruvmanila Mar 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not use the f-string range directly? Or, is there something else I'm missing? I don't think the comment is part of the f-string range.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the node_range calculation avoids any trailing trivia tokens like the one that you've mentioned in the example above. This is done by keeping track of the end of the previous token which excludes some tokens like comment. Here, when you call node_range, then it will give you the range which doesn't include the trailing comment. If it wouldn't then the f-string range would be incorrect here.

Copy link
Member

@dhruvmanila dhruvmanila Mar 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, shoot, I think the tokens field should still include the trailing comment. Happy to go with what you think is best here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think that's a good summary. We have the exact f-string range but need to match that up with the actual Tokens in the tokens field, which includes trailing comments.

I tried the tokens getter and moving the logic into check_fstring_comments, but I do aesthetically prefer how it looked with self.tokens.in_range... even if the in_range method itself looks a little weird. So I might just leave it alone for now. Thanks for double checking!


/// Consumes the token source, returning the collected tokens, comment ranges, and any errors
/// encountered during lexing. The token collection includes both the trivia and non-trivia
/// tokens.
Expand Down
Loading
Loading