Skip to content

Add syntax extensions for lowercasing/uppercasing strings (Fixes #16607) #16636

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

Manishearth
Copy link
Member

Testcase:

 #![feature(macro_rules)]
macro_rules! identify (
  ($i: ident) => {
    let $i = to_lower!(stringify!($i));
  }
)

fn main() {
    println!("{}", to_lower!("FooBar"))
    println!("{}", to_lower!(stringify!(FooBar)))
    println!("{}", to_upper!("FooBar"))
    println!("{}", to_upper!(stringify!(FooBar)))
    identify!(FooBar);
    println!("{}",FooBar)
}

@lilyball
Copy link
Contributor

#16607 calls them to_ascii_lower!() and to_ascii_upper!(). I'm also inclined to say that if uppercasing is useful, titlecasing is also useful (e.g. where the first letter is uppercased and the rest is left alone).

Some(e) => e,
None => return base::DummyResult::expr(sp)
};
match es.move_iter().next() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If multiple expressions are provided, this will happily process the first and ignore the rest. It should expressly test how many expressions it got.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would a test for iter.next().is_some() be okay style-wise?

@SimonSapin
Copy link
Contributor

  • char::to_lowercase does some flavor of Unicode case folding. It includes some fun cases like mapping the Kelvin sign K to ASCII lower case k.
  • str::to_ascii_lower only maps ASCII letters and leaves everything else alone. Although it often looks wrong on arbitrary input, it can make sense when used on a closed set of strings that is entirely ASCII like CSS property names.

The PR currently uses char::to_lowercase, so the to_lower! macro name is consistent.

}
},
_ => ()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is hard to read, with all the rightward drift. Ideally you could compress this into a single match, but I don't think you can match the Gc in ExprLit. We can at least get rid of one of the match levels, and remove the return from the inside of this big beast:

let s = match es.move_iter().next() {
    Some(codemap::Spanned { node: ast::ExprLit(lit), .. }) => match lit.node {
        ast::LitStr(ref s, _) => Some(s),
        _ => None
    },
    _ => None
};
match s {
    Some(s) => {
        let new_s = s.get().chars().map(transform).collect::<String>();
        base::MacExpr::new(cx.expr_str(sp, token::intern_and_get_ident(new_s.as_slice())))
    }
    None => {
        cx.span_err(sp, "expected a string literal");
        base::DummyResult::expr(sp)
    }
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should also probably pull the span off of the first expression, instead of using the span for the whole macro.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the Gc was giving problems. I couldn't find any way of destructuring the @ (I was surprised that syntax still exists)

This seems like a better option, thanks :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this, there's another Gc on Expr, so I had to use the following (one extra match):

   let s = match iter.next() {
        Some(expr) => {
            match *expr {
                ast::Expr  { node: ast::ExprLit(lit), span: sp2 , ..} => {
                    sp = sp2;
                    match lit.node {
                        ast::LitStr(ref s, _) => Some(s),
                        _ => None
                     }
               },
               _ => None
           }
        },
        _ => None
    };

node doesn't live long enough so I'll have to use s.clone() if I don't want to nest the matches further. Should I? Seems a bit unnecessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It occurs to me the ref s isn't going to work, since it's a reference to a moved value. We can skip the move, though, and we can also produce a better error for providing the wrong argument type vs not providing the right number of arguments. Something like:

let mut it = es.iter();
let res = match it.next() {
    Some(&codemap::Spanned { node: ast::ExprLit(ref lit), .. }) => match lit.node {
        ast::LitStr(ref s, span) => Some((s, span)),
        _ => {
            cx.span_err(span, "expected a string literal");
            None
        }
    },
    _ => {
        cx.span_err(sp, "expected 1 argument, found 0");
        None
    }
};
match (res,it.count()) {
    (Some((s,span)),0) => {
        let new_s = s.get().chars().map(transform).collect::<String>();
        base::MacExpr::new(cx.expr_str(sp, token::intern_and_get_ident(new_s.as_slice())))
    }
    (_, rest) => {
        if rest > 0 {
            cx.span_err(sp, format!("expected 1 argument, found {}", rest+1).as_slice());
        }
        base::DummyResult::expr(sp)
    }
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is there a codemap::Spanned there? Shouldn't it be an ast::Ext?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I was assuming that Expr was a Spanned<Expr_>. It actually isn't, which is surprising, and is apparently due to the need for a separate id field. Anyway, I'm guessing the following should work:

fn expand_cased(cx: &mut ExtCtxt, sp: Span, tts: &[ast::TokenTree], transform: |char| -> char)
                        -> Box<base::MacResult> {
    let es = match base::get_exprs_from_tts(cx, sp, tts) {
        Some(e) => e,
        None => return base::DummyResult::expr(sp)
    };

    let mut it = es.iter();
    let res = match it.next() {
        // FIXME (#13910): nested matches are necessary to get through Gc<>
        Some(expr) => match expr.node {
            ast::ExprLit(ref lit) => match lit.node {
                ast::LitStr(ref s, span) => Some((s, span)),
                _ => {
                    cx.span_err(span, "expected a string literal");
                    None
                }
            }
            _ => {
                cx.span_err(expr.span, "expected a string literal");
                None
            }
        },
        None => {
            cx.span_err(sp, "expected 1 argument, found 0");
            None
        }
    };
    match (res,it.count()) {
        (Some((s,span)),0) => {
            let new_s = s.get().chars().map(transform).collect::<String>();
            base::MacExpr::new(cx.expr_str(span, token::intern_and_get_ident(new_s.as_slice())))
        }
        (_, rest) => {
            if rest > 0 {
                cx.span_err(sp, format!("expected 1 argument, found {}", rest+1).as_slice());
            }
            base::DummyResult::expr(sp)
        }
    }
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks. Should the sp in the final MacExpr be sp or span? It might end up with the wrong span on expansion -- I'm not entirely clear on how ExtCtxt works to be sure of this, and I'd rather not wait for two full builds to see ;)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I meant to make that span. I'll edit my comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were more issues with span being taken from the wrong place (should
be lit.span, not the second value of ExprLit), but I fixed that.

Thanks for the help!

-Manish Goregaokar

On Thu, Aug 21, 2014 at 4:59 AM, Kevin Ballard notifications@github.com
wrote:

In src/libsyntax/ext/source_util.rs:

  • };
  • match es.move_iter().next() {
  •    Some(expr) => {
    
  •        match expr.node {
    
  •            ast::ExprLit(lit) => match lit.node {
    
  •                ast::LitStr(ref s, _) => {
    
  •                    return base::MacExpr::new(cx.expr_str(sp,
    
  •                        token::intern_and_get_ident(s.get().chars().map(transform).collect::<String>().as_slice())))
    
  •                },
    
  •                _ => ()
    
  •            },
    
  •            _ => ()
    
  •        }
    
  •    },
    
  •    _ => ()
    
  • }

Good catch, I meant to make that span. I'll edit my comment.


Reply to this email directly or view it on GitHub
https://github.com/rust-lang/rust/pull/16636/files#r16512839.

@lilyball
Copy link
Contributor

@SimonSapin Ok, fair enough.

@SimonSapin
Copy link
Contributor

From chatting on IRC with @Manishearth: either kind of lower casing works for Servo’s use case since they behave the same when the input is within ASCII, which we know because it’s compile time (no arbitrary input from web content).

@Manishearth
Copy link
Member Author

@SimonSapin I can add to_ascii_* variants if we want, since all I have to do is add variants of the transform closure.

With some changes (accept a closure that transforms on the whole string isntead of individual chars) I can add titlecasing (and snakecasing) support as well, if these would be useful.

@lilyball
Copy link
Contributor

@Manishearth I don't think there's a need for the ascii variants. I only brought it up because the original issue talked about to_ascii_lower!() but if the motivating use-case doesn't care, then it's better to have fewer macros.

@Manishearth
Copy link
Member Author

Updated with the style changes. I'll look into adding to_titlecase! tomorrow.

@alexcrichton
Copy link
Member

This is an addition to the prelude for all rust programs in existence, and should be considered with the gravity as such. Currently we require any modifications of the prelude (additions or removals) such as this to go through the RFC process.

This is also why we have created a plugin infrastructure for developing extensions such as this out of tree.

@Manishearth
Copy link
Member Author

Alright, I'll draft up an RFC when I get time.

-Manish Goregaokar

On Thu, Aug 21, 2014 at 9:51 PM, Alex Crichton notifications@github.com
wrote:

This is an addition to the prelude for all rust programs in existence, and
should be considered with the gravity as such. Currently we require any
modifications of the prelude (additions or removals) such as this to go
through the RFC process.

This is also why we have created a plugin infrastructure for developing
extensions such as this out of tree.


Reply to this email directly or view it on GitHub
#16636 (comment).

@SimonSapin
Copy link
Contributor

Alternatively, Servo could define these syntax extensions itself in its macros crate.

@Manishearth
Copy link
Member Author

That would work. I'll have a go.

-Manish Goregaokar

On Thu, Aug 21, 2014 at 9:57 PM, Simon Sapin notifications@github.com
wrote:

Alternatively, Servo could define these syntax extensions itself in its
macros crate.


Reply to this email directly or view it on GitHub
#16636 (comment).

@alexcrichton
Copy link
Member

It sounds like this may be able to live outside the compiler in servo, so I'm going to close this for now. Feel free to reopen if any surprises arise!

Swatinem added a commit to Swatinem/servo that referenced this pull request Feb 3, 2015
The implementation was copied directly from
rust-lang/rust#16636
and updated for rust changes, so the credit goes to @Manishearth
Swatinem added a commit to Swatinem/servo that referenced this pull request Feb 3, 2015
The implementation was copied directly from
rust-lang/rust#16636
and updated for rust changes, so the credit goes to @Manishearth
bors-servo pushed a commit to servo/servo that referenced this pull request Feb 3, 2015
The implementation was copied directly from
rust-lang/rust#16636
and updated for rust changes, so the credit goes to @Manishearth
bors-servo pushed a commit to servo/servo that referenced this pull request Feb 3, 2015
The implementation was copied directly from
rust-lang/rust#16636
and updated for rust changes, so the credit goes to @Manishearth
bors-servo pushed a commit to servo/servo that referenced this pull request Feb 4, 2015
The implementation was copied directly from
rust-lang/rust#16636
and updated for rust changes, so the credit goes to @Manishearth
@Manishearth Manishearth deleted the lower branch September 3, 2015 17:46
jrmuizel pushed a commit to jrmuizel/gecko-cinnabar that referenced this pull request Jun 12, 2017
…28 (from Swatinem:lowercasegetters); r=Manishearth

The implementation was copied directly from
rust-lang/rust#16636
and updated for rust changes, so the credit goes to @Manishearth

Source-Repo: https://github.com/servo/servo
Source-Revision: 902c16497c40684930819693a7e90f0862eb7f56
@mmstick
Copy link

mmstick commented Dec 13, 2018

What's the current status on getting lowercase / uppercase / kebab-case / title-case / etc. variants of strings in macros? I have a lot of use cases for this in various libraries where I often need to create an enum, and then derive both impl From<Enum> for &'static str and impl FromStr for Enum; or when interacting with a JSON API that uses kebab case.

@Manishearth
Copy link
Member Author

Manishearth commented Dec 13, 2018 via email

gecko-dev-updater pushed a commit to marco-c/gecko-dev-comments-removed that referenced this pull request Sep 30, 2019
…28 (from Swatinem:lowercasegetters); r=Manishearth

The implementation was copied directly from
rust-lang/rust#16636
and updated for rust changes, so the credit goes to Manishearth

Source-Repo: https://github.com/servo/servo
Source-Revision: 902c16497c40684930819693a7e90f0862eb7f56

UltraBlame original commit: 1b9c7a69c746676dcf68b519877913b5b2186a56
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified that referenced this pull request Oct 1, 2019
…28 (from Swatinem:lowercasegetters); r=Manishearth

The implementation was copied directly from
rust-lang/rust#16636
and updated for rust changes, so the credit goes to Manishearth

Source-Repo: https://github.com/servo/servo
Source-Revision: 902c16497c40684930819693a7e90f0862eb7f56

UltraBlame original commit: 1b9c7a69c746676dcf68b519877913b5b2186a56
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified-and-comments-removed that referenced this pull request Oct 1, 2019
…28 (from Swatinem:lowercasegetters); r=Manishearth

The implementation was copied directly from
rust-lang/rust#16636
and updated for rust changes, so the credit goes to Manishearth

Source-Repo: https://github.com/servo/servo
Source-Revision: 902c16497c40684930819693a7e90f0862eb7f56

UltraBlame original commit: 1b9c7a69c746676dcf68b519877913b5b2186a56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants