- 
                Notifications
    You must be signed in to change notification settings 
- Fork 599
Add to documentation of -a in perlrun #9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
          
     Closed
      
        
      
    
                
     Closed
            
            
          Conversation
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
    | This repo is a mirror only, see perlhack/Super Quick Patch Guide | 
| Thanks. On Fri, Jul 31, 2015 at 2:18 AM, Martin McGrath notifications@github.com 
 | 
    
  p5p 
      pushed a commit
      that referenced
      this pull request
    
      Sep 15, 2016 
    
    
      
  
    
      
    
  
This macro follows Unicode Corrigendum #9 to allow non-character code points. These are still discouraged but not completely forbidden. It's best for code that isn't intended to operate on arbitrary other code text to use the original definition, but code that does things, such as source code control, should change to use this definition if it wants to be Unicode-strict. Perl can't adopt C9 wholesale, as it might create security holes in existing applications that rely on Perl keeping non-chars out.
    
  p5p 
      pushed a commit
      that referenced
      this pull request
    
      Sep 15, 2016 
    
    
      
  
    
      
    
  
This macro follows Unicode Corrigendum #9 to allow non-character code points. These are still discouraged but not completely forbidden. It's best for code that isn't intended to operate on arbitrary other code text to use the original definition, but code that does things, such as source code control, should change to use this definition if it wants to be Unicode-strict. Perl can't adopt C9 wholesale, as it might create security holes in existing applications that rely on Perl keeping non-chars out.
    
  p5p 
      pushed a commit
      that referenced
      this pull request
    
      Sep 17, 2016 
    
    
      
  
    
      
    
  
This macro follows Unicode Corrigendum #9 to allow non-character code points. These are still discouraged but not completely forbidden. It's best for code that isn't intended to operate on arbitrary other code text to use the original definition, but code that does things, such as source code control, should change to use this definition if it wants to be Unicode-strict. Perl can't adopt C9 wholesale, as it might create security holes in existing applications that rely on Perl keeping non-chars out.
    
  p5p 
      pushed a commit
      that referenced
      this pull request
    
      Sep 18, 2016 
    
    
      
  
    
      
    
  
This macro follows Unicode Corrigendum #9 to allow non-character code points. These are still discouraged but not completely forbidden. It's best for code that isn't intended to operate on arbitrary other code text to use the original definition, but code that does things, such as source code control, should change to use this definition if it wants to be Unicode-strict. Perl can't adopt C9 wholesale, as it might create security holes in existing applications that rely on Perl keeping non-chars out.
  This was referenced Oct 18, 2019 
      
  
  Closed
  
      
  
  Closed
  
      
  
  Closed
  
      
  
  Closed
  
    
  demerphq 
      added a commit
      that referenced
      this pull request
    
      Sep 9, 2022 
    
    
  
    
  demerphq 
      added a commit
      that referenced
      this pull request
    
      Sep 12, 2022 
    
    
  
    
  demerphq 
      added a commit
      that referenced
      this pull request
    
      Oct 25, 2022 
    
    
  
    
  demerphq 
      added a commit
      that referenced
      this pull request
    
      Nov 5, 2022 
    
    
  
    
  demerphq 
      added a commit
      that referenced
      this pull request
    
      Nov 5, 2022 
    
    
  
    
  demerphq 
      added a commit
      that referenced
      this pull request
    
      Nov 5, 2022 
    
    
  
    
  demerphq 
      added a commit
      that referenced
      this pull request
    
      Nov 5, 2022 
    
    
  
    
  demerphq 
      added a commit
      that referenced
      this pull request
    
      Dec 31, 2022 
    
    
  
    
  khwilliamson 
      pushed a commit
        to khwilliamson/perl5
      that referenced
      this pull request
    
      Jan 30, 2023 
    
    
  
    
  khwilliamson 
      pushed a commit
        to khwilliamson/perl5
      that referenced
      this pull request
    
      Jan 30, 2023 
    
    
  
    
  khwilliamson 
      pushed a commit
        to khwilliamson/perl5
      that referenced
      this pull request
    
      Jan 30, 2023 
    
    
      
  
    
      
    
  
- floor(), abs_floor(), ceil() and abs_ceil() added - roundoption integrated as fifth argument to format_number() - see Perl#9
    
  khwilliamson 
      pushed a commit
        to khwilliamson/perl5
      that referenced
      this pull request
    
      Jan 30, 2023 
    
    
  
    
  khwilliamson 
      pushed a commit
        to khwilliamson/perl5
      that referenced
      this pull request
    
      Jan 30, 2023 
    
    
  
    
  khwilliamson 
      pushed a commit
        to khwilliamson/perl5
      that referenced
      this pull request
    
      Jan 30, 2023 
    
    
      
  
    
      
    
  
- to export all constants use :constants or :all - see Perl#9
    
  khwilliamson 
      pushed a commit
        to khwilliamson/perl5
      that referenced
      this pull request
    
      Jan 30, 2023 
    
    
  
    
  khwilliamson 
      pushed a commit
        to khwilliamson/perl5
      that referenced
      this pull request
    
      Jan 30, 2023 
    
    
      
  
    
      
    
  
- Documentation changes - explain undef as argument to round() and format_number() - see Perl#9
    
  demerphq 
      added a commit
      that referenced
      this pull request
    
      Feb 8, 2023 
    
    
  
    
  demerphq 
      added a commit
      that referenced
      this pull request
    
      Feb 19, 2023 
    
    
  
    
  demerphq 
      added a commit
      that referenced
      this pull request
    
      Feb 20, 2023 
    
    
  
    
  khwilliamson 
      pushed a commit
        to khwilliamson/perl5
      that referenced
      this pull request
    
      Jul 16, 2023 
    
    
      
  
    
      
    
  
fix stack usage in vcmp method (cmp overload)
    
  khwilliamson 
      added a commit
        to khwilliamson/perl5
      that referenced
      this pull request
    
      Sep 28, 2024 
    
    
      
  
    
      
    
  
I'm uncertain about this commit. There are three separate DFA tables already in core. One accepts Perl extended UTF-8; one accepts only strict Unicode UTF-8; and the third accepts modified Unicode UTF-8 spelled out by them in Corrigendum Perl#9. Both the Unicode varieties reject surrogate code points and anything above U+10FFFF. C9 accepts, but the other rejects non-character code points. Without this commit, the way it works is it uses the most restrictive table for the DFA. Anything it accepts is always valid. Anything it rejects is potentially problematic, and it calls a non-inlined function to examine the input more slowly to determine if it is acceptable and/or if a warning needs to be raised. This commit examines the input flags to determine which DFA to use in this situation. The benefit is that the slower routine could be avoided for many more code points. But the vast vast majority of calls to this function aren't for any problematic code points, so the extra cost of this will very rarely be recouped. The translation from UTF-8 is critically important. We want it to be as fast as possible. I would not even consider this commit if the extra cost weren't very small. A complicating factor is that 2048 (approximately 20% of the total) Korean Hangul syllable code points are not handled by the strict table, so must be by the slower function; though they're handled at the very beginning of it. These code points are never problematic, so it is unfortunate that they have to be handled via the slower function. But still, rarely will this function be called with them. Only the strict table has this problem The way this commit works is to have a table containing pointers to the three DFA tables. The function looks at the input flags; if none are present, it uses the loosest dfa; if any restrictions are present, it adds 1 to the index to use, and it the C9 resetrictions are present, it adds an extra 1. The flags are cast to bools to get each addition. If the bool casts didn't generate conditionals, the only cost to this would be two additions and an indirection; and I would say that that cost is so tiny that this would be worth it. But I looked at godbolt, and casting to bool requires a comparison on both modern clang and gcc. That makes me unsure of the tradeoff. Another option would be to just juse two DFAs, loose and most strict. Then there would be a single conditional, and the Hanguls still would be handled by the DFA when there were no flags restricting things
    
  khwilliamson 
      added a commit
        to khwilliamson/perl5
      that referenced
      this pull request
    
      Sep 29, 2024 
    
    
      
  
    
      
    
  
I'm uncertain about this commit. There are three separate DFA tables already in core. One accepts Perl extended UTF-8; one accepts only strict Unicode UTF-8; and the third accepts modified Unicode UTF-8 spelled out by them in Corrigendum Perl#9. Both the Unicode varieties reject surrogate code points and anything above U+10FFFF. C9 accepts, but the other rejects non-character code points. Without this commit, the way it works is it uses the most restrictive table for the DFA. Anything it accepts is always valid. Anything it rejects is potentially problematic, and it calls a non-inlined function to examine the input more slowly to determine if it is acceptable and/or if a warning needs to be raised. This commit examines the input flags to determine which DFA to use in this situation. The benefit is that the slower routine could be avoided for many more code points. But the vast vast majority of calls to this function aren't for any problematic code points, so the extra cost of this will very rarely be recouped. The translation from UTF-8 is critically important. We want it to be as fast as possible. I would not even consider this commit if the extra cost weren't very small. A complicating factor is that 2048 (approximately 20% of the total) Korean Hangul syllable code points are not handled by the strict table, so must be by the slower function; though they're handled at the very beginning of it. These code points are never problematic, so it is unfortunate that they have to be handled via the slower function. But still, rarely will this function be called with them. Only the strict table has this problem The way this commit works is to have a table containing pointers to the three DFA tables. The function looks at the input flags; if none are present, it uses the loosest dfa; if any restrictions are present, it adds 1 to the index to use, and it the C9 resetrictions are present, it adds an extra 1. The flags are cast to bools to get each addition. If the bool casts didn't generate conditionals, the only cost to this would be two additions and an indirection; and I would say that that cost is so tiny that this would be worth it. But I looked at godbolt, and casting to bool requires a comparison on both modern clang and gcc. That makes me unsure of the tradeoff. Another option would be to just juse two DFAs, loose and most strict. Then there would be a single conditional, and the Hanguls still would be handled by the DFA when there were no flags restricting things
    
  khwilliamson 
      added a commit
        to khwilliamson/perl5
      that referenced
      this pull request
    
      Sep 29, 2024 
    
    
      
  
    
      
    
  
I'm uncertain about this commit. There are three separate DFA tables already in core. One accepts Perl extended UTF-8; one accepts only strict Unicode UTF-8; and the third accepts modified Unicode UTF-8 spelled out by them in Corrigendum Perl#9. Both the Unicode varieties reject surrogate code points and anything above U+10FFFF. C9 accepts, but the other rejects non-character code points. Without this commit, the way it works is it uses the most restrictive table for the DFA. Anything it accepts is always valid. Anything it rejects is potentially problematic, and it calls a non-inlined function to examine the input more slowly to determine if it is acceptable and/or if a warning needs to be raised. This commit examines the input flags to determine which DFA to use in this situation. The benefit is that the slower routine could be avoided for many more code points. But the vast vast majority of calls to this function aren't for any problematic code points, so the extra cost of this will very rarely be recouped. The translation from UTF-8 is critically important. We want it to be as fast as possible. I would not even consider this commit if the extra cost weren't very small. A complicating factor is that 2048 (approximately 20% of the total) Korean Hangul syllable code points are not handled by the strict table, so must be by the slower function; though they're handled at the very beginning of it. These code points are never problematic, so it is unfortunate that they have to be handled via the slower function. But still, rarely will this function be called with them. Only the strict table has this problem The way this commit works is to have a table containing pointers to the three DFA tables. The function looks at the input flags; if none are present, it uses the loosest dfa; if any restrictions are present, it adds 1 to the index to use, and it the C9 resetrictions are present, it adds an extra 1. The flags are cast to bools to get each addition. If the bool casts didn't generate conditionals, the only cost to this would be two additions and an indirection; and I would say that that cost is so tiny that this would be worth it. But I looked at godbolt, and casting to bool requires a comparison on both modern clang and gcc. That makes me unsure of the tradeoff. Another option would be to just juse two DFAs, loose and most strict. Then there would be a single conditional, and the Hanguls still would be handled by the DFA when there were no flags restricting things
    
  khwilliamson 
      added a commit
        to khwilliamson/perl5
      that referenced
      this pull request
    
      Oct 1, 2024 
    
    
      
  
    
      
    
  
I'm uncertain about this commit. There are three separate DFA tables already in core. One accepts Perl extended UTF-8; one accepts only strict Unicode UTF-8; and the third accepts modified Unicode UTF-8 spelled out by them in Corrigendum Perl#9. Both the Unicode varieties reject surrogate code points and anything above U+10FFFF. C9 accepts, but the other rejects non-character code points. Without this commit, the way it works is it uses the most restrictive table for the DFA. Anything it accepts is always valid. Anything it rejects is potentially problematic, and it calls a non-inlined function to examine the input more slowly to determine if it is acceptable and/or if a warning needs to be raised. This commit examines the input flags to determine which DFA to use in this situation. The benefit is that the slower routine could be avoided for many more code points. But the vast vast majority of calls to this function aren't for any problematic code points, so the extra cost of this will very rarely be recouped. The translation from UTF-8 is critically important. We want it to be as fast as possible. I would not even consider this commit if the extra cost weren't very small. A complicating factor is that 2048 (approximately 20% of the total) Korean Hangul syllable code points are not handled by the strict table, so must be by the slower function; though they're handled at the very beginning of it. These code points are never problematic, so it is unfortunate that they have to be handled via the slower function. But still, rarely will this function be called with them. Only the strict table has this problem The way this commit works is to have a table containing pointers to the three DFA tables. The function looks at the input flags; if none are present, it uses the loosest dfa; if any restrictions are present, it adds 1 to the index to use, and it the C9 resetrictions are present, it adds an extra 1. The flags are cast to bools to get each addition. If the bool casts didn't generate conditionals, the only cost to this would be two additions and an indirection; and I would say that that cost is so tiny that this would be worth it. But I looked at godbolt, and casting to bool requires a comparison on both modern clang and gcc. That makes me unsure of the tradeoff. Another option would be to just juse two DFAs, loose and most strict. Then there would be a single conditional, and the Hanguls still would be handled by the DFA when there were no flags restricting things
  
    Sign up for free
    to join this conversation on GitHub.
    Already have an account?
    Sign in to comment
  
      
  Add this suggestion to a batch that can be applied as a single commit.
  This suggestion is invalid because no changes were made to the code.
  Suggestions cannot be applied while the pull request is closed.
  Suggestions cannot be applied while viewing a subset of changes.
  Only one suggestion per line can be applied in a batch.
  Add this suggestion to a batch that can be applied as a single commit.
  Applying suggestions on deleted lines is not supported.
  You must change the existing code in this line in order to create a valid suggestion.
  Outdated suggestions cannot be applied.
  This suggestion has been applied or marked resolved.
  Suggestions cannot be applied from pending reviews.
  Suggestions cannot be applied on multi-line comments.
  Suggestions cannot be applied while the pull request is queued to merge.
  Suggestion cannot be applied right now. Please check back later.
  
    
  
    
No description provided.