diff --git a/docs/core/tools/dotnet-install-script.md b/docs/core/tools/dotnet-install-script.md index 89c4f4ed69ffd..dd81e19ee4f24 100644 --- a/docs/core/tools/dotnet-install-script.md +++ b/docs/core/tools/dotnet-install-script.md @@ -2,7 +2,8 @@ title: dotnet-install scripts | .NET Core SDK description: Learn about the dotnet-install scripts to install the .NET Core CLI tools and the shared runtime. keywords: dotnet-install, dotnet-install scripts, .NET Core -author: mairaw +author: blackdwarf +ms.author: mairaw manager: wpickett ms.date: 10/12/2016 ms.topic: article @@ -42,7 +43,7 @@ By default, the script will add the install location to the $PATH for the curren Before running the script, please install all the required [dependencies](https://github.com/dotnet/core/blob/master/Documentation/prereqs.md). -You can install a specific version using the `--version` argument. The version needs to be specified as 3-part version (for example, 1.0.0-13232). If omitted, it will default to the first [global.json](global-json.md) file found in the hierarchy above the folder where the script was invoked in that contains the `sdkVersion` property. If that is not present, it will use Latest. +You can install a specific version using the `--version` argument. The version needs to be specified as 3-part version (for example, 1.0.0-13232). If omitted, it will default to the first [global.json](global-json.md) file found in the hierarchy above the folder where the script was invoked that contains the `version` property. If that is not present, it will use Latest. You can also use this script to get the SDK or shared runtime debug binaries with debug symbols by using the `--debug` argument. If you do not do this on first install and realize you do need debug symbols later on, you can re-run the script with this argument and the version of the bits you installed. @@ -56,7 +57,7 @@ Which channel (for example, `future`, `preview`, `production`) to install from. `-Version [VERSION]` -Which version of CLI to install; you need to specify the version as 3-part version (for example, 1.0.0-13232). If omitted, it will default to the first [global.json](global-json.md) that contains the [sdkVersion](global-json.md#sdkversion) property; if that is not present, it will use Latest. +Which version of CLI to install; you need to specify the version as 3-part version (for example, 1.0.0-13232). If omitted, it will default to the first [global.json](global-json.md) that contains the `version` property; if that is not present, it will use Latest. `-InstallDir [DIR]` @@ -83,7 +84,7 @@ Which channel (for example "future", "preview", "production") to install from. T `--version [VERSION]` -Which version of CLI to install; you need to specify the version as 3-part version (for example, 1.0.0-13232). If omitted, it will default to the first [global.json](global-json.md) that contains the [sdkVersion](global-json.md#sdkversion) property; if that is not present, it will use Latest. +Which version of CLI to install; you need to specify the version as 3-part version (for example, 1.0.0-13232). If omitted, it will default to the first [global.json](global-json.md) that contains the `version` property; if that is not present, it will use Latest. `--install-dir [DIR]` diff --git a/docs/core/tools/index.md b/docs/core/tools/index.md index 1bbbe0fecc943..53f92970ba981 100644 --- a/docs/core/tools/index.md +++ b/docs/core/tools/index.md @@ -2,7 +2,8 @@ title: .NET Core Command-Line Interface (CLI) Tools description: An overview of what the Command-Line Interface (CLI) is and its main features keywords: CLI, CLI tools, .NET, .NET Core -author: mairaw +author: blackdwarf +ms.author: mairaw manager: wpickett ms.date: 10/06/2016 ms.topic: article @@ -12,11 +13,9 @@ ms.devlang: dotnet ms.assetid: b70e9ac0-c8be-49f7-9332-95ab93e0e7bc --- -# .NET Core Command-Line Interface Tools +# .NET Core command-line interface tools -By [Zlatko Knezevic](https://github.com/blackdwarf) and [Maira Wenzel](https://github.com/mairaw) - -The .NET Core Command-Line Interface (CLI) is a new foundational cross-platform toolchain for developing +The .NET Core command-line interface (CLI) is a new foundational cross-platform toolchain for developing .NET Core applications. It is "foundational" because it is the primary layer on which other, higher-level tools, such as Integrated Development Environments (IDEs), editors and build orchestrators can build on. @@ -90,7 +89,7 @@ specify a portable app DLL that `dotnet` would run similar to this: `dotnet /pat In the second case, the driver attempts to invoke the specified command. This starts the CLI command execution process. First, the driver determines the version of the tooling that you want. You can specify the version in the -[global.json](global-json.md) file using the [sdkVersion](global-json.md#sdkversion) property. If that is not available, the driver finds the latest version +[global.json](global-json.md) file using the `version` property. If that is not available, the driver finds the latest version of the tools that is installed on disk and uses that version. Once the version is determined, it executes the command. @@ -127,4 +126,4 @@ in the [.NET Core CLI extensibility model](extensibility.md) topic. This was a short overview of the most important features of the CLI. You can find out more by using the reference and conceptual topics on this site. There are also other resources you can use: * [dotnet/CLI](https://github.com/dotnet/cli/) GitHub repo -* [Getting Started instructions](https://aka.ms/dotnetcoregs/) +* [Getting started instructions](https://aka.ms/dotnetcoregs/) diff --git a/docs/core/tools/project-json.md b/docs/core/tools/project-json.md index f476bfc7bde7f..250b18a5bf28a 100644 --- a/docs/core/tools/project-json.md +++ b/docs/core/tools/project-json.md @@ -3,6 +3,7 @@ title: project.json reference description: project.json reference keywords: .NET, .NET Core, project.json author: aL3891 +ms.author: mairaw manager: wpickett ms.date: 09/30/2016 ms.topic: article @@ -363,10 +364,10 @@ Type: String Specifies the type of the dependency. It can be one of the following values: `default`, `build` or `platform`. The default value is `default`. `build` is known as a development dependency and is only used for build-time. It means that the package should not be published or added as a dependency to the output `.nupkg` file. -It has the same effect of setting [supressParent](#supressParent) to `all`. +It has the same effect of setting [supressParent](#supressparent) to `all`. `platform` references the shared SDK. For more information, see the section on "Deploying a framework-dependent deployment with third-party dependencies" on the -[.NET Core Application Deployment](../deploying/index.md) topic. +[.NET Core application deployment](../deploying/index.md) topic. For example: diff --git a/docs/csharp/csharp-7.md b/docs/csharp/csharp-7.md index da1f2f1673ce5..f7c29617ab08e 100644 --- a/docs/csharp/csharp-7.md +++ b/docs/csharp/csharp-7.md @@ -3,6 +3,7 @@ title: What's New in C# 7 | C# Guide description: Get an overview of the new features coming in the upcoming version 7 of the C# language. keywords: C#, .NET, .NET Core, Latest Features, What's New author: BillWagner +ms.author: wiwagn manager: wpickett ms.date: 10/03/2016 ms.topic: article @@ -20,15 +21,15 @@ C# 7 adds a number of new features to the C# language: * [Tuples](#tuples) * [Pattern Matching](#pattern-matching) * [`ref` locals and returns](#ref-locals-and-returns) -* [Local Functions](#local-expressions) -* [Expression Bodied Everything](#expression-bodied-everything) -* [`throw` Expressions](#throw-expressions) +* [Local Functions](#local-functions) +* [Expression Bodied Everything](#expression-bodied-everything-preview-5) +* [`throw` Expressions](#throw-expressions-preview-5) * [Generalized async return types](#generalized-async-return-types) * [Numeric literal syntax improvements](#numeric-literal-syntax-improvements) Two of the most interesting features don't make that list. The first is the shortened release cycle. C# 7 is following C# 6 much more quickly. -The second is that C# 7 has [features contributed by the community](#expression-bodied-everything), not +The second is that C# 7 has [features contributed by the community](#expression-bodied-everything-preview-5), not the C# compiler team. The language is truly open. The remainder of this topic provides an overview diff --git a/docs/fsharp/tutorials/getting-started/getting-started-visual-studio.md b/docs/fsharp/tutorials/getting-started/getting-started-visual-studio.md index 75ebf42c86abd..17917863655c7 100644 --- a/docs/fsharp/tutorials/getting-started/getting-started-visual-studio.md +++ b/docs/fsharp/tutorials/getting-started/getting-started-visual-studio.md @@ -3,7 +3,8 @@ title: Getting Started with F# in Visual Studio description: Learn how to use F# with Visual Studio. keywords: visual f#, f#, functional programming author: cartermp -manager: danielfe +ms.author: phcart +manager: wpickett ms.date: 09/08/2016 ms.topic: article ms.prod: visual-studio-dev14 @@ -11,37 +12,37 @@ ms.technology: devlang-fsharp ms.assetid: 8db75596-19a9-4eda-b20d-a12d517c8cc1 --- -# Getting Started with F# in Visual Studio +# Getting started with F# in Visual Studio F# and the Visual F# tooling are supported in the Visual Studio IDE. To begin, you should [download Visual Studio](https://www.visualstudio.com/downloads/download-visual-studio-vs), if you haven't already. This article uses the Visual Studio 2015 Community Edition, but you can use F# with the version of your choice. -## Installing the Visual F# Tools +## Installing the Visual F# tools Visual Studio will first initialize the installer. After it is intilized, select **Custom** as shown here: -![](media/getting-started-vs/vs2015-install-1.png) +![Select Custom install](./media/getting-started-vs/vs2015-install-1.png) Select the Visual F# Tools under Programming Languages here: -![](media/getting-started-vs/vs2015-install-2.png) +![Visual F#](./media/getting-started-vs/vs2015-install-2.png) Feel free to customize your installation further, and then continue with the installation. After a while, Visual Studio will complete installation and you can create an F# project! -## Creating a Console Application +## Creating a console application One of the most basic projects in Visual Studio is the Console Application. Here's how to do it. Once Visual Studio is open: 1. On the **File** menu, point to **New**, and then choose **Project**. -![](media/getting-started-vs/vs2015-install-3.png) +![File New Project](./media/getting-started-vs/vs2015-install-3.png) 2. In the New Project dialog, under **Templates**, you should see **Visual F#**. Choose this to show the F# templates. -![](media/getting-started-vs/vs2015-install-4.png) +![Visual F# templates](./media/getting-started-vs/vs2015-install-4.png) 3. Choose the **Okay** button to create the F# project! You should see something like this under **Solution Explorer**: -![](media/getting-started-vs/vs2015-install-5.png) +![F# Project in Solution Explorer](./media/getting-started-vs/vs2015-install-5.png) ## Writing your code @@ -61,7 +62,7 @@ Another function, `main`, is defined, which is decorated with the `EntryPoint` a It is in this function that we call the `square` function with an argument of `12`. The F# compiler then assigns the type of `square` to be `int -> int` (that is, a function which takes an `int` and produces an `int`). The call to `printfn` is a formatted printing function which uses a format string, similar to C-style programming languages, parameters which correspond to those specified in the format string, and then prints the result and a new line. -## Running Your Code +## Running your code You can run the code and see results by pressing **ctrl-f5**. This will run the program without debugging and allows you to see the results. Alternatively, you can choose the **Debug** top-level menu item in Visual Studio and choose **Start Without Debugging**. @@ -125,18 +126,18 @@ The pipe-forward operator, and more, are covered in later tutorials. This is only a glimpse into what you can do with F# Interactive. To learn more, check out [Interactive Programming with F#](../fsharp-interactive/index.md). -## Next Steps +## Next steps If you haven't already, check out the [Tour of F#](../../tour.md), which covers some of the core features of the F# language. It will give you an overview of some of the capabilities of F#, and provide ample code samples that you can copy into Visual Studio and run. There are also some great external resources you can use, showcased in the [F# Guide](../../index.md). -## See Also +## See also [Visual F#](../../index.md) [Tour of F#](../../tour.md) -[F# Language Reference](../../language-reference/index.md) +[F# language reference](../../language-reference/index.md) -[Type Inference](../../language-reference/type-inference.md) +[Type inference](../../language-reference/type-inference.md) -[Symbol and Operator Reference](../../language-reference/symbol-and-operator-reference/index.md) +[Symbol and operator reference](../../language-reference/symbol-and-operator-reference/index.md) diff --git a/docs/fsharp/tutorials/getting-started/getting-started-vscode.md b/docs/fsharp/tutorials/getting-started/getting-started-vscode.md index 5570914d4d631..2356fbe159cc6 100644 --- a/docs/fsharp/tutorials/getting-started/getting-started-vscode.md +++ b/docs/fsharp/tutorials/getting-started/getting-started-vscode.md @@ -3,6 +3,7 @@ title: Getting Started with F# in Visual Studio Code with Ionide description: Learn how to use F# with Visual Studio Code and the Ionide plugin suite. keywords: visual f#, f#, functional programming, .NET, Visual Studio Code, vscode, Ionide author: cartermp +ms.author: phcart manager: wpickett ms.date: 09/28/2016 ms.topic: article @@ -24,7 +25,7 @@ F# 4.0 or higher must be installed on your machine to use Ionide. If you're on Windows, you have two options for installing F#. -If you've already installed Visual Studio and don't have F#, you can [Install the Visual F# Tools](getting-started-visual-studio.md#installing-the-visual-f#-tools). This will install all the necessary components to write, compile, and execute F# code. +If you've already installed Visual Studio and don't have F#, you can [Install the Visual F# Tools](getting-started-visual-studio.md#installing-the-visual-f-tools). This will install all the necessary components to write, compile, and execute F# code. If you prefer not to install Visual Studio, use the following instructions: diff --git a/docs/fsharp/tutorials/getting-started/media/getting-started-vs/vs2015-install-1.PNG b/docs/fsharp/tutorials/getting-started/media/getting-started-vs/vs2015-install-1.png similarity index 100% rename from docs/fsharp/tutorials/getting-started/media/getting-started-vs/vs2015-install-1.PNG rename to docs/fsharp/tutorials/getting-started/media/getting-started-vs/vs2015-install-1.png diff --git a/docs/fsharp/tutorials/getting-started/media/getting-started-vs/vs2015-install-2.PNG b/docs/fsharp/tutorials/getting-started/media/getting-started-vs/vs2015-install-2.png similarity index 100% rename from docs/fsharp/tutorials/getting-started/media/getting-started-vs/vs2015-install-2.PNG rename to docs/fsharp/tutorials/getting-started/media/getting-started-vs/vs2015-install-2.png diff --git a/docs/fsharp/tutorials/getting-started/media/getting-started-vs/vs2015-install-3.PNG b/docs/fsharp/tutorials/getting-started/media/getting-started-vs/vs2015-install-3.png similarity index 100% rename from docs/fsharp/tutorials/getting-started/media/getting-started-vs/vs2015-install-3.PNG rename to docs/fsharp/tutorials/getting-started/media/getting-started-vs/vs2015-install-3.png diff --git a/docs/fsharp/tutorials/getting-started/media/getting-started-vs/vs2015-install-4.PNG b/docs/fsharp/tutorials/getting-started/media/getting-started-vs/vs2015-install-4.png similarity index 100% rename from docs/fsharp/tutorials/getting-started/media/getting-started-vs/vs2015-install-4.PNG rename to docs/fsharp/tutorials/getting-started/media/getting-started-vs/vs2015-install-4.png diff --git a/docs/fsharp/tutorials/getting-started/media/getting-started-vs/vs2015-install-5.PNG b/docs/fsharp/tutorials/getting-started/media/getting-started-vs/vs2015-install-5.png similarity index 100% rename from docs/fsharp/tutorials/getting-started/media/getting-started-vs/vs2015-install-5.PNG rename to docs/fsharp/tutorials/getting-started/media/getting-started-vs/vs2015-install-5.png diff --git a/docs/standard/base-types/alternation.md b/docs/standard/base-types/alternation.md index ac306858cb823..9af655b68aa1f 100644 --- a/docs/standard/base-types/alternation.md +++ b/docs/standard/base-types/alternation.md @@ -3,6 +3,7 @@ title: Alternation constructs in regular expressions description: Alternation constructs in regular expressions keywords: .NET, .NET Core author: stevehoag +ms.author: shoag manager: wpickett ms.date: 07/28/2016 ms.topic: article @@ -22,7 +23,7 @@ Alternation constructs modify a regular expression to enable either/or or condit * Conditional matching based on a valid captured group -## Pattern Matching with | +## Pattern matching with | You can use the vertical bar (|) character to match any one of a series of patterns, where the | character separates each pattern. @@ -149,7 +150,7 @@ Pattern | Description `(\d{2}-\d{7}|;\d{3}-\d{2}-\d{4})` | Match either of the following: two decimal digits followed by a hyphen followed by seven decimal digits; or three decimal digits, a hyphen, two decimal digits, another hyphen, and four decimal digits. `\d` | End the match at a word boundary. -## Conditional Matching with an Expression +## Conditional matching with an expression This language element attempts to match one of two patterns depending on whether it can match an initial pattern. Its syntax is: @@ -159,10 +160,10 @@ where *expression* is the initial pattern to match, *yes* is the pattern to matc **(?(?**=_expression_**)**_yes_**|**_no_**)** -where **(?**=_expression_**)** is a zero-width assertion construct. (For more information, see [Grouping Constructs in Regular Expressions](grouping.md).) Because the regular expression engine interprets *expression* as an anchor (a zero-width assertion), *expression* must either be a zero-width assertion (for more information, see [Anchors in Regular Expressions](anchors.md)) or a subexpression that is also contained in *yes*. Otherwise, the *yes* pattern cannot be matched. +where **(?**=_expression_**)** is a zero-width assertion construct. (For more information, see [Grouping constructs in regular expressions](grouping.md).) Because the regular expression engine interprets *expression* as an anchor (a zero-width assertion), *expression* must either be a zero-width assertion (for more information, see [Anchors in regular expressions](anchors.md)) or a subexpression that is also contained in *yes*. Otherwise, the *yes* pattern cannot be matched. > [!NOTE] -> If *expression* is a named or numbered capturing group, the alternation construct is interpreted as a capture test; for more information, see the next section, [Conditional Matching Based on a Valid Capture Group](#Conditional-Matching-Based-on-a-Valid-Capture-Group). In other words, the regular expression engine does not attempt to match the captured substring, but instead tests for the presence or absence of the group. +> If *expression* is a named or numbered capturing group, the alternation construct is interpreted as a capture test; for more information, see the next section, [Conditional matching based on a valid captured group](#conditional-matching-based-on-a-valid-captured-group). In other words, the regular expression engine does not attempt to match the captured substring, but instead tests for the presence or absence of the group. The following example is a variation of the example that appears in the previous section. It uses conditional matching to determine whether the first three characters after a word boundary are two digits followed by a hyphen. If they are, it attempts to match a U.S. Employer Identification Number (EIN). If not, it attempts to match a U.S. Social Security Number (SSN). @@ -217,7 +218,7 @@ Pattern | Description `\d{3}-\d{2}-\d{4}` | If the previous pattern does not match, match three decimal digits, a hyphen, two decimal digits, another hyphen, and four decimal digits. `\b` | Match a word boundary. -## Conditional Matching Based on a Valid Captured Group +## Conditional matching based on a valid captured group This language element attempts to match one of two patterns depending on whether it has matched a specified capturing group. Its syntax is: diff --git a/docs/standard/base-types/anchors.md b/docs/standard/base-types/anchors.md index 46b379239aa89..2e6d4396ce7e3 100644 --- a/docs/standard/base-types/anchors.md +++ b/docs/standard/base-types/anchors.md @@ -3,6 +3,7 @@ title: Anchors in regular expressions description: Anchors in regular expressions keywords: .NET, .NET Core author: stevehoag +ms.author: shoag manager: wpickett ms.date: 07/28/2016 ms.topic: article @@ -29,7 +30,7 @@ Anchor | Description ## Start of String or Line: ^ -The **^** anchor specifies that the following pattern must begin at the first character position of the string. If you use **^** with the [RegexOptions.Multiline](xref:System.Text.RegularExpressions.RegexOptions.Multiline) option (see [Regular Expression Options](options.md)), the match must occur at the beginning of each line. +The **^** anchor specifies that the following pattern must begin at the first character position of the string. If you use **^** with the [RegexOptions.Multiline](xref:System.Text.RegularExpressions.RegexOptions.Multiline) option (see [Regular expression options](options.md)), the match must occur at the beginning of each line. The following example uses the **^** anchor in a regular expression that extracts information about the years during which some professional baseball teams existed. The example calls two overloads of the `Regex.Matches` method: @@ -775,7 +776,7 @@ Pattern | Description ## Word Boundary: \b -The **\b** anchor specifies that the match must occur on a boundary between a word character (the **\w** language element) and a non-word character (the **\W** language element). Word characters consist of alphanumeric characters and underscores; a non-word character is any character that is not alphanumeric or an underscore. (For more information, see [Character Classes in Regular Expressions](classes.md).) The match may also occur on a word boundary at the beginning or end of the string. +The **\b** anchor specifies that the match must occur on a boundary between a word character (the **\w** language element) and a non-word character (the **\W** language element). Word characters consist of alphanumeric characters and underscores; a non-word character is any character that is not alphanumeric or an underscore. (For more information, see [Character classes in regular expressions](classes.md).) The match may also occur on a word boundary at the beginning or end of the string. The **\b** anchor is frequently used to ensure that a subexpression matches an entire word instead of just the beginning or end of a word. The regular expression `\bare\w*\b` in the following example illustrates this usage. It matches any word that begins with the substring "are". The output from the example also illustrates that **\b** matches both the beginning and the end of the input string. diff --git a/docs/standard/base-types/backreference.md b/docs/standard/base-types/backreference.md index 911bf12355ed2..cb65ac9786cee 100644 --- a/docs/standard/base-types/backreference.md +++ b/docs/standard/base-types/backreference.md @@ -3,6 +3,7 @@ title: Backreference constructs in regular expressions description: Backreference constructs in regular expressions keywords: .NET, .NET Core author: stevehoag +ms.author: shoag manager: wpickett ms.date: 07/28/2016 ms.topic: article @@ -17,9 +18,9 @@ ms.assetid: c453ed78-650f-4c3c-9ab4-9d89d250bf88 Backreferences provide a convenient way to identify a repeated character or substring within a string. For example, if the input string contains multiple occurrences of an arbitrary substring, you can match the first occurrence with a capturing group, and then use a backreference to match subsequent occurrences of the substring. > [!NOTE] -> A separate syntax is used to refer to named and numbered capturing groups in replacement strings. For more information, see [Substitutions in Regular Expressions](substitutions.md). +> A separate syntax is used to refer to named and numbered capturing groups in replacement strings. For more information, see [Substitutions in regular expressions](substitutions.md). -.NET defines separate language elements to refer to numbered and named capturing groups. For more information about capturing groups, see [Grouping Constructs in Regular Expressions](grouping.md). +.NET defines separate language elements to refer to numbered and named capturing groups. For more information about capturing groups, see [Grouping constructs in regular expressions](grouping.md). ## Numbered Backreferences diff --git a/docs/standard/base-types/backtracking.md b/docs/standard/base-types/backtracking.md index 7d1ced8e401ae..dbcec334e3c5d 100644 --- a/docs/standard/base-types/backtracking.md +++ b/docs/standard/base-types/backtracking.md @@ -3,6 +3,7 @@ title: Backtracking in regular expressions description: Backtracking in regular expressions keywords: .NET, .NET Core author: stevehoag +ms.author: shoag manager: wpickett ms.date: 07/28/2016 ms.topic: article @@ -21,15 +22,15 @@ Backtracking occurs when a regular expression pattern contains optional [quantif This topic contains the following sections: -* [Linear Comparison Without Backtracking](#Linear-Comparison-Without-Backtracking) +* [Linear comparison without backtracking](#linear-comparison-without-backtracking) -* [Backtracking with Optional Quantifiers or Alternation Constructs](#Backtracking-with-Optional-Quantifiers-or-Alternation-Constructs) +* [Backtracking with optional quantifiers or alternation constructs](#backtracking-with-optional-quantifiers-or-alternation-constructs) -* [Backtracking with Nested Optional Quantifiers](#Backtracking-with-Nested-Optional-Quantifiers) +* [Backtracking with nested optional quantifiers](#backtracking-with-nested-optional-quantifiers) -* [Controlling Backtracking](#Controlling_Backtracking) +* [Controlling backtracking](#controlling-backtracking) -## Linear Comparison Without Backtracking +## Linear comparison without backtracking If a regular expression pattern has no optional quantifiers or alternation constructs, the regular expression engine executes in linear time. That is, after the regular expression engine matches the first language element in the pattern with text in the input string, it tries to match the next language element in the pattern with the next character or group of characters in the input string. This continues until the match either succeeds or fails. In either case, the regular expression engine advances by one character at a time in the input string. @@ -98,7 +99,7 @@ Operation | Position in pattern | Position in string | Result If a regular expression pattern includes no optional quantifiers or alternation constructs, the maximum number of comparisons required to match the regular expression pattern with the input string is roughly equivalent to the number of characters in the input string. In this case, the regular expression engine uses 19 comparisons to identify possible matches in this 13-character string. In other words, the regular expression engine runs in near-linear time if it contains no optional quantifiers or alternation constructs. -## Backtracking with Optional Quantifiers or Alternation Constructs +## Backtracking with optional quantifiers or alternation constructs When a regular expression includes optional quantifiers or alternation constructs, the evaluation of the input string is no longer linear. Pattern matching with an NFA engine is driven by the language elements in the regular expression and not by the characters to be matched in the input string. Therefore, the regular expression engine tries to fully match optional or alternative subexpressions. When it advances to the next language element in the subexpression and the match is unsuccessful, the regular expression engine can abandon a portion of its successful match and return to an earlier saved state in the interest of matching the regular expression as a whole with the input string. This process of returning to a previous saved state to find a match is known as backtracking. @@ -161,7 +162,7 @@ To do this, the regular expression engine uses backtracking as follows: When you use backtracking, matching the regular expression pattern with the input string, which is 55 characters long, requires 67 comparison operations. Interestingly, if the regular expression pattern included a lazy quantifier, `.*?(es),` matching the regular expression would require additional comparisons. In this case, instead of having to backtrack from the end of the string to the "r" in "expressions", the regular expression engine would have to backtrack all the way to the beginning of the string to match "Es" and would require 113 comparisons. Generally, if a regular expression pattern has a single alternation construct or a single optional quantifier, the number of comparison operations required to match the pattern is more than twice the number of characters in the input string. -## Backtracking with Nested Optional Quantifiers +## Backtracking with nested optional quantifiers The number of comparison operations required to match a regular expression pattern can increase exponentially if the pattern includes a large number of alternation constructs, if it includes nested alternation constructs, or, most commonly, if it includes nested optional quantifiers. For example, the regular expression pattern `^(a+)+$` is designed to match a complete string that contains one or more "a" characters. The example provides two input strings of identical length, but only the first string matches the pattern. The [System.Diagnostics.Stopwatch](xref:System.Diagnostics.Stopwatch) class is used to determine how long the match operation takes. @@ -227,16 +228,15 @@ As the output from the example shows, the regular expression engine took about t Comparison of the input string with the regular expression continues in this way until the regular expression engine has tried all possible combinations of matches, and then concludes that there is no match. Because of the nested quantifiers, this comparison is an O(2n) or an exponential operation, where n is the number of characters in the input string. This means that in the worst case, an input string of 30 characters requires approximately 1,073,741,824 comparisons, and an input string of 40 characters requires approximately 1,099,511,627,776 comparisons. If you use strings of these or even greater lengths, regular expression methods can take an extremely long time to complete when they process input that does not match the regular expression pattern. -## Controlling Backtracking +## Controlling backtracking -Backtracking lets you create powerful, flexible regular expressions. However, as the previous section showed, these benefits may be coupled with unacceptably poor performance. To prevent excessive backtracking, you should define a time-out interval when you instantiate a [Regex](xref:System.Text.RegularExpressions.Regex) object or call a static regular expression matching method. This is discussed in the next section. In addition, .NET Core supports three regular expression language elements that limit or suppress backtracking and that support complex regular expressions with little or no performance penalty: [nonbacktracking subexpressions](#nonbacktracking-subexpressions), [lookbehind assertions](#lookbehind-assertions), and [lookahead assertions](#lookahead assertions). For more information about each language element, see [Grouping Constructs in Regular Expressions](grouping.md). +Backtracking lets you create powerful, flexible regular expressions. However, as the previous section showed, these benefits may be coupled with unacceptably poor performance. To prevent excessive backtracking, you should define a time-out interval when you instantiate a [Regex](xref:System.Text.RegularExpressions.Regex) object or call a static regular expression matching method. This is discussed in the next section. In addition, .NET Core supports three regular expression language elements that limit or suppress backtracking and that support complex regular expressions with little or no performance penalty: [nonbacktracking subexpressions](#nonbacktracking-subexpression), [lookbehind assertions](#lookbehind-assertions), and [lookahead assertions](#lookahead-assertions). For more information about each language element, see [Grouping constructs in regular expressions](grouping.md). -### Defining a Time-out Interval +### Defining a time-out interval You can set a time-out value that represents the longest interval the regular expression engine will search for a single match before it abandons the attempt and throws a [RegexMatchTimeoutException](xref:System.Text.RegularExpressions.RegexMatchTimeoutException) exception. You specify the time-out interval by supplying a [TimeSpan](xref:System.TimeSpan) value to the `Regex(String, RegexOptions, TimeSpan)` constructor for instance regular expressions. In addition, each static pattern matching method has an overload with a [TimeSpan](xref:System.TimeSpan) value to the [Regex.Regex(String, RegexOptions, TimeSpan)] parameter that allows you to specify a time-out value. By default, the time-out interval is set to [Regex.InfiniteMatchTimeout](xref:System.Text.RegularExpressions.Regex.InfiniteMatchTimeout) and the regular expression engine does not time out. -> **Important** -> +> [!IMPORTANT] > We recommend th>at you always set a time-out interval if your regular expression relies on backtracking. A [RegexMatchTimeoutException](xref:System.Text.RegularExpressions.RegexMatchTimeoutException)n exception indicates that the regular expression engine was unable to find a match within in the specified time-out interval but does not indicate why the exception was thrown. The reason might be excessive backtracking, but it is also possible that the time-out interval was set too low given the system load at the time the exception was thrown. When you handle the exception, you can choose to abandon further matches with the input string or increase the time-out interval and retry the matching operation. @@ -417,7 +417,7 @@ End Module ' Maximum timeout interval of 3 seconds exceeded. ``` -### Nonbacktracking Subexpression +### Nonbacktracking subexpression The **(?>** _subexpression_**)** language element suppresses backtracking in a subexpression. It is useful for preventing the performance problems associated with failed matches. @@ -494,7 +494,7 @@ End Module ' Match: False in 00:00:00.0001391 ``` -### Lookbehind Assertions +### Lookbehind assertions .NET includes two language elements, **(?<**=_subexpression_**)** and **(? [!WARNING] -> The following example uses a regular expression that is prone to excessive backtracking and that is likely to reject valid email addresses. You should not use it in an email validation routine. If you would like a regular expression that validates email addresses, see [How to: Verify that Strings Are in Valid Email Format](verify-format.md). +> The following example uses a regular expression that is prone to excessive backtracking and that is likely to reject valid email addresses. You should not use it in an email validation routine. If you would like a regular expression that validates email addresses, see [How to: Verify that strings are in valid email format](verify-format.md). For example, consider a very commonly used but extremely problematic regular expression for validating the alias of an email address. The regular expression `^[0-9A-Z]([-.\w]*[0-9A-Z])*$` is written to process what is considered to be a valid email address, which consists of an alphanumeric character, followed by zero or more characters that can be alphanumeric, periods, or hyphens. The regular expression must end with an alphanumeric character. However, as the following example shows, although this regular expression handles valid input easily, its performance is very inefficient when it is processing nearly valid input. @@ -204,11 +205,11 @@ Because this regular expression was developed solely by considering the format o To solve this problem, you can do the following: -* When developing a pattern, you should consider how backtracking might affect the performance of the regular expression engine, particularly if your regular expression is designed to process unconstrained input. For more information, see the [Take Charge of Backtracking](#Take-Charge-of-Backtracking) section. +* When developing a pattern, you should consider how backtracking might affect the performance of the regular expression engine, particularly if your regular expression is designed to process unconstrained input. For more information, see the [Take charge of backtracking](#take-charge-of-backtracking) section. * Thoroughly test your regular expression using invalid and near-valid input as well as valid input. To generate input for a particular regular expression randomly, you can use [Rex](http://research.microsoft.com/en-us/projects/rex/), which is a regular expression exploration tool from Microsoft Research. -## Handle Object Instantiation Appropriately +## Handle object instantiation appropriately At the heart of .NET’s regular expression object model is the [System.Text.RegularExpressions.Regex](xref:System.Text.RegularExpressions.Regex) class, which represents the regular expression engine. Often, the single greatest factor that affects regular expression performance is the way in which the [Regex](xref:System.Text.RegularExpressions.Regex) engine is used. Defining a regular expression involves tightly coupling the regular expression engine with a regular expression pattern. That coupling process, whether it involves instantiating a [Regex](xref:System.Text.RegularExpressions.Regex) object by passing its constructor a regular expression pattern or calling a static method by passing it the regular expression pattern along with the string to be analyzed, is by necessity an expensive one. @@ -223,7 +224,7 @@ You can couple the regular expression engine with a particular regular expressio > [!IMPORTANT] > The form of the method call (static, interpreted, compiled) affects performance if the same regular expression is used repeatedly in method calls, or if an application makes extensive use of regular expression objects. -### Static Regular Expressions +### Static regular expressions Static regular expression methods are recommended as an alternative to repeatedly instantiating a regular expression object with the same regular expression. Unlike regular expression patterns used by regular expression objects, either the operation codes or the compiled Microsoft intermediate language (MSIL) from patterns used in instance method calls is cached internally by the regular expression engine. @@ -295,7 +296,7 @@ Pattern | Description `\s*` | Match zero or more white-space characters. `\d+` | Match one or more decimal digits. -### Interpreted vs. Compiled Regular Expressions +### Interpreted vs. compiled regular expressions Regular expression patterns that are not bound to the regular expression engine through the specification of the [RegexOptions.Compiled](xref:System.Text.RegularExpressions.RegexOptions.Compiled) option are interpreted. When a regular expression object is instantiated, the regular expression engine converts the regular expression to a set of operation codes. When an instance method is called, the operation codes are converted to MSIL and executed by the JIT compiler. Similarly, when a static regular expression method is called and the regular expression cannot be found in the cache, the regular expression engine converts the regular expression to a set of operation codes and stores them in the cache. It then converts these operation codes to MSIL so that the JIT compiler can execute them. Interpreted regular expressions reduce startup time at the cost of slower execution time. Because of this, they are best used when the regular expression is used in a small number of method calls, or if the exact number of calls to regular expression methods is unknown but is expected to be small. As the number of method calls increases, the performance gain from reduced startup time is outstripped by the slower execution speed. @@ -517,12 +518,12 @@ Pattern | Description `\w+` | Match one or more word characters. `[.?:;!]` | Match a period, question mark, colon, semicolon, or exclamation point. -## Take Charge of Backtracking +## Take charge of backtracking Ordinarily, the regular expression engine uses linear progression to move through an input string and compare it to a regular expression pattern. However, when indeterminate quantifiers such as __*__, **+**, and **?** are used in a regular expression pattern, the regular expression engine may give up a portion of successful partial matches and return to a previously saved state in order to search for a successful match for the entire pattern. This process is known as backtracking. > [!NOTE] -> For more information on backtracking, see [Details of Regular Expression Behavior](regex-behavior.md) and [Backtracking in Regular Expressions](backtracking.md). +> For more information on backtracking, see [Details of regular expression behavior](regex-behavior.md) and [Backtracking in regular expressions](backtracking.md). Support for backtracking gives regular expressions power and flexibility. It also places the responsibility for controlling the operation of the regular expression engine in the hands of regular expression developers. Because developers are often not aware of this responsibility, their misuse of backtracking or reliance on excessive backtracking often plays the most significant role in degrading regular expression performance. In a worst-case scenario, execution time can double for each additional character in the input string. In fact, by using backtracking excessively, it is easy to create the programmatic equivalent of an endless loop if input nearly matches the regular expression pattern; the regular expression engine may take hours or even days to process a relatively short input string. @@ -600,7 +601,7 @@ End Module In many cases, backtracking is essential for matching a regular expression pattern to input text. However, excessive backtracking can severely degrade performance and create the impression that an application has stopped responding. In particular, this happens when quantifiers are nested and the text that matches the outer subexpression is a subset of the text that matches the inner subexpression. > [!WARNING] -> In addition to avoiding excessive backtracking, you should use the timeout feature to ensure that excessive backtracking does not severely degrade regular expression performance. For more information, see the [Use Timeout Values](#Use-Timeout-Values) section. +> In addition to avoiding excessive backtracking, you should use the timeout feature to ensure that excessive backtracking does not severely degrade regular expression performance. For more information, see the [Use time-out values](#use-time-out-values) section. For example, the regular expression pattern `^[0-9A-Z]([-.\w]*[0-9A-Z])*\$$` is intended to match a part number that consists of at least one alphanumeric character. Any additional characters can consist of an alphanumeric character, a hyphen, an underscore, or a period, though the last character must be alphanumeric. A dollar sign terminates the part number. In some cases, this regular expression pattern can exhibit extremely poor performance because quantifiers are nested, and because the subexpression `[0-9A-Z]` is a subset of the subexpression `[-.\w]*`. @@ -672,7 +673,7 @@ End Module ' Match not found. ``` -The regular expression language in .NET includes the following language elements that you can use to eliminate nested quantifiers. For more information, see [Grouping Constructs in Regular Expressions](grouping.md). +The regular expression language in .NET includes the following language elements that you can use to eliminate nested quantifiers. For more information, see [Grouping constructs in regular expressions](grouping.md). Language element | Description ---------------- | ----------- @@ -681,7 +682,7 @@ Language element | Description **(?<**=_subexpression_**)** | Zero-width positive lookbehind. Look behind the current position to determine whether *subexpression* matches the input string. **(?**_subexpression_**)**, which defines a named capturing group. Grouping constructs are essential for creating backreferences and for defining a subexpression to which a quantifier is applied. @@ -1085,11 +1086,11 @@ You can disable captures in one of the following ways: * Use the [RegexOptions.ExplicitCapture](xref:System.Text.RegularExpressions.RegexOptions.ExplicitCapture) option. It disables all unnamed or implicit captures in the regular expression pattern. When you use this option, only substrings that match named groups defined with the **(?<**_name_**>**_subexpression_**)** language element can be captured. The [ExplicitCapture](xref:System.Text.RegularExpressions.RegexOptions.ExplicitCapture) flag can be passed to the options parameter of a [Regex](xref:System.Text.RegularExpressions.Regex) class constructor or to the options parameter of a [Regex](xref:System.Text.RegularExpressions.Regex) static matching method. -* Use the **n** option in the **(?imnsx)** language element. This option disables all unnamed or implicit captures from the point in the regular expression pattern at which the element appears. Captures are disabled either until the end of the pattern or until the **(-n)** option enables unnamed or implicit captures. For more information, see [Miscellaneous Constructs in Regular Expressions](miscellaneous.md). +* Use the **n** option in the **(?imnsx)** language element. This option disables all unnamed or implicit captures from the point in the regular expression pattern at which the element appears. Captures are disabled either until the end of the pattern or until the **(-n)** option enables unnamed or implicit captures. For more information, see [Miscellaneous constructs in regular expressions](miscellaneous.md). * Use the **n** option in the **(?imnsx:**_subexpression_**)** language element. This option disables all unnamed or implicit captures in *subexpression*. Captures by any unnamed or implicit nested capturing groups are disabled as well. -## Related Topics +## Related topics Title | Description ----- | ----------- diff --git a/docs/standard/base-types/character-encoding.md b/docs/standard/base-types/character-encoding.md index bf1d4f2417f9f..411a0b656f47c 100644 --- a/docs/standard/base-types/character-encoding.md +++ b/docs/standard/base-types/character-encoding.md @@ -3,6 +3,7 @@ title: Character encoding in .NET description: Character encoding in .NET keywords: .NET, .NET Core author: stevehoag +ms.author: shoag manager: wpickett ms.date: 07/26/2016 ms.topic: article @@ -29,25 +30,25 @@ Character encoding describes the rules by which an encoder and a decoder operate This topic consists of the following sections: -* [Encodings in .NET](#Encodings-in-.NET) +* [Encodings in .NET](#encodings-in-net) -* [Selecting an Encoding Class](#Selecting-an-Encoding-Class) +* [Selecting an encoding class](#selecting-an-encoding-class) -* [Using an Encoding Object](#Using-an-Encoding-Object) +* [Using an encoding object](#using-an-encoding-object) -* [Choosing a Fallback Strategy](#Choosing-a-Fallback-Strategy) +* [Choosing a fallback strategy](#choosing-a-fallback-strategy) -* [Implementing a Custom Fallback Strategy](#Implementing-a-Custom-Fallback-Strategy) +* [Implementing a custom fallback strategy](#implementing-a-custom-fallback-strategy) ## Encodings in .NET All character encoding classes in .NET inherit from the [System.Text.Encoding](xref:System.Text.Encoding) class, which is an abstract class that defines the functionality common to all character encodings. To access the individual encoding objects implemented in .NET, do the following: -* Use the static properties of the [Encoding](xref:System.Text.Encoding) class, which return objects that represent the standard character encodings available in .NET (ASCII, UTF-7, UTF-8, UTF-16, and UTF-32). For example, the [Encoding.Unicode](xref:System.Text.Encoding.Unicode) property returns a [UnicodeEncoding](xref:System.Text.UnicodeEncoding) object. Each object uses replacement fallback to handle strings that it cannot encode and bytes that it cannot decode. (For more information, see the [Replacement Fallback](#Replacement-Fallback) section.) +* Use the static properties of the [Encoding](xref:System.Text.Encoding) class, which return objects that represent the standard character encodings available in .NET (ASCII, UTF-7, UTF-8, UTF-16, and UTF-32). For example, the [Encoding.Unicode](xref:System.Text.Encoding.Unicode) property returns a [UnicodeEncoding](xref:System.Text.UnicodeEncoding) object. Each object uses replacement fallback to handle strings that it cannot encode and bytes that it cannot decode. (For more information, see the [Replacement fallback](#replacement-fallback) section.) -* Call the encoding's class constructor. Objects for the ASCII, UTF-7, UTF-8, UTF-16, and UTF-32 encodings can be instantiated in this way. By default, each object uses replacement fallback to handle strings that it cannot encode and bytes that it cannot decode, but you can specify that an exception should be thrown instead. (For more information, see the [Replacement Fallback](#Replacement-Fallback) and [Exception Fallback](#Exception-Fallback) sections.) +* Call the encoding's class constructor. Objects for the ASCII, UTF-7, UTF-8, UTF-16, and UTF-32 encodings can be instantiated in this way. By default, each object uses replacement fallback to handle strings that it cannot encode and bytes that it cannot decode, but you can specify that an exception should be thrown instead. (For more information, see the [Replacement fallback](#replacement-fallback) and [Exception fallback](#exception-fallback) sections.) -* Call the [Encoding.Encoding(Int32)](xref:System.Text.Encoding.GetEncoding(System.Int32)) constructor and pass it an integer that represents the encoding. Standard encoding objects use replacement fallback, and code page and double-byte character set (DBCS) encoding objects use best-fit fallback to handle strings that they cannot encode and bytes that they cannot decode. (For more information, see the [Best-Fit Fallback](#Best-Fit-Fallback) section.) +* Call the [Encoding.Encoding(Int32)](xref:System.Text.Encoding.GetEncoding(System.Int32)) constructor and pass it an integer that represents the encoding. Standard encoding objects use replacement fallback, and code page and double-byte character set (DBCS) encoding objects use best-fit fallback to handle strings that they cannot encode and bytes that they cannot decode. (For more information, see the [Best-Fit fallback](#best-fit-fallback) section.) * Call the [Encoding.GetEncoding](xref:System.Text.Encoding.GetEncoding(System.Int32)) method, which returns any standard, code page, or DBCS encoding available in .NET. Overloads let you specify a fallback object for both the encoder and the decoder. @@ -71,7 +72,7 @@ These encodings enable you to work with Unicode characters as well as with encod > [!NOTE] > By default, .NET Core does not make available any code page encodings other than code page 28591 and the Unicode encodings, such as UTF-8 and UTF-16. However, you can add the code page encodings found in standard Windows apps that target the .NET Framework to your app. For complete information, see the [EncodingProvider](xref:System.Text.EncodingProvider) topic. -## Selecting an Encoding Class +## Selecting an Encoding class If you have the opportunity to choose the encoding to be used by your application, you should use a Unicode encoding, preferably either [UTF8Encoding](xref:System.Text.UTF8Encoding) or [UnicodeEncoding](xref:System.Text.UnicodeEncoding). (.NET also supports a third Unicode encoding, [UTF32Encoding](xref:System.Text.UTF32Encoding).) @@ -92,7 +93,7 @@ You should consider using [ASCIIEncoding](xref:System.Text.ASCIIEncoding) only f In a web application, characters sent to the client in response to a web request should reflect the encoding used on the client. In most cases, you should set the [HttpResponse.ContentEncoding](xref:System.Net.HttpResponseHeader.ContentEncoding) property to the value returned by the [HttpRequestHeader.ContentEncoding](xref:System.Net.HttpRequestHeader.ContentEncoding) property to display text in the encoding that the user expects. -## Using an Encoding Object +## Using an encoding object An encoder converts a string of characters (most commonly, Unicode characters) to its numeric (byte) equivalent. For example, you might use an ASCII encoder to convert Unicode characters to ASCII so that they can be displayed at the console. To perform the conversion, you call the [Encoding.GetBytes](xref:System.Text.Encoding.GetBytes(System.Char[])) method. If you want to determine how many bytes are needed to store the encoded characters before performing the encoding, you can call the [GetByteCount](xref:System.Text.Encoding.GetByteCount(System.Char[])) method. @@ -574,7 +575,7 @@ End Module ' original = decoded: True ``` -## Choosing a Fallback Strategy +## Choosing a fallback strategy When a method tries to encode or decode a character but no mapping exists, it must implement a fallback strategy that determines how the failed mapping should be handled. There are three types of fallback strategies: @@ -584,10 +585,10 @@ When a method tries to encode or decode a character but no mapping exists, it mu * Exception fallback -> [!Important] +> [!IMPORTANT] > The most common problems in encoding operations occur when a Unicode character cannot be mapped to a particular code page encoding. The most common problems in decoding operations occur when invalid byte sequences cannot be translated into valid Unicode characters. For these reasons, you should know which fallback strategy a particular encoding object uses. Whenever possible, you should specify the fallback strategy used by an encoding object when you instantiate the object. -### Best-Fit Fallback +### Best-fit fallback When a character does not have an exact match in the target encoding, the encoder can try to map it to a similar character. (Best-fit fallback is mostly an encoding rather than a decoding issue. There are very few code pages that contain characters that cannot be successfully mapped to Unicode.) Best-fit fallback is the default for code page and double-byte character set encodings that are retrieved by the [Encoding.GetEncoding(Int32)](xref:System.Text.Encoding.GetEncoding(System.Int32)) and [Encoding.GetEncoding(String)](xref:System.Text.Encoding.GetEncoding(System.String)) overloads. @@ -699,9 +700,9 @@ End Module Best-fit mapping is the default behavior for an [Encoding](xref:System.Text.Encoding) object that encodes Unicode data into code page data, and there are legacy applications that rely on this behavior. However, most new applications should avoid best-fit behavior for security reasons. For example, applications should not put a domain name through a best-fit encoding. > [!Note] -> You can also implement a custom best-fit fallback mapping for an encoding. For more information, see the [Implementing a Custom Fallback Strategy](#Implementing-a-Custom-Fallback-Strategy) section. +> You can also implement a custom best-fit fallback mapping for an encoding. For more information, see the [Implementing a custom fallback strategy](#implementing-a-custom-fallback-strategy) section. -If best-fit fallback is the default for an encoding object, you can choose another fallback strategy when you retrieve an [Encoding](xref:System.Text.Encoding) object by calling the [Encoding.GetEncoding(Int32, EncoderFallback, DecoderFallback)](xref:System.Text.Encoding.GetEncoding(System.Int32,System.Text.EncoderFallback,System.Text.DecoderFallback)) or [Encoding.GetEncoding(String, EncoderFallback, DecoderFallback)](xref:System.Text.Encoding.GetEncoding(System.String,System.Text.EncoderFallback,System.Text.DecoderFallback)) overload. The following section includes an example that replaces each character that cannot be mapped to code page 1252 with an asterisk (*). +If best-fit fallback is the default for an encoding object, you can choose another fallback strategy when you retrieve an [Encoding](xref:System.Text.Encoding) object by calling the [Encoding.GetEncoding(Int32, EncoderFallback, DecoderFallback)](xref:System.Text.Encoding.GetEncoding(System.Int32,System.Text.EncoderFallback,System.Text.DecoderFallback)) or [Encoding.GetEncoding(String, EncoderFallback, DecoderFallback)](xref:System.Text.Encoding.GetEncoding(System.String,System.Text.EncoderFallback,System.Text.DecoderFallback)) overload. The following section includes an example that replaces each character that cannot be mapped to code page 1252 with an asterisk (\*). ```csharp using System; @@ -778,7 +779,7 @@ End Module ' 002A 0020 002A 0020 002A ``` -### Replacement Fallback +### Replacement fallback When a character does not have an exact match in the target scheme, but there is no appropriate character that it can be mapped to, the application can specify a replacement character or string. This is the default behavior for the Unicode decoder, which replaces any two-byte sequence that it cannot decode with REPLACEMENT_CHARACTER (U+FFFD). It is also the default behavior of the [ASCIIEncoding](xref:System.Text.ASCIIEncoding) class, which replaces each character that it cannot encode or decode with a question mark. The following example illustrates character replacement for the Unicode string from the previous example. As the output shows, each character that cannot be decoded into an ASCII byte value is replaced by 0x3F, which is the ASCII code for a question mark. @@ -876,7 +877,7 @@ End Module ' 003F 0020 003F 0020 003F ``` -.NET includes the [EncoderReplacementFallback](xref:System.Text.EncoderReplacementFallback) and [DecoderReplacementFallback](xref:System.Text.DecoderReplacementFallback) classes, which substitute a replacement string if a character does not map exactly in an encoding or decoding operation. By default, this replacement string is a question mark, but you can call a class constructor overload to choose a different string. Typically, the replacement string is a single character, although this is not a requirement. The following example changes the behavior of the code page 1252 encoder by instantiating an [EncoderReplacementFallback](xref:System.Text.EncoderReplacementFallback) object that uses an asterisk (*) as a replacement string. +.NET includes the [EncoderReplacementFallback](xref:System.Text.EncoderReplacementFallback) and [DecoderReplacementFallback](xref:System.Text.DecoderReplacementFallback) classes, which substitute a replacement string if a character does not map exactly in an encoding or decoding operation. By default, this replacement string is a question mark, but you can call a class constructor overload to choose a different string. Typically, the replacement string is a single character, although this is not a requirement. The following example changes the behavior of the code page 1252 encoder by instantiating an [EncoderReplacementFallback](xref:System.Text.EncoderReplacementFallback) object that uses an asterisk (\*) as a replacement string. ```csharp using System; @@ -954,11 +955,11 @@ End Module ``` > [!NOTE] -> You can also implement a replacement class for an encoding. For more information, see the [Implementing a Custom Fallback Strategy](#Implementing-a-Custom-Fallback-Strategy) section. +> You can also implement a replacement class for an encoding. For more information, see the [Implementing a custom fallback strategy](#implementing-a-custom-fallback-strategy) section. In addition to QUESTION MARK (U+003F), the Unicode REPLACEMENT CHARACTER (U+FFFD) is commonly used as a replacement string, particularly when decoding byte sequences that cannot be successfully translated into Unicode characters. However, you are free to choose any replacement string, and it can contain multiple characters. -### Exception Fallback +### Exception fallback Instead of providing a best-fit fallback or a replacement string, an encoder can throw an [EncoderFallbackException](xref:System.Text.EncoderFallbackException) if it is unable to encode a set of characters, and a decoder can throw a [DecoderFallbackException](xref:System.Text.DecoderFallbackException) if it is unable to decode a byte array. To throw an exception in encoding and decoding operations, you supply an [EncoderFallbackException](xref:System.Text.EncoderFallbackException) object and a [DecoderFallbackException](xref:System.Text.DecoderFallbackException) object, respectively, to the [Encoding.GetEncoding(String, EncoderFallback, DecoderFallback)](xref:System.Text.Encoding.GetEncoding(System.String,System.Text.EncoderFallback,System.Text.DecoderFallback)) method. The following example illustrates exception fallback with the ASCIIEncoding class. @@ -1104,7 +1105,7 @@ End Module ``` > [!NOTE] -> You can also implement a custom exception handler for an encoding operation. For more information, see the [Implementing a Custom Fallback Strategy](#Implementing-a-Custom-Fallback-Strategy) section. +> You can also implement a custom exception handler for an encoding operation. For more information, see the [Implementing a custom fallback strategy](#implementing-a-custom-fallback-strategy) section. The [EncoderFallbackException](xref:System.Text.EncoderFallbackException) and [DecoderFallbackException](xref:System.Text.DecoderFallbackException) objects provide the following information about the condition that caused the exception: @@ -1114,7 +1115,7 @@ The [EncoderFallbackException](xref:System.Text.EncoderFallbackException) and [D Although the [EncoderFallbackException](xref:System.Text.EncoderFallbackException) and [DecoderFallbackException](xref:System.Text.DecoderFallbackException) objects provide adequate diagnostic information about the exception, they do not provide access to the encoding or decoding buffer. Therefore, they do not allow invalid data to be replaced or corrected within the encoding or decoding method. -## Implementing a Custom Fallback Strategy +## Implementing a custom fallback strategy In addition to the best-fit mapping that is implemented internally by code pages, .NET includes the following classes for implementing a fallback strategy: @@ -1162,9 +1163,9 @@ When you create a custom fallback solution for an encoder or decoder, you must i If the fallback implementation is a best-fit fallback or a replacement fallback, the classes derived from [EncoderFallbackBuffer](xref:System.Text.EncoderFallbackBuffer) and [DecoderFallbackBuffer](xref:System.Text.DecoderFallbackBuffer) also maintain two private instance fields: the exact number of characters in the buffer; and the index of the next character in the buffer to return. -### An EncoderFallback Example +### An EncoderFallback example -An earlier example used replacement fallback to replace Unicode characters that did not correspond to ASCII characters with an asterisk (*). The following example uses a custom best-fit fallback implementation instead to provide a better mapping of non-ASCII characters. +An earlier example used replacement fallback to replace Unicode characters that did not correspond to ASCII characters with an asterisk (\*). The following example uses a custom best-fit fallback implementation instead to provide a better mapping of non-ASCII characters. The following code defines a class named `CustomMapper` that is derived from [EncoderFallback](xref:System.Text.EncoderFallback) to handle the best-fit mapping of non-ASCII characters. Its `CreateFallbackBuffer` method returns a `CustomMapperFallbackBuffer` object, which provides the [EncoderFallbackBuffer](xref:System.Text.EncoderFallbackBuffer) implementation. The `CustomMapper` class uses a [Dictionary<TKey, TValue>](xref:System.Collections.Generic.Dictionary%602) object to store the mappings of unsupported Unicode characters (the key value) and their corresponding 8-bit characters (which are stored in two consecutive bytes in a 64-bit integer). To make this mapping available to the fallback buffer, the `CustomMapper` instance is passed as a parameter to the `CustomMapperFallbackBuffer` class constructor. Because the longest mapping is the string "INF" for the Unicode character U+221E, the `MaxCharCount` property returns 3. @@ -1483,7 +1484,7 @@ Module Module1 End Module ``` -## See Also +## See also [System.Text.Encoder](xref:System.Text.Encoder) diff --git a/docs/standard/base-types/classes.md b/docs/standard/base-types/classes.md index 9a5389ef98095..38d539c4f7e2e 100644 --- a/docs/standard/base-types/classes.md +++ b/docs/standard/base-types/classes.md @@ -3,6 +3,7 @@ title: Character classes in regular expressions description: Character classes in regular expressions keywords: .NET, .NET Core author: stevehoag +ms.author: shoag manager: wpickett ms.date: 07/29/2016 ms.topic: article @@ -16,32 +17,32 @@ ms.assetid: c7a9305f-7144-4fe8-80e8-a727bf7d223f A character class defines a set of characters, any one of which can occur in an input string for a match to succeed. The regular expression language in .NET supports the following character classes: -* Positive character groups. A character in the input string must match one of a specified set of characters. For more information, see [Positive Character Group](#Positive-Character-Group:-[-]). +* Positive character groups. A character in the input string must match one of a specified set of characters. For more information, see [Positive character group](#positive-character-group--). -* Negative character groups. A character in the input string must not match one of a specified set of characters. For more information, see [Negative Character Group](#Negative-Character-Group:-[^]). +* Negative character groups. A character in the input string must not match one of a specified set of characters. For more information, see [Negative character group](#negative-character-group-). -* Any character. The . (dot or period) character in a regular expression is a wildcard character that matches any character except **\n**. For more information, see [Any Character](#Any-Character:.). +* Any character. The . (dot or period) character in a regular expression is a wildcard character that matches any character except **\n**. For more information, see [Any character](#any-character-). -* A general Unicode category or named block. A character in the input string must be a member of a particular Unicode category or must fall within a contiguous range of Unicode characters for a match to succeed. For more information, see [Unicode Category or Unicode Block](#Unicode-Category-or-Unicode-Block:\p{}). +* A general Unicode category or named block. A character in the input string must be a member of a particular Unicode category or must fall within a contiguous range of Unicode characters for a match to succeed. For more information, see [Unicode category or Unicode block](#unicode-category-or-unicode-block-p). -* A negative general Unicode category or named block. A character in the input string must not be a member of a particular Unicode category or must not fall within a contiguous range of Unicode characters for a match to succeed. For more information, see [Negative Unicode Category or Unicode Block](#Negative-Unicode-Category-or-Unicode-Block:\P{}). +* A negative general Unicode category or named block. A character in the input string must not be a member of a particular Unicode category or must not fall within a contiguous range of Unicode characters for a match to succeed. For more information, see [Negative Unicode category or Unicode block](#negative-unicode-category-or-unicode-block-p). -* A word character. A character in the input string can belong to any of the Unicode categories that are appropriate for characters in words. For more information, see [Word Character](#Word-Character:\w). +* A word character. A character in the input string can belong to any of the Unicode categories that are appropriate for characters in words. For more information, see [Word character](#word-character-w). -* A non-word character. A character in the input string can belong to any Unicode category that is not a word character. For more information, see [Non-Word Character](#Non-Word-Character:\W). +* A non-word character. A character in the input string can belong to any Unicode category that is not a word character. For more information, see [Non-word character](#non-word-character-w). -* A white-space character. A character in the input string can be any Unicode separator character, as well as any one of a number of control characters. For more information, see [White-Space Character](#White-Space-Character:\s). +* A white-space character. A character in the input string can be any Unicode separator character, as well as any one of a number of control characters. For more information, see [White-space character](#white-space-character-s). -* A non-white-space character. A character in the input string can be any character that is not a white-space character. For more information, see [Non-White-Space Character](#Non-White-Space-Character:\S). +* A non-white-space character. A character in the input string can be any character that is not a white-space character. For more information, see [Non-white-space character](#non-white-space-character-s). -* A decimal digit. A character in the input string can be any of a number of characters classified as Unicode decimal digits. For more information, see [Decimal Digit Character](#Decimal-Digit-Character:\d). +* A decimal digit. A character in the input string can be any of a number of characters classified as Unicode decimal digits. For more information, see [Decimal digit character](#decimal-digit-character-d). -* A non-decimal digit. A character in the input string can be anything other than a Unicode decimal digit. For more information, see [Non-Digit Character](#Non-Digit_Character:\D). +* A non-decimal digit. A character in the input string can be anything other than a Unicode decimal digit. For more information, see [Non-digit character](#non-digit-character-d). -.NET supports character class subtraction expressions, which enables you to define a set of characters as the result of excluding one character class from another character class. For more information, see [Character Class Subtraction](#Character-Class-Subtraction:-[base_group---[excluded_group]]). +.NET supports character class subtraction expressions, which enables you to define a set of characters as the result of excluding one character class from another character class. For more information, see [Character class subtraction](#character-class-subtraction). -## Positive Character Group: [ ] +## Positive character group: [ ] A positive character group specifies a list of characters, any one of which may appear in an input string for a match to occur. This list of characters may be specified individually, as a range, or both. @@ -163,7 +164,7 @@ Pattern | Description `\w*` | Match zero or more word characters. `\b` | Match a word boundary. -## Negative Character Group: [^] +## Negative character group: [^] A negative character group specifies a list of characters that must not appear in an input string for a match to occur. The list of characters may be specified individually, as a range, or both. @@ -248,13 +249,13 @@ Pattern | Description `\w+` | Match one or more word characters. `\b` | End at a word boundary. -## Any Character: . +## Any character: . The period character (.) matches any character except **\n** (the newline character, **\u000A**), with the following two qualifications: -* If a regular expression pattern is modified by the `RegexOptions.Singleline` option, or if the portion of the pattern that contains the . character class is modified by the **s** option, . matches any character. For more information, see [Regular Expression Options](options.md). +* If a regular expression pattern is modified by the [RegexOptions.Singleline](xref:System.Text.RegularExpressions.RegexOptions.Singleline) option, or if the portion of the pattern that contains the . character class is modified by the **s** option, . matches any character. For more information, see [Regular expression options](options.md). - The following example illustrates the different behavior of the . character class by default and with the `RegexOptions.Singleline` option. The regular expression `^.+` starts at the beginning of the string and matches every character. By default, the match ends at the end of the first line; the regular expression pattern matches the carriage return character, **\r** or **\u000D**, but it does not match **\n**. Because the `RegexOptions.Singleline` option interprets the entire input string as a single line, it matches every character in the input string, including **\n**. + The following example illustrates the different behavior of the . character class by default and with the [RegexOptions.Singleline](xref:System.Text.RegularExpressions.RegexOptions.Singleline) option. The regular expression `^.+` starts at the beginning of the string and matches every character. By default, the match ends at the end of the first line; the regular expression pattern matches the carriage return character, **\r** or **\u000D**, but it does not match **\n**. Because the [RegexOptions.Singleline](xref:System.Text.RegularExpressions.RegexOptions.Singleline) option interprets the entire input string as a single line, it matches every character in the input string, including **\n**. ```csharp using System; @@ -305,7 +306,7 @@ The period character (.) matches any character except **\n** (the newline charac > [!NOTE] > Because it matches any character except **\n**, the . character class also matches **\r** (the carriage return character, **\u000D**). -* In a positive or negative character group, a period is treated as a literal period character, and not as a character class. For more information, see [Positive Character Group](#Positive-Character-Group:-[-]) or [Negative Character Group](#Negative-Character-Group:-[^]) earlier in this topic. The following example provides an illustration by defining a regular expression that includes the period character (**.**) both as a character class and as a member of a positive character group. The regular expression `\b.*[.?!;:](\s|\z)` begins at a word boundary, matches any character until it encounters one of four punctuation marks, including a period, and then matches either a white-space character or the end of the string. +* In a positive or negative character group, a period is treated as a literal period character, and not as a character class. For more information, see [Positive character group](#positive-character-group--) or [Negative character group](#negative-character-group-) earlier in this topic. The following example provides an illustration by defining a regular expression that includes the period character (**.**) both as a character class and as a member of a positive character group. The regular expression `\b.*[.?!;:](\s|\z)` begins at a word boundary, matches any character until it encounters one of four punctuation marks, including a period, and then matches either a white-space character or the end of the string. ```csharp using System; @@ -342,9 +343,9 @@ The period character (.) matches any character except **\n** (the newline charac ``` > [!NOTE] - > Because it matches any character, the . language element is often used with a lazy quantifier if a regular expression pattern attempts to match any character multiple times. For more information, see [Quantifiers in Regular Expressions](quantifiers.md). + > Because it matches any character, the . language element is often used with a lazy quantifier if a regular expression pattern attempts to match any character multiple times. For more information, see [Quantifiers in regular expressions](quantifiers.md). -## Unicode Category or Unicode Block: \p{} +## Unicode category or Unicode block: \p{} The Unicode standard assigns each character a general category. For example, a particular character can be an uppercase letter (represented by the **Lu** category), a decimal digit (the **Nd** category), a math symbol (the **Sm** category), or a paragraph separator (the **Zl** category). Specific character sets in the Unicode standard also occupy a specific range or block of consecutive code points. For example, the basic Latin character set is found from **\u0000** through **\u007F**, while the Arabic character set is found from **\u0600** through **\u06FF**. @@ -352,7 +353,7 @@ The regular expression construct **\p{**_name_**}** -matches any character that belongs to a Unicode general category or named block, where name is the category abbreviation or named block name. For a list of category abbreviations, see the [Supported Unicode General Categories](#Supported-Unicode-General-Categories) section later in this topic. For a list of named blocks, see the [Supported Named Blocks](#Supported-Named-Blocks) section later in this topic. +matches any character that belongs to a Unicode general category or named block, where name is the category abbreviation or named block name. For a list of category abbreviations, see the [Supported Unicode general categories](#supported-unicode-general-categories) section later in this topic. For a list of named blocks, see the [Supported named blocks](#supported-named-blocks) section later in this topic. The following example uses the **\p{**_name_**}** construct to match both a Unicode general category (in this case, the **Pd**, or Punctuation,Dash category) and a named block (the **IsGreek** and **IsBasicLatin** named blocks). @@ -399,7 +400,7 @@ Pattern | Description `(\s)?` | Match zero or one white-space character. `(\p{IsBasicLatin}+(\s)?)+` | Match the pattern of one or more basic Latin characters followed by zero or one white-space characters one or more times. -## Negative Unicode Category or Unicode Block: \P{} +## Negative Unicode category or Unicode block: \P{} The Unicode standard assigns each character a general category. For example, a particular character can be an uppercase letter (represented by the **Lu** category), a decimal digit (the **Nd** category), a math symbol (the **Sm** category), or a paragraph separator (the **Zl** category). Specific character sets in the Unicode standard also occupy a specific range or block of consecutive code points. For example, the basic Latin character set is found from **\u0000** through **\u007F**, while the Arabic character set is found from **\u0600** through **\u06FF**. @@ -407,7 +408,7 @@ The regular expression construct **\P{**_name_**}** -matches any character that belongs to a Unicode general category or named block, where name is the category abbreviation or named block name. For a list of category abbreviations, see the [Supported Unicode General Categories](#Supported-Unicode-General-Categories) section later in this topic. For a list of named blocks, see the [Supported Named Blocks](#Supported-Named-Blocks) section later in this topic. +matches any character that belongs to a Unicode general category or named block, where name is the category abbreviation or named block name. For a list of category abbreviations, see the [Supported Unicode general categories](#supported-unicode-general-categories) section later in this topic. For a list of named blocks, see the [Supported named blocks](#supported-named-blocks) section later in this topic. The following example uses the **\P{**_name_**}** construct to remove any currency symbols (in this case, the **Sc**, or Symbol, Currency category) from numeric strings. @@ -455,7 +456,7 @@ End Module The regular expression pattern `(\P{Sc})+` matches one or more characters that are not currency symbols; it effectively strips any currency symbol from the result string. -## Word Character: \w +## Word character: \w **\w** matches any word character. A word character is a member of any of the Unicode categories listed in the following table. @@ -470,10 +471,10 @@ Mn | Mark, Nonspacing Nd | Number, Decimal Digit Pc | Punctuation, Connector. This category includes ten characters, the most commonly used of which is the LOWLINE character (_), u+005F. -If ECMAScript-compliant behavior is specified, **\w** is equivalent to `[a-zA-Z_0-9]`. For information on ECMAScript regular expressions, see the "ECMAScript Matching Behavior" section in [Regular Expression Options](options.md). +If ECMAScript-compliant behavior is specified, **\w** is equivalent to `[a-zA-Z_0-9]`. For information on ECMAScript regular expressions, see the [ECMAScript matching behavior](options.md#ecmascript-matching-behavior) section in [Regular expression options](options.md). > [!NOTE] -> Because it matches any word character, the \w language element is often used with a lazy quantifier if a regular expression pattern attempts to match any word character multiple times, followed by a specific word character. For more information, see [Quantifiers in Regular Expressions](quantifiers.md). +> Because it matches any word character, the \w language element is often used with a lazy quantifier if a regular expression pattern attempts to match any word character multiple times, followed by a specific word character. For more information, see [Quantifiers in regular expressions](quantifiers.md). The following example uses the **\w** language element to match duplicate characters in a word. The example defines a regular expression pattern, **(\w)\1**, which can be interpreted as follows. @@ -545,7 +546,7 @@ End Module ' 'nn' found in 'stunned' at position 3. ``` -## Non-Word Character: \W +## Non-word character: \W **\W** matches any non-word character. The **\W** language element is equivalent to the following character class: @@ -566,10 +567,10 @@ Mn | Mark, Nonspacing Nd | Number, Decimal Digit Pc | Punctuation, Connector. This category includes ten characters, the most commonly used of which is the LOWLINE character (_), u+005F. -If ECMAScript-compliant behavior is specified, **\W** is equivalent to `[^a-zA-Z_0-9]`. For information on ECMAScript regular expressions, see the "ECMAScript Matching Behavior" section in [Regular Expression Options](options.md). +If ECMAScript-compliant behavior is specified, **\W** is equivalent to `[^a-zA-Z_0-9]`. For information on ECMAScript regular expressions, see the [ECMAScript matching behavior](options.md#ecmascript-matching-behavior) section in [Regular expression options](options.md). > [!NOTE] -> Because it matches any word character, the \w language element is often used with a lazy quantifier if a regular expression pattern attempts to match any word character multiple times, followed by a specific word character. For more information, see [Quantifiers in Regular Expressions](quantifiers.md). +> Because it matches any word character, the \w language element is often used with a lazy quantifier if a regular expression pattern attempts to match any word character multiple times, followed by a specific word character. For more information, see [Quantifiers in regular expressions](quantifiers.md). The following example illustrates the **\W** character class. It defines a regular expression pattern, `\b(\w+)(\W){1,2}`, that matches a word followed by one or two non-word characters, such as white space or punctuation. The regular expression is interpreted as shown in the following table. @@ -674,7 +675,7 @@ End Module Because the `Group` object for the second capturing group contains only a single captured non-word character, the example retrieves all captured non-word characters from the `CaptureCollection` object that is returned by the `Group.Captures` property. -## White-Space Character: \s +## White-space character: \s **\s** matches any white-space character. It is equivalent to the escape sequences and Unicode categories listed in the following table. @@ -689,7 +690,7 @@ Category | Description **\p{Z}** | Matches any separator character. -If ECMAScript-compliant behavior is specified, **\s** is equivalent to `[ \f\n\r\t\v]`. For information on ECMAScript regular expressions, see the "ECMAScript Matching Behavior" section in [Regular Expression Options](options.md). +If ECMAScript-compliant behavior is specified, **\s** is equivalent to `[ \f\n\r\t\v]`. For information on ECMAScript regular expressions, see the [ECMAScript matching behavior](options.md#ecmascript-matching-behavior) section in [Regular expression options](options.md). The following example illustrates the \s character class. It defines a regular expression pattern, `\b\w+(e)?s(\s|$)`, that matches a word ending in either "s" or "es" followed by either a white-space character or the end of the input string. The regular expression is interpreted as shown in the following table. @@ -741,11 +742,11 @@ End Module ' leaves ``` -## Non-White-Space Character: \S +## Non-white-space character: \S **\S** matches any non-white-space character. It is equivalent to the `[^\f\n\r\t\v\x85\p{Z}]` regular expression pattern, or the opposite of the regular expression pattern that is equivalent to **\s**, which matches white-space characters. For more information, see the oprevious section, "White-Space Character: \s". -If ECMAScript-compliant behavior is specified, **\S** is equivalent to `[^ \f\n\r\t\v]`. For information on ECMAScript regular expressions, see the "ECMAScript Matching Behavior" section in [Regular Expression Options](options.md). +If ECMAScript-compliant behavior is specified, **\S** is equivalent to `[^ \f\n\r\t\v]`. For information on ECMAScript regular expressions, see the [ECMAScript matching behavior](options.md#ecmascript-matching-behavior) section in [Regular expression options](options.md). The following example illustrates the **\S** language element. The regular expression pattern \b(\S+)\s? matches strings that are delimited by white-space characters. The second element in the match's GroupCollection object contains the matched string. The regular expression can be interpreted as shown in the following table. @@ -837,11 +838,11 @@ End Module ' paragraph. ``` -## Decimal Digit Character: \d +## Decimal digit character: \d **\d** matches any decimal digit. It is equivalent to the `\\p{Nd}` regular expression pattern, which includes the standard decimal digits 0-9 as well as the decimal digits of a number of other character sets. -If ECMAScript-compliant behavior is specified, **\d** is equivalent to `[0-9]`. For information on ECMAScript regular expressions, see the "ECMAScript Matching Behavior" section in [Regular Expression Options](options.md). +If ECMAScript-compliant behavior is specified, **\d** is equivalent to `[0-9]`. For information on ECMAScript regular expressions, see the [ECMAScript matching behavior](options.md#ecmascript-matching-behavior) section in [Regular expression options](options.md). The following example illustrates the **\d** language element. It tests whether an input string represents a valid telephone number in the United States and Canada. The regular expression pattern `^(\(?\d{3}\)?[\s-])?\d{3}-\d{4}$` is defined as shown in the following table. @@ -917,11 +918,11 @@ End Module ' 01 999-9999: match failed ``` -## Non-Digit Character: \D +## Non-digit character: \D **\D** matches any non-digit character. It is equivalent to the `\P{Nd}` regular expression pattern. -If ECMAScript-compliant behavior is specified, **\D** is equivalent to `[^0-9]`. For information on ECMAScript regular expressions, see the "ECMAScript Matching Behavior" section in [Regular Expression Options](options.md). +If ECMAScript-compliant behavior is specified, **\D** is equivalent to `[^0-9]`. For information on ECMAScript regular expressions, see the [ECMAScript matching behavior](options.md#ecmascript-matching-behavior) section in [Regular expression options](options.md). The following example illustrates the **\D** language element. It tests whether a string such as a part number consists of the appropriate combination of decimal and non-decimal characters. The regular expression pattern `^\D\d{1,5}\D*$` is defined as shown in the following table. @@ -980,7 +981,7 @@ End Module ' The example displays the following output: ``` -## Supported Unicode General Categories +## Supported Unicode general categories Unicode defines the general categories listed in the following table. For more information, see the "UCD File Format" and "General Category Values" subtopics at the [Unicode Character Database](http://www.unicode.org/reports/tr44/). @@ -1136,7 +1137,8 @@ FE70 - FEFF | **IsArabicPresentationForms-B** FF00 - FFEF | **IsHalfwidthandFullwidthForms** FFF0 - FFFF | **IsSpecials** -## Character Class Subtraction: [base_group - [excluded_group]] + +## Character class subtraction: [base_group - [excluded_group]] A character class defines a set of characters. Character class subtraction yields a set of characters that is the result of excluding the characters in one character class from another character class. @@ -1205,7 +1207,6 @@ End Module ' 335599901 ``` -## See Also - -[Regular expression options](options.md) +## See also +[Regular expression options](options.md) \ No newline at end of file diff --git a/docs/standard/base-types/common-type-system.md b/docs/standard/base-types/common-type-system.md index b5b6aa326480e..7f0895ecbf835 100644 --- a/docs/standard/base-types/common-type-system.md +++ b/docs/standard/base-types/common-type-system.md @@ -3,6 +3,7 @@ title: Common type system in depth description: Common type system in depth keywords: .NET, .NET Core author: stevehoag +ms.author: shoag manager: wpickett ms.date: 07/20/2016 ms.topic: article @@ -16,13 +17,13 @@ ms.assetid: b5482a1d-7bdc-40fe-aa45-10df930ceb5b This topic contains the following sections that explore the common type system in depth: -* [Types in .NET](#Types-in-.NET) +* [Types in .NET](#types-in-net) -* [Type Definitions](#Type-Definitions) +* [Type definitions](#type-definitions) -* [Type Members](#Type-Members) +* [Type members](#type-members) -* [Characteristics of Type Members](#Characteristics-of-Type-Members) +* [Characteristics of type members](#characteristics-of-type-members) ## Types in .NET @@ -35,15 +36,15 @@ Reference types are data types whose objects are represented by a reference (sim The common type system in .NET supports the following five categories of types: -* [Classes](#Classes) +* [Classes](#classes) -* [Structures](#Structures) +* [Structures](#structures) -* [Enumerations](#Enumerations) +* [Enumerations](#enumerations) -* [Interfaces](#Interfaces) +* [Interfaces](#interfaces) -* [Delegates](#Delegates) +* [Delegates](#delegates) ### Classes @@ -60,7 +61,7 @@ inherits | Indicates that instances of the class can be used anywhere the base c exported or not exported | Indicates whether a class is visible outside the assembly in which it is defined. This characteristic applies only to top-level classes and not to nested classes. > [!NOTE] -> A class can also be nested in a parent class or structure. Nested classes also have member characteristics. For more information, see [Nested Types](#Nested-Types). +> A class can also be nested in a parent class or structure. Nested classes also have member characteristics. For more information, see [Nested types](#nested-types). Class members that have no implementation are abstract members. A class that has one or more abstract members is itself abstract; new instances of it cannot be created. Some languages that target the runtime let you mark a class as abstract even if none of its members are abstract. You can use an abstract class when you want to encapsulate a basic set of functionality that derived classes can inherit or override when appropriate. Classes that are not abstract are referred to as concrete classes. @@ -275,7 +276,7 @@ For delegates that represent multiple methods, .NET provides methods of the [Del > [!NOTE] > It is not necessary to use these methods for event-handler delegates in C# or Visual Basic, because these languages provide syntax for adding and removing event handlers. -## Type Definitions +## Type definitions A type definition includes the following: @@ -297,7 +298,7 @@ Attributes provide additional user-defined metadata. Most commonly, they are use Attributes are themselves classes that inherit from [System.Attribute](xref:System.Attribute). Languages that support the use of attributes each have their own syntax for applying attributes to a language element. Attributes can be applied to almost any language element; the specific elements to which an attribute can be applied are defined by the [AttributeUsageAttribute](xref:System.AttributeUsageAttribute) that is applied to that attribute class. -### Type Accessibility +### Type accessibility All types have a modifier that governs their accessibility from other types. The following table describes the type accessibilities supported by the runtime. @@ -320,7 +321,7 @@ The accessibility domain of a nested member `M` declared in a type `T`within a p * If the declared accessibility of `M` is `private`, the accessibility domain of `M` is the program text of `T`. -### Type Names +### Type names The common type system imposes only two restrictions on names: @@ -332,27 +333,27 @@ However, most languages impose additional restrictions on type names. All compar Although a type might reference types from other modules and assemblies, a type must be fully defined within one .NET module. (Depending on compiler support, however, it can be divided into multiple source code files.) Type names need be unique only within a namespace. To fully identify a type, the type name must be qualified by the namespace that contains the implementation of the type. -### Base Types and Interfaces +### Base types and interfaces A type can inherit values and behaviors from another type. The common type system does not allow types to inherit from more than one base type. A type can implement any number of interfaces. To implement an interface, a type must implement all the virtual members of that interface. A virtual method can be implemented by a derived type and can be invoked either statically or dynamically. -## Type Members +## Type members The runtime enables you to define members of your type, which specifies the behavior and state of a type. Type members include the following: -* [Fields](#Fields) +* [Fields](#fields) -* [Properties](#Properties) +* [Properties](#properties) -* [Methods](#Methods) +* [Methods](#methods) -* [Constructors](#Constructors) +* [Constructors](#constructors) -* [Events](#Events) +* [Events](#events) -* [Nested Types](#Nested-Types) +* [Nested types](#nested-types) ### Fields @@ -483,13 +484,13 @@ If the source code for a structure defines constructors, they must be parameteri An event defines an incident that can be responded to, and defines methods for subscribing to, unsubscribing from, and raising the event. Events are often used to inform other types of state changes. -### Nested Types +### Nested types A nested type is a type that is a member of some other type. Nested types should be tightly coupled to their containing type and must not be useful as a general-purpose type. Nested types are useful when the declaring type uses and creates instances of the nested type, and use of the nested type is not exposed in public members. Nested types are confusing to some developers and should not be publicly visible unless there is a compelling reason for visibility. In a well-designed library, developers should rarely have to use nested types to instantiate objects or declare variables. -## Characteristics of Type Members +## Characteristics of type members The common type system allows type members to have a variety of characteristics; however, languages are not required to support all these characteristics. The following table describes member characteristics. @@ -517,7 +518,7 @@ Each type member has a unique signature. Method signatures consist of the method > [!NOTE] > The return type is not considered part of a method's signature. That is, methods cannot be overloaded if they differ only by return type. -### Inheriting, Overriding, and Hiding Members +### Inheriting, overriding, and hiding members A derived type inherits all members of its base type; that is, these members are defined on, and available to, the derived type. The behavior or qualities of inherited members can be modified in two ways: @@ -525,12 +526,6 @@ A derived type inherits all members of its base type; that is, these members are * A derived type can override an inherited virtual method. The overriding method provides a new definition of the method that will be invoked based on the type of the value at run time rather than the type of the variable known at compile time. A method can override a virtual method only if the virtual method is not marked as `final` and the new method is at least as accessible as the virtual method. -## See Also - -[Type Conversion in the .NET Framework](type-conversion.md) - - - - - +## See also +[Type conversion in the .NET Framework](type-conversion.md) \ No newline at end of file diff --git a/docs/standard/base-types/composite-format.md b/docs/standard/base-types/composite-format.md index a782698767439..5432d0a1066e9 100644 --- a/docs/standard/base-types/composite-format.md +++ b/docs/standard/base-types/composite-format.md @@ -3,6 +3,7 @@ title: Composite formatting description: Composite formatting keywords: .NET, .NET Core author: stevehoag +ms.author: shoag manager: wpickett ms.date: 07/25/2016 ms.topic: article @@ -171,7 +172,7 @@ The following table lists types or categories of types in the .NET Framework cla Type or type category | See --------------------- | --- -Date and time types ([DateTime](xref:System.DateTime), [DateTimeOffset](xref:System.DateTimeOffset)) | [Standard Date and Time Format Strings](standard-datetime.md), [Custom Date and Time Format Strings](custom-datetime.md) +Date and time types ([DateTime](xref:System.DateTime), [DateTimeOffset](xref:System.DateTimeOffset)) | [Standard Date and Time Format Strings](standard-datetime.md), [Custom date and time format strings](custom-datetime.md) Enumeration types (all types derived from [System.Enum](xref:System.Enum)) | [Enumeration Format Strings](enumeration-format.md) Numeric types ([BigInteger](xref:System.Numerics.BigInteger), [Byte](xref:System.Byte), [Decimal](xref:System.Decimal), [Double](xref:System.Double), [Int16](xref:System.Int16), [Int32](xref:System.Int32), [Int64](xref:System.Int64), [SByte](xref:System.SByte), [Single](xref:System.Single), [UInt16](xref:System.UInt16), [UInt32](xref:System.UInt32), [UInt64](xref:System.UInt64)) | [Standard Numeric Format Strings](standard-numeric.md), [Custom Numeric Format Strings](custom-numeric.md) [Guid](xref:System.Guid) | [Guid.ToString(String)](xref:System.Guid.ToString(System.String)) diff --git a/docs/standard/base-types/custom-datetime.md b/docs/standard/base-types/custom-datetime.md index 0af066211d502..09d01b9545eb2 100644 --- a/docs/standard/base-types/custom-datetime.md +++ b/docs/standard/base-types/custom-datetime.md @@ -3,6 +3,7 @@ title: Custom date and time format strings description: Custom date and time format strings keywords: .NET, .NET Core author: stevehoag +ms.author: shoag manager: wpickett ms.date: 07/25/2016 ms.topic: article @@ -166,7 +167,7 @@ The following sections provide additional information about each custom date and The "d" custom format specifier represents the day of the month as a number from 1 through 31. A single-digit day is formatted without a leading zero. -If the "d" format specifier is used without other custom format specifiers, it is interpreted as the "d" standard date and time format specifier. For more information about using a single format specifier, see [Using Single Custom Format Specifiers](#Using-Single-Custom-Format-Specifiers) later in this topic. +If the "d" format specifier is used without other custom format specifiers, it is interpreted as the "d" standard date and time format specifier. For more information about using a single format specifier, see [Using single custom format specifiers](#using-single-custom-format-specifiers) later in this topic. The following example includes the "d" custom format specifier in several format strings. @@ -282,7 +283,7 @@ Console.WriteLine(date1.ToString("dddd dd MMMM", _ The "f" custom format specifier represents the most significant digit of the seconds fraction; that is, it represents the tenths of a second in a date and time value. -If the "f" format specifier is used without other format specifiers, it is interpreted as the "f" standard date and time format specifier. For more information about using a single format specifier, see [Using Single Custom Format Specifiers](#Using-Single-Custom-Format-Specifiers) later in this topic. +If the "f" format specifier is used without other format specifiers, it is interpreted as the "f" standard date and time format specifier. For more information about using a single format specifier, see [Using single custom format specifiers](#using-single-custom-format-specifiers) later in this topic. When you use "f" format specifiers as part of a format string supplied to the [DateTime.ParseExact](xref:System.DateTime.ParseExact(System.String,System.String,System.IFormatProvider)), [DateTime.TryParseExact](xref:System.DateTime.TryParseExact(System.String,System.String,System.IFormatProvider,System.Globalization.DateTimeStyles,System.DateTime@)), [DateTimeOffset.ParseExact](xref:System.DateTimeOffset.ParseExact(System.String,System.String,System.IFormatProvider)), or [DateTimeOffset.TryParseExact](xref:System.DateTimeOffset.TryParseExact(System.String,System.String,System.IFormatProvider,System.Globalization.DateTimeStyles,System.DateTimeOffset@) ) method, the number of "f" format specifiers indicates the number of most significant digits of the seconds fraction that must be present to successfully parse the string. @@ -436,7 +437,7 @@ Although it is possible to display the ten millionths of a second component of a The "F" custom format specifier represents the most significant digit of the seconds fraction; that is, it represents the tenths of a second in a date and time value. Nothing is displayed if the digit is zero. -If the "F" format specifier is used without other format specifiers, it is interpreted as the "F" standard date and time format specifier. For more information about using a single format specifier, see [Using Single Custom Format Specifiers](#Using-Single-Custom-Format-Specifiers) later in this topic. +If the "F" format specifier is used without other format specifiers, it is interpreted as the "F" standard date and time format specifier. For more information about using a single format specifier, see [Using single custom format specifiers](#using-single-custom-format-specifiers) later in this topic. The number of "F" format specifiers used with the [DateTime.ParseExact](xref:System.DateTime.ParseExact(System.String,System.String,System.IFormatProvider)), [DateTime.TryParseExact](xref:System.DateTime.TryParseExact(System.String,System.String,System.IFormatProvider,System.Globalization.DateTimeStyles,System.DateTime@)), [DateTimeOffset.ParseExact](xref:System.DateTimeOffset.ParseExact(System.String,System.String,System.IFormatProvider)), or [DateTimeOffset.TryParseExact](xref:System.DateTimeOffset.TryParseExact(System.String,System.String,System.IFormatProvider,System.Globalization.DateTimeStyles,System.DateTimeOffset@) ) method indicates the maximum number of most significant digits of the seconds fraction that can be present to successfully parse the string. @@ -590,7 +591,7 @@ Although it is possible to display the ten millionths of a second component of a The "g" or "gg" custom format specifiers (plus any number of additional "g" specifiers) represents the period or era, such as A.D. The formatting operation ignores this specifier if the date to be formatted does not have an associated period or era string. -If the "g" format specifier is used without other custom format specifiers, it is interpreted as the "g" standard date and time format specifier. For more information about using a single format specifier, see [Using Single Custom Format Specifiers](#Using-Single-Custom-Format-Specifiers) later in this topic. +If the "g" format specifier is used without other custom format specifiers, it is interpreted as the "g" standard date and time format specifier. For more information about using a single format specifier, see [Using single custom format specifiers](#using-single-custom-format-specifiers) later in this topic. The following example includes the "g" custom format specifier in a custom format string. @@ -620,7 +621,7 @@ Console.WriteLine(date1.ToString("MM/dd/yyyy g", _ The "h" custom format specifier represents the hour as a number from 1 through 12; that is, the hour is represented by a 12-hour clock that counts the whole hours since midnight or noon. A particular hour after midnight is indistinguishable from the same hour after noon. The hour is not rounded, and a single-digit hour is formatted without a leading zero. For example, given a time of 5:43 in the morning or afternoon, this custom format specifier displays "5". -If the "h" format specifier is used without other custom format specifiers, it is interpreted as a standard date and time format specifier and throws a [FormatException](xref:System.FormatException). For more information about using a single format specifier, see [Using Single Custom Format Specifiers](#Using-Single-Custom-Format-Specifiers) later in this topic. +If the "h" format specifier is used without other custom format specifiers, it is interpreted as a standard date and time format specifier and throws a [FormatException](xref:System.FormatException). For more information about using a single format specifier, see [Using single custom format specifiers](#using-single-custom-format-specifiers) later in this topic. The following example includes the "h" custom format specifier in a custom format string. @@ -706,7 +707,7 @@ Console.WriteLine(date1.ToString("hh:mm:ss.ff tt", _ The "H" custom format specifier represents the hour as a number from 0 through 23; that is, the hour is represented by a zero-based 24-hour clock that counts the hours since midnight. A single-digit hour is formatted without a leading zero. -If the "H" format specifier is used without other custom format specifiers, it is interpreted as a standard date and time format specifier and throws a [FormatException](xref:System.FormatException). For more information about using a single format specifier, see [Using Single Custom Format Specifiers](#Using-Single-Custom-Format-Specifiers) later in this topic. +If the "H" format specifier is used without other custom format specifiers, it is interpreted as a standard date and time format specifier and throws a [FormatException](xref:System.FormatException). For more information about using a single format specifier, see [Using single custom format specifiers](#using-single-custom-format-specifiers) later in this topic. The following example includes the "H" custom format specifier in a custom format string. @@ -756,7 +757,7 @@ The "K" custom format specifier represents the time zone information of a date a For [DateTimeOffset](xref:System.DateTimeOffset) values, the "K" format specifier is equivalent to the "zz" format specifier, and produces a result string containing the [DateTimeOffset](xref:System.DateTimeOffset) value's offset from UTC. -If the "K" format specifier is used without other custom format specifiers, it is interpreted as a standard date and time format specifier and throws a [FormatException](xref:System.FormatException). For more information about using a single format specifier, see [Using Single Custom Format Specifiers](#Using-Single-Custom-Format-Specifiers) later in this topic. +If the "K" format specifier is used without other custom format specifiers, it is interpreted as a standard date and time format specifier and throws a [FormatException](xref:System.FormatException). For more information about using a single format specifier, see [Using single custom format specifiers](#using-single-custom-format-specifiers) later in this topic. The following example displays the string that results from using the "K" custom format specifier with various [DateTime](xref:System.DateTime) and [DateTimeOffset](xref:System.DateTimeOffset) values on a system in the U.S. Pacific Time zone. @@ -802,7 +803,7 @@ Console.WriteLine(New DateTimeOffset(2008, 5, 1, 6, 30, 0, _ The "m" custom format specifier represents the minute as a number from 0 through 59. The minute represents whole minutes that have passed since the last hour. A single-digit minute is formatted without a leading zero. -If the "m" format specifier is used without other custom format specifiers, it is interpreted as the "m" standard date and time format specifier. For more information about using a single format specifier, see [Using Single Custom Format Specifiers](#Using-Single-Custom-Format-Specifiers) later in this topic. +If the "m" format specifier is used without other custom format specifiers, it is interpreted as the "m" standard date and time format specifier. For more information about using a single format specifier, see [Using single custom format specifiers](#using-single-custom-format-specifiers) later in this topic. The following example includes the "m" custom format specifier in a custom format string. @@ -888,7 +889,7 @@ Console.WriteLine(date1.ToString("hh:mm:ss.ff tt", _ The "M" custom format specifier represents the month as a number from 1 through 12 (or from 1 through 13 for calendars that have 13 months). A single-digit month is formatted without a leading zero. -If the "M" format specifier is used without other custom format specifiers, it is interpreted as the "M" standard date and time format specifier. For more information about using a single format specifier, see [Using Single Custom Format Specifiers](#Using-Single-Custom-Format-Specifiers) later in this topic. +If the "M" format specifier is used without other custom format specifiers, it is interpreted as the "M" standard date and time format specifier. For more information about using a single format specifier, see [Using single custom format specifiers](#using-single-custom-format-specifiers) later in this topic. The following example includes the "M" custom format specifier in a custom format string. @@ -1000,7 +1001,7 @@ Console.WriteLine(date1.ToString("dddd dd MMMM", _ The "s" custom format specifier represents the seconds as a number from 0 through 59. The result represents whole seconds that have passed since the last minute. A single-digit second is formatted without a leading zero. -If the "s" format specifier is used without other custom format specifiers, it is interpreted as the "s" standard date and time format specifier. For more information about using a single format specifier, see [Using Single Custom Format Specifiers](#Using-Single-Custom-Format-Specifiers) later in this topic. +If the "s" format specifier is used without other custom format specifiers, it is interpreted as the "s" standard date and time format specifier. For more information about using a single format specifier, see [Using single custom format specifiers](#using-single-custom-format-specifiers) later in this topic. The following example includes the "s" custom format specifier in a custom format string. @@ -1086,7 +1087,7 @@ Console.WriteLine(date1.ToString("hh:mm:ss.ff tt", _ The "t" custom format specifier represents the first character of the AM/PM designator. The appropriate localized designator is retrieved from the [DateTimeFormatInfo.AMDesignator](xref:System.Globalization.DateTimeFormatInfo.AMDesignator) or [DateTimeFormatInfo.PMDesignator](xref:System.Globalization.DateTimeFormatInfo.PMDesignator) property of the current or specific culture. The AM designator is used for all times from 0:00:00 (midnight) to 11:59:59.999. The PM designator is used for all times from 12:00:00 (noon) to 23:59:59.999. -If the "t" format specifier is used without other custom format specifiers, it is interpreted as the "t" standard date and time format specifier. For more information about using a single format specifier, see [Using Single Custom Format Specifiers](#Using-Single-Custom-Format-Specifiers) later in this topic. +If the "t" format specifier is used without other custom format specifiers, it is interpreted as the "t" standard date and time format specifier. For more information about using a single format specifier, see [Using single custom format specifiers](#using-single-custom-format-specifiers) later in this topic. The following example includes the "t" custom format specifier in a custom format string. @@ -1174,7 +1175,7 @@ Console.WriteLine(date1.ToString("hh:mm:ss.ff tt", _ The "y" custom format specifier represents the year as a one-digit or two-digit number. If the year has more than two digits, only the two low-order digits appear in the result. If the first digit of a two-digit year begins with a zero (for example, 2008), the number is formatted without a leading zero. -If the "y" format specifier is used without other custom format specifiers, it is interpreted as the "y" standard date and time format specifier. For more information about using a single format specifier, [Using Single Custom Format Specifiers](#Using-Single-Custom-Format-Specifiers) later in this topic. +If the "y" format specifier is used without other custom format specifiers, it is interpreted as the "y" standard date and time format specifier. For more information about using a single format specifier, [Using single custom format specifiers](#using-single-custom-format-specifiers) later in this topic. The following example includes the "y" custom format specifier in a custom format string. @@ -1541,7 +1542,7 @@ With [DateTimeOffset](xref:System.DateTimeOffset) values, this format specifier The offset is always displayed with a leading sign. A plus sign (+) indicates hours ahead of UTC, and a minus sign (-) indicates hours behind UTC. A single-digit offset is formatted without a leading zero. -If the "z" format specifier is used without other custom format specifiers, it is interpreted as a standard date and time format specifier and throws a [FormatException](xref:System.FormatException). For more information about using a single format specifier, see [Using Single Custom Format Specifiers](#Using-Single-Custom-Format-Specifiers) later in this topic. +If the "z" format specifier is used without other custom format specifiers, it is interpreted as a standard date and time format specifier and throws a [FormatException](xref:System.FormatException). For more information about using a single format specifier, see [Using single custom format specifiers](#using-single-custom-format-specifiers) later in this topic. The following example includes the "z" custom format specifier in a custom format string. @@ -1647,13 +1648,13 @@ Console.WriteLine(String.Format("{0:%z}, {0:zz}, {0:zzz}", _ The ":" custom format specifier represents the time separator, which is used to differentiate hours, minutes, and seconds. -If the ":" format specifier is used without other custom format specifiers, it is interpreted as a standard date and time format specifier and throws a [FormatException](xref:System.FormatException). For more information about using a single format specifier, see [Using Single Custom Format Specifiers](#Using-Single-Custom-Format-Specifiers) later in this topic. +If the ":" format specifier is used without other custom format specifiers, it is interpreted as a standard date and time format specifier and throws a [FormatException](xref:System.FormatException). For more information about using a single format specifier, see [Using single custom format specifiers](#using-single-custom-format-specifiers) later in this topic. ## The "/" custom format specifier The "/" custom format specifier represents the date separator, which is used to differentiate years, months, and days. -If the "/" format specifier is used without other custom format specifiers, it is interpreted as a standard date and time format specifier and throws a [FormatException](xref:System.FormatException). For more information about using a single format specifier, see [Using Single Custom Format Specifiers](#Using-Single-Custom-Format-Specifiers) later in this topic. +If the "/" format specifier is used without other custom format specifiers, it is interpreted as a standard date and time format specifier and throws a [FormatException](xref:System.FormatException). For more information about using a single format specifier, see [Using single custom format specifiers](#using-single-custom-format-specifiers) later in this topic. ## Character literals @@ -1729,7 +1730,7 @@ End Module There are two ways to indicate that characters are to be interpreted as literal characters and not as reserve characters, so that they can be included in a result string or successfully parsed in an input string: -* By escaping each reserved character. For more information, see [Using the Escape Character](#Using-the-Escape-Character). +* By escaping each reserved character. For more information, see [Using the escape character](#using-the-escape-character). The following example includes the literal characters "pst" (for Pacific Standard time) to represent the local time zone in a format string. Because both "s" and "t" are custom format strings, both characters must be escaped to be interpreted as character literals. @@ -1879,7 +1880,7 @@ Console.WriteLine("'{0:h }'", dat1) ' '1 ' ``` -### Using the Escape character +### Using the escape character The "d", "f", "F", "g", "h", "H", "K", "m", "M", "s", "t", "y", "z", ":", or "/" characters in a format string are interpreted as custom format specifiers rather than as literal characters. To prevent a character from being interpreted as a format specifier, you can precede it with a backslash (\), which is the escape character. The escape character signifies that the following character is a character literal that should be included in the result string unchanged. @@ -1917,7 +1918,7 @@ Formatting is influenced by properties of the current [DateTimeFormatInfo](xref: The result string produced by many of the custom date and time format specifiers also depends on properties of the current [DateTimeFormatInfo](xref:System.Globalization.DateTimeFormatInfo) object. Your application can change the result produced by some custom date and time format specifiers by changing the corresponding [DateTimeFormatInfo](xref:System.Globalization.DateTimeFormatInfo) property. For example, the "ddd" format specifier adds an abbreviated weekday name found in the [AbbreviatedDayNames](xref:System.Globalization.DateTimeFormatInfo.AbbreviatedDayNames) string array to the result string. Similarly, the "MMMM" format specifier adds a full month name found in the [MonthNames](xref:System.Globalization.DateTimeFormatInfo.MonthNames) string array to the result string. -## See Also +## See also [System.DateTime](xref:System.DateTime) diff --git a/docs/standard/base-types/custom-numeric.md b/docs/standard/base-types/custom-numeric.md index 9801aef347602..830f8b9f83ce7 100644 --- a/docs/standard/base-types/custom-numeric.md +++ b/docs/standard/base-types/custom-numeric.md @@ -3,6 +3,7 @@ title: Custom numeric format strings description: Custom numeric format strings keywords: .NET, .NET Core author: stevehoag +ms.author: shoag manager: wpickett ms.date: 07/25/2016 ms.topic: article @@ -18,7 +19,7 @@ You can create a custom numeric format string, which consists of one or more cus Custom numeric format strings are supported by some overloads of the `ToString` method of all numeric types. For example, you can supply a numeric format string to the [ToString(String)](xref:System.Int32.ToString(System.String)) and [ToString(String, IFormatProvider)](xref:System.Int32.ToString(System.String,System.IFormatProvider)) methods of the [Int32](xref:System.Int32) type. Custom numeric format strings are also supported by the .NET Framework [composite formatting](composite-format.md) feature, which is used by some `Write` and `WriteLine` methods of the [Console](xref:System.Console) and [StreamWriter](xref:System.IO.StreamWriter) classes, the [String.Format](xref:System.String.Format(System.IFormatProvider,System.String,System.Object)) method, and the [StringBuilder.AppendFormat](xref:System.Text.StringBuilder.AppendFormat(System.IFormatProvider,System.String,System.Object)) method. -The following table describes the custom numeric format specifiers and displays sample output produced by each format specifier. See the [Notes](#Notes) section for additional information about using custom numeric format strings, and the [Example](#Example) section for a comprehensive illustration of their use. +The following table describes the custom numeric format specifiers and displays sample output produced by each format specifier. See the [Notes](#notes) section for additional information about using custom numeric format strings, and the [Example](#example) section for a comprehensive illustration of their use. Format specifier | Name | Description | Examples ---------------- | ---- | ----------- | -------- diff --git a/docs/standard/base-types/custom-timespan.md b/docs/standard/base-types/custom-timespan.md index 103dfd3f4c2cf..4e496a0f34e7e 100644 --- a/docs/standard/base-types/custom-timespan.md +++ b/docs/standard/base-types/custom-timespan.md @@ -3,6 +3,7 @@ title: Custom TimeSpan format strings description: Custom TimeSpan format strings keywords: .NET, .NET Core author: stevehoag +ms.author: shoag manager: wpickett ms.date: 07/25/2016 ms.topic: article @@ -14,14 +15,14 @@ ms.assetid: e79745eb-6ebd-4e62-85c4-4f2830c27285 # Custom TimeSpan format strings -A [TimeSpan](xref:System.TimeSpan) format string defines the string representation of a [TimeSpan](xref:System.TimeSpan) value that results from a formatting operation. A custom format string consists of one or more custom [TimeSpan](xref:System.TimeSpan) format specifiers along with any number of literal characters. Any string that is not a [Standard TimeSpan](standard-timespan.md) format string is interpreted as a custom [TimeSpan](xref:System.TimeSpan) format string. +A [TimeSpan](xref:System.TimeSpan) format string defines the string representation of a [TimeSpan](xref:System.TimeSpan) value that results from a formatting operation. A custom format string consists of one or more custom [TimeSpan](xref:System.TimeSpan) format specifiers along with any number of literal characters. Any string that is not a [standard TimeSpan](standard-timespan.md) format string is interpreted as a custom [TimeSpan](xref:System.TimeSpan) format string. > [!IMPORTANT] > The custom [TimeSpan](xref:System.TimeSpan) format specifiers do not include placeholder separator symbols, such as the symbols that separate days from hours, hours from minutes, or seconds from fractional seconds. Instead, these symbols must be included in the custom format string as string literals. For example, `"dd\.hh\:mm"` defines a period (.) as the separator between days and hours, and a colon (:) as the separator between hours and minutes. -> Custom [TimeSpan](xref:System.TimeSpan) format specifiers also do not include a sign symbol that enables you to differentiate between negative and positive time intervals. To include a sign symbol, you have to construct a format string by using conditional logic. The [Other Characters](#Other-Characters) section includes an example. +> Custom [TimeSpan](xref:System.TimeSpan) format specifiers also do not include a sign symbol that enables you to differentiate between negative and positive time intervals. To include a sign symbol, you have to construct a format string by using conditional logic. The [Other characters](#other-characters) section includes an example. -The string representations of [TimeSpan](xref:System.TimeSpan) values are produced by calls to the overloads of the [TimeSpan](xref:System.TimeSpan) `ToString` method, as well as by methods that support composite formatting, such as [String.Format](xref:System.String.Format(System.IFormatProvider,System.String,System.Object)). For more information, see [Formatting Types](formatting-types.md) and [Composite Formatting](composite-format.md). The following example illustrates the use of standard format strings in formatting operations. +The string representations of [TimeSpan](xref:System.TimeSpan) values are produced by calls to the overloads of the [TimeSpan](xref:System.TimeSpan) `ToString` method, as well as by methods that support composite formatting, such as [String.Format](xref:System.String.Format(System.IFormatProvider,System.String,System.Object)). For more information, see [Formatting types](formatting-types.md) and [Composite formatting](composite-format.md). The following example illustrates the use of standard format strings in formatting operations. ```csharp using System; @@ -172,7 +173,7 @@ Format specifier | Description | Examples \ | The escape character. | `new TimeSpan(14, 32, 17):` `hh\:mm\:ss --> "14:32:17"` Any other character | Any other unescaped character is interpreted as a custom format specifier. | `new TimeSpan(14, 32, 17):` `hh\:mm\:ss --> "14:32:17"` -## The "d" Custom Format Specifier +## The "d" custom format specifier The "d" custom format specifier outputs the value of the [TimeSpan.Days](xref:System.TimeSpan.Days) property, which represents the number of whole days in the time interval. It outputs the full number of days in a [TimeSpan](xref:System.TimeSpan) value, even if the value has more than one digit. If the value of the [TimeSpan.Days](xref:System.TimeSpan.Days) property is zero, the specifier outputs "0". @@ -214,7 +215,7 @@ Console.WriteLine(ts3.ToString("d\.hh\:mm\:ss")) ' 3.04:03:17 ``` -## The "dd"-"dddddddd" Custom Format Specifiers +## The "dd"-"dddddddd" custom format specifiers The "dd", "ddd", "dddd", "ddddd", "dddddd", "ddddddd", and "dddddddd" custom format specifiers output the value of the [TimeSpan.Days](xref:System.TimeSpan.Days) property, which represents the number of whole days in the time interval. @@ -289,7 +290,7 @@ Next ' dddddddd\.hh\:mm\:ss --> 00000365.21:19:45 ``` -## The "h" Custom Format Specifier +## The "h" custom format specifier The "h" custom format specifier outputs the value of the [TimeSpan.Hours](xref:System.TimeSpan.Hours) property, which represents the number of whole hours in the time interval that is not counted as part of its day component. It returns a one-digit string value if the value of the [TimeSpan.Hours](xref:System.TimeSpan.Hours) property is 0 through 9, and it returns a two-digit string value if the value of the [TimeSpan.Hours](xref:System.TimeSpan.Hours) property ranges from 10 to 23. @@ -360,7 +361,7 @@ Console.WriteLine(ts2.ToString("d\.h\:mm\:ss")) ' 3.4:03:17 ``` -## The "hh" Custom Format Specifier +## The "hh" custom format specifier The "hh" custom format specifier outputs the value of the [TimeSpan.Hours](xref:System.TimeSpan.Hours) property, which represents the number of whole hours in the time interval that is not counted as part of its day component. For values from 0 through 9, the output string includes a leading zero. @@ -415,7 +416,7 @@ Console.WriteLine(ts2.ToString("d\.hh\:mm\:ss")) ' 3.04:03:17 ``` -## The "m" Custom Format Specifier +## The "m" custom format specifier The "m" custom format specifier outputs the value of the [TimeSpan.Minutes](xref:System.TimeSpan.Minutes) property, which represents the number of whole minutes in the time interval that is not counted as part of its day component. It returns a one-digit string value if the value of the [TimeSpan.Minutes](xref:System.TimeSpan.Minutes) property is 0 through 9, and it returns a two-digit string value if the value of the [TimeSpan.Minutes](xref:System.TimeSpan.Minutes) property ranges from 10 to 59. @@ -486,7 +487,7 @@ Console.WriteLine("Elapsed time: {0:m\:ss}", ts2) ' Elapsed time: 18:44 ``` -## The "mm" Custom Format Specifier +## The "mm" custom format specifier The "mm" custom format specifier outputs the value of the [TimeSpan.Minutes](xref:System.TimeSpan.Minutes) property, which represents the number of whole minutes in the time interval that is not included as part of its hours or days component. For values from 0 through 9, the output string includes a leading zero. @@ -537,7 +538,7 @@ Console.WriteLine("Travel time: {0:hh\:mm}", ' Travel time: 05:16 ``` -## The "s" Custom Format Specifier +## The "s" custom format specifier The "s" custom format specifier outputs the value of the [TimeSpan.Seconds](xref:System.TimeSpan.Seconds) property, which represents the number of whole seconds in the time interval that is not included as part of its hours, days, or minutes component. It returns a one-digit string value if the value of the [TimeSpan.Seconds](xref:System.TimeSpan.Seconds) property is 0 through 9, and it returns a two-digit string value if the value of the [TimeSpan.Seconds](xref:System.TimeSpan.Seconds) property ranges from 10 to 59. @@ -604,7 +605,7 @@ Console.WriteLine("Elapsed Time: {0:s\:fff} seconds", ' Elapsed Time: 6:003 seconds ``` -## The "ss" Custom Format Specifier +## The "ss" custom format specifier The "ss" custom format specifier outputs the value of the [TimeSpan.Seconds](xref:System.TimeSpan.Seconds) property, which represents the number of whole seconds in the time interval that is not included as part of its hours, days, or minutes component. For values from 0 through 9, the output string includes a leading zero. @@ -667,7 +668,7 @@ Console.WriteLine(interval2.ToString("ss\.fff")) ' 06.485 ``` -## The "f" Custom Format Specifier +## The "f" custom format specifier The "f" custom format specifier outputs the tenths of a second in a time interval. In a formatting operation, any remaining fractional digits are truncated. In a parsing operation that calls the [TimeSpan.ParseExact](xref:System.TimeSpan.ParseExact(System.String,System.String,System.IFormatProvider)) or [TimeSpan.TryParseExact](xref:System.TimeSpan.TryParseExact(System.String,System.String,System.IFormatProvider,System.Globalization.TimeSpanStyles,System.TimeSpan@)) method, the input string must contain exactly one fractional digit. @@ -745,7 +746,7 @@ Next ' s\.fffffff: 29.8765432 ``` -## The "ff" Custom Format Specifier +## The "ff" custom format specifier The "ff" custom format specifier outputs the hundredths of a second in a time interval. In a formatting operation, any remaining fractional digits are truncated. In a parsing operation that calls the [TimeSpan.ParseExact](xref:System.TimeSpan.ParseExact(System.String,System.String,System.IFormatProvider)) or [TimeSpan.TryParseExact](xref:System.TimeSpan.TryParseExact(System.String,System.String,System.IFormatProvider,System.Globalization.TimeSpanStyles,System.TimeSpan@)) method, the input string must contain exactly two fractional digits. @@ -821,7 +822,7 @@ Next ' s\.fffffff: 29.8765432 ``` -## The "fff" Custom Format Specifier +## The "fff" custom format specifier The "fff" custom format specifier (with three "f" characters) outputs the milliseconds in a time interval. In a formatting operation, any remaining fractional digits are truncated. In a parsing operation that calls the [TimeSpan.ParseExact](xref:System.TimeSpan.ParseExact(System.String,System.String,System.IFormatProvider)) or [TimeSpan.TryParseExact](xref:System.TimeSpan.TryParseExact(System.String,System.String,System.IFormatProvider,System.Globalization.TimeSpanStyles,System.TimeSpan@)) method, the input string must contain exactly three fractional digits. @@ -897,7 +898,7 @@ Next ' s\.fffffff: 29.8765432 ``` -## The "ffff" Custom Format Specifier +## The "ffff" custom format specifier The "ffff" custom format specifier (with four "f" characters) outputs the ten-thousandths of a second in a time interval. In a formatting operation, any remaining fractional digits are truncated. In a parsing operation that calls the [TimeSpan.ParseExact](xref:System.TimeSpan.ParseExact(System.String,System.String,System.IFormatProvider)) or [TimeSpan.TryParseExact](xref:System.TimeSpan.TryParseExact(System.String,System.String,System.IFormatProvider,System.Globalization.TimeSpanStyles,System.TimeSpan@)) method, the input string must contain exactly four fractional digits. The following example uses the "ffff" custom format specifier to display the ten-thousandths of a second in a [TimeSpan](xref:System.TimeSpan) value. "ffff" is used first as the only format specifier, and then combined with the "s" specifier in a custom format string. @@ -972,7 +973,7 @@ Next ' s\.fffffff: 29.8765432 ``` -## The "fffff" Custom Format Specifier +## The "fffff" custom format specifier The "fffff" custom format specifier (with five "f" characters) outputs the hundred-thousandths of a second in a time interval. In a formatting operation, any remaining fractional digits are truncated. In a parsing operation that calls the [TimeSpan.ParseExact](xref:System.TimeSpan.ParseExact(System.String,System.String,System.IFormatProvider)) or [TimeSpan.TryParseExact](xref:System.TimeSpan.TryParseExact(System.String,System.String,System.IFormatProvider,System.Globalization.TimeSpanStyles,System.TimeSpan@)) method, the input string must contain exactly five fractional digits. The following example uses the "fffff" custom format specifier to display the hundred-thousandths of a second in a [TimeSpan](xref:System.TimeSpan) value. "fffff" is used first as the only format specifier, and then combined with the "s" specifier in a custom format string. @@ -1047,7 +1048,7 @@ Next ' s\.fffffff: 29.8765432 ``` -## The "ffffff" Custom Format Specifier +## The "ffffff" custom format specifier The "ffffff" custom format specifier (with six "f" characters) outputs the millionths of a second in a time interval. In a formatting operation, any remaining fractional digits are truncated. In a parsing operation that calls the [TimeSpan.ParseExact](xref:System.TimeSpan.ParseExact(System.String,System.String,System.IFormatProvider)) or [TimeSpan.TryParseExact](xref:System.TimeSpan.TryParseExact(System.String,System.String,System.IFormatProvider,System.Globalization.TimeSpanStyles,System.TimeSpan@)) method, the input string must contain exactly six fractional digits. @@ -1123,7 +1124,7 @@ Next ' s\.fffffff: 29.8765432 ``` -## The "fffffff" Custom Format Specifier +## The "fffffff" custom format specifier The "fffffff" custom format specifier (with seven "f" characters) outputs the ten-millionths of a second (or the fractional number of ticks) in a time interval. In a parsing operation that calls the [TimeSpan.ParseExact](xref:System.TimeSpan.ParseExact(System.String,System.String,System.IFormatProvider)) or [TimeSpan.TryParseExact](xref:System.TimeSpan.TryParseExact(System.String,System.String,System.IFormatProvider,System.Globalization.TimeSpanStyles,System.TimeSpan@)) method, the input string must contain exactly seven fractional digits. @@ -1199,7 +1200,7 @@ Next ' s\.fffffff: 29.8765432 ``` -## The "F" Custom Format Specifier +## The "F" custom format specifier The "F" custom format specifier outputs the tenths of a second in a time interval. In a formatting operation, any remaining fractional digits are truncated. If the value of the time interval's tenths of a second is zero, it is not included in the result string. In a parsing operation that calls the [TimeSpan.ParseExact](xref:System.TimeSpan.ParseExact(System.String,System.String,System.IFormatProvider)) or [TimeSpan.TryParseExact](xref:System.TimeSpan.TryParseExact(System.String,System.String,System.IFormatProvider,System.Globalization.TimeSpanStyles,System.TimeSpan@)) method, the presence of the tenths of a second digit is optional. @@ -1272,7 +1273,7 @@ Next ' Cannot parse 0:0:03.12 with 'h\:m\:ss\.F'. ``` -## The "FF" Custom Format Specifier +## The "FF" custom format specifier The "FF" custom format specifier outputs the hundredths of a second in a time interval. In a formatting operation, any remaining fractional digits are truncated. If there are any trailing fractional zeros, they are not included in the result string. In a parsing operation that calls the [TimeSpan.ParseExact](xref:System.TimeSpan.ParseExact(System.String,System.String,System.IFormatProvider)) or [TimeSpan.TryParseExact](xref:System.TimeSpan.TryParseExact(System.String,System.String,System.IFormatProvider,System.Globalization.TimeSpanStyles,System.TimeSpan@)) method, the presence of the tenths and hundredths of a second digit is optional. @@ -1343,7 +1344,7 @@ Next ' Cannot parse 0:0:03.127 with 'h\:m\:ss\.FF'. ``` -## The "FFF" Custom Format Specifier +## The "FFF" custom format specifier The "FFF" custom format specifier (with three "F" characters) outputs the milliseconds in a time interval. In a formatting operation, any remaining fractional digits are truncated. If there are any trailing fractional zeros, they are not included in the result string. In a parsing operation that calls the [TimeSpan.ParseExact](xref:System.TimeSpan.ParseExact(System.String,System.String,System.IFormatProvider)) or [TimeSpan.TryParseExact](xref:System.TimeSpan.TryParseExact(System.String,System.String,System.IFormatProvider,System.Globalization.TimeSpanStyles,System.TimeSpan@)) method, the presence of the tenths, hundredths, and thousandths of a second digit is optional. @@ -1414,7 +1415,7 @@ Next ' Cannot parse 0:0:03.1279 with 'h\:m\:ss\.FFF'. ``` -## The "FFFF" Custom Format Specifier +## The "FFFF" custom format specifier The "FFFF" custom format specifier (with four "F" characters) outputs the ten-thousandths of a second in a time interval. In a formatting operation, any remaining fractional digits are truncated. If there are any trailing fractional zeros, they are not included in the result string. In a parsing operation that calls the [TimeSpan.ParseExact](xref:System.TimeSpan.ParseExact(System.String,System.String,System.IFormatProvider)) or [TimeSpan.TryParseExact](xref:System.TimeSpan.TryParseExact(System.String,System.String,System.IFormatProvider,System.Globalization.TimeSpanStyles,System.TimeSpan@)) method, the presence of the tenths, hundredths, thousandths, and ten-thousandths of a second digit is optional. @@ -1485,7 +1486,7 @@ Next ' Cannot parse 0:0:03.12795 with 'h\:m\:ss\.FFFF'. ``` -## The "FFFFF" Custom Format Specifier +## The "FFFFF" custom format specifier The "FFFFF" custom format specifier (with five "F" characters) outputs the hundred-thousandths of a second in a time interval. In a formatting operation, any remaining fractional digits are truncated. If there are any trailing fractional zeros, they are not included in the result string. In a parsing operation that calls the [TimeSpan.ParseExact](xref:System.TimeSpan.ParseExact(System.String,System.String,System.IFormatProvider)) or [TimeSpan.TryParseExact](xref:System.TimeSpan.TryParseExact(System.String,System.String,System.IFormatProvider,System.Globalization.TimeSpanStyles,System.TimeSpan@)) method, the presence of the tenths, hundredths, thousandths, ten-thousandths, and hundred-thousandths of a second digit is optional. @@ -1556,7 +1557,7 @@ Next ' Cannot parse 0:0:03.127956 with 'h\:m\:ss\.FFFF'. ``` -## The "FFFFFF" Custom Format Specifier +## The "FFFFFF" custom format specifier The "FFFFFF" custom format specifier (with six "F" characters) outputs the millionths of a second in a time interval. In a formatting operation, any remaining fractional digits are truncated. If there are any trailing fractional zeros, they are not included in the result string. In a parsing operation that calls the [TimeSpan.ParseExact](xref:System.TimeSpan.ParseExact(System.String,System.String,System.IFormatProvider)) or [TimeSpan.TryParseExact](xref:System.TimeSpan.TryParseExact(System.String,System.String,System.IFormatProvider,System.Globalization.TimeSpanStyles,System.TimeSpan@)) method, the presence of the tenths, hundredths, thousandths, ten-thousandths, hundred-thousandths, and millionths of a second digit is optional. @@ -1627,7 +1628,7 @@ Next ' Cannot parse 0:0:03.1279569 with 'h\:m\:ss\.FFFFFF' ``` -## The "FFFFFFF" Custom Format Specifier +## The "FFFFFFF" custom format specifier The "FFFFFFF" custom format specifier (with seven "F" characters) outputs the ten-millionths of a second (or the fractional number of ticks) in a time interval. If there are any trailing fractional zeros, they are not included in the result string. In a parsing operation that calls the [TimeSpan.ParseExact](xref:System.TimeSpan.ParseExact(System.String,System.String,System.IFormatProvider)) or [TimeSpan.TryParseExact](xref:System.TimeSpan.TryParseExact(System.String,System.String,System.IFormatProvider,System.Globalization.TimeSpanStyles,System.TimeSpan@)) method, the presence of the seven fractional digits in the input string is optional. diff --git a/docs/standard/base-types/escapes.md b/docs/standard/base-types/escapes.md index 700f6109ac771..ff8d0b8e1fe74 100644 --- a/docs/standard/base-types/escapes.md +++ b/docs/standard/base-types/escapes.md @@ -3,6 +3,7 @@ title: Character escapes in regular expressions description: Character escapes in regular expressions keywords: .NET, .NET Core author: stevehoag +ms.author: shoag manager: wpickett ms.date: 07/29/2016 ms.topic: article @@ -31,14 +32,14 @@ Character or sequence | Description --------------------- | ----------- All characters except for the following: **. $ ^ { [ ( | ) * + ? \** | Characters other than those listed in the **Character or sequence** column have no special meaning in regular expressions; they match themselves. The characters included in the **Character or sequence** column are special regular expression language elements. To match them in a regular expression, they must be escaped or included in a positive character group. For example, the regular expression `\$\d+ or [$]\d+` matches "$1200". **\a** | Matches a bell (alarm) character, **\u0007**. -**\b** | In a __[__*character*_*group*__]__ character class, matches a backspace, **\u0008**. (See [Character Classes in Regular Expressions](classes.md).) Outside a character class, **\b** is an anchor that matches a word boundary. (See [Anchors in Regular Expressions](anchors.md).) +**\b** | In a __[__*character*_*group*__]__ character class, matches a backspace, **\u0008**. (See [Character classes in regular expressions](classes.md).) Outside a character class, **\b** is an anchor that matches a word boundary. (See [Anchors in regular expressions](anchors.md).) **\t** | Matches a tab, **\u0009**. **\r** | Matches a carriage return, **\u000D**. Note that **\r** is not equivalent to the newline character, **\n**. **\v** | Matches a vertical tab, **\u000B**. **\f** | Matches a form feed, **\u000C**. **\n** | Matches a new line, **\u000A**. **\e** | Matches an escape, **\u001B**. -**\**_nnn_ | Matches an ASCII character, where nnn consists of two or three digits that represent the octal character code. For example, `\040` represents a space character. This construct is interpreted as a backreference if it has only one digit (for example, `\2`) or if it corresponds to the number of a capturing group. (See [Backreference Constructs in Regular Expressions](backreference.md).) +**\**_nnn_ | Matches an ASCII character, where nnn consists of two or three digits that represent the octal character code. For example, `\040` represents a space character. This construct is interpreted as a backreference if it has only one digit (for example, `\2`) or if it corresponds to the number of a capturing group. (See [Backreference constructs in regular expressions](backreference.md).) **\x**_nn_ | Matches an ASCII character, where *nn* is a two-digit hexadecimal character code. **\c**_X_ | Matches an ASCII control character, where *X* is the letter of the control character. For example, `\cC` is CTRL-C. **\u**_nnnn_ | Matches a UTF-16 code unit whose value is *nnnn* hexadecimal. **Note** The Perl 5 character escape that is used to specify Unicode is not supported by .NET. The Perl 5 character escape has the form **\x{####…}**, where **####…** is a series of hexadecimal digits. Instead, use **\u**_nnnn_. diff --git a/docs/standard/base-types/extract-day.md b/docs/standard/base-types/extract-day.md index 667e009f5cf6e..dc09e9621742a 100644 --- a/docs/standard/base-types/extract-day.md +++ b/docs/standard/base-types/extract-day.md @@ -3,6 +3,7 @@ title: "How to: extract the day of the week from a specific date" description: How to extract the day of the week from a specific date keywords: .NET, .NET Core author: stevehoag +ms.author: shoag manager: wpickett ms.date: 07/26/2016 ms.topic: article @@ -18,7 +19,7 @@ ms.assetid: 88a8f8b9-f5c9-4503-b968-84468b52bb8e ## To extract a number indicating the day of the week from a specific date -1. If you are working with the string representation of a date, convert it to a [DateTime](xref:System.DateTime) or a [DateTimeOffset](xref:System.DateTimeOffset) value by using the static [DateTime.Parse](xref:System.DateTime#System.DateTime.Parse(System.String)) or [DateTimeOffset.Parse](xref:System.DateTimeOffset.Parse(System.String)) method. +1. If you are working with the string representation of a date, convert it to a [DateTime](xref:System.DateTime) or a [DateTimeOffset](xref:System.DateTimeOffset) value by using the static [DateTime.Parse](xref:System.DateTime.Parse(System.String)) or [DateTimeOffset.Parse](xref:System.DateTimeOffset.Parse(System.String)) method. 2. Use the [Datetime.DayOfWeek](xref:System.DateTime.DayOfWeek) or [DateTimeOffset.DayOfWeek](xref:System.DateTimeOffset.DayOfWeek) property to retrieve a [DayOfWeek](xref:System.DayOfWeek) value that indicates the day of the week. @@ -54,7 +55,7 @@ End Module ## To extract the abbreviated weekday name from a specific date -1. If you are working with the string representation of a date, convert it to a [DateTime](xref:System.DateTime) or a [DateTimeOffset](xref:System.DateTimeOffset) value by using the static [DateTime.Parse](xref:System.DateTime#System.DateTime.Parse(System.String)) or [DateTimeOffset.Parse](xref:System.DateTimeOffset.Parse(System.String)) method. +1. If you are working with the string representation of a date, convert it to a [DateTime](xref:System.DateTime) or a [DateTimeOffset](xref:System.DateTimeOffset) value by using the static [DateTime.Parse](xref:System.DateTime.Parse(System.String)) or [DateTimeOffset.Parse](xref:System.DateTimeOffset.Parse(System.String)) method. 2. You can extract the abbreviated weekday name of the current culture or of a specific culture: @@ -463,7 +464,7 @@ End Module ' lundi ``` -## See Also +## See also [Performing formatting operations](performing-formatting-operations.md) diff --git a/docs/standard/base-types/formatting-types.md b/docs/standard/base-types/formatting-types.md index 9dbce13a652c9..80a8642c087df 100644 --- a/docs/standard/base-types/formatting-types.md +++ b/docs/standard/base-types/formatting-types.md @@ -1,8 +1,9 @@ - --- +--- title: Formatting types description: Formatting types keywords: .NET, .NET Core author: stevehoag +ms.author: shoag manager: wpickett ms.date: 07/20/2016 ms.topic: article @@ -12,60 +13,60 @@ ms.devlang: dotnet ms.assetid: cf497639-9f91-45cb-836f-998d1cea2f43 --- -# Formatting Types +# Formatting types Formatting is the process of converting an instance of a class, structure, or enumeration value to its string representation, often so that the resulting string can be displayed to users or deserialized to restore the original data type. This conversion can pose a number of challenges: -* The way that values are stored internally does not necessarily reflect the way that users want to view them. For example, a telephone number might be stored in the form **8009999999**, which is not user-friendly. It should instead be displayed as **800-999-9999**. See the [Custom Format Strings](#Custom-Format-Strings) section for an example that formats a number in this way. +* The way that values are stored internally does not necessarily reflect the way that users want to view them. For example, a telephone number might be stored in the form **8009999999**, which is not user-friendly. It should instead be displayed as **800-999-9999**. See the [Custom format strings](#custom-format-strings) section for an example that formats a number in this way. -* Sometimes the conversion of an object to its string representation is not intuitive. For example, it is not clear how the string representation of a **Temperature** object or a **Person** object should appear. For an example that formats a **Temperature** object in a variety of ways, see the [Standard Format Strings](#Standard-Format-Strings) section. +* Sometimes the conversion of an object to its string representation is not intuitive. For example, it is not clear how the string representation of a **Temperature** object or a **Person** object should appear. For an example that formats a **Temperature** object in a variety of ways, see the [Standard format strings](#standard-format-strings) section. -* Values often require culture-sensitive formatting. For example, in an application that uses numbers to reflect monetary values, numeric strings should include the current culture’s currency symbol, group separator (which, in most cultures, is the thousands separator), and decimal symbol. For an example, see the [Culture-Sensitive Formatting with Format Providers and the IFormatProvider Interface](#Culture-Sensitive-Formatting-with-Format-Providers-and-the-IFormatProvider-Interface) section. +* Values often require culture-sensitive formatting. For example, in an application that uses numbers to reflect monetary values, numeric strings should include the current culture’s currency symbol, group separator (which, in most cultures, is the thousands separator), and decimal symbol. For an example, see the [Culture-sensitive formatting with format providers and the IFormatProvider interface](#culture-sensitive-formatting-with-format-providers-and-the-iformatprovider-interface) section. -* An application may have to display the same value in different ways. For example, an application may represent an enumeration member by displaying a string representation of its name or by displaying its underlying value. For an example that formats a member of the [DayOfWeek](xref:System.DayOfWeek) enumeration in different ways, see the [Standard Format Strings](#Standard-Format-Strings) section. +* An application may have to display the same value in different ways. For example, an application may represent an enumeration member by displaying a string representation of its name or by displaying its underlying value. For an example that formats a member of the [DayOfWeek](xref:System.DayOfWeek) enumeration in different ways, see the [Standard format strings](#standard-format-strings) section. .NET provides rich formatting support that enables developers to address these requirements. > [!NOTE] -> Formatting converts the value of a type into a string representation. Parsing is the inverse of formatting. A parsing operation creates an instance of a data type from its string representation. For information about converting strings to other data types, see [Parsing Strings](parsing-strings.md). +> Formatting converts the value of a type into a string representation. Parsing is the inverse of formatting. A parsing operation creates an instance of a data type from its string representation. For information about converting strings to other data types, see [Parsing strings](parsing-strings.md). This overview contains the following sections: -* [Formatting in .NET](#Formatting-in-.NET) +* [Formatting in .NET](#formatting-in-net) -* [Default Formatting Using the ToString Method](#Default-Formatting-Using-the-ToString-Method) +* [Default formatting using the ToString method](#default-formatting-using-the-tostring-method) -* [Overriding the ToString Method](#Overriding-the-ToString-Method) +* [Overriding the ToString method](#overriding-the-tostring-method) -* [The ToString Method and Format Strings](#The-ToString-Method-and-Format-Strings) +* [The ToString method and format strings](#the-tostring-method-and-format-strings) - * [Standard Format Strings](#Standard-Format-Strings) + * [Standard format strings](#standard-format-strings) - * [Custom Format Strings](#Custom-Format-Strings) + * [Custom format strings](#custom-format-strings) - * [Format Strings and .NET Types](#Format-Strings-and-.NET-Types) + * [Format strings and .NET types](#format-strings-and-net-types) -* [Culture-Sensitive Formatting with Format Providers and the IFormatProvider Interface](#Culture-Sensitive-Formatting-with-Format-Providers-and-the-IFormatProvider-Interface) +* [Culture-sensitive formatting with format providers and the IFormatProvider interface](#culture-sensitive-formatting-with-format-providers-and-the-iformatprovider-interface) - * [Culture-Sensitive Formatting of Numeric Values](#Culture-Sensitive-Formatting-of-Numeric-Values) + * [Culture-sensitive formatting of numeric values](#culture-sensitive-formatting-of-numeric-values) - * [Culture-Sensitive Formatting of Date and Time Values](#Culture-Sensitive-Formatting-of-Date-and-Time-Values) + * [Culture-sensitive formatting of date and time values](#culture-sensitive-formatting-of-date-and-time-values) -* [The IFormattable Interface](#The-IFormattable-Interface) +* [The IFormattable interface](#the-iformattable-interface) -* [Composite Formatting](#Composite-Formatting) +* [Composite formatting](#composite-formatting) -* [Custom Formatting with ICustomFormatter](#Custom-Formatting-with-ICustomFormatter) +* [Custom formatting with ICustomFormatter](#custom-formatting-with-icustomformatter) -* [Related Topics](#Related-Topics) +* [Related topics](#related-topics) -* [Reference](#Reference) +* [Reference](#reference) ## Formatting in .NET -The basic mechanism for formatting is the default implementation of the [Object.ToString](xref:System.Object.ToString) method, which is discussed in the [Default Formatting Using the ToString Method](#Default-Formatting-Using-the-ToString-Method) section later in this topic. However, .NET provides several ways to modify and extend its default formatting support. These include the following: +The basic mechanism for formatting is the default implementation of the [Object.ToString](xref:System.Object.ToString) method, which is discussed in the [Default formatting using the ToString method](#default-formatting-using-the-tostring-method) section later in this topic. However, .NET provides several ways to modify and extend its default formatting support. These include the following: -* Overriding the [Object.ToString](xref:System.Object.ToString) method to define a custom string representation of an object’s value. For more information, see the [Overriding the ToString Method](#Overriding-the-ToString-Method) section later in this topic. +* Overriding the [Object.ToString](xref:System.Object.ToString) method to define a custom string representation of an object’s value. For more information, see the [Overriding the ToString method](#overriding-the-tostring-method) section later in this topic. * Defining format specifiers that enable the string representation of an object’s value to take multiple forms. For example, the "X" format specifier in the following statement converts an integer to the string representation of a hexadecimal value. @@ -79,7 +80,7 @@ The basic mechanism for formatting is the default implementation of the [Object. Console.WriteLine(integerValue.ToString("X")) ' Displays EB98. ``` - For more information about format specifiers, see the [The ToString Method and Format Strings](#The-ToString-Method-and-Format-Strings) section. + For more information about format specifiers, see the [The ToString method and format strings](#the-tostring-method-and-format-strings) section. * Using format providers to take advantage of the formatting conventions of a specific culture. For example, the following statement displays a currency value by using the formatting conventions of the en-US culture. @@ -98,17 +99,17 @@ The basic mechanism for formatting is the default implementation of the [Object. ' $1,632.54 ``` - For more information about formatting with format providers, see the [Culture-Sensitive Formatting with Format Providers and the IFormatProvider Interface](#Culture-Sensitive-Formatting-with-Format-Providers-and-the-IFormatProvider-Interface) section. + For more information about formatting with format providers, see the [Culture-sensitive formatting with format providers and the IFormatProvider interface](#culture-sensitive-formatting-with-format-providers-and-the-iformatprovider-interface) section. -* Implementing the [IFormattable](xref:System.IFormattable) interface to support both string conversion with the [Convert](xref:System.Convert) class and composite formatting. For more information, see the [The IFormattable Interface](#The-IFormattable-Interface) section. +* Implementing the [IFormattable](xref:System.IFormattable) interface to support both string conversion with the [Convert](xref:System.Convert) class and composite formatting. For more information, see the [The IFormattable interface](#the-iformattable-interface) section. -* Using composite formatting to embed the string representation of a value in a larger string. For more information, see the [Composite Formatting](#Composite-Formatting) section. +* Using composite formatting to embed the string representation of a value in a larger string. For more information, see the [Composite formatting](#composite-formatting) section. -* Implementing [ICustomFormatter](xref:System.ICustomFormatter) and [IFormatProvider](xref:System.IFormatProvider) to provide a complete custom formatting solution. For more information, see the [Custom Formatting with ICustomFormatter](#Custom-Formatting-with-ICustomFormatter) section. +* Implementing [ICustomFormatter](xref:System.ICustomFormatter) and [IFormatProvider](xref:System.IFormatProvider) to provide a complete custom formatting solution. For more information, see the [Custom formatting with ICustomFormatter](#custom-formatting-with-icustomformatter) section. The following sections examine these methods for converting an object to its string representation. -## Default Formatting Using the ToString Method +## Default formatting using the ToString method Every type that is derived from [System.Object](xref:System.Object) automatically inherits a parameterless [ToString](xref:System.Object.ToString) method, which returns the name of the type by default. The following example illustrates the default [ToString](xref:System.Object.ToString) method. It defines a class named `Automobile` that has no implementation. When the class is instantiated and its [ToString](xref:System.Object.ToString) method is called, it displays its type name. Note that the [ToString](xref:System.Object.ToString) method is not explicitly called in the example. The [Console.WriteLine(Object)](xref:System.Console.WriteLine(System.Object)) method implicitly calls the [ToString](xref:System.Object.ToString) method of the object passed to it as an argument. @@ -152,7 +153,7 @@ Because all types other than interfaces are derived from [Object](xref:System.Ob > [!NOTE] > Structures inherit from [ValueType](xref:System.ValueType), which in turn is derived from [Object](xref:System.Object). Although [ValueType](xref:System.ValueType) overrides [Object.ToString](xref:System.Object.ToString), its implementation is identical. -## Overriding the ToString Method +## Overriding the ToString method Displaying the name of a type is often of limited use and does not allow consumers of your types to differentiate one instance from another. However, you can override the [ToString](xref:System.Object.ToString) method to provide a more useful representation of an object’s value. The following example defines a `Temperature` object and overrides its [ToString](xref:System.Object.ToString) method to display the temperature in degrees Celsius. @@ -230,7 +231,7 @@ Type | ToString override [UInt32](xref:System.UInt32) | Calls `Int16.ToString("G", NumberFormatInfo.CurrentInfo)` to format the [UInt32](xref:System.UInt32) value for the current culture. [UInt64](xref:System.UInt64) | Calls `Int16.ToString("G", NumberFormatInfo.CurrentInfo)` to format the [UInt64](xref:System.UInt64) value for the current culture. -## The ToString Method and Format Strings +## The ToString method and format strings Relying on the default [ToString](xref:System.Object.ToString) method or overriding [ToString](xref:System.Object.ToString) is appropriate when an object has a single string representation. However, the value of an object often has multiple representations. For example, a temperature can be expressed in degrees Fahrenheit, degrees Celsius, or kelvins. Similarly, the integer value 10 can be represented in numerous ways, including 10, 10.0, 1.0e01, or $10.00. @@ -238,7 +239,7 @@ To enable a single value to have multiple string representations, .NET uses form All numeric types, date and time types, and enumeration types in .NET support a predefined set of format specifiers. You can also use format strings to define multiple string representations of your application-defined data types. -### Standard Format Strings +### Standard format strings A standard format string contains a single format specifier, which is an alphabetic character that defines the string representation of the object to which it is applied, along with an optional precision specifier that affects how many digits are displayed in the result string. If the precision specifier is omitted or is not supported, a standard format specifier is equivalent to a standard format string. @@ -273,7 +274,7 @@ Next ' 00000001 ``` -For information about enumeration format strings, see [Enumeration Format Strings](enumeration-format.md). +For information about enumeration format strings, see [Enumeration format strings](enumeration-format.md). Standard format strings for numeric types usually define a result string whose precise appearance is controlled by one or more property values. For example, the "C" format specifier formats a number as a currency value. When you call the [ToString](xref:System.Object.ToString) method with the "C" format specifier as the only parameter, the following property values from the current culture’s [NumberFormatInfo](xref:System.Globalization.NumberFormatInfo) object are used to define the string representation of the numeric value: @@ -320,9 +321,9 @@ Next ' 00FF ``` -For more information about standard numeric formatting strings, see [Standard Numeric Format Strings](standard-numeric.md). +For more information about standard numeric formatting strings, see [Standard numeric format strings](standard-numeric.md). -Standard format strings for date and time values are aliases for custom format strings stored by a particular [DateTimeFormatInfo](xref:System.Globalization.DateTimeFormatInfo) class. For example, calling the [ToString](xref:System.Object.ToString) method of a date and time value with the "D" format specifier displays the date and time by using the custom format string stored in the current culture’s [DateTimeFormatInfo.LongDatePattern](xref:System.Globalization.DateTimeFormatInfo.LongDatePattern) property. (For more information about custom format strings, see the [Custom Format Strings](#Custom-Format-Strings) section.) The following example illustrates this relationship. +Standard format strings for date and time values are aliases for custom format strings stored by a particular [DateTimeFormatInfo](xref:System.Globalization.DateTimeFormatInfo) class. For example, calling the [ToString](xref:System.Object.ToString) method of a date and time value with the "D" format specifier displays the date and time by using the custom format string stored in the current culture’s [DateTimeFormatInfo.LongDatePattern](xref:System.Globalization.DateTimeFormatInfo.LongDatePattern) property. (For more information about custom format strings, see the [Custom format strings](#custom-format-strings) section.) The following example illustrates this relationship. ```csharp using System; @@ -363,7 +364,7 @@ End Module ' 'dddd, MMMM dd, yyyy' custom format string: Tuesday, June 30, 2009 ``` -For more information about standard date and time format strings, see [Standard Date and Time Format Strings](standard-datetime.md). +For more information about standard date and time format strings, see [Standard date and time format strings](standard-datetime.md). You can also use standard format strings to define the string representation of an application-defined object that is produced by the object’s `ToString(String)` method. You can define the specific standard format specifiers that your object supports, and you can determine whether they are case-sensitive or case-insensitive. Your implementation of the `ToString(String)` method should support the following: @@ -575,9 +576,9 @@ End Module ' The temperature is now 16.00 °C. ``` -### Custom Format Strings +### Custom format strings -In addition to the standard format strings, .NET defines custom format strings for both numeric values and date and time values. A custom format string consists of one or more custom format specifiers that define the string representation of a value. For example, the custom date and time format string "yyyy/mm/dd hh:mm:ss.ffff t zzz" converts a date to its string representation in the form "2008/11/15 07:45:00.0000 P -08:00" for the en-US culture. Similarly, the custom format string "0000" converts the integer value 12 to "0012". For a complete list of custom format strings, see [Custom Date and Time Format Strings](custom-datetime.md) and [Custom Numeric Format Strings](custom-numeric.md). +In addition to the standard format strings, .NET defines custom format strings for both numeric values and date and time values. A custom format string consists of one or more custom format specifiers that define the string representation of a value. For example, the custom date and time format string "yyyy/mm/dd hh:mm:ss.ffff t zzz" converts a date to its string representation in the form "2008/11/15 07:45:00.0000 P -08:00" for the en-US culture. Similarly, the custom format string "0000" converts the integer value 12 to "0012". For a complete list of custom format strings, see [Custom date and time format strings](custom-datetime.md) and [Custom numeric format strings](custom-numeric.md). If a format string consists of a single custom format specifier, the format specifier should be preceded by the percent (%) symbol to avoid confusion with a standard format specifier. The following example uses the "M" custom format specifier to display a one-digit or two-digit number of the month of a particular date. @@ -646,19 +647,19 @@ End Module Although standard format strings can generally handle most of the formatting needs for your application-defined types, you may also define custom format specifiers to format your types. -### Format Strings and .NET Types +### Format strings and .NET types All numeric types (that is, the [Byte](xref:System.Byte), [Decimal](xref:System.Decimal), [Double](xref:System.Double), [Int16](xref:System.Int16), [Int32](xref:System.Int32), [Int64](xref:System.Int64), [SByte](xref:System.SByte), [Single](xref:System.Single), [UInt16](xref:System.UInt16), [UInt32](xref:System.UInt32), [UInt64](xref:System.UInt64), and [BigInteger](xref:System.Numerics.BigInteger) types), as well as the [DateTime](xref:System.DateTime), [DateTimeOffset](xref:System.DateTimeOffset), [TimeSpan](xref:System.TimeSpan), [Guid](xref:System.Guid), and all enumeration types, support formatting with format strings. For information on the specific format strings supported by each type, see the following topics: Title | Definition ----- | ---------- -[Standard Numeric Format Strings](standard-numeric.md) | Describes standard format strings that create commonly used string representations of numeric values. -[Custom Numeric Format Strings](custom-numeric.md) | Describes custom format strings that create application-specific formats for numeric values. -[Standard Date and Time Format Strings](standard-datetime.md) | Describes standard format strings that create commonly used string representations of [DateTime](xref:System.DateTime) values. -[Custom Date and Time Format Strings](custom-datetime.md) | Describes custom format strings that create application-specific formats for [DateTime](xref:System.DateTime) values. -[Standard TimeSpan Format Strings](standard-timespan.md) | Describes standard format strings that create commonly used string representations of time intervals. -[Custom TimeSpan Format Strings](custom-timespan.md) | Describes custom format strings that create application-specific formats for time intervals. -[Enumeration Format Strings](enumeration-format.md) | Describes standard format strings that are used to create string representations of enumeration values. +[Standard numeric format strings](standard-numeric.md) | Describes standard format strings that create commonly used string representations of numeric values. +[Custom numeric format strings](custom-numeric.md) | Describes custom format strings that create application-specific formats for numeric values. +[Standard date and time format strings](standard-datetime.md) | Describes standard format strings that create commonly used string representations of [DateTime](xref:System.DateTime) values. +[Custom date and time format strings](custom-datetime.md) | Describes custom format strings that create application-specific formats for [DateTime](xref:System.DateTime) values. +[Standard TimeSpan format Strings](standard-timespan.md) | Describes standard format strings that create commonly used string representations of time intervals. +[Custom TimeSpan format strings](custom-timespan.md) | Describes custom format strings that create application-specific formats for time intervals. +[Enumeration format strings](enumeration-format.md) | Describes standard format strings that are used to create string representations of enumeration values. [Guid.ToString(String)](xref:System.Guid.ToString(System.String)) | Describes standard format strings for [Guid](xref:System.Guid) values. ## Culture-Sensitive Formatting with Format Providers and the IFormatProvider Interface @@ -725,7 +726,7 @@ Method | Type of *formatType* parameter You can also implement your own format provider to replace any one of these classes. However, your implementation’s `GetFormat` method must return an object of the type listed in the previous table if it has to provide formatting information to the `ToString` method. -### Culture-Sensitive Formatting of Numeric Values +### Culture-sensitive formatting of numeric values By default, the formatting of numeric values is culture-sensitive. If you do not specify a culture when you call a formatting method, the formatting conventions of the current thread culture are used. This is illustrated in the following example, which changes the current thread culture four times and then calls the [Decimal.ToString(String)](xref:System.Decimal.ToString(System.String)) method. In each case, the result string reflects the formatting conventions of the current culture. This is because the `ToString` and `ToString(String)` methods wrap calls to each numeric type's `ToString(String, IFormatProvider)` method. @@ -851,7 +852,7 @@ End Module ' fr: 1 043,630 ``` -### Culture-Sensitive Formatting of Date and Time Values +### Culture-sensitive formatting of date and time values By default, the formatting of date and time values is culture-sensitive. If you do not specify a culture when you call a formatting method, the formatting conventions of the current thread culture are used. This is illustrated in the following example, which changes the current thread culture four times and then calls the [DateTime.ToString(String)](xref:System.DateTime.ToString(System.String)) method. In each case, the result string reflects the formatting conventions of the current culture. This is because the [DateTime.ToString()](xref:System.DateTime.ToString), [DateTime.ToString(String)](xref:System.DateTime.ToString(System.String)), [DateTimeOffset.ToString()](xref:System.DateTimeOffset.ToString(System.String)), and [DateTimeOffset.ToString(String)](xref:System.DateTimeOffset.ToString(System.String)) methods wrap calls to the [DateTime.ToString(String, IFormatProvider)](xref:System.DateTime.ToString(System.String,System.IFormatProvider)) and [DateTimeOffset.ToString(String, IFormatProvider)](xref:System.DateTimeOffset.ToString(System.String,System.IFormatProvider)) methods. @@ -926,7 +927,7 @@ End Module You can also format a date and time value for a specific culture by calling a [DateTime.ToString](xref:System.DateTime.ToString(System.String,System.IFormatProvider)) or [DateTimeOffset.ToString](xref:System.DateTimeOffset.ToString(System.String,System.IFormatProvider)) overload that has a provider parameter and passing it either of the following: -* A [CultureInfo](xref:System.Globalization.CultureInfo) object that represents the culture whose formatting conventions are to be used. Its [CultureInfo.GetFormat](xref:System.Globalization.CultureInfo#.CultureInfo.GetFormat(System.Type)) method returns the value of the [CultureInfo.NumberFormat](xref:System.Globalization.CultureInfo.NumberFormat) property, which is the [DateTimeFormatInfo](xref:System.Globalization.DateTimeFormatInfo) object that provides culture-specific formatting information for numeric values. +* A [CultureInfo](xref:System.Globalization.CultureInfo) object that represents the culture whose formatting conventions are to be used. Its [CultureInfo.GetFormat](xref:System.Globalization.CultureInfo.GetFormat(System.Type)) method returns the value of the [CultureInfo.NumberFormat](xref:System.Globalization.CultureInfo.NumberFormat) property, which is the [DateTimeFormatInfo](xref:System.Globalization.DateTimeFormatInfo) object that provides culture-specific formatting information for numeric values. * A [DateTimeFormatInfo](xref:System.Globalization.DateTimeFormatInfo) object that defines the culture-specific formatting conventions to be used. Its [GetFormat](xref:System.Globalization.DateTimeFormatInfo.GetFormat(System.Type)) method returns an instance of itself. @@ -977,7 +978,7 @@ End Module ' fr: 28/05/2012 11:30:00 ``` -## The IFormattable Interface +## The IFormattable interface Typically, types that overload the `ToString` method with a format string and an [IFormatProvider](xref:System.IFormatProvider) parameter also implement the [IFormattable](xref:System.IFormattable) interface. This interface has a single member, [IFormattable.ToString(String, IFormatProvider)](xref:System.IFormattable.ToString(System.String,System.IFormatProvider)), that includes both a format string and a format provider as parameters. @@ -985,7 +986,7 @@ Implementing the [IFormattable](xref:System.IFormattable) interface for your app * Support for string conversion by the [Convert](xref:System.Convert) class. Calls to the [Convert.ToString(Object)](xref:System.Convert.ToString(System.Object)) and [Convert.ToString(Object, IFormatProvider)](xref:System.Convert.ToString(System.Object,System.IFormatProvider)) methods call your [IFormattable](xref:System.IFormattable) implementation automatically. -* Support for composite formatting. If a format item that includes a format string is used to format your custom type, the Common Language Runtime automatically calls your [IFormattable](xref:System.IFormattable) implementation and passes it the format string. For more information about composite formatting with methods such as `String.Format` or `Console.WriteLine`, see the [Composite Formatting](#Composite-Formatting) section. +* Support for composite formatting. If a format item that includes a format string is used to format your custom type, the Common Language Runtime automatically calls your [IFormattable](xref:System.IFormattable) implementation and passes it the format string. For more information about composite formatting with methods such as `String.Format` or `Console.WriteLine`, see the [Composite formatting](#composite-formatting) section. The following example defines a `Temperature` class that implements the [IFormattable](xref:System.IFormattable) interface. It supports the "C" or "G" format specifiers to display the temperature in Celsius, the "F" format specifier to display the temperature in Fahrenheit, and the "K" format specifier to display the temperature in Kelvin. @@ -1156,7 +1157,7 @@ End Module ' Temperature: 71,60°F ``` -## Composite Formatting +## Composite formatting Some methods, such as `String.Format` and `StringBuilder.AppendFormat`, support composite formatting. A composite format string is a kind of template that returns a single string that incorporates the string representation of zero, one, or more objects. Each object is represented in the composite format string by an indexed format item. The index of the format item corresponds to the position of the object that it represents in the method's parameter list. Indexes are zero-based. For example, in the following call to the `String.Format` method, the first format item, `{0:D}`, is replaced by the string representation of `thatDate`; the second format item, `{1}`, is replaced by the string representation of `item1`; and the third format item, `{2:C2}`, is replaced by the string representation of `item1.Value`. @@ -1227,9 +1228,9 @@ Next Note that, if both the alignment string component and the format string component are present, the former precedes the latter (for example, `{0,-20:g}`. -For more information about composite formatting, see [Composite Formatting](composite-format.md). +For more information about composite formatting, see [Composite formatting](composite-format.md). -## Custom Formatting with ICustomFormatter +## Custom formatting with ICustomFormatter Two composite formatting methods, [String.Format(IFormatProvider, String, Object[])](xref:System.String.Format(System.IFormatProvider,System.String,System.Object[])) and [StringBuilder.AppendFormat(IFormatProvider, String, Object[])](xref:System.Text.StringBuilder.AppendFormat(System.IFormatProvider,System.String,System.Object)), include a format provider parameter that supports custom formatting. When either of these formatting methods is called, it passes a [Type](xref:System.Type) object that represents an [ICustomFormatter](xref:System.ICustomFormatter) interface to the format provider’s `GetFormat` method. The `GetFormat` method is then responsible for returning the [ICustomFormatter](xref:System.ICustomFormatter) implementation that provides custom formatting. @@ -1374,20 +1375,20 @@ End Module ' 3,210,662,321 ``` -## Related Topics +## Related topics Title | Definition ----- | ---------- -[Standard Numeric Format Strings](standard-numeric.md) | Describes standard format strings that create commonly used string representations of numeric values. -[Custom Numeric Format Strings](custom-numeric.md) | Describes custom format strings that create application-specific formats for numeric values. -[Standard Date and Time Format Strings](standard-datetime.md) | Describes standard format strings that create commonly used string representations of [DateTime](xref:System.DateTime) values. -[Custom Date and Time Format Strings](custom-datetime.md) | Describes custom format strings that create application-specific formats for [DateTime](xref:System.DateTime) values. -[Standard TimeSpan Format Strings](standard-timespan.md) | Describes standard format strings that create commonly used string representations of time intervals. -[Custom TimeSpan Format Strings](custom-timespan.md) | Describes custom format strings that create application-specific formats for time intervals. -[Enumeration Format Strings](enumeration-format.md) | Describes standard format strings that are used to create string representations of enumeration values. -[Composite Formatting](composite-format.md) | Describes how to embed one or more formatted values in a string. The string can subsequently be displayed on the console or written to a stream. -[Performing Formatting Operations](performing-formatting-operations.md) | Lists topics that provide step-by-step instructions for performing specific formatting operations. -[Parsing Strings](parsing-strings.md) | Describes how to initialize objects to the values described by string representations of those objects. Parsing is the inverse operation of formatting. +[Standard numeric format strings](standard-numeric.md) | Describes standard format strings that create commonly used string representations of numeric values. +[Custom numeric format strings](custom-numeric.md) | Describes custom format strings that create application-specific formats for numeric values. +[Standard date and time format strings](standard-datetime.md) | Describes standard format strings that create commonly used string representations of [DateTime](xref:System.DateTime) values. +[Custom date and time format strings](custom-datetime.md) | Describes custom format strings that create application-specific formats for [DateTime](xref:System.DateTime) values. +[Standard TimeSpan format strings](standard-timespan.md) | Describes standard format strings that create commonly used string representations of time intervals. +[Custom TimeSpan format strings](custom-timespan.md) | Describes custom format strings that create application-specific formats for time intervals. +[Enumeration format strings](enumeration-format.md) | Describes standard format strings that are used to create string representations of enumeration values. +[Composite formatting](composite-format.md) | Describes how to embed one or more formatted values in a string. The string can subsequently be displayed on the console or written to a stream. +[Performing formatting operations](performing-formatting-operations.md) | Lists topics that provide step-by-step instructions for performing specific formatting operations. +[Parsing strings](parsing-strings.md) | Describes how to initialize objects to the values described by string representations of those objects. Parsing is the inverse operation of formatting. ## Reference diff --git a/docs/standard/base-types/grouping.md b/docs/standard/base-types/grouping.md index f882bc99c891b..133da90ca1820 100644 --- a/docs/standard/base-types/grouping.md +++ b/docs/standard/base-types/grouping.md @@ -3,6 +3,7 @@ title: Grouping constructs in regular expressions description: Grouping constructs in regular expressions keywords: .NET, .NET Core author: stevehoag +ms.author: shoag manager: wpickett ms.date: 07/29/2016 ms.topic: article @@ -18,7 +19,7 @@ Grouping constructs delineate the subexpressions of a regular expression and cap * Match a subexpression that is repeated in the input string. -* Apply a quantifier to a subexpression that has multiple regular expression language elements. For more information about quantifiers, see [Quantifiers in Regular Expressions](quantifiers.md). +* Apply a quantifier to a subexpression that has multiple regular expression language elements. For more information about quantifiers, see [Quantifiers in regular expressions](quantifiers.md). * Include a subexpression in the string that is returned by the [Regex.Replace](xref:System.Text.RegularExpressions.Regex.Replace(System.String,System.String)) and [Match.Result](xref:System.Text.RegularExpressions.Match.Result(System.String)) methods. @@ -28,20 +29,20 @@ The following table lists the grouping constructs supported by .NET regular expr Grouping construct | Capturing or noncapturing ------------------ | ------------------------- -[Matched subexpressions](#Matched-subexpressions) | Capturing -[Named matched subexpressions](#Named-matched-subexpressions) | Capturing -[Balancing group definitions](#Balancing group definitions) | Capturing -[Noncapturing groups](#Noncapturing-groups) | Noncapturing -[Group options](#Group-options) | Noncapturing -[Zero-width positive lookahead assertions](#Zero-width-positive-lookahead-assertions) | Noncapturing -[Zero-width negative lookahead assertions](#Zero-width-negative-lookahead-assertions) | Noncapturing -[Zero-width positive lookbehind assertions](#Zero-width-positive-lookbehind-assertions) | Noncapturing -[Zero-width negative lookbehind assertions](#Zero-width-negative-lookbehind-assertions) | Noncapturing -[Nonbacktracking subexpressions](#Nonbacktracking-subexpressions) | Noncapturing +[Matched subexpressions](#matched-subexpressions) | Capturing +[Named matched subexpressions](#named-matched-subexpressions) | Capturing +[Balancing group definitions](#balancing-group-definitions) | Capturing +[Noncapturing groups](#noncapturing-groups) | Noncapturing +[Group options](#group-options) | Noncapturing +[Zero-width positive lookahead assertions](#zero-width-positive-lookahead-assertions) | Noncapturing +[Zero-width negative lookahead assertions](#zero-width-negative-lookahead-assertions) | Noncapturing +[Zero-width positive lookbehind assertions](#zero-width-positive-lookbehind-assertions) | Noncapturing +[Zero-width negative lookbehind assertions](#zero-width-negative-lookbehind-assertions) | Noncapturing +[Nonbacktracking subexpressions](#nonbacktracking-subexpressions) | Noncapturing -For information on groups and the regular expression object model, see [Grouping Constructs and Regular Expression Objects](#Grouping-constructs-and-regular-expression-objects). +For information on groups and the regular expression object model, see [Grouping Constructs and Regular Expression Objects](#grouping-constructs-and-regular-expression-objects). -## Matched Subexpressions +## Matched subexpressions The following grouping construct captures a matched subexpression: @@ -60,7 +61,7 @@ You can access captured groups in four ways: * By using the **$**_number_ replacement sequence in a [Regex.Replace](xref:System.Text.RegularExpressions.Regex.Replace(System.String,System.String)) or [Match.Result](xref:System.Text.RegularExpressions.Match.Result(System.String)) method call, where *number* is the ordinal number of the captured subexpression. -* Programmatically, by using the [GroupCollection](xref:System.Text.RegularExpressions.GroupCollection) object returned by the [Match.Groups](xref:System.Text.RegularExpressions.Match.Groups) property. The member at position zero in the collection represents the entire regular expression match. Each subsequent member represents a matched subexpression. For more information, see the [Grouping Constructs and Regular Expression Objects](#Grouping-constructs-and-regular-expression-objects) section. +* Programmatically, by using the [GroupCollection](xref:System.Text.RegularExpressions.GroupCollection) object returned by the [Match.Groups](xref:System.Text.RegularExpressions.Match.Groups) property. The member at position zero in the collection represents the entire regular expression match. Each subsequent member represents a matched subexpression. For more information, see the [Grouping Constructs and Regular Expression Objects](#grouping-constructs-and-regular-expression-objects) section. The following example illustrates a regular expression that identifies duplicated words in text. The regular expression pattern's two capturing groups represent the two instances of the duplicated word. The second instance is captured to report its starting position in the input string. @@ -117,7 +118,7 @@ Pattern | Description `(\1)` | Match the string in the first captured group. This is the second capturing group. The example assigns it to a captured group so that the starting position of the duplicate word can be retrieved from the `Match.Index` property. `\W` | Match a non-word character, including white space and punctuation. This prevents the regular expression pattern from matching a word that starts with the word from the first captured group. -## Named Matched Subexpressions +## Named matched subexpressions The following grouping construct captures a matched subexpression and lets you access it by name or by number: @@ -134,7 +135,7 @@ or: where *name* is a valid group name, and *subexpression* is any valid regular expression pattern. *name* must not contain any punctuation characters and cannot begin with a number. > [!NOTE] -> If the [RegexOptions](xref:System.Text.RegularExpressions.RegexOptions) parameter of a regular expression pattern matching method includes the [RegexOptions.ExplicitCapture](xref:System.Text.RegularExpressions.RegexOptions.ExplicitCapture) flag, or if the **n** option is applied to this subexpression (see [Group options](#Group-options) later in this topic), the only way to capture a subexpression is to explicitly name capturing groups. +> If the [RegexOptions](xref:System.Text.RegularExpressions.RegexOptions) parameter of a regular expression pattern matching method includes the [RegexOptions.ExplicitCapture](xref:System.Text.RegularExpressions.RegexOptions.ExplicitCapture) flag, or if the **n** option is applied to this subexpression (see [Group options](#group-options) later in this topic), the only way to capture a subexpression is to explicitly name capturing groups. You can access named captured groups in the following ways: @@ -310,7 +311,7 @@ Pattern | Description `\D+` | Match one or more non-decimal digit characters. `(?\d+)?` | Match zero or one occurrence of one or more decimal digit characters. Assign the match to the `digit` named group. -## Balancing Group Definitions +## Balancing group definitions A balancing group definition deletes the definition of a previously defined group and stores, in the current group, the interval between the previously defined group and the current group. This grouping construct has the following format: @@ -470,7 +471,7 @@ Pattern | Description `(?(Open)(?!))` | If the `Open` group exists, abandon the match if an empty string can be matched, but do not advance the position of the regular expression engine in the string. This is a zero-width negative lookahead assertion. Because an empty string is always implicitly present in an input string, this match always fails. Failure of this match indicates that the angle brackets are not balanced. `$` | Match the end of the input string. -The final subexpression, `(?(Open)(?!))`, indicates whether the nesting constructs in the input string are properly balanced (for example, whether each left angle bracket is matched by a right angle bracket). It uses conditional matching based on a valid captured group; for more information, see [Alternation Constructs in Regular Expressions](alternation.md). If the `Open` group is defined, the regular expression engine attempts to match the subexpression `(?!)` in the input string. The `Open` group should be defined only if nesting constructs are unbalanced. Therefore, the pattern to be matched in the input string should be one that always causes the match to fail. In this case, `(?!)` is a zero-width negative lookahead assertion that always fails, because an empty string is always implicitly present at the next position in the input string. +The final subexpression, `(?(Open)(?!))`, indicates whether the nesting constructs in the input string are properly balanced (for example, whether each left angle bracket is matched by a right angle bracket). It uses conditional matching based on a valid captured group; for more information, see [Alternation constructs in regular expressions](alternation.md). If the `Open` group is defined, the regular expression engine attempts to match the subexpression `(?!)` in the input string. The `Open` group should be defined only if nesting constructs are unbalanced. Therefore, the pattern to be matched in the input string should be one that always causes the match to fail. In this case, `(?!)` is a zero-width negative lookahead assertion that always fails, because an empty string is always implicitly present at the next position in the input string. In the example, the regular expression engine evaluates the input string ">" as shown in the following table. @@ -501,7 +502,7 @@ Step | Pattern | Result 23 | `(?(Open)(?!))` | The `Open` group is not defined, so no match is attempted. 24 | `$` | Matches the end of the input string. -## Noncapturing Groups +## Noncapturing groups The following grouping construct does not capture the substring that is matched by a subexpression: @@ -564,13 +565,13 @@ Pattern | Description `(?:\b(?:\w+)\W*)+` | Match the pattern of one or more word characters starting at a word boundary, followed by zero or more non-word characters, one or more times. Do not assign the matched text to a captured group. `\.` | Match a period. -## Group Options +## Group options The following grouping construct applies or disables the specified options within a subexpression: **(?imnsx-imnsx:**_subexpression_**)** -where *subexpression* is any valid regular expression pattern. For example, `(?i-s:)` turns on case insensitivity and disables single-line mode. For more information about the inline options you can specify, see [Regular Expression Options](options.md). +where *subexpression* is any valid regular expression pattern. For example, `(?i-s:)` turns on case insensitivity and disables single-line mode. For more information about the inline options you can specify, see [Regular expression options](options.md). > [!NOTE] > You can specify options that apply to an entire regular expression rather than a subexpression by using a [System.Text.RegularExpressions.Regex](xref:System.Text.RegularExpressions.Regex) class constructor or a static method. You can also specify inline options that apply after a specific point in a regular expression by using the `(?imnsx-imnsx)` language construct. @@ -608,7 +609,7 @@ Next ' 'decidedly ' found at index 9. ``` -## Zero-Width Positive Lookahead Assertions +## Zero-width positive lookahead assertions The following grouping construct defines a zero-width positive lookahead assertion: @@ -687,7 +688,7 @@ Pattern | Description `\w+` | Match one or more word characters. `(?=\sis\b)` | Determine whether the word characters are followed by a white-space character and the string "is", which ends on a word boundary. If so, the match is successful. -## Zero-Width Negative Lookahead Assertions +## Zero-width negative lookahead assertions The following grouping construct defines a zero-width negative lookahead assertion: @@ -801,7 +802,7 @@ Pattern | Description `\b` | End the match at a word boundary. `\p{P})` | If the next character is not a punctuation symbol (such as a period or a comma), the match succeeds. -## Zero-Width Positive Lookbehind Assertions +## Zero-width positive lookbehind assertions The following grouping construct defines a zero-width positive lookbehind assertion: @@ -861,7 +862,7 @@ Pattern | Description Zero-width positive lookbehind assertions are also used to limit backtracking when the last character or characters in a captured group must be a subset of the characters that match that group's regular expression pattern. For example, if a group captures all consecutive word characters, you can use a zero-width positive lookbehind assertion to require that the last character be alphabetical. -## Zero-Width Negative Lookbehind Assertions +## Zero-width negative lookbehind assertions The following grouping construct defines a zero-width negative lookbehind assertion: @@ -938,7 +939,7 @@ Pattern | Description `\d{4}\b` | Match four decimal digits, and end the match at a word boundary. `(?(\w)\1+)` | Match one or more occurrences of a duplicated word character, but do not backtrack to match the last character on a word boundary. -## Grouping Constructs and Regular Expression Objects +## Grouping constructs and regular expression objects Substrings that are matched by a regular expression capturing group are represented by [System.Text.RegularExpressions.Group](xref:System.Text.RegularExpressions.Group) objects, which can be retrieved from the [System.Text.RegularExpressions.GroupCollection](xref:System.Text.RegularExpressions.GroupCollection) object that is returned by the [Match.Groups](xref:System.Text.RegularExpressions.Match.Groups) property. The [GroupCollection](xref:System.Text.RegularExpressions.GroupCollection) object is populated as follows: * The first [Group](xref:System.Text.RegularExpressions.Group) object in the collection (the object at index zero) represents the entire match. -* The next set of [Group](xref:System.Text.RegularExpressions.Group) objects represent unnamed (numbered) capturing groups. They appear in the order in which they are defined in the regular expression, from left to right. The index values of these groups range from 1 to the number of unnamed capturing groups in the collection. (The index of a particular group is equivalent to its numbered backreference. For more information about backreferences, see [Backreference Constructs in Regular Expressions](backreference.md) +* The next set of [Group](xref:System.Text.RegularExpressions.Group) objects represent unnamed (numbered) capturing groups. They appear in the order in which they are defined in the regular expression, from left to right. The index values of these groups range from 1 to the number of unnamed capturing groups in the collection. (The index of a particular group is equivalent to its numbered backreference. For more information about backreferences, see [Backreference constructs in regular expressions](backreference.md) * The final set of [Group](xref:System.Text.RegularExpressions.Group) objects represent named capturing groups. They appear in the order in which they are defined in the regular expression, from left to right. The index value of the first named capturing group is one greater than the index of the last unnamed capturing group. If there are no unnamed capturing groups in the regular expression, the index value of the first named capturing group is one. @@ -1151,7 +1152,7 @@ Pattern | Description The first capturing group matches each word of the sentence. The second capturing group matches each word along with the punctuation and white space that follow the word. The [Group](xref:System.Text.RegularExpressions.Group) object whose index is 2 provides information about the text matched by the second capturing group. The complete set of words captured by the capturing group are available from the [CaptureCollection](xref:System.Text.RegularExpressions.CaptureCollection) object returned by the [Group.Captures](xref:System.Text.RegularExpressions.Group.Captures) property. -## See Also +## See also [Regular expression language - quick reference](quick-ref.md) diff --git a/docs/standard/base-types/manipulating-strings.md b/docs/standard/base-types/manipulating-strings.md index cb34ec03ad0cb..c4d87e6d352de 100644 --- a/docs/standard/base-types/manipulating-strings.md +++ b/docs/standard/base-types/manipulating-strings.md @@ -3,6 +3,7 @@ title: Manipulating strings description: Manipulating strings keywords: .NET, .NET Core author: stevehoag +ms.author: shoag manager: wpickett ms.date: 07/22/2016 ms.topic: article @@ -16,19 +17,19 @@ ms.assetid: da3c277e-b06e-48bd-ae1f-1e7e4240b93e .NET provides an extensive set of routines that enable you to efficiently create, compare, and modify strings as well as rapidly parse large amounts of text and data to search for, remove, and replace text patterns. -## In This Section +## In this section -[Best Practices for Using Strings](best-practices-strings.md) - Examines string-sorting, comparison, and casing methods in .NET, and provides recommendations for selecting a string-handling method . +[Best practices for using strings](best-practices-strings.md) - Examines string-sorting, comparison, and casing methods in .NET, and provides recommendations for selecting a string-handling method . -[Regular Expressions](regular-expressions.md) - Provides detailed information about .NET regular expressions, including language elements, regular expression behavior, and examples. +[Regular expressions](regular-expressions.md) - Provides detailed information about .NET regular expressions, including language elements, regular expression behavior, and examples. -[Basic String Operations](basic-string-operations.md) - Describes string operations provided by the @System.String and @System.Text.StringBuilder classes, including creating new strings from arrays of bytes, comparing string values, and modifying existing strings. +[Basic string operations](basic-string-operations.md) - Describes string operations provided by the @System.String and @System.Text.StringBuilder classes, including creating new strings from arrays of bytes, comparing string values, and modifying existing strings. -[Character Encoding in .NET](character-encoding.md) - Describes how to encode and decode character formats such as Unicode. +[Character encoding in .NET](character-encoding.md) - Describes how to encode and decode character formats such as Unicode. -[Type Conversion](type-conversion.md) - Describes how to convert from one type to another. +[Type conversion](type-conversion.md) - Describes how to convert from one type to another. -[Formatting Types](formatting-types.md) - Describes how to format strings using the string format specifiers. +[Formatting types](formatting-types.md) - Describes how to format strings using the string format specifiers. -[Parsing Strings](parsing-strings.md) - Describes how to convert strings into types. +[Parsing strings](parsing-strings.md) - Describes how to convert strings into types. diff --git a/docs/standard/base-types/miscellaneous.md b/docs/standard/base-types/miscellaneous.md index 5f2a7eaf2cac4..db0fa34f07b9d 100644 --- a/docs/standard/base-types/miscellaneous.md +++ b/docs/standard/base-types/miscellaneous.md @@ -3,6 +3,7 @@ title: Miscellaneous constructs in regular expressions description: Miscellaneous constructs in regular expressions keywords: .NET, .NET Core author: stevehoag +ms.author: shoag manager: wpickett ms.date: 07/29/2016 ms.topic: article @@ -25,7 +26,7 @@ You can set or disable specific pattern matching options for part of a regular e (?imnsx-imnsx) ``` -You list the options you want to enable after the question mark, and the options you want to disable after the minus sign. The following table describes each option. For more information about each option, see [Regular Expression Options](options.md). +You list the options you want to enable after the question mark, and the options you want to disable after the minus sign. The following table describes each option. For more information about each option, see [Regular expression options](options.md). Option | Description ------ | ----------- @@ -38,7 +39,7 @@ Option | Description Any change in regular expression options defined by the **(?imnsx-imnsx)** construct remains in effect until the end of the enclosing group. > [!NOTE] -> The **(?imnsx-imnsx**:_subexpression_**)** grouping construct provides identical functionality for a subexpression. For more information, see [Grouping Constructs in Regular Expressions](grouping.md). +> The **(?imnsx-imnsx**:_subexpression_**)** grouping construct provides identical functionality for a subexpression. For more information, see [Grouping constructs in regular expressions](grouping.md). The following example uses the **i**, **n**, and **x** options to enable case insensitivity and explicit captures, and to ignore white space in the regular expression pattern in the middle of a regular expression. diff --git a/docs/standard/base-types/object-model.md b/docs/standard/base-types/object-model.md index 8b7b5dd5b1762..5318ec3a78843 100644 --- a/docs/standard/base-types/object-model.md +++ b/docs/standard/base-types/object-model.md @@ -3,6 +3,7 @@ title: The regular expression object model description: The regular expression object model keywords: .NET, .NET Core author: stevehoag +ms.author: shoag manager: wpickett ms.date: 07/28/2016 ms.topic: article @@ -16,19 +17,19 @@ ms.assetid: a1e611ec-c6a2-48c6-9c52-0ed845787621 This topic describes the object model used in working with.NET regular expressions. It contains the following sections: -* [The Regular Expression Engine](#The-Regular-Expression-Engine) +* [The regular expression engine](#the-regular-expression-engine) -* [The MatchCollection and Match Objects](#The-MatchCollection-and-Match-Objects) +* [The MatchCollection and Match objects](#the-matchcollection-and-match-objects) -* [The Group Collection](#The-Group-Collection) +* [The Group collection](#the-group-collection) -* [The Captured Group](#The-Captured-Group) +* [The captured group](#the-captured-group) -* [The Capture Collection](#The-Capture-Collection) +* [The capture collection](#the-capture-collection) -* [The Individual Capture](#The-Individual-Capture) +* [The individual capture](#the-individual-capture) -## The Regular Expression Engine +## The regular expression engine The regular expression engine in .NET is represented by the [Regex](xref:System.Text.RegularExpressions.Regex) class. The regular expression engine is responsible for parsing and compiling a regular expression, and for performing operations that match the regular expression pattern with an input string. The engine is the central component in .NET regular expression object model. @@ -52,7 +53,7 @@ You can call the methods of the [Regex](xref:System.Text.RegularExpressions.Rege These operations are described in the following sections. -### Matching a Regular Expression Pattern +### Matching a regular expression pattern The [Regex.IsMatch](xref:System.Text.RegularExpressions.Regex.IsMatch(System.String)) method returns `true` if the string matches the pattern, or `false` if it does not. The [IsMatch](xref:System.Text.RegularExpressions.Regex.IsMatch(System.String)) method is often used to validate string input. For example, the following code ensures that a string matches a valid social security number in the United States. @@ -112,7 +113,7 @@ Pattern | Description `\d{4}` | Match four decimal digits. `$` | Match the end of the input string. -### Extracting a Single Match or the First Match +### Extracting a single match or the first match The [Regex.Match](xref:System.Text.RegularExpressions.Regex.Match(System.String)) method returns a [Match](xref:System.Text.RegularExpressions.Match) object that contains information about the first substring that matches a regular expression pattern. If the `Match.Success` property returns `true`, indicating that a match was found, you can retrieve information about subsequent matches by calling the [Match.NextMatch](xref:System.Text.RegularExpressions.Match.NextMatch) method. These method calls can continue until the `Match.Success` property returns `false`. For example, the following code uses the [Regex.Match(String, String)](xref:System.Text.RegularExpressions.Regex.Match(System.String,System.String)) method to find the first occurrence of a duplicated word in a string. It then calls the [Match.NextMatch](xref:System.Text.RegularExpressions.Match.NextMatch) method to find any additional occurrences. The example examines the `Match.Success` property after each method call to determine whether the current match was successful and whether a call to the [Match.NextMatch](xref:System.Text.RegularExpressions.Match.NextMatch) method should follow. @@ -170,7 +171,7 @@ Pattern | Description `(\1)` | Match the first captured string. This is the second capturing group. `\b` | End the match on a word boundary. -### Extracting All Matches +### Extracting all matches The [Regex.Matches](xref:System.Text.RegularExpressions.Regex.Matches(System.String)) method returns a [MatchCollection](xref:System.Text.RegularExpressions.MatchCollection) object that contains information about all matches that the regular expression engine found in the input string. For example, the previous example could be rewritten to call the [Matches](xref:System.Text.RegularExpressions.Regex.Matches(System.String)) method instead of the [Match](xref:System.Text.RegularExpressions.Regex.Match(System.String)) and [NextMatch](xref:System.Text.RegularExpressions.Match.NextMatch) methods. @@ -212,7 +213,7 @@ End Module ' Duplicate 'that' found at position 22. ``` -### Replacing a Matched Substring +### Replacing a matched substring The [Regex.Replace](xref:System.Text.RegularExpressions.Regex.Replace(System.String,System.String)) method replaces each substring that matches the regular expression pattern with a specified string or regular expression pattern, and returns the entire input string with replacements. For example, the following code adds a U.S. currency symbol before a decimal number in a string. @@ -266,7 +267,7 @@ Pattern | Replacement string `$$` | The dollar sign (**$**) character. `$&` | The entire matched substring. -### Splitting a Single String into an Array of Strings +### Splitting a single string into an array of strings The [Regex.Split](xref:System.Text.RegularExpressions.Regex.Split(System.String)) method splits the input string at the positions defined by a regular expression match. For example, the following code places the items in a numbered list into a string array. @@ -326,11 +327,11 @@ Pattern | Description `\.` | Match a period. `\s` | Match a white-space character. -## The MatchCollection and Match Objects +## The MatchCollection and Match objects [Regex](xref:System.Text.RegularExpressions.Regex) methods return two objects that are part of the regular expression object model: the [MatchCollection](xref:System.Text.RegularExpressions.MatchCollection) object, and the [Match](xref:System.Text.RegularExpressions.Match) object. -### The Match Collection +### The Match collection The [Regex.Matches](xref:System.Text.RegularExpressions.Regex.Matches(System.String)) method returns a [MatchCollection](xref:System.Text.RegularExpressions.MatchCollection) object that contains [Match](xref:System.Text.RegularExpressions.Match) objects that represent all the matches that the regular expression engine found, in the order in which they occur in the input string. If there are no matches, the method returns a [MatchCollection](xref:System.Text.RegularExpressions.MatchCollection) object that contains [Match](xref:System.Text.RegularExpressions.Match) object with no members. The [MatchCollection](xref:System.Text.RegularExpressions.MatchCollection) `Item` property lets you access individual members of the collection by index, from zero to one less than the value of the [MatchCollection.Count](xref:System.Text.RegularExpressions.MatchCollection.Count) property. 'Item` is the collection's indexer (in C#) and default property (in Visual Basic).. @@ -511,7 +512,7 @@ Two properties of the [Match](xref:System.Text.RegularExpressions.Match) class r * The `Match.Captures` property returns a [CaptureCollection](xref:System.Text.RegularExpressions.CaptureCollection) object that is of limited use. The collection is not populated for a [Match](xref:System.Text.RegularExpressions.Match) object whose `Success` property is `false`. Otherwise, it contains a single [Capture](xref:System.Text.RegularExpressions.Capture) object that has the same information as the [Match](xref:System.Text.RegularExpressions.Match) object. -For more information about these objects, see the [The Group Collection](#The-Group-Collection) and [The Capture Collection](#The-Capture-Collection) sections later in this topic. +For more information about these objects, see the [The Group collection](#the-group-collection) and [The capture collection](#the-capture-collection) sections later in this topic. Two additional properties of the [Match](xref:System.Text.RegularExpressions.Match) class provide information about the match. The `Match.Value` property returns the substring in the input string that matches the regular expression pattern. The `Match.Index` property returns the zero-based starting position of the matched string in the input string. @@ -577,7 +578,7 @@ Pattern | Description The replacement pattern **$$ $&** indicates that the matched substring should be replaced by a dollar sign (**$**) symbol (the `$$` pattern), a space, and the value of the match (the `$&` pattern). -## The Group Collection +## The Group collection The [Match.Groups](xref:System.Text.RegularExpressions.Match.Groups) property returns a [GroupCollection](xref:System.Text.RegularExpressions.GroupCollection) object that contains [Group](xref:System.Text.RegularExpressions.Group) objects that represent captured groups in a single match. The first [Group](xref:System.Text.RegularExpressions.Group) object in the collection (at index 0) represents the entire match. Each object that follows represents the results of a single capturing group. @@ -653,7 +654,7 @@ Pattern | Description `(\d{4})` | Match four decimal digits. This is the third capturing group. `\b` | End the match on a word boundary. -## The Captured Group +## The captured group The [Group](xref:System.Text.RegularExpressions.Group) class represents the result from a single capturing group. [Group](xref:System.Text.RegularExpressions.Group) objects that represent the capturing groups defined in a regular expression are returned by the [Item](xref:System.Text.RegularExpressions.GroupCollection.Item(System.Int32)) property of the [GroupCollection](xref:System.Text.RegularExpressions.GroupCollection) object returned by the [Match.Groups](xref:System.Text.RegularExpressions.Match.Groups) property. The [Item](xref:System.Text.RegularExpressions.GroupCollection.Item(System.Int32)) property is the indexer (in C#) and the default property (in Visual Basic) of the [Group](xref:System.Text.RegularExpressions.Group) class. You can also retrieve individual members by iterating the collection using the `foreach` construct. For an example, see the previous section. @@ -742,7 +743,7 @@ Pattern | Description The properties of the [Group](xref:System.Text.RegularExpressions.Group) class provide information about the captured group: The `Group.Value` property contains the captured substring, the `Group.Index` property indicates the starting position of the captured group in the input text, the `Group.Length` property contains the length of the captured text, and the `Group.Success` property indicates whether a substring matched the pattern defined by the capturing group. -Applying quantifiers to a group (for more information, see [Quantifiers in Regular Expressions](quantifiers.md)) modifies the relationship of one capture per capturing group in two ways: +Applying quantifiers to a group (for more information, see [Quantifiers in regular expressions](quantifiers.md)) modifies the relationship of one capture per capturing group in two ways: * If the __*__ or __*?__ quantifier (which specifies zero or more matches) is applied to a group, a capturing group may not have a match in the input string. When there is no captured text, the properties of the [Group](xref:System.Text.RegularExpressions.Group) object are set as shown in the following table. @@ -842,7 +843,7 @@ Applying quantifiers to a group (for more information, see [Quantifiers in Regul ' Group 2: sentence ``` -## The Capture Collection +## The capture collection The [Group](xref:System.Text.RegularExpressions.Group) object contains information only about the last capture. However, the entire set of captures made by a capturing group is still available from the [CaptureCollection](xref:System.Text.RegularExpressions.CaptureCollection) object that is returned by the [Group.Captures](xref:System.Text.RegularExpressions.Group.Captures) property. Each member of the collection is a [Capture](xref:System.Text.RegularExpressions.Capture) object that represents a capture made by that capturing group, in the order in which they were captured (and, therefore, in the order in which the captured strings were matched from left to right in the input string). You can retrieve individual [Capture](xref:System.Text.RegularExpressions.Capture) objects from the collection in either of two ways: @@ -1034,7 +1035,7 @@ End Module ' Capture 2: 'b' at position 7 ``` -## The Individual Capture +## The individual capture The [Capture](xref:System.Text.RegularExpressions.Capture) class contains the results from a single subexpression capture. The [Capture.Value](xref:System.Text.RegularExpressions.Capture.Value) property contains the matched text, and the [Capture.Index](xref:System.Text.RegularExpressions.Capture.Index) property indicates the zero-based position in the input string at which the matched substring begins. @@ -1105,7 +1106,7 @@ Pattern | Description `;` | Match a semicolon. `((\w+(\s\w+)*),(\d+);)+` | Match the pattern of a word followed by any additional words followed by a comma, one or more digits, and a semicolon, one or more times. This is the first capturing group. -## See Also +## See also [System.Text.RegularExpressions](xref:System.Text.RegularExpressions) diff --git a/docs/standard/base-types/options.md b/docs/standard/base-types/options.md index 4479d78147a67..3cf075eb3431a 100644 --- a/docs/standard/base-types/options.md +++ b/docs/standard/base-types/options.md @@ -3,6 +3,7 @@ title: Regular expression options description: Regular expression options keywords: .NET, .NET Core author: stevehoag +ms.author: shoag manager: wpickett ms.date: 07/29/2016 ms.topic: article @@ -18,18 +19,18 @@ By default, the comparison of an input string with any literal characters in a r RegexOptions member | Inline character | Effect ------------------- | ---------------- | ------ -[None](xref:System.Text.RegularExpressions.RegexOptions.None) | Not available | Use default behavior. For more information, see [Default Options](#Default-Options). -[IgnoreCase](xref:System.Text.RegularExpressions.RegexOptions.IgnoreCase) | **i** | Use case-insensitive matching. For more information, see [Case-Insensitive Matching](#Case-Insensitive-Matching). -[Multiline](xref:System.Text.RegularExpressions.RegexOptions.Multiline) | **m** | Use multiline mode, where **^** and **$** match the beginning and end of each line (instead of the beginning and end of the input string). For more information, see [Multiline Mode](#Multiline-Mode). -[Singleline](xref:System.Text.RegularExpressions.RegexOptions.Singleline) | **s** | Use single-line mode, where the period (**.**) matches every character (instead of every character except **\n**). For more information, see [Singleline Mode](#Singleline-Mode). -[ExplicitCapture](xref:System.Text.RegularExpressions.RegexOptions.ExplicitCapture) | **n** | Do not capture unnamed groups. The only valid captures are explicitly named or numbered groups of the form **(?<**_name_**>** _subexpression_**)**. For more information, see [Explicit Captures Only](#Explicit-Captures-Only). -[Compiled](xref:System.Text.RegularExpressions.RegexOptions.Compiled) | Not available | Compile the regular expression to an assembly. For more information, see [Compiled Regular Expressions](#Compiled-Regular-Expressions). -[IgnorePatternWhitespace](xref:System.Text.RegularExpressions.RegexOptions.IgnorePatternWhitespace) | **x** | Exclude unescaped white space from the pattern, and enable comments after a number sign (**#**). For more information, see [Ignore Whitespace](#Ignore-Whitespace). -[RightToLeft](xref:System.Text.RegularExpressions.RegexOptions.RightToLeft) | Not available | Change the search direction. Search moves from right to left instead of from left to right. For more information, see [Right-to-Left Mode](#Right-to-Left-Mode). -[ECMAScript](xref:System.Text.RegularExpressions.RegexOptions.ECMAScript) | Not available | Enable ECMAScript-compliant behavior for the expression. For more information, see [ECMAScript Matching Behavior](#ECMAScript-Matching-Behavior). -[CultureInvariant](xref:System.Text.RegularExpressions.RegexOptions.CultureInvariant) | Not available | Ignore cultural differences in language. For more information, see [Comparison Using the Invariant Culture](#Comparison-Using-the-Invariant-Culture). +[None](xref:System.Text.RegularExpressions.RegexOptions.None) | Not available | Use default behavior. For more information, see [Default options](#default-options). +[IgnoreCase](xref:System.Text.RegularExpressions.RegexOptions.IgnoreCase) | **i** | Use case-insensitive matching. For more information, see [Case-insensitive matching](#case-insensitive-matching). +[Multiline](xref:System.Text.RegularExpressions.RegexOptions.Multiline) | **m** | Use multiline mode, where **^** and **$** match the beginning and end of each line (instead of the beginning and end of the input string). For more information, see [Multiline mode](#multiline-mode). +[Singleline](xref:System.Text.RegularExpressions.RegexOptions.Singleline) | **s** | Use single-line mode, where the period (**.**) matches every character (instead of every character except **\n**). For more information, see [Single-line mode](#single-line-mode). +[ExplicitCapture](xref:System.Text.RegularExpressions.RegexOptions.ExplicitCapture) | **n** | Do not capture unnamed groups. The only valid captures are explicitly named or numbered groups of the form **(?<**_name_**>** _subexpression_**)**. For more information, see [Explicit captures only](#explicit-captures-only). +[Compiled](xref:System.Text.RegularExpressions.RegexOptions.Compiled) | Not available | Compile the regular expression to an assembly. For more information, see [Compiled regular expressions](#compiled-regular-expressions). +[IgnorePatternWhitespace](xref:System.Text.RegularExpressions.RegexOptions.IgnorePatternWhitespace) | **x** | Exclude unescaped white space from the pattern, and enable comments after a number sign (**#**). For more information, see [Ignore white space](#ignore-white-space). +[RightToLeft](xref:System.Text.RegularExpressions.RegexOptions.RightToLeft) | Not available | Change the search direction. Search moves from right to left instead of from left to right. For more information, see [Right-to-left mode](#right-to-left-mode). +[ECMAScript](xref:System.Text.RegularExpressions.RegexOptions.ECMAScript) | Not available | Enable ECMAScript-compliant behavior for the expression. For more information, see [ECMAScript matching behavior](#ecmascript-matching-behavior). +[CultureInvariant](xref:System.Text.RegularExpressions.RegexOptions.CultureInvariant) | Not available | Ignore cultural differences in language. For more information, see [Comparison using the invariant culture](#comparison-using-the-invariant-culture). -## Specifying the Options +## Specifying the options You can specify options for regular expressions in one of three ways: @@ -64,7 +65,7 @@ You can specify options for regular expressions in one of three ways: ' 'decidedly ' found at index 9. ``` -* By applying inline options in a regular expression pattern with the syntax **(?imnsx-imnsx)**. The option applies to the pattern from the point that the option is defined to either the end of the pattern or to the point at which the option is undefined by another inline option. Note that the [System.Text.RegularExpressions.RegexOptions](xref:System.Text.RegularExpressions.RegexOptions) property of a [Regex](xref:System.Text.RegularExpressions.Regex) instance does not reflect these inline options. For more information, see the [Miscellaneous Constructs in Regular Expressions](miscellaneous.md) topic. +* By applying inline options in a regular expression pattern with the syntax **(?imnsx-imnsx)**. The option applies to the pattern from the point that the option is defined to either the end of the pattern or to the point at which the option is undefined by another inline option. Note that the [System.Text.RegularExpressions.RegexOptions](xref:System.Text.RegularExpressions.RegexOptions) property of a [Regex](xref:System.Text.RegularExpressions.Regex) instance does not reflect these inline options. For more information, see the [Miscellaneous constructs in regular expressions](miscellaneous.md) topic. The following example provides an illustration. It uses inline options to enable case-insensitive matching and to ignore pattern white space when identifying words that begin with the letter "d". @@ -91,7 +92,7 @@ You can specify options for regular expressions in one of three ways: ' 'decidedly ' found at index 9. ``` -* By applying inline options in a particular grouping construct in a regular expression pattern with the syntax **(?imnsx-imnsx:**_subexpression_**)**. No sign before a set of options turns the set on; a minus sign before a set of options turns the set off. (**?** is a fixed part of the language construct's syntax that is required whether options are enabled or disabled.) The option applies only to that group. For more information, see [Grouping Constructs in Regular Expressions](grouping.md). +* By applying inline options in a particular grouping construct in a regular expression pattern with the syntax **(?imnsx-imnsx:**_subexpression_**)**. No sign before a set of options turns the set on; a minus sign before a set of options turns the set off. (**?** is a fixed part of the language construct's syntax that is required whether options are enabled or disabled.) The option applies only to that group. For more information, see [Grouping constructs in regular expressions](grouping.md). The following example provides an illustration. It uses inline options in a grouping construct to enable case-insensitive matching and to ignore pattern white space when identifying words that begin with the letter "d". @@ -149,7 +150,7 @@ The following five regular expression options can be set using the *options* par * [RegexOptions.ECMAScript](xref:System.Text.RegularExpressions.RegexOptions.ECMAScript) -## Determining the Options +## Determining the options You can determine which options were provided to a [Regex](xref:System.Text.RegularExpressions.Regex) object when it was instantiated by retrieving the value of the read-only [Regex.Options](xref:System.Text.RegularExpressions.Regex.Options) property. @@ -185,7 +186,7 @@ End If The following sections list the options supported by regular expression in .NET. -## Default Options +## Default options The [RegexOptions.None](xref:System.Text.RegularExpressions.RegexOptions.None) option indicates that no options have been specified, and the regular expression engine uses its default behavior. This includes the following: @@ -210,7 +211,7 @@ The [RegexOptions.None](xref:System.Text.RegularExpressions.RegexOptions.None) o Because the [RegexOptions.None](xref:System.Text.RegularExpressions.RegexOptions.None) option represents the default behavior of the regular expression engine, it is rarely explicitly specified in a method call. A constructor or static pattern-matching method without an options parameter is called instead. -## Case-Insensitive Matching +## Case-insensitive matching The [RegexOptions.IgnoreCase](xref:System.Text.RegularExpressions.RegexOptions.IgnoreCase) option, or the **i** inline option, provides case-insensitive matching. By default, the casing conventions of the current culture are used. @@ -329,7 +330,7 @@ End Module ' Found them at index 18. ``` -## Multiline Mode +## Multiline mode The [RegexOptions.Multiline](xref:System.Text.RegularExpressions.RegexOptions.Multiline) option, or the **m** inline option, enables the regular expression engine to handle an input string that consists of multiple lines. It changes the interpretation of the **^** and **$** language elements so that they match the beginning and end of a line, instead of the beginning and end of the input string. @@ -538,7 +539,7 @@ End Class ' Joe: 164 ``` -## Single-line Mode +## Single-line mode The [RegexOptions.Singleline](xref:System.Text.RegularExpressions.RegexOptions.Singleline) option, or the s inline option, causes the regular expression engine to treat the input string as if it consists of a single line. It does this by changing the behavior of the period (**.**) language element so that it matches every character, instead of matching every character except for the newline character **\n** or \u000A. @@ -628,7 +629,7 @@ End Module ' This\ is\ one\ line\ and\r\nthis\ is\ the\ second\. ``` -## Explicit Captures Only +## Explicit captures only By default, capturing groups are defined by the use of parentheses in the regular expression pattern. Named groups are assigned a name or number by the **(?<**_name_**>** _subexpression_**)** language option, whereas unnamed groups are accessible by index. In the [GroupCollection](xref:System.Text.RegularExpressions.GroupCollection) object, unnamed groups precede named groups. @@ -1084,7 +1085,7 @@ End Module ' Capture 0: Instead, it is a nonsensical paragraph. ``` -## Compiled Regular Expressions +## Compiled regular expressions By default, regular expressions in .NET are interpreted. When a [Regex](xref:System.Text.RegularExpressions.Regex) object is instantiated or a static [Regex](xref:System.Text.RegularExpressions.Regex) method is called, the regular expression pattern is parsed into a set of custom opcodes, and an interpreter uses these opcodes to run the regular expression. This involves a tradeoff: The cost of initializing the regular expression engine is minimized at the expense of run-time performance. @@ -1104,7 +1105,7 @@ However, this improvement in performance occurs only under the following conditi * A static regular expression is used in multiple calls to regular expression pattern-matching methods. (The performance improvement is possible because regular expressions used in static method calls are cached by the regular expression engine.) -## Ignore White Space +## Ignore white space By default, white space in a regular expression pattern is significant; it forces the regular expression engine to match a white-space character in the input string. Because of this, the regular expression `"\b\w+\s"` and `"\b\w+ "` are roughly equivalent regular expressions. In addition, when the number sign (**#**) is encountered in a regular expression pattern, it is interpreted as a literal character to be matched. @@ -1132,7 +1133,7 @@ The following example defines the following regular expression pattern: `\b \(? ( (?>\w+) ,?\s? )+ [\.!?] \)? # Matches an entire sentence`. -This pattern is similar to the pattern defined in the [Explicit Captures Only](#Explicit-Captures-Only) section, except that it uses the [RegexOptions.IgnorePatternWhitespace](xref:System.Text.RegularExpressions.RegexOptions.IgnorePatternWhitespace) option to ignore pattern white space. +This pattern is similar to the pattern defined in the [Explicit captures only](#explicit-captures-only) section, except that it uses the [RegexOptions.IgnorePatternWhitespace](xref:System.Text.RegularExpressions.RegexOptions.IgnorePatternWhitespace) option to ignore pattern white space. ```csharp using System; @@ -1228,7 +1229,7 @@ End Module ' Instead, it is a nonsensical paragraph. ``` -## Right-to-Left Mode +## Right-to-left mode By default, the regular expression engine searches from left to right. You can reverse the search direction by using the [RegexOptions.RightToLeft](xref:System.Text.RegularExpressions.RegexOptions.RightToLeft) option. The search automatically begins at the last character position of the string. For pattern-matching methods that include a starting position parameter, such as [Regex.Match(String, Int32)](xref:System.Text.RegularExpressions.Regex.Match(System.String,System.Int32)), the starting position is the index of the rightmost character position at which the search is to begin. @@ -1272,7 +1273,7 @@ End Module ' 'builder ' found at position 0. ``` -Also note that the lookahead assertion (the **(?**=_subexpression_**)** language element) and the lookbehind assertion (the **(?<**=_subexpression_**)** language element) do not change direction. The lookahead assertions look to the right; the lookbehind assertions look to the left. For example, the regular expression `(?<=\d{1,2}\s)\w+,?\s\d{4}` uses the lookbehind assertion to test for a date that precedes a month name. The regular expression then matches the month and the year. For information on lookahead and lookbehind assertsions, see [Grouping Constructs in Regular Expressions](grouping.md). +Also note that the lookahead assertion (the **(?**=_subexpression_**)** language element) and the lookbehind assertion (the **(?<**=_subexpression_**)** language element) do not change direction. The lookahead assertions look to the right; the lookbehind assertions look to the left. For example, the regular expression `(?<=\d{1,2}\s)\w+,?\s\d{4}` uses the lookbehind assertion to test for a date that precedes a month name. The regular expression then matches the month and the year. For information on lookahead and lookbehind assertsions, see [Grouping constructs in regular expressions](grouping.md). ```csharp using System; @@ -1333,7 +1334,7 @@ Pattern | Description `\s` | Match a white-space character. `\d{4}` | Match four decimal digits. -## ECMAScript Matching Behavior +## ECMAScript matching behavior By default, the regular expression engine uses canonical behavior when matching a regular expression pattern to input text. However, you can instruct the regular expression engine to use ECMAScript matching behavior by specifying the [RegexOptions.ECMAScript](xref:System.Text.RegularExpressions.RegexOptions.ECMAScript) option. @@ -1344,7 +1345,7 @@ The [RegexOptions.ECMAScript](xref:System.Text.RegularExpressions.RegexOptions.E The behavior of ECMAScript and canonical regular expressions differs in three areas: character class syntax, self-referencing capturing groups, and octal versus backreference interpretation. -* Character class syntax. Because canonical regular expressions support Unicode whereas ECMAScript does not, character classes in ECMAScript have a more limited syntax, and some character class language elements have a different meaning. For example, ECMAScript does not support language elements such as the Unicode category or block elements *\p* and **\P**. Similarly, the **\w** element, which matches a word character, is equivalent to the **[a-zA-Z_0-9]** character class when using ECMAScript and **[\p{Ll}\p{Lu}\p{Lt}\p{Lo}\p{Nd}\p{Pc}\p{Lm}]** when using canonical behavior. For more information, see [Character Classes in Regular Expressions](classes.md). +* Character class syntax. Because canonical regular expressions support Unicode whereas ECMAScript does not, character classes in ECMAScript have a more limited syntax, and some character class language elements have a different meaning. For example, ECMAScript does not support language elements such as the Unicode category or block elements *\p* and **\P**. Similarly, the **\w** element, which matches a word character, is equivalent to the **[a-zA-Z_0-9]** character class when using ECMAScript and **[\p{Ll}\p{Lu}\p{Lt}\p{Lo}\p{Nd}\p{Pc}\p{Lm}]** when using canonical behavior. For more information, see [Character classes in regular expressions](classes.md). The following example illustrates the difference between canonical and ECMAScript pattern matching. It defines a regular expression, `\b(\w+\s*)+`, that matches words followed by white-space characters. The input consists of two strings, one that uses the Latin character set and the other that uses the Cyrillic character set. As the output shows, the call to the [Regex.IsMatch(String, String, RegexOptions)](xref:System.Text.RegularExpressions.Regex.IsMatch(System.String,System.String,System.Text.RegularExpressions.RegexOptions)) method that uses ECMAScript matching fails to match the Cyrillic words, whereas the method call that uses canonical matching does match these words. @@ -1558,7 +1559,7 @@ Regular expression | Canonical behavior | ECMAScript behavior **\** followed by a digit from 1 to 9, followed by no additional decimal digits | Interpret as a backreference. For example, \9 always means backreference 9, even if a ninth capturing group does not exist. If the capturing group does not exist, the regular expression parser throws an [ArgumentException](xref:System.ArgumentException). | If a single decimal digit capturing group exists, backreference to that digit. Otherwise, interpret the value as a literal. **\** followed by a digit from 1 to 9, followed by additional decimal digits | Interpret the digits as a decimal value. If that capturing group exists, interpret the expression as a backreference. Otherwise, interpret the leading octal digits up to octal 377; that is, consider only the low 8 bits of the value. Interpret the remaining digits as literals. For example, in the expression `\3000`, if capturing group 300 exists, interpret as backreference 300; if capturing group 300 does not exist, interpret as octal 300 followed by 0. | Interpret as a backreference by converting as many digits as possible to a decimal value that can refer to a capture. If no digits can be converted, interpret as an octal by using the leading octal digits up to octal 377; interpret the remaining digits as literals. -## Comparison Using the Invariant Culture +## Comparison using the invariant culture By default, when the regular expression engine performs case-insensitive comparisons, it uses the casing conventions of the current culture to determine equivalent uppercase and lowercase characters. @@ -1655,7 +1656,7 @@ Thread.CurrentThread.CurrentCulture = defaultCulture ' URLs that access files are not allowed. ``` -## See Also +## See also [Regular expression language - quick reference](quick-ref.md) diff --git a/docs/standard/base-types/parsing-numeric.md b/docs/standard/base-types/parsing-numeric.md index fbc03d4024e45..9cefa756f31ff 100644 --- a/docs/standard/base-types/parsing-numeric.md +++ b/docs/standard/base-types/parsing-numeric.md @@ -3,6 +3,7 @@ title: Parsing numeric strings in .NET description: Parsing numeric strings in .NET keywords: .NET, .NET Core author: stevehoag +ms.author: shoag manager: wpickett ms.date: 07/29/2016 ms.topic: article @@ -16,13 +17,13 @@ ms.assetid: e393430a-731a-49fa-83de-ff7ed52d5704 All numeric types have two static parsing methods, `Parse` and `TryParse`, that you can use to convert the string representation of a number into a numeric type. These methods enable you to parse strings that were produced by using the format strings documented in [Standard Numeric Format Strings](standard-numeric.md) and [Custom Numeric Format Strings](custom-numeric.md). By default, the `Parse` and `TryParse` methods can successfully convert strings that contain integral decimal digits only to integer values. They can successfully convert strings that contain integral and fractional decimal digits, group separators, and a decimal separator to floating-point values. The `Parse` method throws an exception if the operation fails, whereas the `TryParse` method returns `false`. -## Parsing and Format Providers +## Parsing and format providers Typically, the string representations of numeric values differ by culture. Elements of numeric strings such as currency symbols, group (or thousands) separators, and decimal separators all vary by culture. Parsing methods either implicitly or explicitly use a format provider that recognizes these culture-specific variations. If no format provider is specified in a call to the `Parse` or `TryParse` method, the format provider associated with the current thread culture (the [NumberFormatInfo](xref:System.Globalization.NumberFormatInfo) object returned by the [NumberFormatInfo.CurrentInfo](xref:System.Globalization.NumberFormatInfo.CurrentInfo) property) is used. A format provider is represented by an [IFormatProvider](xref:System.Globalization.NumberFormatInfo.CurrentInfo) implementation. This interface has a single member, the [GetFormat](xref:System.IFormatProvider.GetFormat(System.Type)) method, whose single parameter is a [Type](xref:System.Type) object that represents the type to be formatted. This method returns the object that provides formatting information. .NET supports the following two [IFormatProvider](xref:System.Globalization.NumberFormatInfo.CurrentInfo) implementations for parsing numeric strings: -* A [CultureInfo](xref:System.Globalization.CultureInfo) object whose [CultureInfo.GetFormat](xref:System.Globalization.CultureInfo#System.Globalization.CultureInfo.GetFormat(System.Type)) method returns a [NumberFormatInfo](xref:System.Globalization.NumberFormatInfo) object that provides culture-specific formatting information. +* A [CultureInfo](xref:System.Globalization.CultureInfo) object whose [CultureInfo.GetFormat](xref:System.Globalization.CultureInfo.GetFormat(System.Type)) method returns a [NumberFormatInfo](xref:System.Globalization.NumberFormatInfo) object that provides culture-specific formatting information. * A [NumberFormatInfo](xref:System.Globalization.NumberFormatInfo) object whose [NumberFormatInfo.GetFormat](xref:System.Globalization.NumberFormatInfo.GetFormat(System.Type)) method returns itself. @@ -135,7 +136,7 @@ End Module ' fr-FR: Unable to parse 'Ae9f'. ``` -## Parsing and NumberStyles Values +## Parsing and NumberStyles values The style elements (such as white space, group separators, and decimal separator) that the parse operation can handle are defined by a [NumberStyles](xref:System.Globalization.NumberStyles) enumeration value. By default, strings that represent integer values are parsed by using the [NumberStyles.Integer](xref:System.Globalization.NumberStyles.Integer) value, which permits only numeric digits, leading and trailing white space, and a leading sign. Strings that represent floating-point values are parsed using a combination of the [NumberStyles.Float](xref:System.Globalization.NumberStyles.Float) and [NumberStyles.AllowThousands](xref:System.Globalization.NumberStyles.AllowThousands) values; this composite style permits decimal digits along with leading and trailing white space, a leading sign, a decimal separator, a group separator, and an exponent. By calling an overload of the `Parse` or `TryParse` method that includes a parameter of type [NumberStyles](xref:System.Globalization.NumberStyles) and setting one or more [NumberStyles](xref:System.Globalization.NumberStyles) flags, you can control the style elements that can be present in the string for the parse operation to succeed. @@ -227,7 +228,7 @@ Composite NumberStyles value | Includes members [NumberStyles.Any](xref:System.Globalization.NumberStyles.Any) | Includes all styles except [NumberStyles.AllowHexSpecifier](xref:System.Globalization.NumberStyles.AllowHexSpecifier). [NumberStyles.HexNumber](xref:System.Globalization.NumberStyles.HexNumber) | Includes the [NumberStyles.AllowLeadingWhite](xref:System.Globalization.NumberStyles.AllowLeadingWhite), [NumberStyles.AllowTrailingWhite](xref:System.Globalization.NumberStyles.AllowTrailingWhite), and [NumberStyles.AllowHexSpecifier](xref:System.Globalization.NumberStyles.AllowHexSpecifier) styles. -## Parsing and Unicode Digits +## Parsing and Unicode digits The Unicode standard defines code points for digits in various writing systems. For example, code points from U+0030 to U+0039 represent the basic Latin digits 0 through 9, code points from U+09E6 to U+09EF represent the Bangla digits 0 through 9, and code points from U+FF10 to U+FF19 represent the Fullwidth digits 0 through 9. However, the only numeric digits recognized by parsing methods are the basic Latin digits 0-9 with code points from U+0030 to U+0039. If a numeric parsing method is passed a string that contains any other digits, the method throws a [FormatException](xref:System.FormatException). @@ -313,7 +314,7 @@ End Module ' Unable to parse '১২৩৪৫'. ``` -## See Also +## See also [System.Globalization.NumberStyles](xref:System.Globalization.NumberStyles) diff --git a/docs/standard/base-types/quantifiers.md b/docs/standard/base-types/quantifiers.md index 671ab7adc90a4..2fa3d5e2c24aa 100644 --- a/docs/standard/base-types/quantifiers.md +++ b/docs/standard/base-types/quantifiers.md @@ -3,6 +3,7 @@ title: Quantifiers in regular expressions description: Quantifiers in regular expressions keywords: .NET, .NET Core author: stevehoag +ms.author: shoag manager: wpickett ms.date: 07/29/2016 ms.topic: article @@ -18,28 +19,28 @@ Quantifiers specify how many instances of a character, group, or character class Greedy quantifier | Lazy quantifier | Description ----------------- | --------------- | ----------- -__*+__ | __*?__ | Match zero or more times. +**\*** | **\*?** | Match zero or more times. **+** | **+?** | Match one or more times. **?** | **??** | Match zero or one time. **{**_n_**}** | **{**_n_**}?** | Match exactly n times. **{**_n_**,}** | **{**_n_**,}?** | Match at least n times. **{**_n_**,**_m_**}** | **{**_n_**,**_m_**}?** | Match from n to m times. -The quantities *n* and *m* are integer constants. Ordinarily, quantifiers are greedy; they cause the regular expression engine to match as many occurrences of particular patterns as possible. Appending the `?` character to a quantifier makes it lazy; it causes the regular expression engine to match as few occurrences as possible. For a complete description of the difference between greedy and lazy quantifiers, see the section [Greedy and Lazy Quantifiers](#Greedy-and-Lazy-Quantifiers) later in this topic. +The quantities *n* and *m* are integer constants. Ordinarily, quantifiers are greedy; they cause the regular expression engine to match as many occurrences of particular patterns as possible. Appending the `?` character to a quantifier makes it lazy; it causes the regular expression engine to match as few occurrences as possible. For a complete description of the difference between greedy and lazy quantifiers, see the section [Greedy and lazy quantifiers](#greedy-and-lazy-quantifiers) later in this topic. > [!IMPORTANT] -> Nesting quantifiers (for example, as the regular expression pattern `(a*)*` does) can increase the number of comparisons that the regular expression engine must perform, as an exponential function of the number of characters in the input string. For more information about this behavior and its workarounds, see [Backtracking in Regular Expressions](backtracking.md). +> Nesting quantifiers (for example, as the regular expression pattern `(a*)*` does) can increase the number of comparisons that the regular expression engine must perform, as an exponential function of the number of characters in the input string. For more information about this behavior and its workarounds, see [Backtracking in regular expressions](backtracking.md). -## Regular Expression Quantifiers +## Regular expression quantifiers The following sections list the quantifiers supported by .NET regular expressions. > [!NOTE] > If the \*, +, ?, {, and } characters are encountered in a regular expression pattern, the regular expression engine interprets them as quantifiers or part of quantifier constructs unless they are included in a [character class](classes.md). To interpret these as literal characters outside a character class, you must escape them by preceding them with a backslash. For example, the string `\*` in a regular expression pattern is interpreted as a literal asterisk ("*") character. -### Match Zero or More Times: * +### Match zero or more times: \* -The \* quantifier matches the preceding element zero or more times. It is equivalent to the **{0,}** quantifier. __*__ is a greedy quantifier whose lazy equivalent is __*?__. +The \* quantifier matches the preceding element zero or more times. It is equivalent to the **{0,}** quantifier. **\*** is a greedy quantifier whose lazy equivalent is **\*?**. The following example illustrates this regular expression. Of the nine digits in the input string, five match the pattern and four (`95`, `929`, `9129`, and `9919`) do not. @@ -80,7 +81,7 @@ Pattern | Description `9*` | Match zero or more "9" characters. `\b` | End at a word boundary. -### Match One or More Times: + +### Match one or more times: + The **+** quantifier matches the preceding element one or more times. It is equivalent to **{1,}**. **+** is a greedy quantifier whose lazy equivalent is **+?**. @@ -123,7 +124,7 @@ Pattern | Description `\w*?` | Match a word character zero or more times, but as few times as possible. `\b` | End at a word boundary. -### Match Zero or One Time: ? +### Match zero or one time: ? The **?** quantifier matches the preceding element zero or one time. It is equivalent to **{0,1}**. **?** is a greedy quantifier whose lazy equivalent is **??**. @@ -161,7 +162,7 @@ Pattern | Description `an?` | Match an "a" followed by zero or one "n" character. `\b` | End at a word boundary. -### Match Exactly n Times: {n} +### Match exactly n times: {n} The **{**_n_**}** quantifier matches the preceding element exactly *n* times, where *n* is any integer. **{**_n_**}** is a greedy quantifier whose lazy equivalent is **{**_n_**}?**. @@ -203,7 +204,7 @@ Pattern | Description `\d{3}` | Match three decimal digits. `\b` | End at a word boundary. -### Match at Least n Times: {n,} +### Match at least n times: {n,} The **{**_n_**,}** quantifier matches the preceding element at least *n* times, where *n* is any integer. **{**_n_**,}** is a greedy quantifier whose lazy equivalent is **{**_n_**}?**. @@ -241,7 +242,7 @@ Pattern | Description `\b` | Match a word boundary. `\D+` | Match at least one non-decimal digit. -### Match Between n and m Times: {n,m} +### Match between n and m times: {n,m} The **{**_n_**,**_m_**}** quantifier matches the preceding element at least *n* times, but no more than *m* times, where *n* and *m* are integers. **{**_n_**,**_m_**}** is a greedy quantifier whose lazy equivalent is **{**_n_**,**_m_**}?**. @@ -271,9 +272,9 @@ Next ' '00 00 00 00 ' found at position 35. ``` -### Match Zero or More Times (Lazy Match): *? +### Match zero or more times (lazy match): \*? -The __*?__ quantifier matches the preceding element zero or more times, but as few times as possible. It is the lazy counterpart of the greedy quantifier __*__. +The **\*?** quantifier matches the preceding element zero or more times, but as few times as possible. It is the lazy counterpart of the greedy quantifier **\***. In the following example, the regular expression `\b\w*?oo\w*?\b` matches all words that contain the string `oo`. @@ -315,7 +316,7 @@ Pattern | Description `\w*?` | Match zero or more word characters, but as few characters as possible. `\b` | End on a word boundary. -### Match One or More Times (Lazy Match): +? +### Match one or more times (lazy match): +? The **+?** quantifier matches the preceding element one or more times, but as few times as possible. It is the lazy counterpart of the greedy quantifier **+**. @@ -351,7 +352,7 @@ Next ' 'Ff' found at position 15. ``` -### Match Zero or One Time (Lazy Match): ?? +### Match zero or one time (lazy match): ?? The **??** quantifier matches the preceding element zero or one time, but as few times as possible. It is the lazy counterpart of the greedy quantifier **?**. @@ -406,7 +407,7 @@ Pattern | Description `(Line)??` | Match zero or one occurrence of the string "Line". `\(??` | Match zero or one occurrence of the opening parenthesis. -### Match Exactly n Times (Lazy Match): {n}? +### Match exactly n times (lazy match): {n}? The **{**_n_**}?** quantifier matches the preceding element exactly *n* times, where *n* is any integer. It is the lazy counterpart of the greedy quantifier **{**_n_**}+**. @@ -443,13 +444,13 @@ Pattern | Description `(\w{3,}?\.){2}?` | Match the pattern in the first group two times, but as few times as possible. `\b` | End the match on a word boundary. -### Match at Least n Times (Lazy Match): {n,}? +### Match at least n times (lazy match): {n,}? The **{**_n_**,}?** quantifier matches the preceding element at least *n* times, where *n* is any integer, but as few times as possible. It is the lazy counterpart of the greedy quantifier **{**_n_**,}**. See the example for the **{**_n_**}?** quantifier in the previous section for an illustration. The regular expression in that example uses the **{**_n_**,}** quantifier to match a string that has at least three characters followed by a period. -### Match Between n and m Times (Lazy Match): {n,m}? +### Match between n and m times (lazy match): {n,m}? The **{**_n_**,**_m_**}?** quantifier matches the preceding element between *n* and *m* times, where *n* and *m* are integers, but as few times as possible. It is the lazy counterpart of the greedy quantifier **{**_n_**,**_m_**}**. @@ -495,7 +496,7 @@ Pattern | Description `{1,10}?` | Match the previous pattern between 1 and 10 times, but as few times as possible. `[.!?]` | Match any one of the punctuation characters ".", "!", or "?". -## Greedy and Lazy Quantifiers +## Greedy and lazy quantifiers A number of the quantifiers have two versions: @@ -508,7 +509,7 @@ A number of the quantifiers have two versions: A non-greedy quantifier tries to match an element as few times as possible. You can turn a greedy quantifier into a lazy quantifier by simply adding a **?**. -Consider a simple regular expression that is intended to extract the last four digits from a string of numbers such as a credit card number. The version of the regular expression that uses the __*__ greedy quantifier is `\b.*([0-9]{4})\b`. However, if a string contains two numbers, this regular expression matches the last four digits of the second number only, as the following example shows. +Consider a simple regular expression that is intended to extract the last four digits from a string of numbers such as a credit card number. The version of the regular expression that uses the **\*** greedy quantifier is `\b.*([0-9]{4})\b`. However, if a string contains two numbers, this regular expression matches the last four digits of the second number only, as the following example shows. ```csharp string greedyPattern = @"\b.*([0-9]{4})\b"; @@ -530,9 +531,9 @@ Next ' Account ending in ******1999. ``` -The regular expression fails to match the first number because the __*__ quantifier tries to match the previous element as many times as possible in the entire string, and so it finds its match at the end of the string. +The regular expression fails to match the first number because the **\*** quantifier tries to match the previous element as many times as possible in the entire string, and so it finds its match at the end of the string. -This is not the desired behavior. Instead, you can use the __*?__ lazy quantifier to extract digits from both numbers, as the following example shows. +This is not the desired behavior. Instead, you can use the **\*?** lazy quantifier to extract digits from both numbers, as the following example shows. ```csharp string lazyPattern = @"\b.*?([0-9]{4})\b"; @@ -558,11 +559,11 @@ Next In most cases, regular expressions with greedy and lazy quantifiers return the same matches. They most commonly return different results when they are used with the wildcard (**.**) metacharacter, which matches any character. -## Quantifiers and Empty Matches +## Quantifiers and empty matches -The quantifiers __*__, **+**, and **{**_n_**,**_m_**}** and their lazy counterparts never repeat after an empty match when the minimum number of captures has been found. This rule prevents quantifiers from entering infinite loops on empty subexpression matches when the maximum number of possible group captures is infinite or near infinite. +The quantifiers **\***, **+**, and **{**_n_**,**_m_**}** and their lazy counterparts never repeat after an empty match when the minimum number of captures has been found. This rule prevents quantifiers from entering infinite loops on empty subexpression matches when the maximum number of possible group captures is infinite or near infinite. -For example, the following code shows the result of a call to the [Regex.Match](https://docs.microsoft.com/dotnet/core/api/System.Text.RegularExpressions.Regex.Match(System.String)) method with the regular expression pattern `(a?)*,` which matches zero or one "a" character zero or more times. Note that the single capturing group captures each "a" as well as [String.Empty](https://docs.microsoft.com/dotnet/core/api/System.String.Empty), but that there is no second empty match, because the first empty match causes the quantifier to stop repeating. +For example, the following code shows the result of a call to the [Regex.Match](xref:System.Text.RegularExpressions.Regex.Match(System.String)) method with the regular expression pattern `(a?)*,` which matches zero or one "a" character zero or more times. Note that the single capturing group captures each "a" as well as [String.Empty](xref:System.String.Empty), but that there is no second empty match, because the first empty match causes the quantifier to stop repeating. ```csharp using System; @@ -645,9 +646,9 @@ Pattern | Description ------- | ----------- `(a\1` | Either match "a" along with the value of the first captured group … `|(?(1)` | … or test whether the first captured group has been defined. (Note that the **(?(1)** construct does not define a capturing group.) -`\1))` | If the first captured group exists, match its value. If the group does not exist, the group will match [String.Empty](https://docs.microsoft.com/dotnet/core/api/System.String.Empty). +`\1))` | If the first captured group exists, match its value. If the group does not exist, the group will match [String.Empty](xref:System.String.Empty). -The first regular expression tries to match this pattern between zero and two times; the second, exactly two times. Because the first pattern reaches its minimum number of captures with its first capture of [String.Empty](https://docs.microsoft.com/dotnet/core/api/System.String.Empty), it never repeats to try to match `a\1;` the `{0,2}` quantifier allows only empty matches in the last iteration. In contrast, the second regular expression does match "a" because it evaluates `a\1` a second time; the minimum number of iterations, 2, forces the engine to repeat after an empty match. +The first regular expression tries to match this pattern between zero and two times; the second, exactly two times. Because the first pattern reaches its minimum number of captures with its first capture of [String.Empty](xref:System.String.Empty), it never repeats to try to match `a\1;` the `{0,2}` quantifier allows only empty matches in the last iteration. In contrast, the second regular expression does match "a" because it evaluates `a\1` a second time; the minimum number of iterations, 2, forces the engine to repeat after an empty match. ```csharp using System; @@ -778,7 +779,7 @@ End Module ' Capture: 2: 'a' at position 0. ``` -## See Also +## See also [Regular expression language - quick reference](quick-ref.md) diff --git a/docs/standard/base-types/quick-ref.md b/docs/standard/base-types/quick-ref.md index b64e8b044cbf3..75b17c40a7705 100644 --- a/docs/standard/base-types/quick-ref.md +++ b/docs/standard/base-types/quick-ref.md @@ -3,6 +3,7 @@ title: Regular expression language - quick reference description: Regular expression language - quick reference keywords: .NET, .NET Core author: stevehoag +ms.author: shoag manager: wpickett ms.date: 07/28/2016 ms.topic: article @@ -14,29 +15,29 @@ ms.assetid: 8c5dee8c-7bc7-4e6e-aff1-986965c4d98e # Regular expression language - quick reference -A regular expression is a pattern that the regular expression engine attempts to match in input text. A pattern consists of one or more character literals, operators, or constructs. For a brief introduction, see [Regular Expressions in .NET](regular-expressions.md). +A regular expression is a pattern that the regular expression engine attempts to match in input text. A pattern consists of one or more character literals, operators, or constructs. For a brief introduction, see [Regular expressions in .NET](regular-expressions.md). Each section in this quick reference lists a particular category of characters, operators, and constructs that you can use to define regular expressions: -* [Character escapes](#Character-escapes) - -* [Character classes](#Character-classes) +* [Character escapes](#character-escapes) + +* [Character classes](#character-classes) -* [Anchors](#Anchors) +* [Anchors](#anchors) -* [Grouping constructs](#Grouping-constructs) +* [Grouping constructs](#grouping-constructs) -* [Quantifiers](#Quantifiers) +* [Quantifiers](#quantifiers) -* [Backreference constructs](#Backreference-constructs) +* [Backreference constructs](#backreference-constructs) -* [Alternation constructs](#Alternation-constructs) +* [Alternation constructs](#alternation-constructs) -* [Substitutions](#Substitutions) +* [Substitutions](#substitutions) -* [Regular expression options](#Regular-expression-options) +* [Regular expression options](#regular-expression-options) -* [Miscellaneous constructs](#Miscellaneous-constructs) +* [Miscellaneous constructs](#miscellaneous-constructs) We’ve also provided this information in two formats that you can download and print for easy reference: @@ -46,7 +47,7 @@ We’ve also provided this information in two formats that you can download and ## Character Escapes -The backslash character (\) in a regular expression indicates that the character that follows it either is a special character (as shown in the following table), or should be interpreted literally. For more information, see [Character Escapes in Regular Expressions](escapes.md). +The backslash character (\) in a regular expression indicates that the character that follows it either is a special character (as shown in the following table), or should be interpreted literally. For more information, see [Character escapes in regular expressions](escapes.md). Escaped character | Description | Pattern | Matches ----------------- | ----------- | ------- | ------- @@ -62,11 +63,11 @@ Escaped character | Description | Pattern | Matches **\x**_nn_ | Uses hexadecimal representation to specify a character (*nn* consists of exactly two digits). | `\w\x20\w` | "a b", "c d" in "a bc d" **\c**_X_ or **\c**_x_ | Matches the ASCII control character that is specified by *X* or *x*, where *X* or *x* is the letter of the control character. | `\cC` | "\x0003" in "\x0003" (Ctrl-C) **\u**_nnnn_ | Matches a Unicode character by using hexadecimal representation (exactly four digits, as represented by *nnnn*). | `\w\u0020\w` | "a b", "c d" in "a bc d" -**\** | When followed by a character that is not recognized as an escaped character in this and other tables in this topic, matches that character. For example, __\*__ is the same as **\x2A**, and **\.** is the same as **\x2E**. This allows the regular expression engine to disambiguate language elements (such as `*` or `?`) and character literals (represented by `\*` or `\?)`. | `\d+[\+-x\*]\d+` | "2+2" and "3*9" in "(2+2) * 3*9" +**\\** | When followed by a character that is not recognized as an escaped character in this and other tables in this topic, matches that character. For example, __\*__ is the same as **\x2A**, and **\.** is the same as **\x2E**. This allows the regular expression engine to disambiguate language elements (such as `*` or `?`) and character literals (represented by `\*` or `\?)`. | `\d+[\+-x\*]\d+` | "2+2" and "3*9" in "(2+2) * 3*9" ## Character Classes -A character class matches any one of a set of characters. Character classes include the language elements listed in the following table. For more information, see [Character Classes in Regular Expressions](classes.md). +A character class matches any one of a set of characters. Character classes include the language elements listed in the following table. For more information, see [Character classes in regular expressions](classes.md). Character class | Description | Pattern | Matches --------------- | ----------- | ------- | ------- @@ -85,7 +86,7 @@ __\P{__*name*__}__ | Matches any single character that is not in the Unicode gen ## Anchors -Anchors, or atomic zero-width assertions, cause a match to succeed or fail depending on the current position in the string, but they do not cause the engine to advance through the string or consume characters. The metacharacters listed in the following table are anchors. For more information, see [Anchors in Regular Expressions](anchors.md). +Anchors, or atomic zero-width assertions, cause a match to succeed or fail depending on the current position in the string, but they do not cause the engine to advance through the string or consume characters. The metacharacters listed in the following table are anchors. For more information, see [Anchors in regular expressions](anchors.md). Assertion | Description | Pattern | Matches --------- | ----------- | ------- | ------- @@ -100,15 +101,15 @@ Assertion | Description | Pattern | Matches ## Grouping Constructs -Grouping constructs delineate subexpressions of a regular expression and typically capture substrings of an input string. Grouping constructs include the language elements listed in the following table. For more information, see [Grouping Constructs in Regular Expressions](grouping.md). +Grouping constructs delineate subexpressions of a regular expression and typically capture substrings of an input string. Grouping constructs include the language elements listed in the following table. For more information, see [Grouping constructs in regular expressions](grouping.md). Grouping construct | Description | Pattern | Matches ------------------ | ----------- | ------- | ------- **(**_subexpression_**)** | Captures the matched subexpression and assigns it a one-based ordinal number. | `(\w)\1` | "ee" in "deep" **(?** _subexpression_**)** | Captures the matched subexpression into a named group. | `(?\w)\k` | "ee" in "deep" -**(?** _subexpression_**)** | Defines a balancing group definition. For more information, see the "Balancing Group Definition" section in [Grouping Constructs in Regular Expressions](grouping.md). | `(((?'Open'\()[^\(\)]*)+((?'Close-Open'\))[^\(\)]*)+)*(?(Open)(?!))$` | "((1-3)*(3-1))" in "3+2^((1-3)*(3-1))" +**(?** _subexpression_**)** | Defines a balancing group definition. For more information, see the [Balancing Group Definitions](grouping.md#balancing-group-definitions) section in [Grouping constructs in regular expressions](grouping.md). | `(((?'Open'\()[^\(\)]*)+((?'Close-Open'\))[^\(\)]*)+)*(?(Open)(?!))$` | "((1-3)*(3-1))" in "3+2^((1-3)*(3-1))" **(?**: subexpression**)** | Defines a noncapturing group. | `Write(?:Line)?` | "WriteLine" in "Console.WriteLine()", "Write" in "Console.Write(value)" -**(?imnsx-imnsx**: _subexpression_**)** | Applies or disables the specified options within _subexpression_. For more information, see [Regular Expression Options](options.md). | `A\d{2}(?i:\w+)\b` | "A12xl", "A12XL" in "A12xl A12XL a12xl" +**(?imnsx-imnsx**: _subexpression_**)** | Applies or disables the specified options within _subexpression_. For more information, see [Regular expression options](options.md). | `A\d{2}(?i:\w+)\b` | "A12xl", "A12XL" in "A12xl A12XL a12xl" **(?**= _subexpression_**)** | Zero-width positive lookahead assertion. | `\w+(?=\.)` | "is", "ran", and "out" in "He is. The dog ran. The sun is out." **(?!** _subexpression_**)** | Zero-width negative lookahead assertion. | `\b(?!un)\w+\b` | "sure", "used" in "unsure sure unity used" **(?**<= _subexpression_**)** | Zero-width positive lookbehind assertion. | `(?<=19)\d{2}\b` | "99", "50", "05" in "1851 1999 1950 1905 2003" @@ -117,7 +118,7 @@ Grouping construct | Description | Pattern | Matches ## Quantifiers -A quantifier specifies how many instances of the previous element (which can be a character, a group, or a character class) must be present in the input string for a match to occur. Quantifiers include the language elements listed in the following table. For more information, see [Quantifiers in Regular Expressions](quantifiers.md). +A quantifier specifies how many instances of the previous element (which can be a character, a group, or a character class) must be present in the input string for a match to occur. Quantifiers include the language elements listed in the following table. For more information, see [Quantifiers in regular expressions](quantifiers.md). Quantifier | Description | Pattern | Matches ---------- | ----------- | ------- | ------- @@ -136,7 +137,7 @@ __*?__ | Matches the previous element zero or more times, but as few times as po ## Backreference Constructs -A backreference allows a previously matched subexpression to be identified subsequently in the same regular expression. The following table lists the backreference constructs supported by regular expressions in the .NET Framework. For more information, see [Backreference Constructs in Regular Expressions](backreference.md). +A backreference allows a previously matched subexpression to be identified subsequently in the same regular expression. The following table lists the backreference constructs supported by regular expressions in the .NET Framework. For more information, see [Backreference constructs in regular expressions](backreference.md). Backreference construct | Description | Pattern | Matches ----------------------- | ----------- | ------- | ------- @@ -145,7 +146,7 @@ Backreference construct | Description | Pattern | Matches ## Alternation Constructs -Alternation constructs modify a regular expression to enable either/or matching. These constructs include the language elements listed in the following table. For more information, see [Alternation Constructs in Regular Expressions](alternation.md). +Alternation constructs modify a regular expression to enable either/or matching. These constructs include the language elements listed in the following table. For more information, see [Alternation constructs in regular expressions](alternation.md). Alternation construct | Description | Pattern | Matches --------------------- | ----------- | ------- | ------- @@ -155,7 +156,7 @@ __(?(__*expression*__)__*yes*__|__*no*__)__ | Matches *yes* if the regular ## Substitutions -Substitutions are regular expression language elements that are supported in replacement patterns. For more information, see [Substitutions in Regular Expressions](substitutions.md). The metacharacters listed in the following table are atomic zero-width assertions. +Substitutions are regular expression language elements that are supported in replacement patterns. For more information, see [Substitutions in regular expressions](substitutions.md). The metacharacters listed in the following table are atomic zero-width assertions. Character | Description | Pattern | Replacement pattern | Input string | Result string --------- | ----------- | ------- | ------------------- | ------------ | ------------- @@ -170,7 +171,7 @@ Character | Description | Pattern | Replacement pattern | Input string | Result ## Regular Expression Options -You can specify options that control how the regular expression engine interprets a regular expression pattern. Many of these options can be specified either inline (in the regular expression pattern) or as one or more `RegexOptions` constants. This quick reference lists only inline options. For more information about inline and `RegexOptions` options, see the article [Regular Expression Options](options.md). +You can specify options that control how the regular expression engine interprets a regular expression pattern. Many of these options can be specified either inline (in the regular expression pattern) or as one or more `RegexOptions` constants. This quick reference lists only inline options. For more information about inline and `RegexOptions` options, see the article [Regular expression options](options.md). You can specify an inline option in two ways: @@ -183,9 +184,9 @@ The .NET regular expression engine supports the following inline options. Option | Description | Pattern | Matches ------ | ----------- | ------- | ------- **i** | Use case-insensitive matching. | **\b(?i)a(?-i)a\w+\b** | "aardvark", "aaaAuto" in "aardvark AAAuto aaaAuto Adam breakfast" -**m** | Use multiline mode. **^** and **$** match the beginning and end of a line, instead of the beginning and end of a string. | For an example, see the "Multiline Mode" section in [Regular Expression Options](options.md). | -**n*** | Do not capture unnamed groups. | For an example, see the "Explicit Captures Only" section in [Regular Expression Options](options.md). | -**s** | Use single-line mode. | For an example, see the "Single-line Mode" section in [Regular Expression Options](options.md). | +**m** | Use multiline mode. **^** and **$** match the beginning and end of a line, instead of the beginning and end of a string. | For an example, see the "Multiline Mode" section in [Regular expression options](options.md). | +**n*** | Do not capture unnamed groups. | For an example, see the "Explicit Captures Only" section in [Regular expression options](options.md). | +**s** | Use single-line mode. | For an example, see the "Single-line Mode" section in [Regular expression options](options.md). | **x** | Ignore unescaped white space in the regular expression pattern. | **\b(?x) \d+ \s \w+** | "1 aardvark", "2 cats" in "1 aardvark 2 cats IV centurions" ##Miscellaneous Constructs @@ -194,7 +195,7 @@ Miscellaneous constructs either modify a regular expression pattern or provide i Construct | Definition | Example --------- | ---------- | ------- -**(?imnsx-imnsx)** | Sets or disables options such as case insensitivity in the middle of a pattern. For more information, see [Regular Expression Options](options.md). | `\bA(?i)b\w+\b` matches "ABA", "Able" in "ABA Able Act" +**(?imnsx-imnsx)** | Sets or disables options such as case insensitivity in the middle of a pattern. For more information, see [Regular expression options](options.md). | `\bA(?i)b\w+\b` matches "ABA", "Able" in "ABA Able Act" **(?#** _comment_**)** | Inline comment. The comment ends at the first closing parenthesis. | `\bA(?#` matches words starting with `A)\w+\b` **#** [to end of line] | X-mode comment. The comment starts at an unescaped # and continues to the end of the line. | `(?x)\bA\w+\b#` matches words starting with `A` diff --git a/docs/standard/base-types/regex-behavior.md b/docs/standard/base-types/regex-behavior.md index f2c34834474a4..6476990dc48bf 100644 --- a/docs/standard/base-types/regex-behavior.md +++ b/docs/standard/base-types/regex-behavior.md @@ -3,6 +3,7 @@ title: Details of regular expression behavior description: Details of regular expression behavior keywords: .NET, .NET Core author: stevehoag +ms.author: shoag manager: wpickett ms.date: 07/28/2016 ms.topic: article @@ -117,7 +118,7 @@ Pattern | Description `(\d+)` | Match at least one numeric character, and assign it to the first capturing group. `\.` | Match a period. -For more information about lazy quantifiers, see [Quantifiers in Regular Expressions](quantifiers.md). +For more information about lazy quantifiers, see [Quantifiers in regular expressions](quantifiers.md). ### Positive lookahead @@ -170,7 +171,7 @@ Pattern | Description `\b` | End the match at a word boundary. `(?=\P{P})` | Look ahead to determine whether the next character is a punctuation symbol. If it is not, the match succeeds. -For more information about positive lookahead assertions, see [Grouping Constructs in Regular Expressions](grouping.md). +For more information about positive lookahead assertions, see [Grouping constructs in regular expressions](grouping.md). ### Negative lookahead @@ -225,7 +226,7 @@ Pattern | Description `(\w+)` | Match one or more word characters. `\b` | End the match at a word boundary. -For more information about negative lookahead assertions, see [Grouping Constructs in Regular Expressions](grouping.md). +For more information about negative lookahead assertions, see [Grouping constructs in regular expressions](grouping.md). ### Conditional evaluation @@ -319,11 +320,11 @@ Pattern | Description `|((\w+\p{P}?\s)+))` | If the `Pvt` capturing group does not exist, match one or more occurrences of one or more word characters followed by zero or one punctuation separator followed by a white-space character. Assign the substring to the third capturing group. `\r?$` | Match the end of a line or the end of the string. -For more information about conditional evaluation, see [Alternation Constructs in Regular Expressions](alternation.md). +For more information about conditional evaluation, see [Alternation constructs in regular expressions](alternation.md). ### Balancing group definitions -Balancing group definitions: **(?<**_name1-name2_**>** _subexpression_**)**. This feature allows the regular expression engine to keep track of nested constructs such as parentheses or opening and closing brackets. For an example, see [Grouping Constructs in Regular Expressions](grouping.md). +Balancing group definitions: **(?<**_name1-name2_**>** _subexpression_**)**. This feature allows the regular expression engine to keep track of nested constructs such as parentheses or opening and closing brackets. For an example, see [Grouping constructs in regular expressions](grouping.md). ### Nonbacktracking subexpressions @@ -473,7 +474,7 @@ End Module ' Group 1: aaaaa ``` -For more information about nonbacktracking subexpressions, see [Grouping Constructs in Regular Expressions](grouping.md). +For more information about nonbacktracking subexpressions, see [Grouping constructs in regular expressions](grouping.md). ### Right-to-left matching @@ -546,7 +547,7 @@ End Module ' Number at end of sentence (right-to-left): 107325 ``` -For more information about right-to-left matching, see [Regular Expression Options](options.md). +For more information about right-to-left matching, see [Regular expression options](options.md). ### Positive and negative lookbehind @@ -614,7 +615,7 @@ Pattern | Description `(?<=[A-Z0-9])` | Look behind to the previous character, which must be numeric or alphanumeric. (The comparison is case-insensitive.) `$` | End the match at the end of the string. -For more information about positive and negative lookbehind, see [Grouping Constructs in Regular Expressions](grouping.md). +For more information about positive and negative lookbehind, see [Grouping constructs in regular expressions](grouping.md). ## Related Topics diff --git a/docs/standard/base-types/standard-datetime.md b/docs/standard/base-types/standard-datetime.md index 044bcf30ede8a..bf300c93d5b79 100644 --- a/docs/standard/base-types/standard-datetime.md +++ b/docs/standard/base-types/standard-datetime.md @@ -3,6 +3,7 @@ title: Standard date and time format strings description: Standard date and time format strings keywords: .NET, .NET Core author: stevehoag +ms.author: shoag manager: wpickett ms.date: 07/25/2016 ms.topic: article @@ -14,7 +15,7 @@ ms.assetid: be239871-10cc-4949-b548-200bb260630a # Standard date and time format strings -A standard date and time format string uses a single format specifier to define the text representation of a date and time value. Any date and time format string that contains more than one character, including white space, is interpreted as a custom date and time format string; for more information, see [Custom Date and Time Format Strings](custom-datetime.md). A standard or custom format string can be used in two ways: +A standard date and time format string uses a single format specifier to define the text representation of a date and time value. Any date and time format string that contains more than one character, including white space, is interpreted as a custom date and time format string; for more information, see [Custom date and time format strings](custom-datetime.md). A standard or custom format string can be used in two ways: * To define the string that results from a formatting operation. @@ -22,7 +23,7 @@ A standard date and time format string uses a single format specifier to define Standard date and time format strings can be used with both [DateTime](xref:System.DateTime) and [DateTimeOffset](xref:System.DateTimeOffset) values. -The following table describes the standard date and time format specifiers. Unless otherwise noted, a particular standard date and time format specifier produces an identical string representation regardless of whether it is used with a [DateTime](xref:System.DateTime) or a [DateTimeOffset](xref:System.DateTimeOffset) value. See the [Notes](#Notes) section for additional information about using standard date and time format strings. +The following table describes the standard date and time format specifiers. Unless otherwise noted, a particular standard date and time format specifier produces an identical string representation regardless of whether it is used with a [DateTime](xref:System.DateTime) or a [DateTimeOffset](xref:System.DateTimeOffset) value. See the [Notes](#notes) section for additional information about using standard date and time format strings. Format specifier | Description | Examples ---------------- | ----------- | -------- @@ -43,7 +44,7 @@ Format specifier | Description | Examples "Y", "y" | Year month pattern. | `2009-06-15T13:45:30 -> June, 2009 (en-US)`; `2009-06-15T13:45:30 -> juni 2009 (da-DK)`; `2009-06-15T13:45:30 -> Juni 2009 (id-ID)` Any other single character | Unknown specifier. | Throws a run-time [FormatException](xref:System.FormatException). -## How Standard Format Strings Work +## How standard format strings work In a formatting operation, a standard format string is simply an alias for a custom format string. The advantage of using an alias to refer to a custom format string is that, although the alias remains invariant, the custom format string itself can vary. This is important because the string representations of date and time values typically vary by culture. For example, the "d" standard format string indicates that a date and time value is to be displayed using a short date pattern. For the invariant culture, this pattern is "MM/dd/yyyy". For the fr-FR culture, it is "dd/MM/yyyy". For the ja-JP culture, it is "yyyy/MM/dd". @@ -106,7 +107,7 @@ Standard format string | Defined by DateTimeFormatInfo.InvariantInfo property | The following sections describe the standard format specifiers for [DateTime](xref:System.DateTime) and [DateTimeOffset](xref:System.DateTimeOffset) values. -## The Short Date ("d") Format Specifier +## The short date ("d") format specifier The "d" standard format specifier represents a custom date and time format string that is defined by a specific culture's [DateTimeFormatInfo.ShortDatePattern](xref:System.Globalization.DateTimeFormatInfo.ShortDatePattern) property. For example, the custom format string that is returned by the [ShortDatePattern](xref:System.Globalization.DateTimeFormatInfo.ShortDatePattern) property of the invariant culture is "MM/dd/yyyy". @@ -143,7 +144,7 @@ Console.WriteLine(date1.ToString("d", _ ' Displays 10.04.2008 ``` -## The Long Date ("D") Format Specifier +## The long date ("D") format specifier The "D" standard format specifier represents a custom date and time format string that is defined by the current [DateTimeFormatInfo.LongDatePattern](xref:System.Globalization.DateTimeFormatInfo.LongDatePattern) property. For example, the custom format string for the invariant culture is "dddd, dd MMMM yyyy". @@ -183,7 +184,7 @@ Console.WriteLine(date1.ToString("D", _ ' Displays jueves, 10 de abril de 2008 ``` -## The Full Date Short Time ("f") Format Specifier +## The full date short time ("f") format specifier The "f" standard format specifier represents a combination of the long date ("D") and short time ("t") patterns, separated by a space. @@ -220,7 +221,7 @@ Console.WriteLine(date1.ToString("f", _ ' Displays jeudi 10 avril 2008 06:30 ``` -## The Full Date Long Time ("F") Format Specifier +## The full date long time ("F") format specifier The "F" standard format specifier represents a custom date and time format string that is defined by the current [DateTimeFormatInfo.FullDateTimePattern](xref:System.Globalization.DateTimeFormatInfo.FullDateTimePattern) property. For example, the custom format string for the invariant culture is "dddd, dd MMMM yyyy HH:mm:ss". @@ -256,7 +257,7 @@ Console.WriteLine(date1.ToString("F", _ ' Displays jeudi 10 avril 2008 06:30:00 ``` -## The General Date Short Time ("g") Format Specifier +## The general date short time ("g") format specifier The "g" standard format specifier represents a combination of the short date ("d") and short time ("t") patterns, separated by a space. @@ -297,7 +298,7 @@ Console.WriteLine(date1.ToString("g", _ ' Displays 10/04/2008 6:30 ``` -## The General Date Long Time ("G") Format Specifier +## The general date long time ("G") format specifier The "G" standard format specifier represents a combination of the short date ("d") and long time ("T") patterns, separated by a space. @@ -338,7 +339,7 @@ Console.WriteLine(date1.ToString("G", _ ' Displays 10/04/2008 6:30:00 ``` -## The Month ("M", "m") Format Specifier +## The month ("M", "m") format specifier The "M" or "m" standard format specifier represents a custom date and time format string that is defined by the current [DateTimeFormatInfo.MonthDayPattern](xref:System.Globalization.DateTimeFormatInfo.MonthDayPattern) property. For example, the custom format string for the invariant culture is "MMMM dd". @@ -371,7 +372,7 @@ Console.WriteLine(date1.ToString("m", _ ' Displays 10 April ``` -## The Round-trip ("O", "o") Format Specifier +## The round-trip ("O", "o") format specifier The "O" or "o" standard format specifier represents a custom date and time format string using a pattern that preserves time zone information and emits a result string that complies with ISO 8601. For [DateTime](xref:System.DateTime) values, this format specifier is designed to preserve date and time values along with the [DateTime.Kind](xref:System.DateTime.Kind) property in text. The formatted string can be parsed back by using the [DateTime.Parse(String, IFormatProvider, DateTimeStyles)](xref:System.DateTime.Parse(System.String,System.IFormatProvider,System.Globalization.DateTimeStyles)) or [DateTime.ParseExact](xref:System.DateTime.ParseExact(System.String,System.String,System.IFormatProvider,System.Globalization.DateTimeStyles)) method if the styles parameter is set to [DateTimeStyles.RoundtripKind](xref:System.Globalization.DateTimeStyles.RoundtripKind). @@ -520,7 +521,7 @@ Console.WriteLine("Round-tripped {0} to {1}.", originalDTO, newDTO) ' Round-tripped 4/12/2008 9:30:00 AM -08:00 to 4/12/2008 9:30:00 AM -08:00. ``` -## The RFC1123 ("R", "r") Format Specifier +## The RFC1123 ("R", "r") format specifier The "R" or "r" standard format specifier represents a custom date and time format string that is defined by the [DateTimeFormatInfo.RFC1123Pattern](xref:System.Globalization.DateTimeFormatInfo.RFC1123Pattern) property. The pattern reflects a defined standard, and the property is read-only. Therefore, it is always the same, regardless of the culture used or the format provider supplied. The custom format string is "ddd, dd MMM yyyy HH':'mm':'ss 'GMT'". When this standard format specifier is used, the formatting or parsing operation always uses the invariant culture. @@ -555,7 +556,7 @@ Console.WriteLine(dateOffset.ToUniversalTime.ToString("r")) ' Displays Thu, 10 Apr 2008 13:30:00 GMT ``` -## The Sortable ("s") Format Specifier +## The sortable ("s") format specifier The "s" standard format specifier represents a custom date and time format string that is defined by the [DateTimeFormatInfo.SortableDateTimePattern](xref:System.Globalization.DateTimeFormatInfo.SortableDateTimePattern) property. The pattern reflects a defined standard (ISO 8601), and the property is read-only. Therefore, it is always the same, regardless of the culture used or the format provider supplied. The custom format string is "yyyy'-'MM'-'dd'T'HH':'mm':'ss". @@ -577,7 +578,7 @@ Console.WriteLine(date1.ToString("s")) ' Displays 2008-04-10T06:30:00 ``` -## The Short Time ("t") Format Specifier +## The short time ("t") format specifier The "t" standard format specifier represents a custom date and time format string that is defined by the current [DateTimeFormatInfo.ShortTimePattern](xref:System.Globalization.DateTimeFormatInfo.ShortTimePattern) property. For example, the custom format string for the invariant culture is "HH:mm". @@ -611,7 +612,7 @@ Console.WriteLine(date1.ToString("t", _ ' Displays 6:30 ``` -## The Long Time ("T") Format Specifier +## The long time ("T") format specifier The "T" standard format specifier represents a custom date and time format string that is defined by a specific culture's [DateTimeFormatInfo.LongTimePattern](xref:System.Globalization.DateTimeFormatInfo.LongTimePattern) property. For example, the custom format string for the invariant culture is "HH:mm:ss". @@ -645,7 +646,7 @@ Console.WriteLine(date1.ToString("T", _ ' Displays 6:30:00 ``` -## The Universal Sortable ("u") Format Specifier +## The universal sortable ("u") format specifier The "u" standard format specifier represents a custom date and time format string that is defined by the [DateTimeFormatInfo.UniversalSortableDateTimePattern](xref:System.Globalization.DateTimeFormatInfo.UniversalSortableDateTimePattern) property. The pattern reflects a defined standard, and the property is read-only. Therefore, it is always the same, regardless of the culture used or the format provider supplied. The custom format string is "yyyy'-'MM'-'dd HH':'mm':'ss'Z'". When this standard format specifier is used, the formatting or parsing operation always uses the invariant culture. @@ -665,7 +666,7 @@ Console.WriteLine(date1.ToUniversalTime.ToString("u")) ' Displays 2008-04-10 13:30:00Z ``` -## The Universal Full ("U") Format Specifier +## The universal full ("U") format specifier The "U" standard format specifier represents a custom date and time format string that is defined by a specified culture's [DateTimeFormatInfo.FullDateTimePattern](xref:System.Globalization.DateTimeFormatInfo.FullDateTimePattern) property. The pattern is the same as the "F" pattern. However, the [DateTime](xref:System.DateTime) value is automatically converted to UTC before it is formatted. @@ -701,7 +702,7 @@ Console.WriteLine(date1.ToString("U", CultureInfo.CreateSpecificCulture("sv-FI") ' Displays den 10 april 2008 13:30:00 ``` -## The Year Month ("Y", "y") Format Specifier +## The year month ("Y", "y") format specifier The "Y" or "y" standard format specifier represents a custom date and time format string that is defined by the [DateTimeFormatInfo.YearMonthPattern](xref:System.Globalization.DateTimeFormatInfo.YearMonthPattern) property of a specified culture. For example, the custom format string for the invariant culture is "yyyy MMMM". @@ -734,11 +735,11 @@ Console.WriteLine(date1.ToString("y", CultureInfo.CreateSpecificCulture("af-ZA") ## Notes -### DateTimeFormatInfo Properties +### DateTimeFormatInfo properties Formatting is influenced by properties of the current [DateTimeFormatInfo](xref:System.Globalization.DateTimeFormatInfo) object, which is provided implicitly by the current thread culture or explicitly by the [IFormatProvider](xref:System.IFormatProvider) parameter of the method that invokes formatting. For the [IFormatProvider](xref:System.IFormatProvider) parameter, your application should specify a [CultureInfo](xref:System.Globalization.CultureInfo) object, which represents a culture, or a [DateTimeFormatInfo](xref:System.Globalization.DateTimeFormatInfo) object, which represents a particular culture's date and time formatting conventions. Many of the standard date and time format specifiers are aliases for formatting patterns defined by properties of the current [DateTimeFormatInfo](xref:System.Globalization.DateTimeFormatInfo) object. Your application can change the result produced by some standard date and time format specifiers by changing the corresponding date and time format patterns of the corresponding [DateTimeFormatInfo](xref:System.Globalization.DateTimeFormatInfo) property. -## See Also +## See also [Formatting types](formatting-types.md) diff --git a/docs/standard/base-types/standard-numeric.md b/docs/standard/base-types/standard-numeric.md index ed0179c0ffc5f..27c019b0b6251 100644 --- a/docs/standard/base-types/standard-numeric.md +++ b/docs/standard/base-types/standard-numeric.md @@ -3,6 +3,7 @@ title: Standard numeric format strings description: Standard numeric format strings keywords: .NET, .NET Core author: stevehoag +ms.author: shoag manager: wpickett ms.date: 07/26/2016 ms.topic: article @@ -27,7 +28,7 @@ When *precision specifier* controls the number of fractional digits in the resul Standard numeric format strings are supported by some overloads of the `ToString` method of all numeric types. For example, you can supply a numeric format string to the [ToString(String)](xref:System.Int32.ToString(System.String)) and [ToString(String, IFormatProvider)](xref:System.Int32.ToString(System.String,System.IFormatProvider)) methods of the [Int32](xref:System.Int32) type. Standard numeric format strings are also supported by the .NET [composite formatting](composite-format.md) feature, which is used by some `Write` and `WriteLine` methods of the [Console](xref:System.Console) and [StreamWriter](xref:System.IO.StreamWriter) classes, the [String.Format](xref:System.String.Format(System.IFormatProvider,System.String,System.Object)) method, and the [StringBuilder.AppendFormat](xref:System.Text.StringBuilder.AppendFormat(System.IFormatProvider,System.String,System.Object)) method. The composite format feature allows you to include the string representation of multiple data items in a single string, to specify field width, and to align numbers in a field. For more information, see [Composite Formatting](composite-format.md). -The following table describes the standard numeric format specifiers and displays sample output produced by each format specifier. See the [Notes](#Notes) section for additional information about using standard numeric format strings, and the [Example](#Example) section for a comprehensive illustration of their use. +The following table describes the standard numeric format specifiers and displays sample output produced by each format specifier. See the [Notes](#notes) section for additional information about using standard numeric format strings, and the [Example](#example) section for a comprehensive illustration of their use. |Format specifier|Name|Description|Examples| |----------------------|----------|-----------------|--------------| @@ -42,7 +43,7 @@ The following table describes the standard numeric format specifiers and display |"X" or "x"|Hexadecimal|Result: A hexadecimal string.

Supported by: Integral types only.

Precision specifier: Number of digits in the result string.

|255 ("X") -> FF

-1 ("x") -> ff

255 ("x4") -> 00ff

-1 ("X4") -> 00FF| |Any other single character|Unknown specifier|Result: Throws a [FormatException](xref:System.FormatException) at run time.|| -## Using Standard Numeric Format Strings +## Using standard numeric format strings A standard numeric format string can be used to define the formatting of a numeric value in one of two ways: @@ -96,7 +97,7 @@ A standard numeric format string can be used to define the formatting of a numer The following sections provide detailed information about each of the standard numeric format strings. -## The Currency ("C") Format Specifier +## The currency ("C") format specifier The "C" (or currency) format specifier converts a number to a string that represents a currency amount. The precision specifier indicates the desired number of decimal places in the result string. If the precision specifier is omitted, the default precision is defined by the [NumberFormatInfo.CurrencyDecimalDigits](xref:System.Globalization.NumberFormatInfo.CurrencyDecimalDigits) property. @@ -147,7 +148,7 @@ Console.WriteLine(value.ToString("C3", _ ' kr 12.345,679 ``` -## The Decimal ("D") Format Specifier +## The decimal ("D") format specifier The "D" (or decimal) format specifier converts a number to a string of decimal digits (0-9), prefixed by a minus sign if the number is negative. This format is supported only for integral types. @@ -193,7 +194,7 @@ Console.WriteLine(value.ToString("D8")) ' Displays -00012345 ``` -## The Exponential ("E") Format Specifier +## The exponential ("E") format specifier The exponential ("E") format specifier converts a number to a string of the form "-d.ddd…E+ddd" or "-d.ddd…e+ddd", where each "d" indicates a digit (0-9). The string starts with a minus sign if the number is negative. Exactly one digit always precedes the decimal point. @@ -243,7 +244,7 @@ Console.WriteLine(value.ToString("E", _ ' Displays 1,234568E+004 ``` -## The Fixed-Point ("F") Format Specifier +## The fixed-point ("F") format specifier The fixed-point ("F) format specifier converts a number to a string of the form "-ddd.ddd…" where each "d" indicates a digit (0-9). The string starts with a minus sign if the number is negative. @@ -315,7 +316,7 @@ Console.WriteLine(doubleNumber.ToString("F3", _ ' Displays -1898300,199 ``` -## The General ("G") Format Specifier +## The general ("G") format specifier The general ("G") format specifier converts a number to the more compact of either fixed-point or scientific notation, depending on the type of the number and whether a precision specifier is present. The precision specifier defines the maximum number of significant digits that can appear in the result string. If the precision specifier is omitted or zero, the type of the number determines the default precision, as indicated in the following table. @@ -413,7 +414,7 @@ Console.WriteLine(number.ToString("G5", CultureInfo.InvariantCulture)) ' Displays 3.1416 ``` -## The Numeric ("N") Format Specifier +## The numeric ("N") format specifier The numeric ("N") format specifier converts a number to a string of the form "-d,ddd,ddd.ddd…", where "-" indicates a negative number symbol if required, "d" indicates a digit (0-9), "," indicates a group separator, and "." indicates a decimal point symbol. The precision specifier indicates the desired number of digits after the decimal point. If the precision specifier is omitted, the number of decimal places is defined by the current [NumberFormatInfo.NumberDecimalDigits](xref:System.Globalization.NumberFormatInfo.NumberDecimalDigits) property. @@ -456,7 +457,7 @@ Console.WriteLine(intValue.ToString("N1", CultureInfo.InvariantCulture)) ' Displays 123,456,789.0 ``` -## The Percent ("P") Format Specifier +## The percent ("P") format specifier The percent ("P") format specifier multiplies a number by 100 and converts it to a string that represents a percentage. The precision specifier indicates the desired number of decimal places. If the precision specifier is omitted, the default numeric precision supplied by the current [PercentDecimalDigits](xref:System.Globalization.NumberFormatInfo.PercentDecimalDigits) property is used. @@ -497,7 +498,7 @@ Console.WriteLine(number.ToString("P1", CultureInfo.InvariantCulture)) ' Displays 24.7 % ``` -## The Round-trip ("R") Format Specifier +## The round-trip ("R") format specifier The round-trip ("R") format specifier is used to ensure that a numeric value that is converted to a string will be parsed back into the same numeric value. This format is supported only for the [Single](xref:System.Single), [Double](xref:System.Double), and [BigInteger](xref:System.Numerics.BigInteger) types. @@ -543,7 +544,7 @@ Console.WriteLine(value.ToString("r")) ' Displays 1.623E-21 ``` -## The Hexadecimal ("X") Format Specifier +## The hexadecimal ("X") format specifier The hexadecimal ("X") format specifier converts a number to a string of hexadecimal digits. The case of the format specifier indicates whether to use uppercase or lowercase characters for hexadecimal digits that are greater than 9. For example, use "X" to produce "ABCDEF", and "x" to produce "abcdef". This format is supported only for integral types. @@ -591,15 +592,15 @@ Console.WriteLine(value.ToString("X2")) ## Notes -### NumberFormatInfo Properties +### NumberFormatInfo properties Formatting is influenced by the properties of the current [NumberFormatInfo](xref:System.Globalization.NumberFormatInfo) object, which is provided implicitly by the current thread culture or explicitly by the [IFormatProvider](xref:System.IFormatProvider) parameter of the method that invokes formatting. Specify a [NumberFormatInfo](xref:System.Globalization.NumberFormatInfo) or [CultureInfo](xref:System.Globalization.CultureInfo) object for that parameter. -### Integral and Floating-Point Numeric Types +### Integral and floating-point numeric types Some descriptions of standard numeric format specifiers refer to integral or floating-point numeric types. The integral numeric types are [Byte](xref:System.Byte), [SByte](xref:System.SByte), [Int16](xref:System.Int16), [Int32](xref:System.Int32), [Int64](xref:System.Int64), [UInt16](xref:System.UInt16), [UInt32](xref:System.UInt32), [UInt64](xref:System.UInt64), and [BigInteger](xref:System.Numerics.BigInteger). The floating-point numeric types are [Decimal](xref:System.Decimal), [Single](xref:System.Single), and [Double](xref:System.Double). -### Floating-Point Infinities and NaN +### Floating-point infinities and NaN Regardless of the format string, if the value of a [Single](xref:System.Single) or [Double](xref:System.Double) floating-point type is positive infinity, negative infinity, or not a number (NaN), the formatted string is the value of the respective [PositiveInfinitySymbol](xref:System.Globalization.NumberFormatInfo.PositiveInfinitySymbol), [NegativeInfinitySymbol](xref:System.Globalization.NumberFormatInfo.NegativeInfinitySymbol), or [NaNSymbol](xref:System.Globalization.NumberFormatInfo.NaNSymbol) property that is specified by the currently applicable [NumberFormatInfo](xref:System.Globalization.NumberFormatInfo) object. @@ -712,7 +713,7 @@ Module NumericFormats End Module ``` -## See Also +## See also [System.Globalization.NumberFormatInfo](xref:System.Globalization.NumberFormatInfo) diff --git a/docs/standard/base-types/substitutions.md b/docs/standard/base-types/substitutions.md index 930d31a9802f5..d84bc6c5ecbbd 100644 --- a/docs/standard/base-types/substitutions.md +++ b/docs/standard/base-types/substitutions.md @@ -3,6 +3,7 @@ title: Substitutions in regular expressions description: Substitutions in regular expressions keywords: .NET, .NET Core author: stevehoag +ms.author: shoag manager: wpickett ms.date: 07/29/2016 ms.topic: article @@ -20,29 +21,29 @@ Substitutions are language elements that are recognized only within replacement Substitution | Description ------------ | ----------- -**$**_number_ | Includes the last substring matched by the capturing group that is identified by *number*, where *number* is a decimal value, in the replacement string. For more information, see [Substituting a Numbered Group](#Substituting-a-Numbered-Group). -**${**_name_**}** | Includes the last substring matched by the named group that is designated by **(?<**_name_**>)** in the replacement string. For more information, see [Substituting a Named Group](#Substituting-a-Named-Group). -**$$** | Includes a single "$" literal in the replacement string. For more information, see [Substituting a "$" Symbol](#Substituting-a-$-Symbol). -**$&** | Includes a copy of the entire match in the replacement string. For more information, see [Substituting the Entire Match](#Substituting-the-Entire-Match). -**$`** | Includes all the text of the input string before the match in the replacement string. For more information, see [Substituting the Text before the Match](#Substituting-the-Text-before-the-Match). -**$'** | Includes all the text of the input string after the match in the replacement string. For more information, see [Substituting the Text after the Match](#Substituting-the-Text-after-the-Match). -**$+** | Includes the last group captured in the replacement string. For more information, see [Substituting the Last Captured Group](#Substituting-the-Last-Captured-Group). -**$_** | Includes the entire input string in the replacement string. For more information, see [Substituting the Entire Input String](#Substituting-the-Entire-Input-String). +**$**_number_ | Includes the last substring matched by the capturing group that is identified by *number*, where *number* is a decimal value, in the replacement string. For more information, see [Substituting a numbered group](#substituting-a-numbered-group). +**${**_name_**}** | Includes the last substring matched by the named group that is designated by **(?<**_name_**>)** in the replacement string. For more information, see [Substituting a named group](#substituting-a-named-group). +**$$** | Includes a single "$" literal in the replacement string. For more information, see [Substituting a "$" character](#substituting-a--character). +**$&** | Includes a copy of the entire match in the replacement string. For more information, see [Substituting the entire match](#substituting-the-entire-match). +**$`** | Includes all the text of the input string before the match in the replacement string. For more information, see [Substituting the text before the match](#substituting-the-text-before-the-match). +**$'** | Includes all the text of the input string after the match in the replacement string. For more information, see [Substituting the text after the match](#substituting-the-text-after-the-match). +**$+** | Includes the last group captured in the replacement string. For more information, see [Substituting the last captured group](#substituting-the-last-captured-group). +**$_** | Includes the entire input string in the replacement string. For more information, see [Substituting the entire input string](#substituting-the-entire-input-string). -## Substitution Elements and Replacement Patterns +## Substitution elements and replacement patterns Substitutions are the only special constructs recognized in a replacement pattern. None of the other regular expression language elements, including character escapes and the period (**.**), which matches any character, are supported. Similarly, substitution language elements are recognized only in replacement patterns and are never valid in regular expression patterns. The only character that can appear either in a regular expression pattern or in a substitution is the **$** character, although it has a different meaning in each context. In a regular expression pattern, **$** is an anchor that matches the end of the string. In a replacement pattern, **$** indicates the beginning of a substitution. > [!NOTE] -> For functionality similar to a replacement pattern within a regular expression, use a backreference. For more information about backreferences, see [Backreference Constructs](backreference.md). +> For functionality similar to a replacement pattern within a regular expression, use a backreference. For more information about backreferences, see [Backreference constructs](backreference.md). -## Substituting a Numbered Group +## Substituting a numbered group -The **$**_number_ language element includes the last substring matched by the number capturing group in the replacement string, where *number* is the index of the capturing group. For example, the replacement pattern `$1` indicates that the matched substring is to be replaced by the first captured group. For more information about numbered capturing groups, see [Grouping Constructs in Regular Expressions](grouping.md). +The **$**_number_ language element includes the last substring matched by the number capturing group in the replacement string, where *number* is the index of the capturing group. For example, the replacement pattern `$1` indicates that the matched substring is to be replaced by the first captured group. For more information about numbered capturing groups, see [Grouping constructs in regular expressions](grouping.md). -All digits that follow **$** are interpreted as belonging to the number group. If this is not your intent, you can substitute a named group instead. For example, you can use the replacement string **${1}1** instead of **$11** to define the replacement string as the value of the first captured group along with the number "1". For more information, see [Substituting a Named Group](#Substituting-a-Named-Group). +All digits that follow **$** are interpreted as belonging to the number group. If this is not your intent, you can substitute a named group instead. For example, you can use the replacement string **${1}1** instead of **$11** to define the replacement string as the value of the first captured group along with the number "1". For more information, see [Substituting a named group](#substituting-a-named-group). Capturing groups that are not explicitly assigned names using the **(?<**_name-**>)** syntax are numbered from left to right starting at one. Named groups are also numbered from left to right, starting at one greater than the index of the last unnamed group. For example, in the regular expression `(\w)(?\d)`, the index of the `digit` named group is 2. @@ -96,9 +97,9 @@ Pattern | Description `\d*` | Match zero or more decimal digits. `(\s?\d+[.,]?\d*)` | Match a white space followed by one or more decimal digits, followed by zero or one period or comma, followed by zero or more decimal digits. This is the first capturing group. Because the replacement pattern is `$1`, the call to the [Regex.Replace](xref:System.Text.RegularExpressions.Regex.Replace(System.String,System.String,System.String,System.Text.RegularExpressions.RegexOptions)) method replaces the entire matched substring with this captured group. -## Substituting a Named Group +## Substituting a named group -The **${**_name_**}** language element substitutes the last substring matched by the *name* capturing group, where *name* is the name of a capturing group defined by the **(?<**_name_**>)** language element. For more information about named capturing groups, see [Grouping Constructs in Regular Expressions](grouping.md). +The **${**_name_**}** language element substitutes the last substring matched by the *name* capturing group, where *name* is the name of a capturing group defined by the **(?<**_name_**>)** language element. For more information about named capturing groups, see [Grouping constructs in regular expressions](grouping.md). If *name* doesn't specify a valid named capturing group defined in the regular expression pattern but consists of digits, **${**_name_**}** is interpreted as a numbered group. @@ -152,7 +153,7 @@ Pattern | Description `\d*` | Match zero or more decimal digits. `(?\s?\d[.,]?\d*)` | Match a white space, followed by one or more decimal digits, followed by zero or one period or comma, followed by zero or more decimal digits. This is the capturing group named amount. Because the replacement pattern is `${amount}`, the call to the [Regex.Replace](xref:System.Text.RegularExpressions.Regex.Replace(System.String,System.String,System.String,System.Text.RegularExpressions.RegexOptions)) method replaces the entire matched substring with this captured group. -## Substituting a $ Character +## Substituting a $ character The **$$** substitution inserts a literal "$" character in the replaced string. @@ -236,7 +237,7 @@ Pattern | Description `(\d+)` | Match one or more decimal digits. This is the third capturing group. `(\.(\d+))?` | Match zero or one occurrence of a period followed by one or more decimal digits. This is the second capturing group. -## Substituting the Entire Match +## Substituting the entire match The **$&** substitution includes the entire match in the replacement string. Often, it is used to add a substring to the beginning or end of the matched string. For example, the `($&)` replacement pattern adds parentheses to the beginning and end of each match. If there is no match, the **$&** substitution has no effect. @@ -300,7 +301,7 @@ Pattern | Description The `"$&"` replacement pattern adds a literal quotation mark to the beginning and end of each match. -## Substituting the Text Before the Match +## Substituting the text before the match The **$`** substitution replaces the matched string with the entire input string before the match. That is, it duplicates the input string up to the match while removing the matched text. Any text that follows the matched text is unchanged in the result string. If there are multiple matches in an input string, the replacement text is derived from the original input string, rather than from the string in which text has been replaced by earlier matches. (The example provides an illustration.) If there is no match, the **$`** substitution has no effect. @@ -375,7 +376,7 @@ Match | Position | String before match | Result string 4 | 11 | aa1bb2cc3dd | aaaabbaa1bbccaa1bb2ccdd**aa1bb2cc3dd**ee5 5 | 14 | aa1bb2cc3dd4ee | aaaabbaa1bbccaa1bb2ccddaa1bb2cc3ddee **aa1bb2cc3dd4ee** -## Substituting the Text After the Match +## Substituting the text after the match The **$'** substitution replaces the matched string with the entire input string after the match. That is, it duplicates the input string after the match while removing the matched text. Any text that precedes the matched text is unchanged in the result string. If there is no match, the **$'** substitution has no effect. @@ -449,7 +450,7 @@ Match | Position | String before match | Result string 4 | 11 | ee5 | aabb2cc3dd4ee5bbcc3dd4ee5ccdd4ee5dd**ee5**ee5 5 | 14 | [String.Empty](xref:System.String.Empty) | aabb2cc3dd4ee5bbcc3dd4ee5ccdd4ee5ddee5ee -## Substituting the Last Captured Group +## Substituting the last captured group The **$+** substitution replaces the matched string with the last captured group. If there are no captured groups or if the value of the last captured group is [String.Empty](xref:System.String.Empty), the **$+** substitution has no effect. @@ -500,7 +501,7 @@ Pattern | Description `\1` | Match the first captured group. `\b` | End the match at a word boundary. -## Substituting the Entire Input String +## Substituting the entire input string The **$_** substitution replaces the matched string with the entire input string. That is, it removes the matched text and replaces it with the entire string, including the matched text. @@ -552,7 +553,7 @@ Match | Position | String before match | Result string 1 | 3 | 123 | ABC**ABC123DEF456**DEF456 2 | 5 | 456 | ABCABC123DEF456DEF**ABC123DEF456** -## See Also +## See also [Regular expression language - quick reference](quick-ref.md) diff --git a/docs/standard/base-types/type-conversion.md b/docs/standard/base-types/type-conversion.md index 24a1ab6d8e987..377df192bb85e 100644 --- a/docs/standard/base-types/type-conversion.md +++ b/docs/standard/base-types/type-conversion.md @@ -3,6 +3,7 @@ title: Type conversion description: Type conversion keywords: .NET, .NET Core author: stevehoag +ms.author: shoag manager: wpickett ms.date: 07/22/2016 ms.topic: article @@ -28,17 +29,17 @@ Every value has an associated type, which defines attributes such as the amount In addition to these automatic conversions, .NET provides several features that support custom type conversion. These include the following: -* The `Implicit` operator, which defines the available widening conversions between types. For more information, see the [Implicit Conversion with the Implicit Operator](#Implicit-Conversion-with-the-Implicit-Operator) section. +* The `Implicit` operator, which defines the available widening conversions between types. For more information, see the [Implicit conversion with the Implicit operator](#implicit-conversion-with-the-implicit-operator) section. -* The `Explicit` operator, which defines the available narrowing conversions between types. For more information, see the [Explicit Conversion with the Explicit Operator](#Explicit-Conversion-with-the-Explicit-Operator) section. +* The `Explicit` operator, which defines the available narrowing conversions between types. For more information, see the [Explicit conversion with the Explicit operator](#explicit-conversion-with-the-explicit-operator) section. -* The [IConvertible](xref:System.IConvertible) interface, which defines conversions to each of the base .NET data types. For more information, see the [The IConvertible Interface](#The-IConvertible-Interface) section. +* The [IConvertible](xref:System.IConvertible) interface, which defines conversions to each of the base .NET data types. For more information, see the [The IConvertible interface](#the-iconvertible-interface) section. -* The [Convert](xref:System.Convert) class, which provides a set of methods that implement the methods in the `IConvertible` interface. For more information, see the [The Convert Class](#The-Convert-Class) section. +* The [Convert](xref:System.Convert) class, which provides a set of methods that implement the methods in the `IConvertible` interface. For more information, see the [The Convert class](#the-convert-class) section. -* The [TypeConverter](xref:System.ComponentModel.TypeConverter) class, which is a base class that can be extended to support the conversion of a specified type to any other type. For more information, see the [The TypeConverter Class](#The-TypeConverter-Class) section. +* The [TypeConverter](xref:System.ComponentModel.TypeConverter) class, which is a base class that can be extended to support the conversion of a specified type to any other type. For more information, see the [The TypeConverter class](#the-typeconverter-class) section. -## Implicit Conversion with the Implicit Operator +## Implicit conversion with the Implicit operator Widening conversions involve the creation of a new value from the value of an existing type that has either a more restrictive range or a more restricted member list than the target type. Widening conversions cannot result in data loss (although they may result in a loss of precision). Because data cannot be lost, compilers can handle the conversion implicitly or transparently, without requiring the use of an explicit conversion method or a casting operator. @@ -199,7 +200,7 @@ Console.WriteLine(value.ToString()) ' 255 ``` -## Explicit Conversion with the Explicit Operator +## Explicit conversion with the Explicit operator Narrowing conversions involve the creation of a new value from the value of an existing type that has either a greater range or a larger member list than the target type. Because a narrowing conversion can result in a loss of data, compilers often require that the conversion be made explicit through a call to a conversion method or a casting operator. That is, the conversion must be handled explicitly in developer code. @@ -214,7 +215,7 @@ Type | Comparison with range of Int32 [UInt32](xref:System.UInt32) | [UInt32.MaxValue](xref:System.UInt32.MaxValue) is greater than [Int32.MaxValue](xref:System.Int32.MaxValue). [UInt64](xref:System.UInt64) | [UInt64.MaxValue](xref:System.UInt64.MaxValue) is greater than [Int32.MaxValue](xref:System.Int32.MaxValue). -To handle such narrowing conversions, .NET allows types to define an `Explicit` operator. Individual language compilers can then implement this operator using their own syntax, or a member of the [Convert](xref:System.Convert) class can be called to perform the conversion. (For more information about the `Convert` class, see [The Convert Class](#The-Convert-Class) later in this topic.) The following example illustrates the use of language features to handle the explicit conversion of these potentially out-of-range integer values to [Int32](xref:System.Int32) values. +To handle such narrowing conversions, .NET allows types to define an `Explicit` operator. Individual language compilers can then implement this operator using their own syntax, or a member of the [Convert](xref:System.Convert) class can be called to perform the conversion. (For more information about the `Convert` class, see [The Convert class](#the-convert-class) later in this topic.) The following example illustrates the use of language features to handle the explicit conversion of these potentially out-of-range integer values to [Int32](xref:System.Int32) values. ```csharp long number1 = int.MaxValue + 20L; @@ -482,7 +483,7 @@ End Try ' '1024' is out of range of the ByteWithSign data type. ``` -## The IConvertible Interface +## The IConvertible interface To support the conversion of any type to a common language runtime base type, .NET provides the [IConvertible](xref:System.IConvertible) interface. The implementing type is required to provide the following: @@ -508,18 +509,18 @@ Dim ch As Char = iConv.ToChar(Nothing) Console.WriteLine("Converted {0} to {1}.", codePoint, ch) ``` -The requirement to call the conversion method on its interface rather than on the implementing type makes explicit interface implementations relatively expensive. Instead, we recommend that you call the appropriate member of the [Convert](xref:System.Convert) class to convert between common language runtime base types. For more information, see the next section, [The Convert Class](#The-Convert-Class). +The requirement to call the conversion method on its interface rather than on the implementing type makes explicit interface implementations relatively expensive. Instead, we recommend that you call the appropriate member of the [Convert](xref:System.Convert) class to convert between common language runtime base types. For more information, see the next section, [The Convert class](#the-convert-class). > [!NOTE] > In addition to the [IConvertible](xref:System.IConvertible) interface and the [Convert](xref:System.Convert) class provided by .NET, individual languages may also provide ways to perform conversions. For example, C# uses casting operators; Visual Basic uses compiler-implemented conversion functions such as `CType`, `CInt`, and `DirectCast`. -For the most part, the [IConvertible](xref:System.IConvertible) interface is designed to support conversion between the base types in .NET. However, the interface can also be implemented by a custom type to support conversion of that type to other custom types. For more information, see the section [Custom Conversions with the ChangeType Method](#Custom-Conversions-with-the-ChangeType-Method) later in this topic. +For the most part, the [IConvertible](xref:System.IConvertible) interface is designed to support conversion between the base types in .NET. However, the interface can also be implemented by a custom type to support conversion of that type to other custom types. For more information, see the section [Custom conversions with the ChangeType method](#custom-conversions-with-the-changetype-method) later in this topic. -## The Convert Class +## The Convert class Although each base type's [IConvertible](xref:System.IConvertible) interface implementation can be called to perform a type conversion, calling the methods of the [System.Convert](xref:System.Convert) class is the recommended language-neutral way to convert from one base type to another. In addition, the [Convert.ChangeType(Object, Type, IFormatProvider)](xref:System.Convert.ChangeType(System.Object,System.Type,System.IFormatProvider)) method can be used to convert from a specified custom type to another type. -### Conversions Between Base Types +### Conversions between base types The [Convert](xref:System.Convert) class provides a language-neutral way to perform conversions between base types and is available to all languages that target the common language runtime. It provides a complete set of methods for both widening and narrowing conversions, and throws an [InvalidCastException](xref:System.InvalidCastException) for conversions that are not supported (such as the conversion of a [DateTime](xref:System.DateTime) value to an integer value). Narrowing conversions are performed in a checked context, and an [OverflowException](xref:System.OverflowException) is thrown if the conversion fails. @@ -689,9 +690,9 @@ End Try ' 42.72 converted to 43. ``` -For a table that lists both the widening and narrowing conversions supported by the [Convert](xref:System.Convert) class, see [Type Conversion Tables](conversion-tables.md). +For a table that lists both the widening and narrowing conversions supported by the [Convert](xref:System.Convert) class, see [Type conversion tables](conversion-tables.md). -### Custom Conversions with the ChangeType Method +### Custom conversions with the ChangeType method In addition to supporting conversions to each of the base types, the [Convert](xref:System.Convert) class can be used to convert a custom type to one or more predefined types. This conversion is performed by the [Convert.ChangeType(Object, Type, IFormatProvider)](xref:System.Convert.ChangeType(System.Object,System.Type,System.IFormatProvider)) method, which in turn wraps a call to the [IConvertible.ToType](xref:System.IConvertible.ToType(System.Type,System.IFormatProvider)) method of the value parameter. This means that the object represented by the value parameter must provide an implementation of the [IConvertible](xref:System.IConvertible) interface. @@ -1198,7 +1199,7 @@ Console.WriteLine("{0} equals {1}.", tempF2, tempF3) ' 212°F equals 212°F. ``` -## The TypeConverter Class +## The TypeConverter class .NET also allows you to define a type converter for a custom type by extending the [System.ComponentModel.TypeConverter](xref:System.ComponentModel.TypeConverter) class and associating the type converter with the type through a [System.ComponentModel.TypeConverterAttribute](xref:System.ComponentModel.TypeConverterAttribute) attribute. The following table highlights the differences between this approach and implementing the [IConvertible](xref:System.IConvertible) interface for a custom type. @@ -1214,11 +1215,10 @@ Allows two-way type conversions from the custom type to other data types, and fr For more information about using type converters to perform conversions, see [System.ComponentModel.TypeConverter](xref:System.ComponentModel.TypeConverter). -## See Also +## See also [System.Convert](xref:System.Convert) [IConvertible](xref:System.IConvertible) -[Type Conversion Tables](conversion-tables.md) - +[Type conversion tables](conversion-tables.md) \ No newline at end of file diff --git a/docs/standard/datetime/converting-between-time-zones.md b/docs/standard/datetime/converting-between-time-zones.md index 739be8e6df15f..254f9899d4f94 100644 --- a/docs/standard/datetime/converting-between-time-zones.md +++ b/docs/standard/datetime/converting-between-time-zones.md @@ -3,6 +3,7 @@ title: Converting times between time zones description: Converting times between time zones keywords: .NET, .NET Core author: stevehoag +ms.author: shoag manager: wpickett ms.date: 08/15/2016 ms.topic: article @@ -26,7 +27,7 @@ Coordinated Universal Time (UTC) is a high-precision, atomic time standard. The The easiest way to convert a time to UTC is to call the `static` (`Shared` in Visual Basic) [TimeZoneInfo.ConvertTimeToUtc(DateTime)](https://msdn.microsoft.com/en-us/library/bb381744(v=vs.110).aspx) method. > [!IMPORTANT] -> The `TimeZoneInfo.ConvertTimeToUtc(DateTime)' method isn't currently available in .NET Core. +> The `TimeZoneInfo.ConvertTimeToUtc(DateTime)` method isn't currently available in .NET Core. The exact conversion performed by the method depends on the value of the `DateTime` parameter's [Kind](xref:System.DateTime.Kind) property, as the following table shows. @@ -53,7 +54,7 @@ Console.WriteLine("The date and time are {0} UTC.", _ > [!NOTE] >The [TimeZoneInfo.ConvertTimeToUtc(DateTime)](https://msdn.microsoft.com/en-us/library/bb381744(v=vs.110).aspx) method does not necessarily produce results that are identical to the [TimeZone.ToUniversalTime](https://msdn.microsoft.com/en-us/library/System.TimeZone.ToUniversalTime(v=vs.110).aspx) and [DateTime.ToUniversalTime](xref:System.DateTime.ToUniversalTime) methods. If the host system's local time zone includes multiple adjustment rules, [TimeZoneInfo.ConvertTimeToUtc(DateTime)](https://msdn.microsoft.com/en-us/library/System.TimeZone.ConvertTimeToUtc(v=vs.110).aspx) applies the appropriate rule to a particular date and time. The other two methods always apply the latest adjustment rule. -If the date and time value does not represent either the local time or UTC, the [ToUniversalTime](https://msdn.microsoft.com/en-us/library/System.TimeZone.ToUniversalTime(v=vs.110).aspx) method will likely return an erroneous result. However, you can use the [TimeZoneInfo.ConvertTimeToUtc](https://msdn.microsoft.com/en-us/library/bb381744(v=vs.110).aspx) method to convert the date and time from a specified time zone. (For details on retrieving a TimeZoneInfo object that represents the destination time zone, see [Finding the Time Zones Defined on a Local System](finding-the-time-zones-on-local-system.md). The following code uses the [TimeZoneInfo.ConvertTimeToUtc](https://msdn.microsoft.com/en-us/library/bb381744(v=vs.110).aspx) method to convert Eastern Standard Time to UTC. +If the date and time value does not represent either the local time or UTC, the [ToUniversalTime](https://msdn.microsoft.com/en-us/library/System.TimeZone.ToUniversalTime(v=vs.110).aspx) method will likely return an erroneous result. However, you can use the [TimeZoneInfo.ConvertTimeToUtc](https://msdn.microsoft.com/en-us/library/bb381744(v=vs.110).aspx) method to convert the date and time from a specified time zone. (For details on retrieving a TimeZoneInfo object that represents the destination time zone, see [Finding the time zones defined on a local system](finding-the-time-zones-on-local-system.md). The following code uses the [TimeZoneInfo.ConvertTimeToUtc](https://msdn.microsoft.com/en-us/library/bb381744(v=vs.110).aspx) method to convert Eastern Standard Time to UTC. ```csharp DateTime easternTime = new DateTime(2007, 01, 02, 12, 16, 00); @@ -178,9 +179,9 @@ Console.WriteLine() ' 6/15/2007 7:00:00 PM +00:00 exactly equals 6/15/2007 7:00:00 PM +00:00: True ``` -## Converting UTC to a Designated Time Zone +## Converting UTC to a designated time zone -To convert UTC to local time, see the [Converting UTC to Local Time](#Converting-UTC-to-Local-Time) section that follows. +To convert UTC to local time, see the [Converting UTC to local time](#converting-utc-to-local-time) section that follows. To convert UTC to the time in any time zone that you designate, call the [ConvertTimeFromUtc](https://msdn.microsoft.com/en-us/library/System.TimeZoneInfo.converttimefromutc(v=vs.110).aspx) method. @@ -232,7 +233,7 @@ Catch e As InvalidTimeZoneException End Try ``` -## Converting UTC to Local Time +## Converting UTC to local time To convert UTC to local time, call the [DateTime.ToLocalTime](xref:System.DateTime) method of the [DateTime](xref:System.DateTime) object whose time you want to convert. The exact behavior of the method depends on the value of the object’s [Kind](xref:System.DateTime.Kind) property, as the following table shows. @@ -242,7 +243,7 @@ To convert UTC to local time, call the [DateTime.ToLocalTime](xref:System.DateTi [DateTimeKind.Unspecified](xref:System.DateTimeKind.Unspecified) | Assumes that the [DateTime](xref:System.DateTime) value is UTC and converts the UTC to local time. [DateTimeKind.Utc](xref:System.DateTimeKind.Utc) | Converts the [DateTime](xref:System.DateTime) value to local time. -## Converting Between Any Two Time Zones +## Converting between any two time zones You can convert between any two time zones by using the static [TimeZoneInfo.ConvertTime](xref:System.TimeZoneInfo.ConvertTime(System.DateTime,System.TimeZoneInfo)) method. This method's parameters are the [DateTime](xref:System.DateTime) value to convert, a [TimeZoneInfo](xref:System.TimeZoneInfo) object that represents the time zone of the date and time value, and a [TimeZoneInfo](xref:System.TimeZoneInfo) object that represents the time zone to convert the date and time value to. @@ -285,7 +286,7 @@ Catch e As InvalidTimeZoneException End Try ``` -## Converting DateTimeOffset Values +## Converting DateTimeOffset values Date and time values represented by [System.DateTimeOffset](xref:System.DateTimeOffset) objects are not fully time-zone aware because the object is disassociated from its time zone at the time it is instantiated. However, in many cases an application simply needs to convert a date and time based on two different offsets from UTC rather than on the time in particular time zones. To perform this conversion, you can call the current instance's [ToOffset](xref:System.DateTimeOffset.ToOffset(System.TimeSpan)) method. The method's single parameter is [TimeSpan](xref:System.TimeSpan) representing the offset of the new date and time value that the method is to return. @@ -363,7 +364,7 @@ Public Function ReturnTimeOnServer(clientString As String) As DateTimeOffset End Function ``` -## See Also +## See also [TimeZoneInfo](xref:System.TimeZoneInfo) diff --git a/docs/standard/garbagecollection/fundamentals.md b/docs/standard/garbagecollection/fundamentals.md index 94f8d15d5ce83..b90b19f1924c2 100644 --- a/docs/standard/garbagecollection/fundamentals.md +++ b/docs/standard/garbagecollection/fundamentals.md @@ -3,6 +3,7 @@ title: Fundamentals of garbage collection description: Fundamentals of garbage collection keywords: .NET, .NET Core author: stevehoag +ms.author: shoag manager: wpickett ms.date: 08/16/2016 ms.topic: article @@ -27,17 +28,17 @@ In the Common Language Runtime (CLR), the garbage collector serves as an automat This topic describes the core concepts of garbage collection. It contains the following sections: -* [Fundamentals of memory](#Fundamentals-of-memory) +* [Fundamentals of memory](#fundamentals-of-memory) -* [Conditions for a garbage collection](#Conditions-for-a-garbage-collection) +* [Conditions for a garbage collection](#conditions-for-a-garbage-collection) -* [The managed heap](#The-managed-heap) +* [The managed heap](#the-managed-heap) -* [Generations](#Generations) +* [Generations](#generations) -* [What happens during a garbage collection](#What-happens-during-a-garbage-collection) +* [What happens during a garbage collection](#what-happens-during-a-garbage-collection) -* [Manipulating unmanaged resources](#Manipulating-unmanaged-resources) +* [Manipulating unmanaged resources](#manipulating-unmanaged-resources) ## Fundamentals of memory @@ -163,6 +164,6 @@ Users of your managed object may not dispose the native resources used by the ob When a finalizable object is discovered to be dead, its finalizer is put in a queue so that its cleanup actions are executed, but the object itself is promoted to the next generation. Therefore, you have to wait until the next garbage collection that occurs on that generation (which is not necessarily the next garbage collection) to determine whether the object has been reclaimed. -## See Also +## See also [Garbage collection in .NET](gc.md) diff --git a/docs/standard/garbagecollection/implementing-dispose.md b/docs/standard/garbagecollection/implementing-dispose.md index 96501a80c664a..693caf7940294 100644 --- a/docs/standard/garbagecollection/implementing-dispose.md +++ b/docs/standard/garbagecollection/implementing-dispose.md @@ -3,6 +3,7 @@ title: Implementing a dispose method description: Implementing a dispose method keywords: .NET, .NET Core author: stevehoag +ms.author: shoag manager: wpickett ms.date: 08/16/2016 ms.topic: article @@ -23,7 +24,7 @@ The dispose pattern has two variations: * You wrap each unmanaged resource that a type uses in a safe handle (that is, in a class derived from [System.Runtime.InteropServices.SafeHandle](xref:System.Runtime.InteropServices.SafeHandle)). In this case, you implement the [IDisposable](xref:System.IDisposable) interface and an additional `Dispose(Boolean)` method. This is the recommended variation and doesn't require overriding the [Object.Finalize](xref:System.Object.Finalize) method. > [!NOTE] -> The [Microsoft.Win32.SafeHandles](xref:Microsoft.Win32.SafeHandles) namespace provides a set of classes derived from [SafeHandle](xref:System.Runtime.InteropServices.SafeHandle), which are listed in the [Using safe handles](#Using-safe-handles) section. If you can't find a class that is suitable for releasing your unmanaged resource, you can implement your own subclass of [SafeHandle](xref:System.Runtime.InteropServices.SafeHandle). +> The [Microsoft.Win32.SafeHandles](xref:Microsoft.Win32.SafeHandles) namespace provides a set of classes derived from [SafeHandle](xref:System.Runtime.InteropServices.SafeHandle), which are listed in the [Using safe handles](#using-safe-handles) section. If you can't find a class that is suitable for releasing your unmanaged resource, you can implement your own subclass of [SafeHandle](xref:System.Runtime.InteropServices.SafeHandle). * You implement the [IDisposable](xref:System.IDisposable) interface and an additional `Dispose(Boolean`) method, and you also override the [Object.Finalize](xref:System.Object.Finalize) method. You must override [Finalize](xref:System.Object.Finalize) to ensure that unmanaged resources are disposed of if your [IDisposable.Dispose](xref:System.IDisposable.Dispose) implementation is not called by a consumer of your type. If you use the recommended technique discussed in the previous bullet, the [System.Runtime.InteropServices.SafeHandle](xref:System.Runtime.InteropServices.SafeHandle) class does this on your behalf. @@ -712,7 +713,7 @@ Public Class DisposableStreamResource2 : Inherits DisposableStreamResource End Class ``` -## See Also +## See also [SuppressFinalize](xref:System.GC.SuppressFinalize(System.Object)) diff --git a/docs/standard/language-independence.md b/docs/standard/language-independence.md index d0259f961de6e..f8b6df1ce6462 100644 --- a/docs/standard/language-independence.md +++ b/docs/standard/language-independence.md @@ -3,6 +3,7 @@ title: Language independence and language-independent components description: Language independence and language-independent components keywords: .NET, .NET Core author: stevehoag +ms.author: shoag manager: wpickett ms.date: 07/22/2016 ms.topic: article @@ -27,7 +28,7 @@ In this article: * [CLS compliance rules](#cls-compliance-rules) - * [Types and type member signatures](#Types-and-type-member-signatures) + * [Types and type member signatures](#types-and-type-member-signatures) * [Naming conventions](#naming-conventions) @@ -59,7 +60,7 @@ In this article: * [CLSCompliantAttribute attribute](#the-clscompliantattribute-attribute) -* [Cross-Language Interoperability](#Cross-Language-Interoperability) +* [Cross-Language Interoperability](#cross-language-interoperability) ## CLS compliance rules @@ -193,12 +194,12 @@ Properties | [Properties](#properties) | A property’s accessors shall all be s Properties | [Properties](#properties) | The type of a property shall be the return type of the getter and the type of the last argument of the setter. The types of the parameters of the property shall be the types of the parameters to the getter and the types of all but the final parameter of the setter. All of these types shall be CLS-compliant, and shall not be managed pointers (i.e., shall not be passed by reference). | 27 Properties | [Properties](#properties) | Properties shall adhere to a specific naming pattern. The `SpecialName` attribute referred to in CLS rule 24 shall be ignored in appropriate name comparisons and shall adhere to identifier rules. A property shall have a getter method, a setter method, or both. | 28 Type conversion | [Type conversion](#type-conversion) | If either op_Implicit or op_Explicit is provided, an alternate means of providing the coercion shall be provided. | 39 -Types | [Types and type member signatures](#Types-and-type-member-signatures) | Boxed value types are not CLS-compliant. | 3 -Types | [Types and type member signatures](#Types-and-type-member-signatures) | All types appearing in a signature shall be CLS-compliant. All types composing an instantiated generic type shall be CLS-compliant. | 11 -Types | [Types and type member signatures](#Types-and-type-member-signatures) | Typed references are not CLS-compliant. | 14 -Types | [Types and type member signatures](#Types-and-type-member-signatures) | Unmanaged pointer types are not CLS-compliant. | 17 -Types | [Types and type member signatures](#Types-and-type-member-signatures) | CLS-compliant classes, value types, and interfaces shall not require the implementation of non-CLS-compliant members | 20 -Types | [Types and type member signatures](#Types-and-type-member-signatures) | [System.Object](xref:System.Object) is CLS-compliant. Any other CLS-compliant class shall inherit from a CLS-compliant class. | 23 +Types | [Types and type member signatures](#types-and-type-member-signatures) | Boxed value types are not CLS-compliant. | 3 +Types | [Types and type member signatures](#types-and-type-member-signatures) | All types appearing in a signature shall be CLS-compliant. All types composing an instantiated generic type shall be CLS-compliant. | 11 +Types | [Types and type member signatures](#types-and-type-member-signatures) | Typed references are not CLS-compliant. | 14 +Types | [Types and type member signatures](#types-and-type-member-signatures) | Unmanaged pointer types are not CLS-compliant. | 17 +Types | [Types and type member signatures](#types-and-type-member-signatures) | CLS-compliant classes, value types, and interfaces shall not require the implementation of non-CLS-compliant members | 20 +Types | [Types and type member signatures](#types-and-type-member-signatures) | [System.Object](xref:System.Object) is CLS-compliant. Any other CLS-compliant class shall inherit from a CLS-compliant class. | 23 ### Types and type member signatures @@ -1104,7 +1105,7 @@ CLS-compliant arrays conform to the following rules: Return numbersOut End Function End Module - ``` +``` ### Interfaces