mdn · yin1999 · May 11, 2024 · Mar 3, 2024 · Mar 3, 2024 · Mar 3, 2024
@@ -1,6 +1,8 @@
 ---
 title: 量词
 slug: Web/JavaScript/Guide/Regular_expressions/Quantifiers
+l10n:
+  sourceCommit: 95a838d5d8e0e40aaa15897d23de476efade14b1
 ---
 
 {{jsSidebar("JavaScript Guide")}}
@@ -11,6 +13,8 @@ slug: Web/JavaScript/Guide/Regular_expressions/Quantifiers
 
 ## 类型
 
+> **备注：** 在下文中，*项*不仅指单个字符，还包括[字符类](/zh-CN/docs/Web/JavaScript/Guide/Regular_expressions/Character_classes)、[组和反向引用](/zh-CN/docs/Web/JavaScript/Guide/Regular_expressions/Groups_and_backreferences)。
+
 <table class="standard-table">
   <thead>
     <tr>
@@ -25,7 +29,7 @@ slug: Web/JavaScript/Guide/Regular_expressions/Quantifiers
       </td>
       <td>
         <p>
-          将前面的项“x”匹配 0 次或更多次。例如，/bo*/匹配“A ghost
+          将前面的项“x”匹配 0 次或更多次。例如，<code>/bo*/</code> 匹配“A ghost
           booooed”中的“boooo”和“A bird warbled”中的“b”，但在“A goat
           grunt”中没有匹配。
         </p>
@@ -48,12 +52,10 @@ slug: Web/JavaScript/Guide/Regular_expressions/Quantifiers
       </td>
       <td>
         <p>
-          将前面的项“x”匹配 0 或 1 次。例如，/e?le?/匹配 angel 中的 el 和 angle
-          中的 le。
+          将前面的项“x”匹配 0 或 1 次。例如，<code>/e?le?/</code> 匹配“angel”中的“el”和“angle”中的“le”。
         </p>
         <p>
-          如果立即在任何量词*、+、?或{}之后使用，则使量词是非贪婪的
-          (匹配最小次数)，而不是默认的贪婪的 (匹配最大次数)。
+          如果立即在任何 <code>*</code>、<code>+</code>、<code>?</code> 或 <code>{}</code> 量词之后使用，则使量词变为非贪婪匹配（匹配最小次数），而不是默认的贪婪匹配（匹配最大次数）。
         </p>
       </td>
     </tr>
@@ -63,8 +65,8 @@ slug: Web/JavaScript/Guide/Regular_expressions/Quantifiers
       </td>
       <td>
         <p>
-          其中“n”是一个正整数，与前一项“x”的 n 次匹配。例如，<code>/a{2}/ </code
-          >不匹配“candy”中的“a”，但它匹配“caandy”中的所有“a”，以及“caaandy”中的前两个“a”。
+          其中“n”是一个非负整数，与前一项“x”至少匹配“n”次。例如，<code>/a{2}/</code
+          > 不匹配“candy”中的“a”，但它匹配“caandy”中的所有“a”，以及“caaandy”中的前两个“a”。
         </p>
       </td>
     </tr>
@@ -74,7 +76,7 @@ slug: Web/JavaScript/Guide/Regular_expressions/Quantifiers
       </td>
       <td>
         <p>
-          其中，“n”是一个正整数，与前一项“x”至少匹配“n”次。例如，<code>/a{2，}/</code>不匹配“candy”中的“a”，但匹配“caandy”和“caaaaaaandy”中的所有
+          其中“n”是一个非负整数，与前一项“x”至少匹配“n”次。例如，<code>/a{2，}/</code> 不匹配“candy”中的“a”，但匹配“caandy”和“caaaaaaandy”中的所有
           a。
         </p>
       </td>
@@ -85,8 +87,7 @@ slug: Web/JavaScript/Guide/Regular_expressions/Quantifiers
       </td>
       <td>
         <p>
-          其中，“n”是 0 或一个正整数，“m”是一个正整数，而 m > n
-          至少与前一项“x”匹配，最多与“m”匹配。例如，/a{1,3}/不匹配“cndy”中的“a”，“candy”中的“a”，“caandy”中的两个“a”，以及“caaaaaaandy”中的前三个“a”。注意，当匹配“caaaaaaandy”时，匹配的是“aaa”，即使原始字符串中有更多的“a”。
+          其中“n”和“m”为非负整数，并且 <code><em>m</em> >= <em>n</em></code>。与项“x”至少匹配“n”次，至多匹配“m”次。例如，<code>/a{1,3}/</code> 不匹配“cndy”中的任何内容，而匹配“candy”中的“a”、“caandy”中的两个“a”以及“caaaaaandy”中的前三个“a”。请注意，在匹配“caaaaaandy”时，匹配的是“aaa”，尽管原始字符串中有更多的“a”。
         </p>
       </td>
     </tr>
@@ -101,10 +102,7 @@ slug: Web/JavaScript/Guide/Regular_expressions/Quantifiers
       </td>
       <td>
         <p>
-          默认情况下，像 <code>* </code>和
-          <code>+ </code
-          >这样的量词是“贪婪的”，这意味着它们试图匹配尽可能多的字符串。量词后面的字符 `?` 使量词“非贪婪”：它一旦找到匹配就会停止。例如，给定一个字符串“some
-          &#x3C;foo> &#x3C;bar> new &#x3C;/bar> &#x3C;/foo> thing”:
+          默认情况下，像 <code>*</code> 和 <code>+</code> 这样的量词是“贪婪的”，这意味着它们试图匹配尽可能多的字符串。量词后面的字符 <code>?</code> 使量词“不贪婪”：它一旦找到匹配就会停止。例如，给定一个字符串“some &#x3C;foo> &#x3C;bar> new &#x3C;/bar> &#x3C;/foo> thing”:
         </p>
         <ul>
           <li>
@@ -123,37 +121,37 @@ slug: Web/JavaScript/Guide/Regular_expressions/Quantifiers
 ### 重复模式
 
 ```js
-var wordEndingWithAs = /\w+a+/;
-var delicateMessage = "This is Spartaaaaaaa";
+const wordEndingWithAs = /\w+a+\b/;
+const delicateMessage = "This is Spartaaaaaaa";
 
 console.table(delicateMessage.match(wordEndingWithAs)); // [ "Spartaaaaaaa" ]
 ```
 
-### 计算字符集
+### 统计单词
 
 ```js
-var singleLetterWord = /\b\w\b/g;
-var notSoLongWord = /\b\w{1,6}\b/g;
-var loooongWord = /\b\w{13,}\b/g;
+const singleLetterWord = /\b\w\b/g;
+const notSoLongWord = /\b\w{2,6}\b/g;
+const longWord = /\b\w{13,}\b/g;
 
-var sentence = "Why do I have to learn multiplication table?";
+const sentence = "Why do I have to learn multiplication table?";
 
 console.table(sentence.match(singleLetterWord)); // ["I"]
-console.table(sentence.match(notSoLongWord)); // [ "Why", "do", "I", "have", "to", "learn", "table" ]
-console.table(sentence.match(loooongWord)); // ["multiplication"] 可选可选字符
+console.table(sentence.match(notSoLongWord)); // [ "Why", "do", "have", "to", "learn", "table" ]
+console.table(sentence.match(longWord)); // ["multiplication"]
 ```
 
 ### 可选字符
 
 ```js
-var britishText = "He asked his neighbour a favour.";
-var americanText = "He asked his neighbor a favor.";
+const britishText = "He asked his neighbour a favour.";
+const americanText = "He asked his neighbor a favor.";
 
-var regexpEnding = /\w+ou?r/g;
-// \w+ One or several letters
-// o   followed by an "o",
-// u?  optionally followed by a "u"
-// r   followed by an "r"
+const regexpEnding = /\w+ou?r/g;
+// \w+ 一个及以上字母
+// o   跟随字母“o”，
+// u?  可能跟随字母“u”
+// r   跟随字母“r”
 
 console.table(britishText.match(regexpEnding));
 // ["neighbour", "favour"]
@@ -165,19 +163,19 @@ console.table(americanText.match(regexpEnding));
 ### 贪婪匹配与非贪婪匹配
 
 ```js
-var text = "I must be getting somewhere near the centre of the earth.";
-var greedyRegexp = /[\w ]+/;
-// [\w ]      a letter of the latin alphabet or a whitespace
-//      +     one or several times
+const text = "I must be getting somewhere near the center of the earth.";
+const greedyRegexp = /[\w ]+/;
+// [\w ]      一个拉丁字母或一个空格
+//      +     匹配一次及以上
 
 console.log(text.match(greedyRegexp)[0]);
-// "I must be getting somewhere near the centre of the earth"
-// almost all of the text matches (leaves out the dot character)
+// "I must be getting somewhere near the center of the earth"
+// 几乎所有文本都匹配（除了点字符）
 
-var nonGreedyRegexp = /[\w ]+?/; // Notice the question mark
+const nonGreedyRegexp = /[\w ]+?/; // 注意问号
 console.log(text.match(nonGreedyRegexp));
 // "I"
-// The match is the smallest one possible
+// 尽可能少的匹配
 ```
 
 ## 参见

@@ -0,0 +1,182 @@
+---
+title: 量词：*、+、?、{n}、{n,}、{n,m}
+slug: Web/JavaScript/Reference/Regular_expressions/Quantifier
+l10n:
+  sourceCommit: 4f86aad2b0b66c0d2041354ec81400c574ab56ca
+---
+
+{{jsSidebar}}
+
+**量词**会将[原子](/zh-CN/docs/Web/JavaScript/Reference/Regular_expressions#原子)重复一定的次数。量词被置于其适用的原子项之后。
+
+## 语法
+
+```regex
+// 贪婪
+atom?
+atom*
+atom+
+atom{count}
+atom{min,}
+atom{min,max}
+
+// 非贪婪
+atom??
+atom*?
+atom+?
+atom{count}?
+atom{min,}?
+atom{min,max}?
+```
+
+### 参数
+
+- `atom`
+  - : 单个[原子项](/zh-CN/docs/Web/JavaScript/Reference/Regular_expressions#原子)。
+- `count`
+  - : 非负整数，原子应当被重复的次数。
+- `min`
+  - : 非负整数，原子可以被重复的最小次数。
+- `max` {{optional_inline}}
+  - : 非负整数，原子可以被重复的最大次数。如果省略该参数，原子可根据需要重复多次。
+
+## 描述
+
+量词位于[原子项](/zh-CN/docs/Web/JavaScript/Reference/Regular_expressions#原子)之后，用于将原子项重复一定次数。它不能单独出现。每个量词都可以指定一个模式必须重复的最小和最大次数。
+
+| 量词        | 最小值  | 最大值   |
+| ----------- | ------- | -------- |
+| `?`         | 0       | 1        |
+| `*`         | 0       | Infinity |
+| `+`         | 1       | Infinity |
+| `{count}`   | `count` | `count`  |
+| `{min,}`    | `min`   | Infinity |
+| `{min,max}` | `min`   | `max`    |
+
+对于 `{count}`、`{min,}` 和 `{min,max}` 语法，数字周围不能有空格，否则就会变成[字面字符](/zh-CN/docs/Web/JavaScript/Reference/Regular_expressions/Literal_character)模式。
+
+```js example-bad
+const re = /a{1, 3}/;
+re.test("aa"); // false
+re.test("a{1, 3}"); // true
+```
+
+这种行为在 [Unicode 感知模式](/zh-CN/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicode#unicode_感知模式)中得到了修正，在这种模式下，如果不使用[转义](/zh-CN/docs/Web/JavaScript/Reference/Regular_expressions/Character_escape)语法，就不能按字面意思使用大括号。在不使用转义的情况下按字面意思使用 `{` 和 `}` 的特性是一种[为与 web 兼容而废弃的语法](/zh-CN/docs/Web/JavaScript/Reference/Deprecated_and_obsolete_features#regexp)，不应依赖这种特性。
+
+```js-nolint example-bad
+/a{1, 3}/u; // SyntaxError: Invalid regular expression: Incomplete quantifier
+```
+
+如果最小值大于最大值，会产生语法错误。
+
+```js-nolint example-bad
+/a{3,2}/; // SyntaxError: Invalid regular expression: numbers out of order in {} quantifier
+```
+
+量词可导致[捕获组](/zh-CN/docs/Web/JavaScript/Reference/Regular_expressions/Capturing_group)多次匹配。有关这种情况下的行为的更多信息，请参阅捕获组页面。
+
+每次重复匹配的字符串不必相同。
+
+```js
+/[ab]*/.exec("aba"); // ['aba']
+```
+
+默认情况下，量词是*贪婪*的，这意味着它们会尝试尽可能多地匹配，直到达到最大值或无法继续匹配为止。你可以在量词后面添加 `?`，使其成为*非贪婪*量词。在这种情况下，量词会尽量减少匹配次数，只有当重复匹配次数达到不可能匹配到模式的其余部分时，才会增加匹配次数。
+
+```js
+/a*/.exec("aaa"); // ['aaa']；整个输入被消耗
+/a*?/.exec("aaa"); // ['']；可以不消耗任何字符，但仍能成功匹配
+/^a*?$/.exec("aaa"); // ['aaa']；不可能消耗更少的字符而仍然匹配成功
+```
+
+不过，只要正则表达式在某个索引处成功匹配字符串，就不会再尝试后续索引，尽管这可能会导致消耗更少的字符。
+
+```js
+/a*?$/.exec("aaa"); // ['aaa']；在第一个字符处已经匹配成功，因此该正则表达式不会尝试从第二个字符处开始匹配
+```
+
+如果无法与模式的其余部分匹配，贪婪量词可能会尝试较少的重复。
+
+```js
+/[ab]+[abc]c/.exec("abbc"); // ['abbc']
+```
+
+在本例中，`[ab]+` 首先贪婪地匹配了 `"abb"`，但 `[abc]c` 无法匹配模式的其余部分（`"c"`），因此量词被简化为只匹配 `"ab"`。
+
+贪婪的量化符避免匹配无限多的空字符串。如果匹配的字符数达到最小值，且原子在该位置不再消耗更多字符，那么量化器就会停止匹配。这就是为什么 `/(a*)*/.exec("b")` 不会导致无限循环。
+
+贪婪的量词会尽可能*多*地匹配，而不会最大化匹配的*长度*。例如，`/(aa|aabaac|ba)*/.exec("aabaac")` 先匹配 `"aa"`，然后匹配 `"ba"` 而不是 `"aabaac"`。
+
+量词适用于单个原子。如果要对较长的模式或选择表达式进行量化，必须对其进行[分组](/zh-CN/docs/Web/JavaScript/Reference/Regular_expressions/Non-capturing_group)。量词不能用于[断言](/zh-CN/docs/Web/JavaScript/Reference/Regular_expressions#断言)。
+
+```js-nolint example-bad
+/^*/; // SyntaxError: Invalid regular expression: nothing to repeat
+```
+
+在 [Unicode 感知模式](/zh-CN/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicode#unicode_感知模式)中，[前向断言](/zh-CN/docs/Web/JavaScript/Reference/Regular_expressions/Lookahead_assertion)可以量化。这是一种[为兼容 web 而过时的语法](/zh-CN/docs/Web/JavaScript/Reference/Deprecated_and_obsolete_features#regexp)，不应依赖它。
+
+```js
+/(?=a)?b/.test("b"); // true；前向匹配了零次
+```
+
+## 示例
+
+### 移除 HTML 标签
+
+下面的示例删除了用角括弧括起来的 HTML 标记。请注意使用 `?` 以避免一次删除过多字符。
+
+```js
+function stripTags(str) {
+  return str.replace(/<.+?>/g, "");
+}
+
+stripTags("<p><em>lorem</em> <strong>ipsum</strong></p>"); // 'lorem ipsum'
+```
+
+使用贪婪匹配可以达到同样的效果，但不允许重复模式与 `>` 匹配。
+
+```js
+function stripTags(str) {
+  return str.replace(/<[^>]+>/g, "");
+}
+
+stripTags("<p><em>lorem</em> <strong>ipsum</strong></p>"); // 'lorem ipsum'
+```
+
+> **警告：** 以上方法仅供演示——它无法处理属性值中的 `>`。请使用类似 [HTML sanitizer API](/zh-CN/docs/Web/API/HTML_Sanitizer_API) 这样的正规 HTML 净化器。
+
+### 定位 Markdown 段落
+
+在 Markdown 中，段落由一个或多个空行分隔。下面的示例通过匹配两个或多个换行符来计算字符串中的所有段落。
+
+```js
+function countParagraphs(str) {
+  return str.match(/(?:\r?\n){2,}/g).length + 1;
+}
+
+countParagraphs(`
+Paragraph 1
+
+Paragraph 2
+Containing some line breaks, but still the same paragraph
+
+Another paragraph
+`); // 3
+```
+
+> **警告：** 以上方法仅供演示——它无法处理代码块或其他 Markdown 块元素（如标题）中的换行符。请使用合适的 Markdown 解析器。
+
+## 规范
+
+{{Specifications}}
+
+## 浏览器兼容性
+
+{{Compat}}
+
+## 参见
+
+- [量词](/zh-CN/docs/Web/JavaScript/Guide/Regular_expressions/Quantifiers)指南
+- [正则表达式](/zh-CN/docs/Web/JavaScript/Reference/Regular_expressions)
+- [选择表达式：`|`](/zh-CN/docs/Web/JavaScript/Reference/Regular_expressions/Disjunction)
+- [字符类：`[...]`、`[^...]`](/zh-CN/docs/Web/JavaScript/Reference/Regular_expressions/Character_class)