Skip to content

Commit

Permalink
Parse negative numbers in Norwegian (and 59 other languages)
Browse files Browse the repository at this point in the history
The DecimalFormatSymbols for Norwegian and 59 other languages use the
minus-sign (unicode 8722) instead of the hyphen-minus sign (ascii 45).

While technically correct, Gherkin is written on regular keyboards and
there is no practical way to write a minus-sign. By patching the
`DecimalFormatSymbols` with a regular minus sign we solve this problem.

Additionally, for the same reason, the non-breaking space (ascii 160)
and right single quotation mark (unicode 8217) for thousands separators
are also patched with either a period or colon.

Fixes: #287
  • Loading branch information
mpkorstanje committed Mar 21, 2024
1 parent b3f0892 commit ca1ecd2
Show file tree
Hide file tree
Showing 8 changed files with 220 additions and 15 deletions.
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,11 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/)
and this project adheres to [Semantic Versioning](http://semver.org/).

## [Unreleased]
### Added
- [Java] Assume numbers use either a comma or period for the thousands separator instead of non-breaking spaces. ([#290](https://github.com/cucumber/cucumber-expressions/pull/290))

### Fixed
- [Java] Parse negative numbers in Norwegian (and 59 other languages) ([#290](https://github.com/cucumber/cucumber-expressions/pull/290))
- [Python] Remove support for Python 3.7 and extend support to 3.12 ([#280](https://github.com/cucumber/cucumber-expressions/pull/280))
- [Python] The `ParameterType` constructor's `transformer` should be optional ([#288](https://github.com/cucumber/cucumber-expressions/pull/288))

Expand Down
25 changes: 23 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,14 +65,35 @@ the following built-in parameter types:
| `{short}` | Matches the same as `{int}`, but converts to a 16 bit signed integer if the platform supports it. |
| `{long}` | Matches the same as `{int}`, but converts to a 64 bit signed integer if the platform supports it. |

### Cucumber-JVM
### Java

### The Anonymous Parameter

The *anonymous* parameter type will be converted to the parameter type of the step definition using an object mapper.
Cucumber comes with a built-in object mapper that can handle all numeric types as well as. `Enum`.

To automatically convert to other types it is recommended to install an object mapper. See [configuration](https://cucumber.io/docs/cucumber/configuration)
To automatically convert to other types it is recommended to install an object mapper. See [cucumber-java - Default Transformers](https://github.com/cucumber/cucumber-jvm/tree/main/cucumber-java#default-transformers)
to learn how.

### Number formats

Java supports parsing localised numbers. I.e. in your English feature file you
can format a-thousand-and-one-tenth as '1,000.1; while in French you would format it
as '1.000,1'.

Parsing is facilitated by Javas [`DecimalFormat`](https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/text/DecimalFormat.html)
and includes support for the scientific notation. Unfortunately the default
localisation include symbols that can not be easily written on a regular
keyboard. So a few substitutions are made:

* The minus sign is always hyphen-minus - (ascii 45).
* If the decimal separator is a period (. ascii 46) the thousands separator is a comma (, ascii 44).
So '1 000.1' and '1’000.1' should always be written as '1,000.1'.
* If the decimal separator is a comma (, ascii 44) the thousands separator is a period (. ascii 46).
So '1 000,1' or '1’000,1' should always be written as '1.000,1'.

If support for your preferred language could be improved, please create an issue!

### Custom Parameter types

Cucumber Expressions can be extended so they automatically convert
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
package io.cucumber.cucumberexpressions;

import java.text.DecimalFormatSymbols;
import java.util.Locale;

/**
* A set of localized decimal symbols that can be written on a regular keyboard.
* <p>
* Note quite complete, feel free to make a suggestion.
*/
class KeyboardFriendlyDecimalFormatSymbols {

static DecimalFormatSymbols getInstance(Locale locale) {
DecimalFormatSymbols symbols = DecimalFormatSymbols.getInstance(locale);

// Replace the minus sign with minus-hyphen as available on most keyboards.
if (symbols.getMinusSign() == '\u2212') {
symbols.setMinusSign('-');
}

if (symbols.getDecimalSeparator() == '.') {
// For locales that use the period as the decimal separator
// always use the comma for thousands. The alternatives are
// not available on a keyboard
symbols.setGroupingSeparator(',');
} else if (symbols.getDecimalSeparator() == ',') {
// For locales that use the comma as the decimal separator
// always use the period for thousands. The alternatives are
// not available on a keyboard
symbols.setGroupingSeparator('.');
}
return symbols;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

import java.math.BigDecimal;
import java.text.DecimalFormat;
import java.text.DecimalFormatSymbols;
import java.text.NumberFormat;
import java.text.ParseException;
import java.util.Locale;
Expand All @@ -14,6 +15,8 @@ final class NumberParser {
if (numberFormat instanceof DecimalFormat) {
DecimalFormat decimalFormat = (DecimalFormat) numberFormat;
decimalFormat.setParseBigDecimal(true);
DecimalFormatSymbols symbols = KeyboardFriendlyDecimalFormatSymbols.getInstance(locale);
decimalFormat.setDecimalFormatSymbols(symbols);
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ private ParameterTypeRegistry(ParameterByTypeTransformer defaultParameterTransfo
this.internalParameterTransformer = defaultParameterTransformer;
this.defaultParameterTransformer = defaultParameterTransformer;

DecimalFormatSymbols numberFormat = DecimalFormatSymbols.getInstance(locale);
DecimalFormatSymbols numberFormat = KeyboardFriendlyDecimalFormatSymbols.getInstance(locale);

List<String> localizedFloatRegexp = singletonList(FLOAT_REGEXPS
.replace("{decimal}", "" + numberFormat.getDecimalSeparator())
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
package io.cucumber.cucumberexpressions;

import org.junit.jupiter.api.Test;

import java.text.DecimalFormatSymbols;
import java.util.AbstractMap.SimpleEntry;
import java.util.Arrays;
import java.util.List;
import java.util.Locale;
import java.util.function.Function;
import java.util.stream.Stream;

import static java.util.Comparator.comparing;
import static java.util.stream.Collectors.groupingBy;
import static java.util.stream.Collectors.toList;

class KeyboardFriendlyDecimalFormatSymbolsTest {

@Test
void listMinusSigns(){
System.out.println("Original minus signs:");
listMinusSigns(DecimalFormatSymbols::getInstance);
System.out.println();
System.out.println("Friendly minus signs:");
listMinusSigns(KeyboardFriendlyDecimalFormatSymbols::getInstance);
System.out.println();
}

private static void listMinusSigns(Function<Locale, DecimalFormatSymbols> supplier) {
getAvailableLocalesAsStream()
.collect(groupingBy(locale -> supplier.apply(locale).getMinusSign()))
.forEach((c, locales) -> System.out.println(render(c) + " " + render(locales)));
}

@Test
void listDecimalAndGroupingSeparators(){
System.out.println("Original decimal and group separators:");
listDecimalAndGroupingSeparators(DecimalFormatSymbols::getInstance);
System.out.println();
System.out.println("Friendly decimal and group separators:");
listDecimalAndGroupingSeparators(KeyboardFriendlyDecimalFormatSymbols::getInstance);
System.out.println();
}

private static void listDecimalAndGroupingSeparators(Function<Locale, DecimalFormatSymbols> supplier) {
getAvailableLocalesAsStream()
.collect(groupingBy(locale -> {
DecimalFormatSymbols symbols = supplier.apply(locale);
return new SimpleEntry<>(symbols.getDecimalSeparator(), symbols.getGroupingSeparator());
}))
.entrySet()
.stream()
.sorted(comparing(entry -> entry.getKey().getKey()))
.forEach((entry) -> {
SimpleEntry<Character, Character> characters = entry.getKey();
List<Locale> locales = entry.getValue();
System.out.println(render(characters.getKey()) + " " + render(characters.getValue()) + " " + render(locales));
});
}

@Test
void listExponentSigns(){
System.out.println("Original exponent signs:");
listExponentSigns(DecimalFormatSymbols::getInstance);
System.out.println();
System.out.println("Friendly exponent signs:");
listExponentSigns(KeyboardFriendlyDecimalFormatSymbols::getInstance);
System.out.println();
}

private static void listExponentSigns(Function<Locale, DecimalFormatSymbols> supplier) {
getAvailableLocalesAsStream()
.collect(groupingBy(locale -> supplier.apply(locale).getExponentSeparator()))
.forEach((s, locales) -> {
if (s.length() == 1) {
System.out.println(render(s.charAt(0)) + " " + render(locales));
} else {
System.out.println(s + " " + render(locales));
}
});
}

private static Stream<Locale> getAvailableLocalesAsStream() {
return Arrays.stream(DecimalFormatSymbols.getAvailableLocales());
}

private static String render(Character character) {
return character + " (" + (int) character + ")";
}

private static String render(List<Locale> locales) {
return locales.size() + ": " + locales.stream()
.sorted(comparing(Locale::getDisplayName))
.map(Locale::getDisplayName)
.collect(toList());
}

}
Original file line number Diff line number Diff line change
Expand Up @@ -5,36 +5,70 @@
import java.math.BigDecimal;
import java.util.Locale;

import static java.util.Locale.forLanguageTag;
import static org.junit.jupiter.api.Assertions.assertEquals;

public class NumberParserTest {
class NumberParserTest {

private final NumberParser english = new NumberParser(Locale.ENGLISH);
private final NumberParser german = new NumberParser(Locale.GERMAN);
private final NumberParser canadianFrench = new NumberParser(Locale.CANADA_FRENCH);
private final NumberParser norwegian = new NumberParser(forLanguageTag("no"));
private final NumberParser canadian = new NumberParser(Locale.CANADA);

@Test
public void can_parse_float() {
void can_parse_float() {
assertEquals(1042.2f, english.parseFloat("1,042.2"), 0);
assertEquals(1042.2f, german.parseFloat( "1.042,2"), 0);
assertEquals(1042.2f, canadianFrench.parseFloat( "1\u00A0042,2"), 0);
assertEquals(1042.2f, canadian.parseFloat("1,042.2"), 0);

assertEquals(1042.2f, german.parseFloat("1.042,2"), 0);
assertEquals(1042.2f, canadianFrench.parseFloat("1.042,2"), 0);
assertEquals(1042.2f, norwegian.parseFloat("1.042,2"), 0);
}

@Test
public void can_parse_double() {
void can_parse_double() {
assertEquals(1042.000000000000002, english.parseDouble("1,042.000000000000002"), 0);
assertEquals(1042.000000000000002, german.parseDouble( "1.042,000000000000002"), 0);
assertEquals(1042.000000000000002, canadianFrench.parseDouble( "1\u00A0042,000000000000002"), 0);
assertEquals(1042.000000000000002, canadian.parseDouble("1,042.000000000000002"), 0);

assertEquals(1042.000000000000002, german.parseDouble("1.042,000000000000002"), 0);
assertEquals(1042.000000000000002, canadianFrench.parseDouble("1.042,000000000000002"), 0);
assertEquals(1042.000000000000002, norwegian.parseDouble("1.042,000000000000002"), 0);
}

@Test
public void can_parse_big_decimals() {
void can_parse_big_decimals() {
assertEquals(new BigDecimal("1042.0000000000000000000002"), english.parseBigDecimal("1,042.0000000000000000000002"));
assertEquals(new BigDecimal("1042.0000000000000000000002"), german.parseBigDecimal( "1.042,0000000000000000000002"));
assertEquals(new BigDecimal("1042.0000000000000000000002"), canadianFrench.parseBigDecimal( "1\u00A0042,0000000000000000000002"));
assertEquals(new BigDecimal("1042.0000000000000000000002"), canadian.parseBigDecimal("1,042.0000000000000000000002"));

assertEquals(new BigDecimal("1042.0000000000000000000002"), german.parseBigDecimal("1.042,0000000000000000000002"));
assertEquals(new BigDecimal("1042.0000000000000000000002"), canadianFrench.parseBigDecimal("1.042,0000000000000000000002"));
assertEquals(new BigDecimal("1042.0000000000000000000002"), norwegian.parseBigDecimal("1.042,0000000000000000000002"));
}

@Test
void can_parse_negative() {
assertEquals(-1042.2f, english.parseFloat("-1,042.2"), 0);
assertEquals(-1042.2f, canadian.parseFloat("-1,042.2"), 0);

assertEquals(-1042.2f, german.parseFloat("-1.042,2"), 0);
assertEquals(-1042.2f, canadianFrench.parseFloat("-1.042,2"), 0);
assertEquals(-1042.2f, norwegian.parseFloat("-1.042,2"), 0);
}

@Test
void can_parse_exponents() {
assertEquals(new BigDecimal("100"), english.parseBigDecimal("1.00E2"));
assertEquals(new BigDecimal("100"), canadian.parseBigDecimal("1.00e2"));
assertEquals(new BigDecimal("100"), german.parseBigDecimal("1,00E2"));
assertEquals(new BigDecimal("100"), canadianFrench.parseBigDecimal("1,00E2"));
assertEquals(new BigDecimal("100"), norwegian.parseBigDecimal("1,00E2"));

assertEquals(new BigDecimal("0.01"), english.parseBigDecimal("1E-2"));
assertEquals(new BigDecimal("0.01"), canadian.parseBigDecimal("1e-2"));
assertEquals(new BigDecimal("0.01"), german.parseBigDecimal("1E-2"));
assertEquals(new BigDecimal("0.01"), canadianFrench.parseBigDecimal("1E-2"));
assertEquals(new BigDecimal("0.01"), norwegian.parseBigDecimal("1E-2"));
}

}
Original file line number Diff line number Diff line change
Expand Up @@ -171,8 +171,19 @@ public void parse_decimal_numbers_in_canadian_french() {
ExpressionFactory factory = new ExpressionFactory(new ParameterTypeRegistry(Locale.CANADA_FRENCH));
Expression expression = factory.createExpression("{bigdecimal}");

assertThat(expression.match("1\u00A0000,1").get(0).getValue(), is(new BigDecimal("1000.1")));
assertThat(expression.match("1\u00A0000\u00A0000,1").get(0).getValue(), is(new BigDecimal("1000000.1")));
assertThat(expression.match("1.000,1").get(0).getValue(), is(new BigDecimal("1000.1")));
assertThat(expression.match("1.000.000,1").get(0).getValue(), is(new BigDecimal("1000000.1")));
assertThat(expression.match("-1,1").get(0).getValue(), is(new BigDecimal("-1.1")));
assertThat(expression.match("-,1E1").get(0).getValue(), is(new BigDecimal("-1")));
}

@Test
public void parse_decimal_numbers_in_norwegian() {
ExpressionFactory factory = new ExpressionFactory(new ParameterTypeRegistry(Locale.forLanguageTag("no")));
Expression expression = factory.createExpression("{bigdecimal}");

assertThat(expression.match("1.000,1").get(0).getValue(), is(new BigDecimal("1000.1")));
assertThat(expression.match("1.000.000,1").get(0).getValue(), is(new BigDecimal("1000000.1")));
assertThat(expression.match("-1,1").get(0).getValue(), is(new BigDecimal("-1.1")));
assertThat(expression.match("-,1E1").get(0).getValue(), is(new BigDecimal("-1")));
}
Expand Down

0 comments on commit ca1ecd2

Please sign in to comment.