Skip to content

Commit

Permalink
docs(formula): add formulas i18n,algorithm contributing guide
Browse files Browse the repository at this point in the history
  • Loading branch information
Dushusir committed Apr 9, 2024
1 parent c7594c2 commit 5b85403
Show file tree
Hide file tree
Showing 2 changed files with 35 additions and 23 deletions.
22 changes: 14 additions & 8 deletions packages/sheets-formula/README-zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -181,7 +181,7 @@ univerAPI.unregisterFunction({

Uniscript 底层使用了 `@univerjs/facade`,你也可以直接在项目中使用类似 Uniscript 的 API,请参考 [注册公式](/guides/facade/register-function)

## 如何在初始化 Univer 时添加公式
### 如何在初始化 Univer 时添加公式

按照以下步骤来实现一个自定义公式 `CUSTOMSUM`

Expand Down Expand Up @@ -719,7 +719,7 @@ univer.registerPlugin(UniverSheetsCustomFunctionPlugin);
- `description` 参数需要综合下内容进行提取,因为有的 Excel 描述很长,需要简化
- `abstract``links` 基本上不需要做改动
- `aliasFunctionName` 是可选参数,大部分公式不需要填写(也可以只设置某个国家的别名),暂时还未找到有公式别名文档来参考。目前找到一个函数翻译插件可能提供类似功能 [Excel 函数翻译工具](https://support.microsoft.com/zh-cn/office/excel-%E5%87%BD%E6%95%B0%E7%BF%BB%E8%AF%91%E5%B7%A5%E5%85%B7-f262d0c0-991c-485b-89b6-32cc8d326889)
- `functionParameter` 中需要为每个参数设定一个名称,我们推荐根据参数的含义进行变化,比如数值类型的 `key``number`(仅有一个数值参数的时候)或者 `number1``number2`(有多个数值参数的时候),范围为 `range`,条件为 `criteria`,求和范围为 `sum_range`(多个单词之间用 `_` 分割)
- `functionParameter` 中需要为每个参数设定一个名称,我们推荐根据参数的含义进行变化,比如数值类型的 `key``number`(仅有一个数值参数的时候)或者 `number1``number2`(有多个数值参数的时候),范围为 `range`,条件为 `criteria`,求和范围为 `sumRange`,采用驼峰式命名法。对于具体的参数内容, `name` 英文格式就使用带下划线的格式 `sum_range`,其他语言采用翻译的文本, `detail` 全部采用翻译。
- Office 函数文档中文翻译猜测用的机翻,部分翻译不容易理解,需要自己修改,一部分专用名词如下。
- 单元格参考 => 单元格引用
- 数字类型的参数统一翻译为:数值
Expand All @@ -745,11 +745,11 @@ univer.registerPlugin(UniverSheetsCustomFunctionPlugin);

位置在 [packages/engine-formula/src/functions/math/sumif/index.ts](https://github.com/dream-num/univer/blob/dev/packages/engine-formula/src/functions/math/sumif/index.ts)。

在当前公式的分类文件夹下新建公式文件夹,一个公式一个文件夹。然后新建 `index.ts` 文件来写公式算法,公式 `class` 的名称采用大驼峰的写法,认为公式是一个单词,带 `_` 或者 `.` 的公式认为是两个单词,比如
在当前公式的分类文件夹下新建公式文件夹,文件夹名称与公式相同,采用短横线命名,一个公式一个文件夹。然后新建 `index.ts` 文件来写公式算法,公式 `class` 的名称采用帕斯卡命名法(又叫大驼峰),认为公式是一个单词,带 `_` 或者 `.` 的公式认为是两个单词,比如

- `SUMIF` => `Sumif`
- `NETWORKDAYS.INTL` => `Networkdays_Intl`
- `ARRAY_CONSTRAIN` => `Array_Constrain`
- `SUMIF` => 文件夹 `sumif`, 类 `Sumif`
- `NETWORKDAYS.INTL` => 文件夹 `networkdays-intl`, 类 `NetworkdaysIntl`
- `ARRAY_CONSTRAIN` => 文件夹 `array-constrain`, 类 `ArrayConstrain`

同级新建 `__tests__` 来写编写单元测试。写完之后,记得在分类目录下的 `function-map` 文件中添加公式算法和函数名映射用于注册这个函数算法。

Expand All @@ -776,7 +776,8 @@ univer.registerPlugin(UniverSheetsCustomFunctionPlugin);

### 公式实现注意事项

- 任何公式的入参和出参都可以是 `A1``A1:B10`,调研 Excel 的时候需要把所有情况考虑到,比如 `=SIN(A1:B10)`,会展开一个正弦计算后的范围。
- 大部分的公式规则请参考最新版本的 Excel,如果有不合理的地方,再参考 Google Sheets
- 任何公式的入参和出参都可以是 `A1``A1:B10`,单元格内容也可能是数字、字符串、布尔值、空单元格、错误值、数组等,虽然公式教程中说明了识别固定的数据类型,但是程序上实现是需要都兼容的,调研 Excel 的时候需要把所有情况考虑到,比如 `=SIN(A1:B10)`,会展开一个正弦计算后的范围。

- 例如 `XLOOKUP` 函数,要求两个入参的行或列至少又一个大小相等,这样才能进行矩阵计算。
- 例如 `SUMIF` 函数,大家以为是求和,但是它是可以根据第二个参数进行展开的
Expand All @@ -788,8 +789,13 @@ univer.registerPlugin(UniverSheetsCustomFunctionPlugin);
- 公式的数值计算,需要使用内置的方法,尽量不要获取值自行计算。因为公式的参数可以是值、数组、引用。可以参考已有的 `sum``minus` 函数。
- 精度问题,公式引入了 `big.js`,使用内置方法会调用该库,但是相比原生计算会慢接近 100 倍,所以像 `sin``js` 方法,尽量用原生实现。
- 需要自定义计算,使用 `product` 函数,适合两个入参的计算,调用 `map` 对值自身进行迭代计算,适合对一个入参本身的值进行改变。
- 公式算法支持两种配置 `needsExpandParams``needsReferenceObject`
- `needsExpandParams`: 函数是否需要扩展参数,主要处理类似 `LOOKUP` 函数需要处理不同大小向量的情况
- `needsReferenceObject`:函数是否需要传入引用对象,设置之后 `BaseReferenceObject` 不会转化为 `ArrayValueObject` 而是直接传入公式算法,比如 `OFFSET` 函数
- 公式计算错误会返回固定类型的错误,比如 `#NAME?``#VALUE!`,需要对齐 Excel,因为有判断报错类型的函数 `ISERR``ISNA`等,类型指定不对,结果就可能不一样。
- 公式算法中,即使是必选参数,也需要拦截为 `null` 的情况并返回错误 `#N/A`,因为用户有可能不输入任何参数。这个行为在 Excel 中会被拦截,在 Google Sheets 中返回 `#N/A`,我们参照 Google Sheets

#### 公式基础工具
### 公式基础工具

1. `ValueObjectFactory` 用来自动识别参数格式创建一个参数实例,范围类型的数据用 `RangeReferenceObject` 来创建参数实例
2. 数组 `toArrayValueObject` 可以与值直接运算,得到新的数组
36 changes: 21 additions & 15 deletions packages/sheets-formula/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -719,7 +719,7 @@ To implement a formula, you need to add formula description, internationalizatio
- Extract the `description` from the content, as some Excel descriptions are lengthy and need simplification.
- `abstract` and `links` generally do not need modification.
- `aliasFunctionName` is optional; most formulas do not need to be filled (or can be set for aliases in specific countries). Currently, there is no documentation for formula aliases. Currently I have found a function translation plug-in that may provide similar functions [Excel Functions Translator](https://support.microsoft.com/en-us/office/excel-functions-translator-f262d0c0-991c-485b-89b6-32cc8d326889)
- `functionParameter` needs a name for each parameter. We recommend varying names based on the parameter's meaning, e.g., use `number` for a numeric parameter (if there is only one) or `number1`, `number2` for multiple numeric parameters. Use `range` for a range, `criteria` for conditions, and `sum_range` for the sum range (separated by `_` for multiple words).
- `functionParameter` needs a name for each parameter. We recommend varying names based on the parameter's meaning, e.g., use `number` for a numeric parameter (if there is only one) or `number1`, `number2` for multiple numeric parameters. Use `range` for a range, `criteria` for conditions, and `sumRange` for the sum range, use `camelCase`. For specific parameter content, the English format of `name` uses the underlined format `sum_range`, other languages use the translated text, and `detail` uses all translations.
- Some Chinese translations in the Office function documentation are machine-translated and may be unclear. Modify as needed. For example, `单元格参考` (Cell Reference) should be translated as `单元格引用`. Numeric type parameters are uniformly translated as: `数值`.
- Do not end `abstract` with a period (used in the search list when users input cells), but end `description` and `detail` with a period (used in descriptions).
- Capitalize the first letter of English sentences.
Expand All @@ -743,11 +743,11 @@ To implement a formula, you need to add formula description, internationalizatio

Location: [packages/engine-formula/src/functions/math/sumif/index.ts](https://github.com/dream-num/univer/blob/dev/packages/engine-formula/src/functions/math/sumif/index.ts).

Create a new folder for the formula under the current formula category, with one folder per formula. Then create an `index.ts` file to write the formula algorithm. Use camel case for the formula `class` name, considering the formula as one word. If a formula contains `_` or `.`, treat it as two words, such as:
Create a new formula folder under the classification folder of the current formula. The folder name is the same as the formula, named with `kebab-case`, one folder for each formula. Then create a new `index.ts` file to write the formula algorithm. The name of the formula `class` adopts `PascalCase`. The formula is considered to be one word, and the formula with `_` or `.` is considered to be two words such as

- `SUMIF` => `Sumif`
- `NETWORKDAYS.INTL` => `Networkdays_Intl`
- `ARRAY_CONSTRAIN` => `Array_Constrain`
- `SUMIF` => folder `sumif`, class `Sumif`
- `NETWORKDAYS.INTL` => folder `networkdays-intl`, class `NetworkdaysIntl`
- `ARRAY_CONSTRAIN` => folder `array-constrain`, class `ArrayConstrain`

Create a `__tests__` folder at the same level to write unit tests. After writing, remember to add the formula algorithm and function name mapping in the `function-map` file in the category directory to register the formula algorithm.

Expand All @@ -772,20 +772,26 @@ To implement a formula, you need to add formula description, internationalizatio
- After selecting `SUMIF` or entering `=sumif(`, trigger the formula details popup and carefully check the contents.
- Select the data range, trigger the calculation, and check if the formula calculation result is correct.

#### Considerations for Formula Implementation
### Considerations for Formula Implementation

- Any formula's input and output can be `A1`, `A1:B10`, etc. When researching Excel, consider all cases, such as `=SIN(A1:B10)`, which expands to the calculated range.
- For example, the `XLOOKUP` function requires at least one of the rows or columns of its two inputs to be of equal size for matrix calculation.
- For example, the `SUMIF` function, although commonly used for summation, can expand based on the second parameter.
- For most formula rules, please refer to the latest version of Excel. If there are any unreasonable rules, please refer to Google Sheets.
- The input and output parameters of any formula can be `A1`, `A1:B10`, and the cell content may also be numbers, strings, Boolean values, empty cells, error values, arrays, etc., although the formula tutorial explains In order to identify fixed data types, the program implementation needs to be compatible. When researching Excel, consider all cases, such as `=SIN(A1:B10)`, which expands to the calculated range.
- For example, the `XLOOKUP` function requires at least one of the rows or columns of its two inputs to be of equal size for matrix calculation.
- For example, the `SUMIF` function, although commonly used for summation, can expand based on the second parameter.
![sumif array](./assets/sumif-array.png)
![sumif array result](./assets/sumif-array-result.png)
- Excel formula calculation is becoming more like numpy, for example:
- Excel formula calculation is becoming more like numpy, for example:
![numpy](./assets/numpy.png)
- For numerical calculations in formulas, use built-in methods and try to avoid obtaining values for manual calculation. Because formula parameters can be values, arrays, or references. You can refer to existing `sum` and `minus` functions.
- Precision issues: The formula introduces `big.js`, and using built-in methods will call this library. However, it is nearly 100 times slower than native calculations. Therefore, for methods like `sin`, it is advisable to use native implementations.
- For custom calculations, use the `product` function, suitable for calculating two input parameters. Call `map` to iterate over the values for changes to a parameter's own values.

#### Formula Basic Tools
- For numerical calculations in formulas, use built-in methods and try to avoid obtaining values for manual calculation. Because formula parameters can be values, arrays, or references. You can refer to existing `sum` and `minus` functions.
- Precision issues: The formula introduces `big.js`, and using built-in methods will call this library. However, it is nearly 100 times slower than native calculations. Therefore, for methods like `sin`, it is advisable to use native implementations.
- For custom calculations, use the `product` function, suitable for calculating two input parameters. Call `map` to iterate over the values for changes to a parameter's own values.
- Formula algorithm supports two configurations `needsExpandParams` and `needsReferenceObject`
- `needsExpandParams`: Whether the function needs to expand parameters, mainly handles situations where the `LOOKUP` function needs to handle vectors of different sizes
- `needsReferenceObject`: Whether the function needs to pass in a reference object. After setting, `BaseReferenceObject` will not be converted into `ArrayValueObject` but will be passed directly into the formula algorithm, such as the `OFFSET` function
- Formula calculation errors will return fixed types of errors, such as `#NAME?`, `#VALUE!`, which need to be aligned with Excel, because there are functions `ISERR`, `ISNA`, etc. that determine the error type. If the type is not specified correctly, the result will be It may be different.
- In the formula algorithm, even if it is a required parameter, it is necessary to intercept the case of `null` and return the error `#N/A`, because the user may not enter any parameters. This behavior will be intercepted in Excel and `#N/A` will be returned in Google Sheets. We refer to Google Sheets.

### Formula Basic Tools

1. `ValueObjectFactory` is used to automatically recognize parameter formats and create a parameter instance. Use `RangeReferenceObject` to create parameter instances for range-type data.
2. The array `toArrayValueObject` can be operated directly with values to get a new array.

0 comments on commit 5b85403

Please sign in to comment.