Skip to content

Commit 112acd2

Browse files
Nondeterministic expressions (CallMethodViaReflection and MonotonicallyIncreasingID)
1 parent ee8db2c commit 112acd2

File tree

3 files changed

+118
-100
lines changed

3 files changed

+118
-100
lines changed
+50-62
Original file line numberDiff line numberDiff line change
@@ -1,86 +1,74 @@
1+
---
12
title: CallMethodViaReflection
3+
---
24

35
# CallMethodViaReflection Expression
46

5-
`CallMethodViaReflection` is an Expression.md[expression] that represents a static method call in Scala or Java using `reflect` and `java_method` functions.
7+
`CallMethodViaReflection` is a [non-deterministic expression](Nondeterministic.md) that represents a static method call (in Scala or Java) using `reflect` and `java_method` standard functions.
68

7-
NOTE: `reflect` and `java_method` functions are only supported in SparkSession.md#sql[SQL] and dataset/index.md#selectExpr[expression] modes.
9+
`CallMethodViaReflection` supports [fallback mode for expression code generation](Expression.md#CodegenFallback).
810

9-
.CallMethodViaReflection's DataType to JVM Types Mapping
10-
[cols="1,2",options="header",width="100%"]
11-
|===
12-
| DataType
13-
| JVM Type
11+
## Creating Instance
1412

15-
| `BooleanType`
16-
| `java.lang.Boolean` / `scala.Boolean`
13+
`CallMethodViaReflection` takes the following to be created:
1714

18-
| `ByteType`
19-
| `java.lang.Byte` / `Byte`
15+
* <span id="children"> Children [Expression](Expression.md)s
2016

21-
| `ShortType`
22-
| `java.lang.Short` / `Short`
17+
`CallMethodViaReflection` is created when:
2318

24-
| `IntegerType`
25-
| `java.lang.Integer` / `Int`
19+
* `reflect` standard function is used
2620

27-
| `LongType`
28-
| `java.lang.Long` / `Long`
21+
## evalInternal { #evalInternal }
2922

30-
| `FloatType`
31-
| `java.lang.Float` / `Float`
23+
??? note "Nondeterministic"
3224

33-
| `DoubleType`
34-
| `java.lang.Double` / `Double`
25+
```scala
26+
evalInternal(
27+
input: InternalRow): Any
28+
```
3529

36-
| `StringType`
37-
| `String`
38-
|===
30+
`evalInternal` is part of the [Nondeterministic](Nondeterministic.md#evalInternal) abstraction.
3931

40-
[source, scala]
41-
----
32+
`evalInternal`...FIXME
33+
34+
## initializeInternal { #initializeInternal }
35+
36+
??? note "Nondeterministic"
37+
38+
```scala
39+
initializeInternal(
40+
partitionIndex: Int): Unit
41+
```
42+
43+
`initializeInternal` is part of the [Nondeterministic](Nondeterministic.md#initializeInternal) abstraction.
44+
45+
`initializeInternal`...FIXME
46+
47+
## Demo
48+
49+
```scala
4250
import org.apache.spark.sql.catalyst.expressions.CallMethodViaReflection
4351
import org.apache.spark.sql.catalyst.expressions.Literal
44-
scala> val expr = CallMethodViaReflection(
45-
| Literal("java.time.LocalDateTime") ::
46-
| Literal("now") :: Nil)
47-
expr: org.apache.spark.sql.catalyst.expressions.CallMethodViaReflection = reflect(java.time.LocalDateTime, now)
52+
val expr = CallMethodViaReflection(
53+
Literal("java.time.LocalDateTime") ::
54+
Literal("now") :: Nil)
55+
```
56+
57+
```text
4858
scala> println(expr.numberedTreeString)
49-
00 reflect(java.time.LocalDateTime, now)
59+
00 reflect(java.time.LocalDateTime, now, true)
5060
01 :- java.time.LocalDateTime
5161
02 +- now
62+
```
5263

53-
// CallMethodViaReflection as the expression for reflect SQL function
54-
val q = """
55-
select reflect("java.time.LocalDateTime", "now") as now
56-
"""
64+
```scala
65+
val q = """SELECT reflect("java.time.LocalDateTime", "now") AS now"""
5766
val plan = spark.sql(q).queryExecution.logical
67+
```
68+
69+
```text
5870
// CallMethodViaReflection shows itself under "reflect" name
5971
scala> println(plan.numberedTreeString)
60-
00 Project [reflect(java.time.LocalDateTime, now) AS now#39]
61-
01 +- OneRowRelation$
62-
----
63-
64-
`CallMethodViaReflection` supports a Expression.md#CodegenFallback[fallback mode for expression code generation].
65-
66-
[[properties]]
67-
.CallMethodViaReflection's Properties
68-
[width="100%",cols="1,2",options="header"]
69-
|===
70-
| Property
71-
| Description
72-
73-
| [[dataType]] `dataType`
74-
| `StringType`
75-
76-
| [[deterministic]] `deterministic`
77-
| Disabled (i.e. `false`)
78-
79-
| [[nullable]] `nullable`
80-
| Enabled (i.e. `true`)
81-
82-
| [[prettyName]] `prettyName`
83-
| `reflect`
84-
|===
85-
86-
NOTE: `CallMethodViaReflection` is very similar to spark-sql-Expression-StaticInvoke.md[StaticInvoke] expression.
72+
00 'Project ['reflect(java.time.LocalDateTime, now) AS now#0]
73+
01 +- OneRowRelation
74+
```

docs/expressions/MonotonicallyIncreasingID.md

+11-7
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,14 @@
1-
# MonotonicallyIncreasingID
1+
---
2+
title: MonotonicallyIncreasingID
3+
---
4+
5+
# MonotonicallyIncreasingID Leaf Expression
26

37
`MonotonicallyIncreasingID` is a [non-deterministic](Nondeterministic.md) [leaf expression](Expression.md#LeafExpression) that represents `monotonically_increasing_id` [standard](../standard-functions/index.md#monotonically_increasing_id) and [SQL](../FunctionRegistry.md#monotonically_increasing_id) functions in [logical query plans](../logical-operators/LogicalPlan.md).
48

59
`MonotonicallyIncreasingID` supports [code-generated](#doGenCode) and [interpreted](#evalInternal) execution modes.
610

7-
## <span id="dataType"> Result DataType
11+
## Result DataType { #dataType }
812

913
```scala
1014
dataType: DataType
@@ -14,7 +18,7 @@ dataType: DataType
1418

1519
`dataType` is part of the [Expression](Expression.md#dataType) abstraction.
1620

17-
## <span id="nullable"> Never Nullable
21+
## Never Nullable { #nullable }
1822

1923
```scala
2024
nullable: Boolean
@@ -24,7 +28,7 @@ nullable: Boolean
2428

2529
`nullable` is part of the [Expression](Expression.md#nullable) abstraction.
2630

27-
## <span id="initializeInternal"> Initialization
31+
## Internal Initialize { #initializeInternal }
2832

2933
```scala
3034
initializeInternal(
@@ -45,7 +49,7 @@ scala> println(partitionMask.toBinaryString)
4549

4650
`initializeInternal` is part of the [Nondeterministic](Nondeterministic.md#initializeInternal) abstraction.
4751

48-
## <span id="evalInternal"> Interpreted Expression Evaluation
52+
## Internal Interpreted Expression Evaluation { #evalInternal }
4953

5054
```scala
5155
evalInternal(
@@ -58,7 +62,7 @@ evalInternal(
5862

5963
`evalInternal` is part of the [Nondeterministic](Nondeterministic.md#evalInternal) abstraction.
6064

61-
## <span id="doGenCode"> Code-Generated Expression Evaluation
65+
## Code-Generated Expression Evaluation { #doGenCode }
6266

6367
```scala
6468
doGenCode(
@@ -104,6 +108,6 @@ final long value_0 = partitionMask + count_0;
104108
count_0++;
105109
```
106110

107-
## <span id="Stateful"> Stateful
111+
## Stateful { #Stateful }
108112

109113
`MonotonicallyIncreasingID` is a [Stateful](Stateful.md).

docs/expressions/Nondeterministic.md

+57-31
Original file line numberDiff line numberDiff line change
@@ -1,73 +1,99 @@
1+
---
2+
title: Nondeterministic
3+
---
4+
15
# Nondeterministic Expressions
26

3-
`Nondeterministic` is an [extension](#contract) of the [Expression](Expression.md) abstraction for [non-deterministic and non-foldable expressions](#implementations).
7+
`Nondeterministic` is an [extension](#contract) of the [Expression](Expression.md) abstraction for [non-deterministic, non-foldable expressions](#implementations).
48

5-
Nondeterministic expression should be [initialized](#initialize) (with the partition ID) before [evaluation](#eval).
9+
`Nondeterministic` expressions should be [initialized](#initialize) (with the partition ID) before [evaluation](#eval).
610

711
## Contract
812

9-
### <span id="evalInternal"> evalInternal
13+
### Internal Interpreted Expression Evaluation { #evalInternal }
1014

1115
```scala
1216
evalInternal(
1317
input: InternalRow): Any
1418
```
1519

20+
See:
21+
22+
* [CallMethodViaReflection](CallMethodViaReflection.md#evalInternal)
23+
* [MonotonicallyIncreasingID](MonotonicallyIncreasingID.md#evalInternal)
24+
1625
Used when:
1726

18-
* `Nondeterministic` is requested to [eval](#eval)
27+
* `Nondeterministic` expression is requested to [evaluate](#eval)
1928

20-
### <span id="initializeInternal"> initializeInternal
29+
### Internal Initialize { #initializeInternal }
2130

2231
```scala
2332
initializeInternal(
2433
partitionIndex: Int): Unit
2534
```
2635

36+
See:
37+
38+
* [CallMethodViaReflection](CallMethodViaReflection.md#initializeInternal)
39+
* [MonotonicallyIncreasingID](MonotonicallyIncreasingID.md#initializeInternal)
40+
2741
Used when:
2842

2943
* `Nondeterministic` is requested to [initialize](#initialize)
3044

3145
## Implementations
3246

3347
* [CallMethodViaReflection](CallMethodViaReflection.md)
34-
* `CurrentBatchTimestamp`
35-
* `InputFileBlockLength`
36-
* `InputFileBlockStart`
37-
* `InputFileName`
38-
* `SparkPartitionID`
39-
* `Stateful`
48+
* [MonotonicallyIncreasingID](MonotonicallyIncreasingID.md)
49+
* _others_
50+
51+
## Deterministic { #deterministic }
52+
53+
??? note "Expression"
4054

41-
## Review Me
55+
```scala
56+
deterministic: Boolean
57+
```
4258

43-
NOTE: `Nondeterministic` expressions are the target of `PullOutNondeterministic` logical plan rule.
59+
`deterministic` is part of the [Expression](Expression.md#deterministic) abstraction.
4460

45-
=== [[initialize]] Initializing Expression -- `initialize` Method
61+
??? note "Final Method"
62+
`deterministic` is a Scala **final method** and may not be overridden in [subclasses](#implementations).
4663

47-
[source, scala]
48-
----
49-
initialize(partitionIndex: Int): Unit
50-
----
64+
Learn more in the [Scala Language Specification]({{ scala.spec }}/05-classes-and-objects.html#final).
5165

52-
Internally, `initialize` <<initializeInternal, initializes>> itself (with the input partition index) and turns the internal <<initialized, initialized>> flag on.
66+
`deterministic` is always `false`.
5367

54-
`initialize` is used when [InterpretedProjection](InterpretedProjection.md#initialize) and `InterpretedMutableProjection` are requested to `initialize` themselves.
68+
## Foldable { #foldable }
5569

56-
=== [[eval]] Evaluating Expression -- `eval` Method
70+
??? note "Expression"
5771

58-
[source, scala]
59-
----
60-
eval(input: InternalRow): Any
61-
----
72+
```scala
73+
foldable: Boolean
74+
```
6275

63-
`eval` is part of the [Expression](Expression.md#eval) abstraction.
76+
`foldable` is part of the [Expression](Expression.md#foldable) abstraction.
6477

65-
`eval` is just a wrapper of <<evalInternal, evalInternal>> that makes sure that <<initialize, initialize>> has already been executed (and so the expression is initialized).
78+
??? note "Final Method"
79+
`foldable` is a Scala **final method** and may not be overridden in [subclasses](#implementations).
6680

67-
Internally, `eval` makes sure that the expression was <<initialized, initialized>> and calls <<evalInternal, evalInternal>>.
81+
Learn more in the [Scala Language Specification]({{ scala.spec }}/05-classes-and-objects.html#final).
6882

69-
`eval` reports a `IllegalArgumentException` exception when the internal <<initialized, initialized>> flag is off, i.e. <<initialize, initialize>> has not yet been executed.
83+
`foldable` is always `false`.
7084

71-
```text
72-
requirement failed: Nondeterministic expression [name] should be initialized before eval.
85+
## Initialize { #initialize }
86+
87+
```scala
88+
initialize(
89+
partitionIndex: Int): Unit
7390
```
91+
92+
`initialize` [initializeInternal](#initializeInternal) and sets the [initialized](#initialized) internal flag to `true`.
93+
94+
---
95+
96+
`initialize` is used when:
97+
98+
* `ExpressionsEvaluator` is requested to `initializeExprs`
99+
* `GenerateExec` physical operator is requested to [doExecute](../physical-operators/GenerateExec.md#doExecute)

0 commit comments

Comments
 (0)