-
Notifications
You must be signed in to change notification settings - Fork 10
Expression Translation
Harish Butani edited this page Nov 11, 2020
·
7 revisions
translate Cast, PromotePrecision and CheckOverflow *
- for PromotePrecision just return child translation.
- for CheckOverflow
- if
nullOnOverflow
is true add a case check - if
nullOnOverflow
is false: do nothing? translated oExpr will throw
- if
val epochTS = {
// osql"to_timestamp_tz('1970-01-01 00:00:00 00:00', 'YYYY-MM-DD HH24:MI:SS TZH:TZM')"
osql"from_tz(to_timestamp('1970-01-01', 'YYYY-MM-DD'), 'UTC')"
}
val epochDt = osql"date '1970-01-01'"
val epochTSAtSessionTZ = osql"to_timestamp('1970-01-01', 'YYYY-MM-DD')"
val true_bool_TS =
osql"from_tz(to_timestamp('1970-01-01', 'YYYY-MM-DD'), 'UTC') + interval '0.001' second(0,3)"
- if there is no catalystExpr#zoneId then generate
cast(${oraE} as timestamp)
- else generate
cast(${oraE} as timestamp) at time zone ${zoneOE}
- if there is no catalystExpr#zoneId then
- cast input to a timestamp and set its timezone; then cast the result to a date
- Translation expression:
cast(from_tz(cast({oraE} as timestamp), {zoneOE}) as date)
- Otherwise translation expression is
cast({oraE} as date)
Translation logic is:
millisToInterval = numtodsinterval({oraE}/1000, 'SECOND')
millisToIntervalWithTZOffset = {millisToInterval} + {epochTS} - {epochTSAtSessionTZ}
result = {epochTSAtSessionTZ} + ({millisToIntervalWitTZOffset})
For example for oraE = 1603425037802
, sql is:
to_timestamp('1970-01-01', 'YYYY-MM-DD') +
(numtodsinterval(1603425037802/1000, 'SECOND') +
from_tz(to_timestamp('1970-01-01', 'YYYY-MM-DD')), 'UTC') -
to_timestamp('1970-01-01', 'YYYY-MM-DD')
)
Translation logic is:
millisToInterval = numtodsinterval({oraE}/1000, 'SECOND')
millisToIntervalWithTZOffset = {millisToInterval} + {epochTS} - {epochTSAtSessionTZ}
epoch_ts = {epochTSAtSessionTZ} + {millisToIntervalWitTZOffset}
result = trunc({epoch_ts}, 'DD')
For example for oraE = 1603425037802
, sql is:
trunc(
to_timestamp('1970-01-01', 'YYYY-MM-DD') +
(numtodsinterval(1603425037802/1000, 'SECOND') +
from_tz(to_timestamp('1970-01-01', 'YYYY-MM-DD')), 'UTC') -
to_timestamp('1970-01-01', 'YYYY-MM-DD')
),
'DD'
)
Translation logic is:
// using ora date arithmetic: ora_ts - ora_ts -> ora_interval
days = extract(day from ({oraE} - {epochTS}))246060
hours = extract(hour from ({oraE} - {epochTS}))6060
mins = extract(minute from ({oraE} - {epochTS}))6060
secs = extract(second from ({oraE} - {epochTS}))6060
result = ({days} + {hours} + {mins} + {secs})1000
For example for oraE = systimestamp
, sql is:
extract(day from (systimestamp - from_tz(to_timestamp('1970-01-01', 'YYYY-MM-DD')), 'UTC')))246060 +
extract(hour from (systimestamp - from_tz(to_timestamp('1970-01-01', 'YYYY-MM-DD')), 'UTC')))6060 +
extract(minute from (systimestamp - from_tz(to_timestamp('1970-01-01', 'YYYY-MM-DD')), 'UTC')))60 +
extract(second from (systimestamp - from_tz(to_timestamp('1970-01-01', 'YYYY-MM-DD')), 'UTC')))
)1000
Translation logic is:
trunc_to_days = trunc(sysdate, 'DD')
// using ora date arithmetic: ora_date - ora_ts -> ora_interval
interval_from_epoch = trunc_to_days - epoch_ts
num_hours = extract(day from interval_from_epoch)24 +
extract(hour from interval_from_epoch)
result = num_hours60601000
For example, for sysdate:
(extract(day from(trunc(sysdate, 'DD') - from_tz(to_timestamp('1970-01-01', 'YYYY-MM-DD')), 'UTC')))24 +
extract(hour from(trunc(sysdate, 'DD') - from_tz(to_timestamp('1970-01-01', 'YYYY-MM-DD')), 'UTC')))
)60601000
for widening Cast: do nothing, return 'childOE'
for narrowing Cast: use the following sql expression template:
case when {childOE} > {toDT.MinValue} and {childOE} < {toDT.MaxValue}
then cast({childOE} as {toDT})
else null
end
from string:
-
to numeric:
- apply
TO_NUMBER
https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/TO_NUMBER.html#GUID-D4807212-AFD7-48A7-9AED-BEC3E8809866 oracle function) - so sql template is
to_number({childOE})
- apply
-
to date:
- Spark uses DateTimeUtils.stringToDate; this tries a bunch of Date formats
- When translating we will use the default date format of Oracle Connection.
From Oracle
To_Date
https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/TO_DATE.html#GUID-D226FA7C-F7AD-41A0-BB1D-BD8EF9440118 oracle function: The default date format is determined implicitly by the NLS_TERRITORY initialization parameter or can be set explicitly by the NLS_DATE_FORMAT parameter. - so translation is
to_date({childOE})
-
to timestamp:
- Spark uses org.joda.time.DateTimeUtils.stringToTimestamp; this tries a bunch of Date formats.
- When translating we will use the default timestamp format of Oracle Connection.
From Oracle
To_Timestamp
https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/TO_TIMESTAMP.html#GUID-57E09334-E3CC-4CA2-809E-F0909458BCFA oracle function The default format of the TIMESTAMP data type, which is determined by the NLS_TIMESTAMP_FORMAT initialization parameter. - so translation is
to_timestamp({childOE})
-
to boolean:
- Spark uses
-
StringUtils.isTrueString to translate to
true
-
StringUtils.isFalseString to translate to
false
- else
null
-
StringUtils.isTrueString to translate to
- sql template:
(case when ${childOE} in ('t', 'true', 'y', 'yes', '1') then 1 when ${childOE} in ('f', 'false', 'n', 'no', '0') then 0 else null end) = 1
- So below example shows boolean translations:
-- This returns 1 row select 1 from dual where (case when't' in ('t', 'true', 'y', 'yes', '1') then 1 when 't' in ('f', 'false', 'n', 'no', '0') then 0 else null end) = 1; -- These return 0 rows: select 1 from dual where (case when'f' in ('t', 'true', 'y', 'yes', '1') then 1 when 'f' in ('f', 'false', 'n', 'no', '0') then 0 else null end) = 1; select 1 from dual where (case when'u' in ('t', 'true', 'y', 'yes', '1') then 1 when 'u' in ('f', 'false', 'n', 'no', '0') then 0 else null end) = 1;
- Spark uses
to string:
-
from numeric:
- Spark applies
UTF8String.fromString({childExpr.val}.toString)
- translate using
TO_CHAR(number)
https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/TO_CHAR-number.html#GUID-00DA076D-2468-41AB-A3AC-CC78DBA0D9CB oracle function - so translation template is
to_char({childOE})
- For example
to_char(12345678912345.345678900000)
returns12345678912345.345678900000
- Spark applies
-
from date:
- Spark uses DateFormatter for the
timeZoneId
of thecastExpr
- date pattern used is
defaultPattern: String = "yyyy-MM-dd"
- date pattern used is
- translate to sql template using TO_CHAR(date) https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/TO_CHAR-datetime.html#GUID-0C3EEFD1-AE3D-452D-BF23-2FC95664E78F oracle function.
- template is
to_char({childOE})
. - This uses the default date format of the Oracle connection. Which can be changed in Oracle by setting the the 'NLS_TERRITORY' initialization parameter or can be set explicitly by the 'NLS_DATE_FORMAT' parameter.
- template is
- Spark uses DateFormatter for the
-
from timestamp:
- Spark uses FractionTimestampFormatter for the
timeZoneId
of thecastExpr
- timetsamp pattern used is
formatter = DateTimeFormatterHelper.fractionFormatter
Which parses/formats timestamps according to the patternyyyy-MM-dd HH:mm:ss.[..fff..]
- timetsamp pattern used is
- translate to sql template using TO_CHAR(date) https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/TO_CHAR-datetime.html#GUID-0C3EEFD1-AE3D-452D-BF23-2FC95664E78F oracle function.
- template is
to_char({childOE})
. - This uses the default date format of the Oracle connection. Which can be changed in Oracle by setting the the 'NLS_TERRITORY' initialization parameter or can be set explicitly by the 'NLS_TIMESTAMP_FORMAT' parameter.
- For example:
to_char('22-OCT-20 01.16.32.740812 PM')
- template is
- Spark uses FractionTimestampFormatter for the
-
from boolean:
- use template:
case when {childOE} then 'true' else 'false' end
- use template:
from boolean:
-
to numeric:
{orE} != 0
-
to string: Same as
Boolean -> String
in StringCasting -
to date: Same as
Boolean -> Date
in DateCasting -
to timestamp: Same as
Boolean -> Timestamp
in TimestampCasting
to boolean:
-
from numeric:
{oraE}
-
from string: Same as
String -> Boolean
in StringCasting -
from date: Same as
Date -> Boolean
in DateCasting -
from timestamp: Same as
Timestamp -> Boolean
in TimestampCasting
from date:
-
to numeric:
- In Spark: num_of_days since epoch.
- translate to:
{oraE} - {epochDt}
. Based on oracle's date arithmetic(oraDt - oraDt -> number
) this represents the number of days since start of epoch.
-
to string:
- Sames as
Date -> String
in StringCasting
- Sames as
-
to timestamp:
- In Spark:
DatetimeUtils.daysToMicros(d, zoneId)
- Converts days since
1970-01-01
at the given zone ID to microseconds since 1970-01-01 00:00:00Z.
- Converts days since
- translate to:
cast({oraE} as timestamp)
with additionalat time zone {castE.timeZoneId}
. See dtToTimestamp() method.
- In Spark:
-
to boolean:
- In Spark:
null
- translate to:
null
- In Spark:
to date:
-
from numeric:
- In Spark it is undefined
- translate to:
{epochDt} + {oraE}
Based on oracle's date arithmetic this represents thedate
that is{oraE}
days from epoch.
-
from string:
- same as
String -> Date
in StringCasting
- same as
-
from timestamp:
- In Spark: convert timestamp at given tz to date
- translate to:
cast({oraE} as date)
; if{castE.timeZoneId}
is specified first convert to timestamp in timeZone. See timestampToDt()
-
from boolean:
- In Spark it is undefined
- we throw during translation.
from timestamp:
-
to numeric:
- In Spark: convert to
millis_since_epoch
- translate to: see timestampToEpoch()
- In Spark: convert to
-
to string: Sames as
Timestamp -> String
in StringCasting -
to date: Same as
Timestamp -> Date
in DateCasting -
to boolean:
- In Spark:
millis_since_epoch != 0
translate to:timestampToEpoch({oraE}) != 0
. See timestampToEpoch()
- In Spark:
to timestamp:
-
from numeric:
- In Spark it is undefined
- translate to: See epochToTimestamp()
-
from string: same as
String -> Date
in StringCasting -
from date:
- In Spark: convert timestamp at given tz to date
- translate to: See timestampToDt()
-
from boolean:
- In Spark:
true
is interpreted as1L millis_since_epoch
, andfalse
is0L millis_since_epoch
. - translate to:
case when {oraE} then ${true_bool_TS} else ${epochTS} end
- In Spark:
- Quick Start
- Latest Demo
- Configuration
- Catalog
- Translation
- Query Splitting details
- DML Operations
- Language Integration
- Dockerized Demo env.
- Sharded Database
- Developer Notes