Skip to content
This repository has been archived by the owner on Aug 2, 2022. It is now read-only.

Commit

Permalink
Added following string functions: regex, substr, substring, ltrim, rt…
Browse files Browse the repository at this point in the history
…rim, trim, upper, lower, concat, concat_ws, length, strcmp (#750)

* Bug fix, support long type for aggregation (#522)

* Bug fix, support long type for aggregation

* change to datetime to JDBC format

* Opendistro Release 1.9.0 (#532)

* prepare odfe 1.9

* Fix all ES 7.8 compile and build errors

* Revert changes as Lombok is working now

* Update CustomExternalTestCluster.java

* Fix license headers check

* Use splitFieldsByMetadata to separate fields when calling SearchHit constructor

* More fixes for ODFE 1.9

* Remove todo statement

* Add ODFE 1.9.0 release notes

* Rename release notes to use 4 digit versions (#547)

* Revert changes ahead of develop branch in master (#551)

* Revert "Rename release notes to use 4 digit versions (#547)"

This reverts commit 33c6d3e.

* Revert "Opendistro Release 1.9.0 (#532)"

This reverts commit 254f2e0.

* Revert "Bug fix, support long type for aggregation (#522)"

This reverts commit fb2ed91.

* Merge all SQL repos and adjust workflows (#549) (#554)

* merge all sql repos

* fix test and build workflows

* fix workbench and odbc path

* fix workbench and odbc path

* restructure workbench dir and fix workflows

* fix workbench workflow

* fix workbench workflow

* fix workbench workflow

* fix workbench workflow

* fix workbench workflow

* revert workbench directory structure

* fix workbench workflow

* fix workbench workflow

* fix workbench workflow

* fix workbench workflow

* update workbench workflow for release

* Delete .github/ in sql-workbench directory

* Add cypress to sql-workbench

* Sync latest ODBC commits

* Sync latest workbench commits (will add cypress in separate PR)

* Add ignored ODBC libs

* add date and time support (#560)

* add date and time support

* update doc

* update doc

* Revert "add date and time support (#560)" (#567)

This reverts commit 4b33a2f.

* add error details for all server communication errors (#645)

- add null check to avoid crashing if details not initialized

* Revert "add error details for all server communication errors (#645)" (#653)

This reverts commit c11125d.

* Fix download link in package description (#729)

* [1] Initial commit, checking if server build passes

* [1] Commiting expression documentation with REGEXP

* [1] Failure with REGEXP doc

* [1] Moved testing for regexp to binary predicates.

* [1] Updating parser for REGEX

* [1] Parser update

* [1] Making REGEXP like LIKE

* [1] Reverting change to legacy

* [1] Checking if same without NOT

* [1] testing adding to Ast Expr

* [1] Switching REGEX over to Integer.

* [1] Reversion test

* [1] Add back test

* [1] Fixing spacing

* [1] Regexp builder test.

* -2

* -1

* [1] trying with semicolon

* [1] Found the missing link >_<

* [1] Functions documentation

* [1] Fixing documentation mistake.

* [1] Retesting

* [1] Trying to debug python

* [1] more python debug info

* [1] Trying again.

* [1] MOre py inof

* [1] Fixed except

* [1] Simplified concat and concat_ws

* [1] Added missing stuff to paraser

* [1] Fixed some functions and removed some unused imports

* [1] Fixed STRCMP

* [1] Trying to fix aliasing issue with substring

* [1] Fixed stringcompare and substring

* [1] REmoving unused imorts

* [1] Fixed documentation

* [1] REGEXP not supported by sqllite so removing these for now.

* [1] Removed auto IT and added manual.

* [1] Fixed spacing

* [1] Fixed type definitions

* [1] Fixed integer values and ltrim

* [1] COrrecting ltrim again

* [1] Changed patterns

* [1] Fixed some minor issues

* [1] reverting change i didnt make

* [1] Condensed logic
[2] Added EMPTY_STRING
[3] Removed unused function resolver

* [1] Removed SUBSTRING FunctionName.

* [1] Reverted failure issues

* [1] Combined substring and substr test.

* [1] Added ppl test and edited caps in textfunctiontest

* [1] Testing without source

* [1] Correcting format of string

* [1] Testing new queries

* [1] Adding resource and fixed tests

* [1] Added maapping and adjusted tests

* [1] minor corrections

* [1] Additional debug info

* [1] Removing unsuspported ppl functions.

* [1] Added back unsupported

* [1] Checking regex

* [1] Removing printout

* [1] Trying to fix commit

* Revert "[1] Trying to fix commit"

This reverts commit 3fa954c.

* [1] Adding rest of commit

* [1] Adding workflow files

* [1] fixed docs

* [1] Updated based on PR comments.

* [1] Fixed code so tests pass

Co-authored-by: Peng Huo <penghuo@gmail.com>
Co-authored-by: Joshua <joshuali925@gmail.com>
Co-authored-by: Joshua Li <lijshu@amazon.com>
Co-authored-by: Jordan Wilson <37088125+jordanw-bq@users.noreply.github.com>
Co-authored-by: Chloe <chloezh1102@gmail.com>
Co-authored-by: chloe-zh <fizhang@amazon.com>
Co-authored-by: Sayali Gaikawad <61760125+gaiksaya@users.noreply.github.com>
  • Loading branch information
8 people authored Sep 24, 2020
1 parent f98c51f commit 15bc237
Show file tree
Hide file tree
Showing 25 changed files with 1,235 additions and 45 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -273,6 +273,54 @@ public FunctionExpression module(Expression... expressions) {
return function(BuiltinFunctionName.MODULES, expressions);
}

public FunctionExpression substr(Expression... expressions) {
return function(BuiltinFunctionName.SUBSTR, expressions);
}

public FunctionExpression substring(Expression... expressions) {
return function(BuiltinFunctionName.SUBSTR, expressions);
}

public FunctionExpression ltrim(Expression... expressions) {
return function(BuiltinFunctionName.LTRIM, expressions);
}

public FunctionExpression rtrim(Expression... expressions) {
return function(BuiltinFunctionName.RTRIM, expressions);
}

public FunctionExpression trim(Expression... expressions) {
return function(BuiltinFunctionName.TRIM, expressions);
}

public FunctionExpression upper(Expression... expressions) {
return function(BuiltinFunctionName.UPPER, expressions);
}

public FunctionExpression lower(Expression... expressions) {
return function(BuiltinFunctionName.LOWER, expressions);
}

public FunctionExpression regexp(Expression... expressions) {
return function(BuiltinFunctionName.REGEXP, expressions);
}

public FunctionExpression concat(Expression... expressions) {
return function(BuiltinFunctionName.CONCAT, expressions);
}

public FunctionExpression concat_ws(Expression... expressions) {
return function(BuiltinFunctionName.CONCAT_WS, expressions);
}

public FunctionExpression length(Expression... expressions) {
return function(BuiltinFunctionName.LENGTH, expressions);
}

public FunctionExpression strcmp(Expression... expressions) {
return function(BuiltinFunctionName.STRCMP, expressions);
}

public FunctionExpression and(Expression... expressions) {
return function(BuiltinFunctionName.AND, expressions);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
import com.amazon.opendistroforelasticsearch.sql.expression.operator.arthmetic.MathematicalFunction;
import com.amazon.opendistroforelasticsearch.sql.expression.operator.predicate.BinaryPredicateOperator;
import com.amazon.opendistroforelasticsearch.sql.expression.operator.predicate.UnaryPredicateOperator;
import com.amazon.opendistroforelasticsearch.sql.expression.text.TextFunction;
import java.util.HashMap;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
Expand All @@ -47,6 +48,7 @@ public BuiltinFunctionRepository functionRepository() {
AggregatorFunction.register(builtinFunctionRepository);
DateTimeFunction.register(builtinFunctionRepository);
IntervalClause.register(builtinFunctionRepository);
TextFunction.register(builtinFunctionRepository);
return builtinFunctionRepository;
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,22 @@ public enum BuiltinFunctionName {
SUM(FunctionName.of("sum")),
COUNT(FunctionName.of("count")),

/**
* Text Functions.
*/
SUBSTR(FunctionName.of("substr")),
SUBSTRING(FunctionName.of("substring")),
RTRIM(FunctionName.of("rtrim")),
LTRIM(FunctionName.of("ltrim")),
TRIM(FunctionName.of("trim")),
UPPER(FunctionName.of("upper")),
LOWER(FunctionName.of("lower")),
REGEXP(FunctionName.of("regexp")),
CONCAT(FunctionName.of("concat")),
CONCAT_WS(FunctionName.of("concat_ws")),
LENGTH(FunctionName.of("length")),
STRCMP(FunctionName.of("strcmp")),

/**
* NULL Test.
*/
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
import static com.amazon.opendistroforelasticsearch.sql.data.model.ExprValueUtils.LITERAL_NULL;
import static com.amazon.opendistroforelasticsearch.sql.data.model.ExprValueUtils.LITERAL_TRUE;
import static com.amazon.opendistroforelasticsearch.sql.data.type.ExprCoreType.BOOLEAN;
import static com.amazon.opendistroforelasticsearch.sql.data.type.ExprCoreType.INTEGER;
import static com.amazon.opendistroforelasticsearch.sql.data.type.ExprCoreType.STRING;

import com.amazon.opendistroforelasticsearch.sql.data.model.ExprBooleanValue;
Expand Down Expand Up @@ -61,6 +62,7 @@ public static void register(BuiltinFunctionRepository repository) {
repository.register(gte());
repository.register(like());
repository.register(notLike());
repository.register(regexp());
}

/**
Expand Down Expand Up @@ -245,6 +247,12 @@ private static FunctionResolver like() {
STRING));
}

private static FunctionResolver regexp() {
return FunctionDSL.define(BuiltinFunctionName.REGEXP.getName(), FunctionDSL
.impl(FunctionDSL.nullMissingHandling(OperatorUtils::matchesRegexp),
INTEGER, STRING, STRING));
}

private static FunctionResolver notLike() {
return FunctionDSL.define(BuiltinFunctionName.NOT_LIKE.getName(), FunctionDSL
.impl(FunctionDSL.nullMissingHandling(
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,229 @@
/*
*
* Copyright 2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License").
* You may not use this file except in compliance with the License.
* A copy of the License is located at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* or in the "license" file accompanying this file. This file is distributed
* on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
* express or implied. See the License for the specific language governing
* permissions and limitations under the License.
*
*/

package com.amazon.opendistroforelasticsearch.sql.expression.text;

import static com.amazon.opendistroforelasticsearch.sql.data.type.ExprCoreType.INTEGER;
import static com.amazon.opendistroforelasticsearch.sql.data.type.ExprCoreType.STRING;
import static com.amazon.opendistroforelasticsearch.sql.expression.function.FunctionDSL.define;
import static com.amazon.opendistroforelasticsearch.sql.expression.function.FunctionDSL.impl;
import static com.amazon.opendistroforelasticsearch.sql.expression.function.FunctionDSL.nullMissingHandling;

import com.amazon.opendistroforelasticsearch.sql.data.model.ExprIntegerValue;
import com.amazon.opendistroforelasticsearch.sql.data.model.ExprStringValue;
import com.amazon.opendistroforelasticsearch.sql.data.model.ExprValue;
import com.amazon.opendistroforelasticsearch.sql.expression.function.BuiltinFunctionName;
import com.amazon.opendistroforelasticsearch.sql.expression.function.BuiltinFunctionRepository;
import com.amazon.opendistroforelasticsearch.sql.expression.function.FunctionName;
import com.amazon.opendistroforelasticsearch.sql.expression.function.FunctionResolver;

import lombok.experimental.UtilityClass;


/**
* The definition of text functions.
* 1) have the clear interface for function define.
* 2) the implementation should rely on ExprValue.
*/
@UtilityClass
public class TextFunction {
private static String EMPTY_STRING = "";

/**
* Register String Functions.
*
* @param repository {@link BuiltinFunctionRepository}.
*/
public void register(BuiltinFunctionRepository repository) {
repository.register(substr());
repository.register(substring());
repository.register(ltrim());
repository.register(rtrim());
repository.register(trim());
repository.register(lower());
repository.register(upper());
repository.register(concat());
repository.register(concat_ws());
repository.register(length());
repository.register(strcmp());
}

/**
* Gets substring starting at given point, for optional given length.
* Form of this function using keywords instead of comma delimited variables is not supported.
* Supports following signatures:
* (STRING, INTEGER)/(STRING, INTEGER, INTEGER) -> STRING
*/
private FunctionResolver substringSubstr(FunctionName functionName) {
return define(functionName,
impl(nullMissingHandling(TextFunction::exprSubstrStart),
STRING, STRING, INTEGER),
impl(nullMissingHandling(TextFunction::exprSubstrStartLength),
STRING, STRING, INTEGER, INTEGER));
}

private FunctionResolver substring() {
return substringSubstr(BuiltinFunctionName.SUBSTRING.getName());
}

private FunctionResolver substr() {
return substringSubstr(BuiltinFunctionName.SUBSTR.getName());
}

/**
* Removes leading whitespace from string.
* Supports following signatures:
* STRING -> STRING
*/
private FunctionResolver ltrim() {
return define(BuiltinFunctionName.LTRIM.getName(),
impl(nullMissingHandling((v) -> new ExprStringValue(v.stringValue().stripLeading())),
STRING, STRING));
}

/**
* Removes trailing whitespace from string.
* Supports following signatures:
* STRING -> STRING
*/
private FunctionResolver rtrim() {
return define(BuiltinFunctionName.RTRIM.getName(),
impl(nullMissingHandling((v) -> new ExprStringValue(v.stringValue().stripTrailing())),
STRING, STRING));
}

/**
* Removes leading and trailing whitespace from string.
* Has option to specify a String to trim instead of whitespace but this is not yet supported.
* Supporting String specification requires finding keywords inside TRIM command.
* Supports following signatures:
* STRING -> STRING
*/
private FunctionResolver trim() {
return define(BuiltinFunctionName.TRIM.getName(),
impl(nullMissingHandling((v) -> new ExprStringValue(v.stringValue().trim())),
STRING, STRING));
}

/**
* Converts String to lowercase.
* Supports following signatures:
* STRING -> STRING
*/
private FunctionResolver lower() {
return define(BuiltinFunctionName.LOWER.getName(),
impl(nullMissingHandling((v) -> new ExprStringValue((v.stringValue().toLowerCase()))),
STRING, STRING)
);
}

/**
* Converts String to uppercase.
* Supports following signatures:
* STRING -> STRING
*/
private FunctionResolver upper() {
return define(BuiltinFunctionName.UPPER.getName(),
impl(nullMissingHandling((v) -> new ExprStringValue((v.stringValue().toUpperCase()))),
STRING, STRING)
);
}

/**
* TODO: https://github.com/opendistro-for-elasticsearch/sql/issues/710
* Extend to accept variable argument amounts.
* Concatenates a list of Strings.
* Supports following signatures:
* (STRING, STRING) -> STRING
*/
private FunctionResolver concat() {
return define(BuiltinFunctionName.CONCAT.getName(),
impl(nullMissingHandling((str1, str2) ->
new ExprStringValue(str1.stringValue() + str2.stringValue())), STRING, STRING, STRING));
}

/**
* TODO: https://github.com/opendistro-for-elasticsearch/sql/issues/710
* Extend to accept variable argument amounts.
* Concatenates a list of Strings with a separator string.
* Supports following signatures:
* (STRING, STRING, STRING) -> STRING
*/
private FunctionResolver concat_ws() {
return define(BuiltinFunctionName.CONCAT_WS.getName(),
impl(nullMissingHandling((sep, str1, str2) ->
new ExprStringValue(str1.stringValue() + sep.stringValue() + str2.stringValue())),
STRING, STRING, STRING, STRING));
}

/**
* Calculates length of String in bytes.
* Supports following signatures:
* STRING -> INTEGER
*/
private FunctionResolver length() {
return define(BuiltinFunctionName.LENGTH.getName(),
impl(nullMissingHandling((str) ->
new ExprIntegerValue(str.stringValue().getBytes().length)), INTEGER, STRING));
}

/**
* Does String comparison of two Strings and returns Integer value.
* Supports following signatures:
* (STRING, STRING) -> INTEGER
*/
private FunctionResolver strcmp() {
return define(BuiltinFunctionName.STRCMP.getName(),
impl(nullMissingHandling((str1, str2) ->
new ExprIntegerValue(Integer.compare(
str1.stringValue().compareTo(str2.stringValue()), 0))),
INTEGER, STRING, STRING));
}

private static ExprValue exprSubstrStart(ExprValue exprValue, ExprValue start) {
int startIdx = start.integerValue();
if (startIdx == 0) {
return new ExprStringValue(EMPTY_STRING);
}
String str = exprValue.stringValue();
return exprSubStr(str, startIdx, str.length());
}

private static ExprValue exprSubstrStartLength(
ExprValue exprValue, ExprValue start, ExprValue length) {
int startIdx = start.integerValue();
int len = length.integerValue();
if ((startIdx == 0) || (len == 0)) {
return new ExprStringValue(EMPTY_STRING);
}
String str = exprValue.stringValue();
return exprSubStr(str, startIdx, len);
}

private static ExprValue exprSubStr(String str, int start, int len) {
// Correct negative start
start = (start > 0) ? (start - 1) : (str.length() + start);

if (start > str.length()) {
return new ExprStringValue(EMPTY_STRING);
} else if ((start + len) > str.length()) {
return new ExprStringValue(str.substring(start));
}
return new ExprStringValue(str.substring(start, start + len));
}
}

Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
package com.amazon.opendistroforelasticsearch.sql.utils;

import com.amazon.opendistroforelasticsearch.sql.data.model.ExprBooleanValue;
import com.amazon.opendistroforelasticsearch.sql.data.model.ExprIntegerValue;
import com.amazon.opendistroforelasticsearch.sql.data.model.ExprValue;
import java.util.regex.Pattern;
import lombok.experimental.UtilityClass;
Expand All @@ -35,6 +36,16 @@ public static ExprBooleanValue matches(ExprValue text, ExprValue pattern) {
.matches());
}

/**
* Checks if text matches regular expression pattern.
* @param pattern string pattern to match.
* @return if text matches pattern returns true; else return false.
*/
public static ExprIntegerValue matchesRegexp(ExprValue text, ExprValue pattern) {
return new ExprIntegerValue(Pattern.compile(pattern.stringValue()).matcher(text.stringValue())
.matches() ? 1 : 0);
}

private static final char DEFAULT_ESCAPE = '\\';

private static String patternToRegex(String patternString) {
Expand Down
Loading

0 comments on commit 15bc237

Please sign in to comment.