Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
133 commits
Select commit Hold shift + click to select a range
c47d079
initial set of changes for local[4] in core
Oct 30, 2016
90d3b91
[SPARK-18103][SQL] Rename *FileCatalog to *FileIndex
ericl Oct 30, 2016
522359b
added mllib changes to local[4]
Oct 30, 2016
8ae2da0
[SPARK-18106][SQL] ANALYZE TABLE should raise a ParseException for in…
dongjoon-hyun Oct 30, 2016
2881a2d
[SPARK-17919] Make timeout to RBackend configurable in SparkR
falaki Oct 30, 2016
b6879b8
[SPARK-16137][SPARKR] randomForest for R
felixcheung Oct 30, 2016
7c37869
[SPARK-18110][PYTHON][ML] add missing parameter in Python for RandomF…
felixcheung Oct 30, 2016
d2923f1
[SPARK-18143][SQL] Ignore Structured Streaming event logs to avoid br…
zsxwing Oct 31, 2016
26b07f1
[BUILD] Close stale Pull Requests.
srowen Oct 31, 2016
8bfc3b7
[SPARK-17972][SQL] Add Dataset.checkpoint() to truncate large query p…
liancheng Oct 31, 2016
de3f87f
[SPARK-18030][TESTS] Fix flaky FileStreamSourceSuite by not deleting …
zsxwing Oct 31, 2016
6633b97
[SPARK-18167][SQL] Also log all partitions when the SQLQuerySuite tes…
ericl Oct 31, 2016
efc254a
[SPARK-18087][SQL] Optimize insert to not require REPAIR TABLE
ericl Nov 1, 2016
7d6c871
[SPARK-18167][SQL] Retry when the SQLQuerySuite test flakes
ericl Nov 1, 2016
d9d1465
[SPARK-18024][SQL] Introduce an internal commit protocol API
rxin Nov 1, 2016
dd85eb5
[SPARK-18107][SQL] Insert overwrite statement runs much slower in spa…
viirya Nov 1, 2016
623fc7f
[MINOR][DOC] Remove spaces following slashs
dongjoon-hyun Nov 1, 2016
cb80edc
[SPARK-18111][SQL] Wrong ApproximatePercentile answer when multiple r…
Nov 1, 2016
e34b4e1
[SPARK-15994][MESOS] Allow enabling Mesos fetch cache in coarse execu…
drcrallen Nov 1, 2016
ec6f479
[SPARK-16881][MESOS] Migrate Mesos configs to use ConfigEntry
techaddict Nov 1, 2016
9b377aa
[SPARK-18114][MESOS] Fix mesos cluster scheduler generage command opt…
Nov 1, 2016
f7c145d
[SPARK-17996][SQL] Fix unqualified catalog.getFunction(...)
hvanhovell Nov 1, 2016
5441a62
[SPARK-16839][SQL] redundant aliases after cleanupAliases
Nov 1, 2016
0cba535
Revert "[SPARK-16839][SQL] redundant aliases after cleanupAliases"
hvanhovell Nov 1, 2016
8ac0910
[SPARK-17848][ML] Move LabelCol datatype cast into Predictor.fit
zhengruifeng Nov 1, 2016
8cdf143
[SPARK-18103][FOLLOW-UP][SQL][MINOR] Rename `MetadataLogFileCatalog` …
lw-lin Nov 1, 2016
8a538c9
[SPARK-18189][SQL] Fix serialization issue in KeyValueGroupedDataset
seyfe Nov 1, 2016
d0272b4
[SPARK-18148][SQL] Misleading Error Message for Aggregation Without W…
jiangxb1987 Nov 1, 2016
cfac17e
[SPARK-18167] Disable flaky SQLQuerySuite test
ericl Nov 1, 2016
01dd008
[SPARK-17764][SQL] Add `to_json` supporting to convert nested struct …
HyukjinKwon Nov 1, 2016
6e62981
[SPARK-17350][SQL] Disable default use of KryoSerializer in Thrift Se…
JoshRosen Nov 1, 2016
b929537
[SPARK-18182] Expose ReplayListenerBus.read() overload which takes st…
JoshRosen Nov 1, 2016
91c33a0
[SPARK-18088][ML] Various ChiSqSelector cleanups
jkbradley Nov 2, 2016
77a9816
[SPARK-18025] Use commit protocol API in structured streaming
rxin Nov 2, 2016
ad4832a
[SPARK-18216][SQL] Make Column.expr public
rxin Nov 2, 2016
1ecfafa
[SPARK-17838][SPARKR] Check named arguments for options and use forma…
HyukjinKwon Nov 2, 2016
1bbf9ff
[SPARK-17992][SQL] Return all partitions from HiveShim when Hive thro…
Nov 2, 2016
620da3b
[SPARK-17475][STREAMING] Delete CRC files if the filesystem doesn't u…
frreiss Nov 2, 2016
abefe2e
[SPARK-18183][SPARK-18184] Fix INSERT [INTO|OVERWRITE] TABLE ... PART…
ericl Nov 2, 2016
a36653c
[SPARK-18192] Support all file formats in structured streaming
rxin Nov 2, 2016
85c5424
[SPARK-18144][SQL] logging StreamingQueryListener$QueryStartedEvent
CodingCat Nov 2, 2016
2dc0480
[SPARK-17532] Add lock debugging info to thread dumps.
rdblue Nov 2, 2016
bcbe444
[MINOR] Use <= for clarity in Pi examples' Monte Carlo process
mrydzy Nov 2, 2016
98ede49
[SPARK-18198][DOC][STREAMING] Highlight code snippets
lw-lin Nov 2, 2016
70a5db7
[SPARK-18204][WEBUI] Remove SparkUI.appUIAddress
jaceklaskowski Nov 2, 2016
9c8deef
[SPARK-18076][CORE][SQL] Fix default Locale used in DateFormat, Numbe…
srowen Nov 2, 2016
f151bd1
[SPARK-16839][SQL] Simplify Struct creation code path
Nov 2, 2016
4af0ce2
[SPARK-17683][SQL] Support ArrayType in Literal.apply
maropu Nov 2, 2016
742e0fe
[SPARK-17895] Improve doc for rangeBetween and rowsBetween
david-weiluo-ren Nov 2, 2016
02f2031
[SPARK-14393][SQL] values generated by non-deterministic functions sh…
mengxr Nov 2, 2016
3c24299
[SPARK-18160][CORE][YARN] spark.files & spark.jars should not be pass…
zjffdu Nov 2, 2016
37d9522
[SPARK-17058][BUILD] Add maven snapshots-and-staging profile to build…
steveloughran Nov 2, 2016
fd90541
[SPARK-18214][SQL] Simplify RuntimeReplaceable type coercion
rxin Nov 2, 2016
3a1bc6f
[SPARK-17470][SQL] unify path for data source table and locationUri f…
cloud-fan Nov 3, 2016
7eb2ca8
[SPARK-17963][SQL][DOCUMENTATION] Add examples (extend) in each expre…
HyukjinKwon Nov 3, 2016
9ddec86
[SPARK-18175][SQL] Improve the test case coverage of implicit type ca…
gatorsmile Nov 3, 2016
d24e736
[SPARK-18200][GRAPHX] Support zero as an initial capacity in OpenHashSet
dongjoon-hyun Nov 3, 2016
96cc1b5
[SPARK-17122][SQL] support drop current database
adrian-wang Nov 3, 2016
937af59
[SPARK-18219] Move commit protocol API (internal) from sql/core to co…
rxin Nov 3, 2016
0ea5d5b
[SQL] minor - internal doc improvement for InsertIntoTable.
rxin Nov 3, 2016
9dc9f9a
[SPARK-18177][ML][PYSPARK] Add missing 'subsamplingRate' of pyspark G…
zhengruifeng Nov 3, 2016
66a99f4
[SPARK-17981][SPARK-17957][SQL] Fix Incorrect Nullability Setting to …
gatorsmile Nov 3, 2016
27daf6b
[SPARK-17949][SQL] A JVM object based aggregate operator
liancheng Nov 3, 2016
b17057c
[SPARK-18244][SQL] Rename partitionProviderIsHive -> tracksPartitions…
rxin Nov 3, 2016
1629331
[SPARK-18237][HIVE] hive.exec.stagingdir have no effect
Nov 3, 2016
098e4ca
[SPARK-18099][YARN] Fail if same files added to distributed cache for…
kishorvpatil Nov 3, 2016
cf36e3a
added local[4] to repl,sparksql,streaming, all tests pass
Nov 3, 2016
67659c9
[SPARK-18212][SS][KAFKA] increase executor poll timeout
koeninger Nov 3, 2016
e892025
[SPARKR][TEST] remove unnecessary suppressWarnings
wangmiao1981 Nov 3, 2016
f22954a
[SPARK-18257][SS] Improve error reporting for FileStressSuite
rxin Nov 3, 2016
dc4c600
[SPARK-18138][DOCS] Document that Java 7, Python 2.6, Scala 2.10, Had…
srowen Nov 4, 2016
aa412c5
[SPARK-18259][SQL] Do not capture Throwable in QueryExecution
hvanhovell Nov 4, 2016
a08463b
[SPARK-14393][SQL][DOC] update doc for python and R
felixcheung Nov 4, 2016
27602c3
[SPARK-18200][GRAPHX][FOLLOW-UP] Support zero as an initial capacity …
dongjoon-hyun Nov 4, 2016
14f235d
Closing some stale/invalid pull requests
rxin Nov 4, 2016
a42d738
[SPARK-18197][CORE] Optimise AppendOnlyMap implementation
a-roberts Nov 4, 2016
550cd56
[SPARK-17337][SQL] Do not pushdown predicates through filters with p…
hvanhovell Nov 4, 2016
4cee2ce
[SPARK-18167] Re-enable the non-flaky parts of SQLQuerySuite
ericl Nov 4, 2016
0e3312e
[SPARK-18256] Improve the performance of event log replay in HistoryS…
JoshRosen Nov 5, 2016
0f7c9e8
[SPARK-18189] [SQL] [Followup] Move test from ReplSuite to prevent ja…
rxin Nov 5, 2016
8a9ca19
[SPARK-17710][FOLLOW UP] Add comments to state why 'Utils.classForNam…
weiqingy Nov 5, 2016
6e27018
[SPARK-18260] Make from_json null safe
brkyvz Nov 5, 2016
95ec4e2
[SPARK-17183][SPARK-17983][SPARK-18101][SQL] put hive serde table sch…
cloud-fan Nov 5, 2016
e2648d3
[SPARK-18287][SQL] Move hash expressions from misc.scala into hash.scala
rxin Nov 5, 2016
a87471c
[SPARK-18192][MINOR][FOLLOWUP] Missed json test in FileStreamSinkSuite
HyukjinKwon Nov 5, 2016
fb0d608
[SPARK-17849][SQL] Fix NPE problem when using grouping sets
Nov 5, 2016
9a87c31
[SPARK-17964][SPARKR] Enable SparkR with Mesos client mode and cluste…
susanxhuynh Nov 5, 2016
15d3926
[MINOR][DOCUMENTATION] Fix some minor descriptions in functions consi…
HyukjinKwon Nov 6, 2016
23ce0d1
[SPARK-18276][ML] ML models should copy the training summary and set …
sethah Nov 6, 2016
340f09d
[SPARK-17854][SQL] rand/randn allows null/long as input seed
HyukjinKwon Nov 6, 2016
b89d055
[SPARK-18210][ML] Pipeline.copy does not create an instance with the …
wojtek-szymanski Nov 6, 2016
556a3b7
[SPARK-18269][SQL] CSV datasource should read null properly when sche…
HyukjinKwon Nov 7, 2016
46b2e49
[SPARK-18173][SQL] data source tables should support truncating parti…
cloud-fan Nov 7, 2016
07ac3f0
[SPARK-18167][SQL] Disable flaky hive partition pruning test.
rxin Nov 7, 2016
9db06c4
[SPARK-18296][SQL] Use consistent naming for expression test suites
rxin Nov 7, 2016
57626a5
[SPARK-16904][SQL] Removal of Hive Built-in Hash Functions and TestHi…
gatorsmile Nov 7, 2016
a814eea
[SPARK-18125][SQL] Fix a compilation error in codegen due to splitExp…
viirya Nov 7, 2016
daa975f
[SPARK-18291][SPARKR][ML] SparkR glm predict should output original l…
yanboliang Nov 7, 2016
b06c23d
[SPARK-18283][STRUCTURED STREAMING][KAFKA] Added test to check whethe…
tdas Nov 7, 2016
0d95662
[SPARK-17108][SQL] Fix BIGINT and INT comparison failure in spark sql
weiqingy Nov 7, 2016
8f0ea01
[SPARK-14914][CORE] Fix Resource not closed after using, mostly for u…
HyukjinKwon Nov 7, 2016
19cf208
[SPARK-17490][SQL] Optimize SerializeFromObject() for a primitive array
kiszk Nov 7, 2016
3a710b9
[SPARK-18236] Reduce duplicate objects in Spark UI and HistoryServer
JoshRosen Nov 8, 2016
3eda057
[SPARK-18295][SQL] Make to_json function null safe (matching it to fr…
HyukjinKwon Nov 8, 2016
9b0593d
[SPARK-18086] Add support for Hive session vars.
rdblue Nov 8, 2016
c1a0c66
[SPARK-18261][STRUCTURED STREAMING] Add statistics to MemorySink for …
lw-lin Nov 8, 2016
1da64e1
[SPARK-18217][SQL] Disallow creating permanent views based on tempora…
gatorsmile Nov 8, 2016
6f36971
[SPARK-16575][CORE] partition calculation mismatch with sc.binaryFiles
fidato13 Nov 8, 2016
47731e1
[SPARK-18207][SQL] Fix a compilation error due to HashExpression.doGe…
kiszk Nov 8, 2016
c291bd2
[SPARK-18137][SQL] Fix RewriteDistinctAggregates UnresolvedException …
Nov 8, 2016
ee2e741
[SPARK-13770][DOCUMENTATION][ML] Document the ML feature Interaction
Nov 8, 2016
b1033fb
[MINOR][DOC] Unify example marks
zhengruifeng Nov 8, 2016
344dcad
[SPARK-17868][SQL] Do not use bitmasks during parsing and analysis of…
jiangxb1987 Nov 8, 2016
73feaa3
[SPARK-18346][SQL] TRUNCATE TABLE should fail if no partition is matc…
cloud-fan Nov 8, 2016
9c41969
[SPARK-18191][CORE] Port RDD API to use commit protocol
jiangxb1987 Nov 8, 2016
245e5a2
[SPARK-18357] Fix yarn files/archive broken issue andd unit tests
kishorvpatil Nov 8, 2016
26e1c53
[SPARK-17748][ML] Minor cleanups to one-pass linear regression with e…
jkbradley Nov 8, 2016
b6de0c9
[SPARK-18280][CORE] Fix potential deadlock in `StandaloneSchedulerBac…
zsxwing Nov 8, 2016
6f7ecb0
[SPARK-18342] Make rename failures fatal in HDFSBackedStateStore
brkyvz Nov 8, 2016
55964c1
[SPARK-18239][SPARKR] Gradient Boosted Tree for R
felixcheung Nov 9, 2016
4afa39e
[SPARK-18333][SQL] Revert hacks in parquet and orc reader to support …
ericl Nov 9, 2016
b9192bb
[SPARK-18368] Fix regexp_replace with task serialization.
rdblue Nov 9, 2016
e256392
[SPARK-17659][SQL] Partitioned View is Not Supported By SHOW CREATE T…
gatorsmile Nov 9, 2016
02c5325
[SPARK-18292][SQL] LogicalPlanToSQLSuite should not use resource depe…
dongjoon-hyun Nov 9, 2016
205e6d5
[SPARK-18338][SQL][TEST-MAVEN] Fix test case initialization order und…
liancheng Nov 9, 2016
06a13ec
[SPARK-16808][CORE] History Server main page does not honor APPLICATI…
vijoshi Nov 9, 2016
4763661
Revert "[SPARK-18368] Fix regexp_replace with task serialization."
yhuai Nov 9, 2016
d4028de
[SPARK-18368][SQL] Fix regexp replace when serialized
rdblue Nov 9, 2016
ec0b2a8
added streaming, repl to unit test runs, changed to local[4]
Nov 9, 2016
d8b81f7
[SPARK-18370][SQL] Add table information to InsertIntoHadoopFsRelatio…
hvanhovell Nov 9, 2016
64fbdf1
[SPARK-18191][CORE][FOLLOWUP] Call `setConf` if `OutputFormat` is `Co…
jiangxb1987 Nov 9, 2016
3f62e1b
[SPARK-17829][SQL] Stable format for offset log
Nov 9, 2016
39912d9
fixed merge conflicts
Nov 10, 2016
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
14 changes: 12 additions & 2 deletions R/pkg/NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,9 @@ exportMethods("glm",
"spark.gaussianMixture",
"spark.als",
"spark.kstest",
"spark.logit")
"spark.logit",
"spark.randomForest",
"spark.gbt")

# Job group lifecycle management methods
export("setJobGroup",
Expand Down Expand Up @@ -350,7 +352,11 @@ export("as.DataFrame",
"uncacheTable",
"print.summary.GeneralizedLinearRegressionModel",
"read.ml",
"print.summary.KSTest")
"print.summary.KSTest",
"print.summary.RandomForestRegressionModel",
"print.summary.RandomForestClassificationModel",
"print.summary.GBTRegressionModel",
"print.summary.GBTClassificationModel")

export("structField",
"structField.jobj",
Expand All @@ -375,6 +381,10 @@ S3method(print, structField)
S3method(print, structType)
S3method(print, summary.GeneralizedLinearRegressionModel)
S3method(print, summary.KSTest)
S3method(print, summary.RandomForestRegressionModel)
S3method(print, summary.RandomForestClassificationModel)
S3method(print, summary.GBTRegressionModel)
S3method(print, summary.GBTClassificationModel)
S3method(structField, character)
S3method(structField, jobj)
S3method(structType, jobj)
Expand Down
10 changes: 5 additions & 5 deletions R/pkg/R/DataFrame.R
Original file line number Diff line number Diff line change
Expand Up @@ -788,7 +788,7 @@ setMethod("write.json",
function(x, path, mode = "error", ...) {
write <- callJMethod(x@sdf, "write")
write <- setWriteOptions(write, mode = mode, ...)
invisible(callJMethod(write, "json", path))
invisible(handledCallJMethod(write, "json", path))
})

#' Save the contents of SparkDataFrame as an ORC file, preserving the schema.
Expand Down Expand Up @@ -819,7 +819,7 @@ setMethod("write.orc",
function(x, path, mode = "error", ...) {
write <- callJMethod(x@sdf, "write")
write <- setWriteOptions(write, mode = mode, ...)
invisible(callJMethod(write, "orc", path))
invisible(handledCallJMethod(write, "orc", path))
})

#' Save the contents of SparkDataFrame as a Parquet file, preserving the schema.
Expand Down Expand Up @@ -851,7 +851,7 @@ setMethod("write.parquet",
function(x, path, mode = "error", ...) {
write <- callJMethod(x@sdf, "write")
write <- setWriteOptions(write, mode = mode, ...)
invisible(callJMethod(write, "parquet", path))
invisible(handledCallJMethod(write, "parquet", path))
})

#' @rdname write.parquet
Expand Down Expand Up @@ -895,7 +895,7 @@ setMethod("write.text",
function(x, path, mode = "error", ...) {
write <- callJMethod(x@sdf, "write")
write <- setWriteOptions(write, mode = mode, ...)
invisible(callJMethod(write, "text", path))
invisible(handledCallJMethod(write, "text", path))
})

#' Distinct
Expand Down Expand Up @@ -3342,7 +3342,7 @@ setMethod("write.jdbc",
jprops <- varargsToJProperties(...)
write <- callJMethod(x@sdf, "write")
write <- callJMethod(write, "mode", jmode)
invisible(callJMethod(write, "jdbc", url, tableName, jprops))
invisible(handledCallJMethod(write, "jdbc", url, tableName, jprops))
})

#' randomSplit
Expand Down
17 changes: 9 additions & 8 deletions R/pkg/R/SQLContext.R
Original file line number Diff line number Diff line change
Expand Up @@ -350,7 +350,7 @@ read.json.default <- function(path, ...) {
paths <- as.list(suppressWarnings(normalizePath(path)))
read <- callJMethod(sparkSession, "read")
read <- callJMethod(read, "options", options)
sdf <- callJMethod(read, "json", paths)
sdf <- handledCallJMethod(read, "json", paths)
dataFrame(sdf)
}

Expand Down Expand Up @@ -422,7 +422,7 @@ read.orc <- function(path, ...) {
path <- suppressWarnings(normalizePath(path))
read <- callJMethod(sparkSession, "read")
read <- callJMethod(read, "options", options)
sdf <- callJMethod(read, "orc", path)
sdf <- handledCallJMethod(read, "orc", path)
dataFrame(sdf)
}

Expand All @@ -444,7 +444,7 @@ read.parquet.default <- function(path, ...) {
paths <- as.list(suppressWarnings(normalizePath(path)))
read <- callJMethod(sparkSession, "read")
read <- callJMethod(read, "options", options)
sdf <- callJMethod(read, "parquet", paths)
sdf <- handledCallJMethod(read, "parquet", paths)
dataFrame(sdf)
}

Expand Down Expand Up @@ -496,7 +496,7 @@ read.text.default <- function(path, ...) {
paths <- as.list(suppressWarnings(normalizePath(path)))
read <- callJMethod(sparkSession, "read")
read <- callJMethod(read, "options", options)
sdf <- callJMethod(read, "text", paths)
sdf <- handledCallJMethod(read, "text", paths)
dataFrame(sdf)
}

Expand Down Expand Up @@ -914,12 +914,13 @@ read.jdbc <- function(url, tableName,
} else {
numPartitions <- numToInt(numPartitions)
}
sdf <- callJMethod(read, "jdbc", url, tableName, as.character(partitionColumn),
numToInt(lowerBound), numToInt(upperBound), numPartitions, jprops)
sdf <- handledCallJMethod(read, "jdbc", url, tableName, as.character(partitionColumn),
numToInt(lowerBound), numToInt(upperBound), numPartitions, jprops)
} else if (length(predicates) > 0) {
sdf <- callJMethod(read, "jdbc", url, tableName, as.list(as.character(predicates)), jprops)
sdf <- handledCallJMethod(read, "jdbc", url, tableName, as.list(as.character(predicates)),
jprops)
} else {
sdf <- callJMethod(read, "jdbc", url, tableName, jprops)
sdf <- handledCallJMethod(read, "jdbc", url, tableName, jprops)
}
dataFrame(sdf)
}
20 changes: 17 additions & 3 deletions R/pkg/R/backend.R
Original file line number Diff line number Diff line change
Expand Up @@ -108,13 +108,27 @@ invokeJava <- function(isStatic, objId, methodName, ...) {
conn <- get(".sparkRCon", .sparkREnv)
writeBin(requestMessage, conn)

# TODO: check the status code to output error information
returnStatus <- readInt(conn)
handleErrors(returnStatus, conn)

# Backend will send +1 as keep alive value to prevent various connection timeouts
# on very long running jobs. See spark.r.heartBeatInterval
while (returnStatus == 1) {
returnStatus <- readInt(conn)
handleErrors(returnStatus, conn)
}

readObject(conn)
}

# Helper function to check for returned errors and print appropriate error message to user
handleErrors <- function(returnStatus, conn) {
if (length(returnStatus) == 0) {
stop("No status is returned. Java SparkR backend might have failed.")
}
if (returnStatus != 0) {

# 0 is success and +1 is reserved for heartbeats. Other negative values indicate errors.
if (returnStatus < 0) {
stop(readString(conn))
}
readObject(conn)
}
2 changes: 1 addition & 1 deletion R/pkg/R/client.R
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@

# Creates a SparkR client connection object
# if one doesn't already exist
connectBackend <- function(hostname, port, timeout = 6000) {
connectBackend <- function(hostname, port, timeout) {
if (exists(".sparkRcon", envir = .sparkREnv)) {
if (isOpen(.sparkREnv[[".sparkRCon"]])) {
cat("SparkRBackend client connection already exists\n")
Expand Down
24 changes: 14 additions & 10 deletions R/pkg/R/functions.R
Original file line number Diff line number Diff line change
Expand Up @@ -1485,7 +1485,7 @@ setMethod("soundex",

#' Return the partition ID as a column
#'
#' Return the partition ID of the Spark task as a SparkDataFrame column.
#' Return the partition ID as a SparkDataFrame column.
#' Note that this is nondeterministic because it depends on data partitioning and
#' task scheduling.
#'
Expand Down Expand Up @@ -2317,7 +2317,8 @@ setMethod("date_format", signature(y = "Column", x = "character"),

#' from_utc_timestamp
#'
#' Assumes given timestamp is UTC and converts to given timezone.
#' Given a timestamp, which corresponds to a certain time of day in UTC, returns another timestamp
#' that corresponds to the same time of day in the given timezone.
#'
#' @param y Column to compute on.
#' @param x time zone to use.
Expand All @@ -2340,7 +2341,7 @@ setMethod("from_utc_timestamp", signature(y = "Column", x = "character"),
#' Locate the position of the first occurrence of substr column in the given string.
#' Returns null if either of the arguments are null.
#'
#' NOTE: The position is not zero based, but 1 based index, returns 0 if substr
#' NOTE: The position is not zero based, but 1 based index. Returns 0 if substr
#' could not be found in str.
#'
#' @param y column to check
Expand Down Expand Up @@ -2391,7 +2392,8 @@ setMethod("next_day", signature(y = "Column", x = "character"),

#' to_utc_timestamp
#'
#' Assumes given timestamp is in given timezone and converts to UTC.
#' Given a timestamp, which corresponds to a certain time of day in the given timezone, returns
#' another timestamp that corresponds to the same time of day in UTC.
#'
#' @param y Column to compute on
#' @param x timezone to use
Expand Down Expand Up @@ -2539,7 +2541,7 @@ setMethod("shiftLeft", signature(y = "Column", x = "numeric"),

#' shiftRight
#'
#' Shift the given value numBits right. If the given value is a long value, it will return
#' (Signed) shift the given value numBits right. If the given value is a long value, it will return
#' a long value else it will return an integer value.
#'
#' @param y column to compute on.
Expand Down Expand Up @@ -2777,7 +2779,7 @@ setMethod("window", signature(x = "Column"),
#' locate
#'
#' Locate the position of the first occurrence of substr.
#' NOTE: The position is not zero based, but 1 based index, returns 0 if substr
#' NOTE: The position is not zero based, but 1 based index. Returns 0 if substr
#' could not be found in str.
#'
#' @param substr a character string to be matched.
Expand Down Expand Up @@ -2823,7 +2825,8 @@ setMethod("lpad", signature(x = "Column", len = "numeric", pad = "character"),

#' rand
#'
#' Generate a random column with i.i.d. samples from U[0.0, 1.0].
#' Generate a random column with independent and identically distributed (i.i.d.) samples
#' from U[0.0, 1.0].
#'
#' @param seed a random seed. Can be missing.
#' @family normal_funcs
Expand Down Expand Up @@ -2852,7 +2855,8 @@ setMethod("rand", signature(seed = "numeric"),

#' randn
#'
#' Generate a column with i.i.d. samples from the standard normal distribution.
#' Generate a column with independent and identically distributed (i.i.d.) samples from
#' the standard normal distribution.
#'
#' @param seed a random seed. Can be missing.
#' @family normal_funcs
Expand Down Expand Up @@ -3442,8 +3446,8 @@ setMethod("size",

#' sort_array
#'
#' Sorts the input array for the given column in ascending order,
#' according to the natural ordering of the array elements.
#' Sorts the input array in ascending or descending order according
#' to the natural ordering of the array elements.
#'
#' @param x A Column to sort
#' @param asc A logical flag indicating the sorting order.
Expand Down
70 changes: 40 additions & 30 deletions R/pkg/R/generics.R
Original file line number Diff line number Diff line change
Expand Up @@ -1310,9 +1310,11 @@ setGeneric("window", function(x, ...) { standardGeneric("window") })
#' @export
setGeneric("year", function(x) { standardGeneric("year") })

#' @rdname spark.glm
###################### Spark.ML Methods ##########################

#' @rdname fitted
#' @export
setGeneric("spark.glm", function(data, formula, ...) { standardGeneric("spark.glm") })
setGeneric("fitted")

#' @param x,y For \code{glm}: logical values indicating whether the response vector
#' and model matrix used in the fitting process should be returned as
Expand All @@ -1332,13 +1334,42 @@ setGeneric("predict", function(object, ...) { standardGeneric("predict") })
#' @export
setGeneric("rbind", signature = "...")

#' @rdname spark.als
#' @export
setGeneric("spark.als", function(data, ...) { standardGeneric("spark.als") })

#' @rdname spark.gaussianMixture
#' @export
setGeneric("spark.gaussianMixture",
function(data, formula, ...) { standardGeneric("spark.gaussianMixture") })

#' @rdname spark.gbt
#' @export
setGeneric("spark.gbt", function(data, formula, ...) { standardGeneric("spark.gbt") })

#' @rdname spark.glm
#' @export
setGeneric("spark.glm", function(data, formula, ...) { standardGeneric("spark.glm") })

#' @rdname spark.isoreg
#' @export
setGeneric("spark.isoreg", function(data, formula, ...) { standardGeneric("spark.isoreg") })

#' @rdname spark.kmeans
#' @export
setGeneric("spark.kmeans", function(data, formula, ...) { standardGeneric("spark.kmeans") })

#' @rdname fitted
#' @rdname spark.kstest
#' @export
setGeneric("fitted")
setGeneric("spark.kstest", function(data, ...) { standardGeneric("spark.kstest") })

#' @rdname spark.lda
#' @export
setGeneric("spark.lda", function(data, ...) { standardGeneric("spark.lda") })

#' @rdname spark.logit
#' @export
setGeneric("spark.logit", function(data, formula, ...) { standardGeneric("spark.logit") })

#' @rdname spark.mlp
#' @export
Expand All @@ -1348,13 +1379,14 @@ setGeneric("spark.mlp", function(data, ...) { standardGeneric("spark.mlp") })
#' @export
setGeneric("spark.naiveBayes", function(data, formula, ...) { standardGeneric("spark.naiveBayes") })

#' @rdname spark.survreg
#' @rdname spark.randomForest
#' @export
setGeneric("spark.survreg", function(data, formula) { standardGeneric("spark.survreg") })
setGeneric("spark.randomForest",
function(data, formula, ...) { standardGeneric("spark.randomForest") })

#' @rdname spark.lda
#' @rdname spark.survreg
#' @export
setGeneric("spark.lda", function(data, ...) { standardGeneric("spark.lda") })
setGeneric("spark.survreg", function(data, formula) { standardGeneric("spark.survreg") })

#' @rdname spark.lda
#' @export
Expand All @@ -1364,32 +1396,10 @@ setGeneric("spark.posterior", function(object, newData) { standardGeneric("spark
#' @export
setGeneric("spark.perplexity", function(object, data) { standardGeneric("spark.perplexity") })

#' @rdname spark.isoreg
#' @export
setGeneric("spark.isoreg", function(data, formula, ...) { standardGeneric("spark.isoreg") })

#' @rdname spark.gaussianMixture
#' @export
setGeneric("spark.gaussianMixture",
function(data, formula, ...) {
standardGeneric("spark.gaussianMixture")
})

#' @rdname spark.logit
#' @export
setGeneric("spark.logit", function(data, formula, ...) { standardGeneric("spark.logit") })

#' @param object a fitted ML model object.
#' @param path the directory where the model is saved.
#' @param ... additional argument(s) passed to the method.
#' @rdname write.ml
#' @export
setGeneric("write.ml", function(object, path, ...) { standardGeneric("write.ml") })

#' @rdname spark.als
#' @export
setGeneric("spark.als", function(data, ...) { standardGeneric("spark.als") })

#' @rdname spark.kstest
#' @export
setGeneric("spark.kstest", function(data, ...) { standardGeneric("spark.kstest") })
Loading