Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
1090 commits
Select commit Hold shift + click to select a range
9041223
[TRIVIAL][DOCS][STREAMING][SQL] The return type mentioned in the Java…
holdenk Jun 29, 2016
1b4d63f
[SPARK-16291][SQL] CheckAnalysis should capture nested aggregate func…
liancheng Jun 29, 2016
ba71cf4
[SPARK-16261][EXAMPLES][ML] Fixed incorrect appNames in ML Examples
BryanCutler Jun 29, 2016
d96e8c2
[MINOR][SPARKR] Fix arguments of survreg in SparkR
yanboliang Jun 29, 2016
1cde325
[SPARK-16140][MLLIB][SPARKR][DOCS] Group k-means method in generated …
keypointt Jun 29, 2016
edd1905
[SPARK-16236][SQL][FOLLOWUP] Add Path Option back to Load API in Data…
gatorsmile Jun 29, 2016
3cc258e
[SPARK-16256][SQL][STREAMING] Added Structured Streaming Programming …
tdas Jun 29, 2016
809af6d
[TRIVIAL] [PYSPARK] Clean up orc compression option as well
HyukjinKwon Jun 29, 2016
a7f66ef
[SPARK-16301] [SQL] The analyzer rule for resolving using joins shoul…
yhuai Jun 29, 2016
ef0253f
[SPARK-16006][SQL] Attemping to write empty DataFrame with no fields …
dongjoon-hyun Jun 29, 2016
c4cebd5
[SPARK-16238] Metrics for generated method and class bytecode size
ericl Jun 29, 2016
011befd
[SPARK-16228][SQL] HiveSessionCatalog should return `double`-param fu…
dongjoon-hyun Jun 29, 2016
8da4314
[SPARK-16134][SQL] optimizer rules for typed filter
cloud-fan Jun 30, 2016
e1bdf1e
Revert "[SPARK-16134][SQL] optimizer rules for typed filter"
liancheng Jun 30, 2016
b52bd80
[SPARK-16267][TEST] Replace deprecated `CREATE TEMPORARY TABLE ... US…
dongjoon-hyun Jun 30, 2016
a548523
[SPARK-16294][SQL] Labelling support for the include_example Jekyll p…
liancheng Jun 30, 2016
3134f11
[SPARK-12177][STREAMING][KAFKA] Update KafkaDStreams to new Kafka 0.1…
koeninger Jun 30, 2016
c8a7c23
[SPARK-16256][DOCS] Minor fixes on the Structured Streaming Programmi…
tdas Jun 30, 2016
1d27445
[SPARK-16241][ML] model loading backward compatibility for ml NaiveBayes
zlpmichelle Jun 30, 2016
6a4f4c1
[SPARK-12177][TEST] Removed test to avoid compilation issue in scala …
tdas Jun 30, 2016
56207fc
[SPARK-16071][SQL] Checks size limit when doubling the array size in …
clockfly Jun 30, 2016
98056a1
[BUILD] Fix version in poms related to kafka-0-10
tdas Jun 30, 2016
f17ffef
[SPARK-13850] Force the sorter to Spill when number of elements in th…
Jun 30, 2016
03008e0
[SPARK-16256][DOCS] Fix window operation diagram
tdas Jun 30, 2016
4dc7d37
[SPARK-16336][SQL] Suggest doing table refresh upon FileNotFoundExcep…
petermaxlee Jun 30, 2016
17c7522
[SPARK-16313][SQL] Spark should not silently drop exceptions in file …
rxin Jun 30, 2016
d3027c4
[SPARK-16328][ML][MLLIB][PYSPARK] Add 'asML' and 'fromML' conversion …
Jul 1, 2016
79c96c9
[SPARK-15643][DOC][ML] Add breaking changes to ML migration guide
Jul 1, 2016
94d61de
[SPARK-15954][SQL] Disable loading test tables in Python tests
rxin Jul 1, 2016
80a7bff
[SPARK-15820][PYSPARK][SQL] Add Catalog.refreshTable into python API
WeichenXu123 Jun 30, 2016
cc3c44b
[SPARK-14608][ML] transformSchema needs better documentation
hhbyyh Jul 1, 2016
1932bb6
[SPARK-12177][STREAMING][KAFKA] limit api surface area
koeninger Jul 1, 2016
972106d
[SPARK-16182][CORE] Utils.scala -- terminateProcess() should call Pro…
srowen Jul 1, 2016
0b64543
[SPARK-15761][MLLIB][PYSPARK] Load ipython when default python is Pyt…
MechCoder Jul 1, 2016
3665927
[SPARK-16222][SQL] JDBC Sources - Handling illegal input values for `…
gatorsmile Jul 1, 2016
4c96ded
[SPARK-16012][SPARKR] Implement gapplyCollect which will apply a R fu…
Jul 1, 2016
d658811
[SPARK-16299][SPARKR] Capture errors from R workers in daemon.R to av…
sun-rui Jul 1, 2016
78387ce
[SPARK-16335][SQL] Structured streaming should fail if source directo…
rxin Jul 1, 2016
794d099
[SPARK-16233][R][TEST] ORC test should be enabled only when HiveConte…
dongjoon-hyun Jul 1, 2016
ab43038
[SPARK-16095][YARN] Yarn cluster mode should report correct state to …
renozhang Jul 1, 2016
f3a3599
[GRAPHX][EXAMPLES] move graphx test data directory and update graphx …
WeichenXu123 Jul 2, 2016
0d0b416
[SPARK-16345][DOCUMENTATION][EXAMPLES][GRAPHX] Extract graphx program…
WeichenXu123 Jul 2, 2016
0c6fd03
[MINOR][BUILD] Fix Java linter errors
dongjoon-hyun Jul 2, 2016
3ecee57
[SPARK-16260][ML][EXAMPLE] PySpark ML Example Improvements and Cleanup
wangmiao1981 Jul 4, 2016
ecbb447
[MINOR][DOCS] Remove unused images; crush PNGs that could use it for …
srowen Jul 4, 2016
d5683a7
[SPARK-16329][SQL][BACKPORT-2.0] Star Expansion over Table Containing…
gatorsmile Jul 4, 2016
cc100ab
[SPARK-16353][BUILD][DOC] Missing javadoc options for java unidoc
Jul 4, 2016
0754ccb
[SPARK-16311][SQL] Metadata refresh should work on temporary views
rxin Jul 5, 2016
cabee23
[SPARK-16212][STREAMING][KAFKA] use random port for embedded kafka
koeninger Jul 5, 2016
9c1596b
[SPARK-15730][SQL] Respect the --hiveconf in the spark-sql command line
chenghao-intel Jul 5, 2016
801fb79
[SPARK-16359][STREAMING][KAFKA] unidoc skip kafka 0.10
koeninger Jul 5, 2016
a2ef13a
[SPARK-16385][CORE] Catch correct exception when calling method via r…
Jul 5, 2016
0fe2a8c
[SPARK-16348][ML][MLLIB][PYTHON] Use full classpaths for pyspark ML J…
jkbradley Jul 6, 2016
4a55b23
Preparing Spark release v2.0.0-rc2
pwendell Jul 6, 2016
6e8fa86
Preparing development version 2.0.1-SNAPSHOT
pwendell Jul 6, 2016
521fc71
[SPARK-16339][CORE] ScriptTransform does not print stderr when outstr…
tejasapatil Jul 6, 2016
25006c8
[SPARK-16249][ML] Change visibility of Object ml.clustering.LDA to pu…
YY-OnCall Jul 6, 2016
d5d2457
[SPARK-15968][SQL] Nonempty partitioned metastore tables are not cached
rxin Jul 6, 2016
e956bd7
[SPARK-16229][SQL] Drop Empty Table After CREATE TABLE AS SELECT fails
gatorsmile Jul 6, 2016
091cd5f
[DOC][SQL] update out-of-date code snippets using SQLContext in all d…
WeichenXu123 Jul 6, 2016
03f336d
[MINOR][PYSPARK][DOC] Fix wrongly formatted examples in PySpark docum…
HyukjinKwon Jul 6, 2016
2465f07
[SPARK-16371][SQL] Do not push down filters incorrectly when inner na…
HyukjinKwon Jul 6, 2016
d7926da
[SPARK-15740][MLLIB] Word2VecSuite "big model load / save" caused OOM…
tmnd1991 Jul 6, 2016
88be66b
[SPARK-16379][CORE][MESOS] Spark on mesos is broken due to race condi…
srowen Jul 6, 2016
2c2b8f1
[MESOS] expand coarse-grained mode docs
Jul 6, 2016
05ddc75
[SPARK-16371][SQL] Two follow-up tasks
rxin Jul 6, 2016
920162a
[SPARK-16212][STREAMING][KAFKA] apply test tweaks from 0-10 to 0-8 as…
koeninger Jul 6, 2016
d63428a
[SPARK-16368][SQL] Fix Strange Errors When Creating View With Unmatch…
gatorsmile Jul 7, 2016
2493335
[SPARK-16372][MLLIB] Retag RDD to tallSkinnyQR of RowMatrix
yinxusen Jul 7, 2016
cbfd94e
[SPARK-16350][SQL] Fix support for incremental planning in wirteStrea…
lw-lin Jul 7, 2016
30cb3f1
[SPARK-16415][SQL] fix catalog string error
adrian-wang Jul 7, 2016
5828da4
[SPARK-16310][SPARKR] R na.string-like default for csv source
felixcheung Jul 7, 2016
73c764a
[SPARK-16425][R] `describe()` should not fail with non-numeric columns
dongjoon-hyun Jul 8, 2016
88603bd
[SPARK-16276][SQL] Implement elt SQL function
petermaxlee Jun 30, 2016
7ef1d1c
[SPARK-16278][SPARK-16279][SQL] Implement map_keys/map_values SQL fun…
dongjoon-hyun Jul 3, 2016
a049754
[SPARK-16289][SQL] Implement posexplode table generating function
dongjoon-hyun Jun 30, 2016
144aa84
[SPARK-16271][SQL] Implement Hive's UDFXPathUtil
petermaxlee Jun 29, 2016
bb4b041
[SPARK-16274][SQL] Implement xpath_boolean
petermaxlee Jun 30, 2016
e32c29d
[SPARK-16288][SQL] Implement inline table generating function
dongjoon-hyun Jul 3, 2016
565e18c
[SPARK-16286][SQL] Implement stack table generating function
dongjoon-hyun Jul 6, 2016
18ace01
[SPARK-16430][SQL][STREAMING] Add option maxFilesPerTrigger
tdas Jul 8, 2016
221a4a7
[SPARK-16285][SQL] Implement sentences SQL functions
dongjoon-hyun Jul 8, 2016
8c81806
[SPARK-16369][MLLIB] tallSkinnyQR of RowMatrix should aware of empty …
yinxusen Jul 8, 2016
8dee2ec
[SPARK-13638][SQL] Add quoteAll option to CSV DataFrameWriter
jurriaan Jul 8, 2016
0e9333b
[SPARK-16420] Ensure compression streams are closed.
rdblue Jul 8, 2016
e3424fd
[SPARK-16281][SQL] Implement parse_url SQL function
janplus Jul 8, 2016
07f562f
[SPARK-16453][BUILD] release-build.sh is missing hive-thriftserver fo…
yhuai Jul 8, 2016
463cbf7
[SPARK-16387][SQL] JDBC Writer should use dialect to quote field names.
dongjoon-hyun Jul 8, 2016
c425230
[SPARK-13569][STREAMING][KAFKA] pattern based topic subscription
koeninger Jul 9, 2016
16202ba
[SPARK-16376][WEBUI][SPARK WEB UI][APP-ID] HTTP ERROR 500 when using …
srowen Jul 9, 2016
5024c4c
[SPARK-16432] Empty blocks fail to serialize due to assert in Chunked…
ericl Jul 9, 2016
50d7002
[SPARK-11857][MESOS] Deprecate fine grained
Jul 9, 2016
a33643c
[SPARK-16401][SQL] Data Source API: Enable Extending RelationProvider…
gatorsmile Jul 9, 2016
139d5ea
[SPARK-16476] Restructure MimaExcludes for easier union excludes
rxin Jul 11, 2016
aa8cbcd
[SPARK-16355][SPARK-16354][SQL] Fix Bugs When LIMIT/TABLESAMPLE is No…
gatorsmile Jul 11, 2016
7e4ba66
[SPARK-16381][SQL][SPARKR] Update SQL examples and programming guide …
keypointt Jul 11, 2016
f97dd8a
[SPARK-16459][SQL] Prevent dropping current database
dongjoon-hyun Jul 11, 2016
72cf743
[SPARK-16318][SQL] Implement all remaining xpath functions (branch-2.0)
petermaxlee Jul 11, 2016
aea33bf
[SPARK-16458][SQL] SessionCatalog should support `listColumns` for te…
dongjoon-hyun Jul 11, 2016
b938ca7
[SPARKR][DOC] SparkR ML user guides update for 2.0
yanboliang Jul 11, 2016
cb463b6
[SPARK-16144][SPARKR] update R API doc for mllib
felixcheung Jul 11, 2016
02d584c
[SPARK-16114][SQL] structured streaming event time window example
jjthomas Jul 12, 2016
81d7f48
[MINOR][STREAMING][DOCS] Minor changes on kinesis integration
keypointt Jul 12, 2016
b716e10
[SPARK-16433][SQL] Improve StreamingQuery.explain when no data arrives
zsxwing Jul 12, 2016
b37177c
[SPARK-16430][SQL][STREAMING] Fixed bug in the maxFilesPerTrigger in …
tdas Jul 12, 2016
6892614
[SPARK-16488] Fix codegen variable namespace collision in pmod and pa…
sameeragarwal Jul 12, 2016
9e0d2e2
[MINOR][ML] update comment where is inconsistent with code in ml.regr…
WeichenXu123 Jul 12, 2016
7b63e7d
[SPARK-16470][ML][OPTIMIZER] Check linear regression training whether…
WeichenXu123 Jul 12, 2016
f419476
[SPARK-16489][SQL] Guard against variable reuse mistakes in expressio…
rxin Jul 12, 2016
2f47b37
[SPARK-16414][YARN] Fix bugs for "Can not get user config when callin…
sharkdtu Jul 12, 2016
4303d29
[SPARK-16284][SQL] Implement reflect SQL function
petermaxlee Jul 13, 2016
41df62c
[SPARK-16514][SQL] Fix various regex codegen bugs
ericl Jul 13, 2016
5173f84
[SPARK-16303][DOCS][EXAMPLES] Updated SQL programming guide and examples
Jul 13, 2016
4b93a83
[SPARK-15889][STREAMING] Follow-up fix to erroneous condition in Stre…
srowen Jul 13, 2016
5301efc
[SPARK-16375][WEB UI] Fixed misassigned var: numCompletedTasks was as…
ajbozarth Jul 13, 2016
934e2aa
[MINOR] Fix Java style errors and remove unused imports
keypointt Jul 13, 2016
38787ec
[SPARK-16439] Fix number formatting in SQL UI
Jul 13, 2016
a34a544
[SPARK-16438] Add Asynchronous Actions documentation
phalodi Jul 13, 2016
74ad486
[MINOR][YARN] Fix code error in yarn-cluster unit test
sharkdtu Jul 13, 2016
5a71a05
[SPARK-16440][MLLIB] Undeleted broadcast variables in Word2Vec causin…
srowen Jul 13, 2016
7d9bd95
[SPARK-16469] enhanced simulate multiply
uzadude Jul 13, 2016
90f0e81
[SPARK-16435][YARN][MINOR] Add warning log if initialExecutors is les…
jerryshao Jul 13, 2016
2e97f3a
[SPARK-14812][ML][MLLIB][PYTHON] Experimental, DeveloperApi annotatio…
jkbradley Jul 13, 2016
7de183d
[SPARK-16531][SQL][TEST] Remove timezone setting from DataFrameTimeWi…
brkyvz Jul 13, 2016
86adc5c
[SPARK-16114][SQL] updated structured streaming guide
jjthomas Jul 13, 2016
18255a9
[SPARKR][MINOR] R examples and test updates
felixcheung Jul 13, 2016
9e3a598
[SPARKR][DOCS][MINOR] R programming guide to include csv data source …
felixcheung Jul 13, 2016
550d0e7
[SPARK-16482][SQL] Describe Table Command for Tables Requiring Runtim…
gatorsmile Jul 13, 2016
abb8023
[SPARK-16485][ML][DOC] Fix privacy of GLM members, rename sqlDataType…
jkbradley Jul 13, 2016
47eb9a6
Preparing Spark release v2.0.0-rc3
pwendell Jul 14, 2016
5244f86
Preparing development version 2.0.1-SNAPSHOT
pwendell Jul 14, 2016
f6eda6b
[SPARK-16503] SparkSession should provide Spark version
lw-lin Jul 14, 2016
48d1fa3
Preparing Spark release v2.0.0-rc3
pwendell Jul 14, 2016
b3ebecb
Preparing development version 2.0.1-SNAPSHOT
pwendell Jul 14, 2016
240c42b
[SPARK-16500][ML][MLLIB][OPTIMIZER] add LBFGS convergence warning for…
WeichenXu123 Jul 14, 2016
4e9080f
[SPARK-16509][SPARKR] Rename window.partitionBy and window.orderBy to…
sun-rui Jul 14, 2016
29281bc
[SPARK-16538][SPARKR] fix R call with namespace operator on SparkSess…
felixcheung Jul 14, 2016
e5f8c11
Preparing Spark release v2.0.0-rc4
pwendell Jul 14, 2016
0a651aa
Preparing development version 2.0.1-SNAPSHOT
pwendell Jul 14, 2016
7418019
[SPARK-16529][SQL][TEST] `withTempDatabase` should set `default` data…
dongjoon-hyun Jul 14, 2016
23e1ab9
[SPARK-16528][SQL] Fix NPE problem in HiveClientImpl
jacek-lewandowski Jul 14, 2016
1fe0bcd
[SPARK-16540][YARN][CORE] Avoid adding jars twice for Spark running o…
jerryshao Jul 14, 2016
5c56bc0
[SPARK-16553][DOCS] Fix SQL example file name in docs
shivaram Jul 14, 2016
aa4690b
[SPARK-16555] Work around Jekyll error-handling bug which led to sile…
JoshRosen Jul 14, 2016
c5f9355
[SPARK-16557][SQL] Remove stale doc in sql/README.md
rxin Jul 15, 2016
90686ab
[SPARK-14817][ML][MLLIB][DOC] Made DataFrame-based API primary in MLl…
jkbradley Jul 15, 2016
e833c90
[SPARK-16538][SPARKR] Add more tests for namespace call to SparkSessi…
felixcheung Jul 15, 2016
34ac45a
[SPARK-16230][CORE] CoarseGrainedExecutorBackend to self kill if ther…
tejasapatil Jul 15, 2016
5d49529
[SPARK-16582][SQL] Explicitly define isNull = false for non-nullable …
sameeragarwal Jul 16, 2016
cad4693
[SPARK-3359][DOCS] More changes to resolve javadoc 8 errors that will…
srowen Jul 16, 2016
8c2ec44
[SPARK-16112][SPARKR] Programming guide for gapply/gapplyCollect
Jul 16, 2016
c527e9e
[SPARK-16507][SPARKR] Add a CRAN checker, fix Rd aliases
shivaram Jul 17, 2016
a4bf13a
[SPARK-16584][SQL] Move regexp unit tests to RegexpExpressionsSuite
rxin Jul 17, 2016
808d69a
[SPARK-16588][SQL] Deprecate monotonicallyIncreasingId in Scala/Java
rxin Jul 18, 2016
2365d63
[MINOR][TYPO] fix fininsh typo
WeichenXu123 Jul 18, 2016
085f3cc
[SPARK-16055][SPARKR] warning added while using sparkPackages with sp…
krishnakalyan3 Jul 18, 2016
33d92f7
[SPARK-16515][SQL] set default record reader and writer for script tr…
adrian-wang Jul 18, 2016
7889585
[SPARKR][DOCS] minor code sample update in R programming guide
felixcheung Jul 18, 2016
aac8608
[SPARK-16590][SQL] Improve LogicalPlanToSQLSuite to check generated S…
dongjoon-hyun Jul 19, 2016
1dd1526
[HOTFIX] Fix Scala 2.10 compilation
rxin Jul 19, 2016
24ea875
[SPARK-16615][SQL] Expose sqlContext in SparkSession
rxin Jul 19, 2016
ef2a6f1
[SPARK-16303][DOCS][EXAMPLES] Minor Scala/Java example update
liancheng Jul 19, 2016
504aa6f
[DOC] improve python doc for rdd.histogram and dataframe.join
mortada Jul 19, 2016
eb1c20f
[MINOR][BUILD] Fix Java Linter `LineLength` errors
dongjoon-hyun Jul 19, 2016
929fa28
[MINOR][SQL][STREAMING][DOCS] Fix minor typos, punctuations and grammar
ahmed-mahran Jul 19, 2016
2c74b6d
[SPARK-16600][MLLIB] fix some latex formula syntax error
WeichenXu123 Jul 19, 2016
6ca1d94
[SPARK-16620][CORE] Add back the tokenization process in `RDD.pipe(co…
lw-lin Jul 19, 2016
f18f9ca
[SPARK-16602][SQL] `Nvl` function should support numeric-string cases
dongjoon-hyun Jul 19, 2016
80ab8b6
[SPARK-15705][SQL] Change the default value of spark.sql.hive.convert…
yhuai Jul 19, 2016
13650fc
Preparing Spark release v2.0.0-rc5
pwendell Jul 19, 2016
307f892
Preparing development version 2.0.1-SNAPSHOT
pwendell Jul 19, 2016
f58fd46
[SPARK-16568][SQL][DOCUMENTATION] update sql programming guide refres…
WeichenXu123 Jul 20, 2016
6f209c8
[SPARK-10683][SPARK-16510][SPARKR] Move SparkR include jar test to Sp…
shivaram Jul 20, 2016
c2b5b3c
[SPARK-16632][SQL] Respect Hive schema when merging parquet schema.
Jul 20, 2016
3f6b272
[SPARK-16440][MLLIB] Destroy broadcasted variables even on driver
Jul 20, 2016
83b957e
[SPARK-15923][YARN] Spark Application rest api returns 'no such app: …
weiqingy Jul 20, 2016
b177e08
[SPARK-16613][CORE] RDD.pipe returns values for empty partitions
srowen Jul 20, 2016
81004f1
[SPARK-16634][SQL] Workaround JVM bug by moving some code out of ctor.
Jul 20, 2016
a804c92
[SPARK-16644][SQL] Aggregate should not propagate constraints contain…
cloud-fan Jul 21, 2016
c2b4228
[MINOR][DOCS][STREAMING] Minor docfix schema of csv rather than parqu…
holdenk Jul 21, 2016
f9367d6
[SPARK-16632][SQL] Use Spark requested schema to guide vectorized Par…
liancheng Jul 21, 2016
933d76a
[SPARK-16632][SQL] Revert PR #14272: Respect Hive schema when merging…
liancheng Jul 21, 2016
cd41e6a
[SPARK-16656][SQL] Try to make CreateTableAsSelectSuite more stable
yhuai Jul 21, 2016
4cb8ff7
[SPARK-16334] Maintain single dictionary per row-batch in vectorized …
sameeragarwal Jul 21, 2016
70bf8ce
[SPARK-16287][SQL] Implement str_to_map SQL function
techaddict Jul 22, 2016
0cc36ca
[SPARK-16287][HOTFIX][BUILD][SQL] Fix annotation argument needs to be…
jaceklaskowski Jul 22, 2016
fb944a1
[SPARK-16650] Improve documentation of spark.task.maxFailures
Jul 22, 2016
28bb2b0
[SPARK-16651][PYSPARK][DOC] Make `withColumnRenamed/drop` description…
dongjoon-hyun Jul 22, 2016
da34e8e
[SPARK-16380][EXAMPLES] Update SQL examples and programming guide for…
liancheng Jul 23, 2016
31c3bcb
[SPARK-16690][TEST] rename SQLTestUtils.withTempTable to withTempView
cloud-fan Jul 23, 2016
198b042
[SPARK-16515][SQL][FOLLOW-UP] Fix test `script` on OS X/Windows...
lw-lin Jul 24, 2016
d226dce
[SPARK-16699][SQL] Fix performance bug in hash aggregate on long stri…
ooq Jul 25, 2016
fcbb7f6
[SPARK-16648][SQL] Make ignoreNullsExpr a child expression of First a…
liancheng Jul 25, 2016
b52e639
[SPARK-16698][SQL] Field names having dots should be allowed for data…
HyukjinKwon Jul 25, 2016
57d65e5
[SPARK-16703][SQL] Remove extra whitespace in SQL generation for wind…
liancheng Jul 25, 2016
d9bd066
[SPARKR][DOCS] fix broken url in doc
felixcheung Jul 25, 2016
f0d05f6
[SPARK-16485][DOC][ML] Fixed several inline formatting in ml features…
lins05 Jul 25, 2016
1b4f7cf
[SQL][DOC] Fix a default name for parquet compression
maropu Jul 25, 2016
41e72f6
[SPARK-16715][TESTS] Fix a potential ExprId conflict for Subexpressio…
zsxwing Jul 25, 2016
b17fe4e
[SPARK-14131][STREAMING] SQL Improved fix for avoiding potential dead…
tdas Jul 25, 2016
9d581dc
[SPARK-16722][TESTS] Fix a StreamingContext leak in StreamingContextS…
zsxwing Jul 26, 2016
3d35474
Fix description of spark.speculation.quantile
nwbvt Jul 26, 2016
aeb6d5c
[SPARK-16672][SQL] SQLBuilder should not raise exceptions on EXISTS q…
dongjoon-hyun Jul 26, 2016
4b38a6a
[SPARK-16724] Expose DefinedByConstructorParams
marmbrus Jul 26, 2016
4391d4a
[SPARK-16633][SPARK-16642][SPARK-16721][SQL] Fixes three issues relat…
yhuai Jul 26, 2016
44234b1
[TEST][STREAMING] Fix flaky Kafka rate controlling test
tdas Jul 26, 2016
be9965b
[SPARK-16621][SQL] Generate stable SQLs in SQLBuilder
dongjoon-hyun Jul 27, 2016
4e98e69
[MINOR][ML] Fix some mistake in LinearRegression formula.
yanboliang Jul 27, 2016
8bc2877
[SPARK-16729][SQL] Throw analysis exception for invalid date casts
petermaxlee Jul 27, 2016
2f4e06e
[MINOR][DOC] missing keyword new
Jul 27, 2016
2d56a21
[SPARK-16730][SQL] Implement function aliases for type casts
petermaxlee Jul 28, 2016
0fd2dfb
[SPARK-15232][SQL] Add subquery SQL building tests to LogicalPlanToSQ…
dongjoon-hyun Jul 28, 2016
825c837
[SPARK-16639][SQL] The query with having condition that contains grou…
viirya Jul 28, 2016
f46a074
[SPARK-16740][SQL] Fix Long overflow in LongToUnsafeRowMap
sylvinus Jul 28, 2016
fb09a69
[SPARK-16764][SQL] Recommend disabling vectorized parquet reader on O…
sameeragarwal Jul 28, 2016
5cd79c3
[SPARK-16772] Correct API doc references to PySpark classes + formatt…
nchammas Jul 28, 2016
ed03d0a
[SPARK-16664][SQL] Fix persist call on Data frames with more than 200…
Jul 29, 2016
efad4aa
[SPARK-16750][ML] Fix GaussianMixture training failed due to feature …
yanboliang Jul 29, 2016
268bf14
[SPARK-16751] Upgrade derby to 10.12.1.1
a-roberts Jul 29, 2016
a32531a
[SPARK-16761][DOC][ML] Fix doc link in docs/ml-guide.md
sundapeng Jul 29, 2016
7d87fc9
[SPARK-16748][SQL] SparkExceptions during planning should not wrapped…
tdas Jul 30, 2016
26da5a7
[SPARK-16800][EXAMPLES][ML] Fix Java examples that fail to run due to…
BryanCutler Jul 30, 2016
75dd781
[SPARK-16812] Open up SparkILoop.getAddedJars
rxin Jul 31, 2016
d357ca3
[SPARK-16813][SQL] Remove private[sql] and private[spark] from cataly…
rxin Jul 31, 2016
c651ff5
[SPARK-16805][SQL] Log timezone when query result does not match
rxin Aug 1, 2016
4bdc558
[SPARK-16778][SQL][TRIVIAL] Fix deprecation warning with SQLContext
holdenk Aug 1, 2016
b49091e
[SPARK-16776][STREAMING] Replace deprecated API in KafkaTestUtils for…
HyukjinKwon Aug 1, 2016
1523bf6
[SPARK-16791][SQL] cast struct with timestamp field fails
Aug 1, 2016
4e73cb8
[SPARK-16774][SQL] Fix use of deprecated timestamp constructor & impr…
holdenk Aug 1, 2016
1813bbd
[SPARK-15869][STREAMING] Fix a potential NPE in StreamingJobProgressL…
zsxwing Aug 1, 2016
5fbf5f9
[SPARK-16818] Exchange reuse incorrectly reuses scans over different …
ericl Aug 2, 2016
9d9956e
[SPARK-16734][EXAMPLES][SQL] Revise examples of all language bindings
liancheng Aug 2, 2016
c5516ab
[SPARK-16558][EXAMPLES][MLLIB] examples/mllib/LDAExample should use M…
yinxusen Aug 2, 2016
fc18e25
[SPARK-15541] Casting ConcurrentHashMap to ConcurrentMap (master branch)
Aug 2, 2016
22f0899
[SPARK-16837][SQL] TimeWindow incorrectly drops slideDuration in cons…
tmagrino Aug 2, 2016
ef7927e
[SPARK-16062] [SPARK-15989] [SQL] Fix two bugs of Python-only UDTs
viirya Aug 2, 2016
a937c9e
[SPARK-16836][SQL] Add support for CURRENT_DATE/CURRENT_TIMESTAMP lit…
hvanhovell Aug 2, 2016
f190bb8
[SPARK-16850][SQL] Improve type checking error message for greatest/l…
petermaxlee Aug 2, 2016
063a507
[SPARK-16787] SparkContext.addFile() should not throw if called twice…
JoshRosen Aug 2, 2016
d9d3504
[SPARK-16831][PYTHON] Fixed bug in CrossValidator.avgMetrics
pkch Aug 3, 2016
969313b
[SPARK-16796][WEB UI] Visible passwords on Spark environment page
Devian-ua Aug 2, 2016
2daab33
[SPARK-16714][SPARK-16735][SPARK-16646] array, map, greatest, least's…
cloud-fan Aug 3, 2016
b44da5b
[SPARK-14204][SQL] register driverClass rather than user-specified class
mchalek Aug 3, 2016
bb30a3d
[SPARK-16770][BUILD] Fix JLine dependency management and version (Sca…
stsc-pentasys Aug 4, 2016
11854e5
[SPARK-16873][CORE] Fix SpillReader NPE when spillFile has no data
sharkdtu Aug 4, 2016
182991e
[SPARK-16802] [SQL] fix overflow in LongToUnsafeRowMap
Aug 4, 2016
ddbff01
[SPARK-16875][SQL] Add args checking for DataSet randomSplit and sample
zhengruifeng Aug 4, 2016
c66338b
[SPARK-16880][ML][MLLIB] make ann training data persisted if needed
WeichenXu123 Aug 4, 2016
818ddcf
[SPARK-16877][BUILD] Add rules for preventing to use Java annotations…
HyukjinKwon Aug 4, 2016
824d626
[SPARK-16863][ML] ProbabilisticClassifier.fit check threshoulds' length
zhengruifeng Aug 4, 2016
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ metastore/
metastore_db/
sql/hive-thriftserver/test_warehouses
warehouse/
spark-warehouse/

# For R session data
.RData
Expand Down
3 changes: 2 additions & 1 deletion LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -263,7 +263,7 @@ The text of each license is also included at licenses/LICENSE-[project].txt.
(New BSD license) Protocol Buffer Java API (org.spark-project.protobuf:protobuf-java:2.4.1-shaded - http://code.google.com/p/protobuf)
(The BSD License) Fortran to Java ARPACK (net.sourceforge.f2j:arpack_combined_all:0.1 - http://f2j.sourceforge.net)
(The BSD License) xmlenc Library (xmlenc:xmlenc:0.52 - http://xmlenc.sourceforge.net)
(The New BSD License) Py4J (net.sf.py4j:py4j:0.9.2 - http://py4j.sourceforge.net/)
(The New BSD License) Py4J (net.sf.py4j:py4j:0.10.1 - http://py4j.sourceforge.net/)
(Two-clause BSD-style license) JUnit-Interface (com.novocode:junit-interface:0.10 - http://github.com/szeiger/junit-interface/)
(BSD licence) sbt and sbt-launch-lib.bash
(BSD 3 Clause) d3.min.js (https://github.com/mbostock/d3/blob/master/LICENSE)
Expand Down Expand Up @@ -296,3 +296,4 @@ The text of each license is also included at licenses/LICENSE-[project].txt.
(MIT License) blockUI (http://jquery.malsup.com/block/)
(MIT License) RowsGroup (http://datatables.net/license/mit)
(MIT License) jsonFormatter (http://www.jqueryscript.net/other/jQuery-Plugin-For-Pretty-JSON-Formatting-jsonFormatter.html)
(MIT License) modernizr (https://github.com/Modernizr/Modernizr/blob/master/LICENSE)
13 changes: 5 additions & 8 deletions NOTICE
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
Apache Spark
Copyright 2014 The Apache Software Foundation.
Copyright 2014 and onwards The Apache Software Foundation.

This product includes software developed at
The Apache Software Foundation (http://www.apache.org/).
Expand All @@ -12,7 +12,9 @@ Common Development and Distribution License 1.0
The following components are provided under the Common Development and Distribution License 1.0. See project link for details.

(CDDL 1.0) Glassfish Jasper (org.mortbay.jetty:jsp-2.1:6.1.14 - http://jetty.mortbay.org/project/modules/jsp-2.1)
(CDDL 1.0) JAX-RS (https://jax-rs-spec.java.net/)
(CDDL 1.0) Servlet Specification 2.5 API (org.mortbay.jetty:servlet-api-2.5:6.1.14 - http://jetty.mortbay.org/project/modules/servlet-api-2.5)
(CDDL 1.0) (GPL2 w/ CPE) javax.annotation API (https://glassfish.java.net/nonav/public/CDDL+GPL.html)
(COMMON DEVELOPMENT AND DISTRIBUTION LICENSE (CDDL) Version 1.0) (GNU General Public Library) Streaming API for XML (javax.xml.stream:stax-api:1.0-2 - no url defined)
(Common Development and Distribution License (CDDL) v1.0) JavaBeans Activation Framework (JAF) (javax.activation:activation:1.1 - http://java.sun.com/products/javabeans/jaf/index.jsp)

Expand All @@ -22,15 +24,10 @@ Common Development and Distribution License 1.1

The following components are provided under the Common Development and Distribution License 1.1. See project link for details.

(CDDL 1.1) (GPL2 w/ CPE) org.glassfish.hk2 (https://hk2.java.net)
(CDDL 1.1) (GPL2 w/ CPE) JAXB API bundle for GlassFish V3 (javax.xml.bind:jaxb-api:2.2.2 - https://jaxb.dev.java.net/)
(CDDL 1.1) (GPL2 w/ CPE) JAXB RI (com.sun.xml.bind:jaxb-impl:2.2.3-1 - http://jaxb.java.net/)
(CDDL 1.1) (GPL2 w/ CPE) jersey-core (com.sun.jersey:jersey-core:1.8 - https://jersey.dev.java.net/jersey-core/)
(CDDL 1.1) (GPL2 w/ CPE) jersey-core (com.sun.jersey:jersey-core:1.9 - https://jersey.java.net/jersey-core/)
(CDDL 1.1) (GPL2 w/ CPE) jersey-guice (com.sun.jersey.contribs:jersey-guice:1.9 - https://jersey.java.net/jersey-contribs/jersey-guice/)
(CDDL 1.1) (GPL2 w/ CPE) jersey-json (com.sun.jersey:jersey-json:1.8 - https://jersey.dev.java.net/jersey-json/)
(CDDL 1.1) (GPL2 w/ CPE) jersey-json (com.sun.jersey:jersey-json:1.9 - https://jersey.java.net/jersey-json/)
(CDDL 1.1) (GPL2 w/ CPE) jersey-server (com.sun.jersey:jersey-server:1.8 - https://jersey.dev.java.net/jersey-server/)
(CDDL 1.1) (GPL2 w/ CPE) jersey-server (com.sun.jersey:jersey-server:1.9 - https://jersey.java.net/jersey-server/)
(CDDL 1.1) (GPL2 w/ CPE) Jersey 2 (https://jersey.java.net)

========================================================================
Common Public License 1.0
Expand Down
12 changes: 6 additions & 6 deletions R/DOCUMENTATION.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
# SparkR Documentation

SparkR documentation is generated using in-source comments annotated using using
`roxygen2`. After making changes to the documentation, to generate man pages,
SparkR documentation is generated by using in-source comments and annotated by using
[`roxygen2`](https://cran.r-project.org/web/packages/roxygen2/index.html). After making changes to the documentation and generating man pages,
you can run the following from an R console in the SparkR home directory

library(devtools)
devtools::document(pkg="./pkg", roclets=c("rd"))

```R
library(devtools)
devtools::document(pkg="./pkg", roclets=c("rd"))
```
You can verify if your changes are good by running

R CMD check pkg/
32 changes: 18 additions & 14 deletions R/README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,13 @@
# R on Spark

SparkR is an R package that provides a light-weight frontend to use Spark from R.

### Installing sparkR

Libraries of sparkR need to be created in `$SPARK_HOME/R/lib`. This can be done by running the script `$SPARK_HOME/R/install-dev.sh`.
By default the above script uses the system wide installation of R. However, this can be changed to any user installed location of R by setting the environment variable `R_HOME` the full path of the base directory where R is installed, before running install-dev.sh script.
Example:
```
```bash
# where /home/username/R is where R is installed and /home/username/R/bin contains the files R and RScript
export R_HOME=/home/username/R
./install-dev.sh
Expand All @@ -17,8 +18,9 @@ export R_HOME=/home/username/R
#### Build Spark

Build Spark with [Maven](http://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn) and include the `-Psparkr` profile to build the R package. For example to use the default Hadoop versions you can run
```
build/mvn -DskipTests -Psparkr package

```bash
build/mvn -DskipTests -Psparkr package
```

#### Running sparkR
Expand All @@ -37,8 +39,8 @@ To set other options like driver memory, executor memory etc. you can pass in th

#### Using SparkR from RStudio

If you wish to use SparkR from RStudio or other R frontends you will need to set some environment variables which point SparkR to your Spark installation. For example
```
If you wish to use SparkR from RStudio or other R frontends you will need to set some environment variables which point SparkR to your Spark installation. For example
```R
# Set this to where Spark is installed
Sys.setenv(SPARK_HOME="/Users/username/spark")
# This line loads SparkR from the installed directory
Expand All @@ -55,23 +57,25 @@ Once you have made your changes, please include unit tests for them and run exis

#### Generating documentation

The SparkR documentation (Rd files and HTML files) are not a part of the source repository. To generate them you can run the script `R/create-docs.sh`. This script uses `devtools` and `knitr` to generate the docs and these packages need to be installed on the machine before using the script.
The SparkR documentation (Rd files and HTML files) are not a part of the source repository. To generate them you can run the script `R/create-docs.sh`. This script uses `devtools` and `knitr` to generate the docs and these packages need to be installed on the machine before using the script. Also, you may need to install these [prerequisites](https://github.com/apache/spark/tree/master/docs#prerequisites). See also, `R/DOCUMENTATION.md`

### Examples, Unit tests

SparkR comes with several sample programs in the `examples/src/main/r` directory.
To run one of them, use `./bin/spark-submit <filename> <args>`. For example:

./bin/spark-submit examples/src/main/r/dataframe.R

You can also run the unit-tests for SparkR by running (you need to install the [testthat](http://cran.r-project.org/web/packages/testthat/index.html) package first):

R -e 'install.packages("testthat", repos="http://cran.us.r-project.org")'
./R/run-tests.sh
```bash
./bin/spark-submit examples/src/main/r/dataframe.R
```
You can also run the unit tests for SparkR by running. You need to install the [testthat](http://cran.r-project.org/web/packages/testthat/index.html) package first:
```bash
R -e 'install.packages("testthat", repos="http://cran.us.r-project.org")'
./R/run-tests.sh
```

### Running on YARN

The `./bin/spark-submit` can also be used to submit jobs to YARN clusters. You will need to set YARN conf dir before doing so. For example on CDH you can run
```
```bash
export YARN_CONF_DIR=/etc/hadoop/conf
./bin/spark-submit --master yarn examples/src/main/r/dataframe.R
```
20 changes: 20 additions & 0 deletions R/WINDOWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,23 @@ include Rtools and R in `PATH`.
directory in Maven in `PATH`.
4. Set `MAVEN_OPTS` as described in [Building Spark](http://spark.apache.org/docs/latest/building-spark.html).
5. Open a command shell (`cmd`) in the Spark directory and run `mvn -DskipTests -Psparkr package`

## Unit tests

To run the SparkR unit tests on Windows, the following steps are required —assuming you are in the Spark root directory and do not have Apache Hadoop installed already:

1. Create a folder to download Hadoop related files for Windows. For example, `cd ..` and `mkdir hadoop`.

2. Download the relevant Hadoop bin package from [steveloughran/winutils](https://github.com/steveloughran/winutils). While these are not official ASF artifacts, they are built from the ASF release git hashes by a Hadoop PMC member on a dedicated Windows VM. For further reading, consult [Windows Problems on the Hadoop wiki](https://wiki.apache.org/hadoop/WindowsProblems).

3. Install the files into `hadoop\bin`; make sure that `winutils.exe` and `hadoop.dll` are present.

4. Set the environment variable `HADOOP_HOME` to the full path to the newly created `hadoop` directory.

5. Run unit tests for SparkR by running the command below. You need to install the [testthat](http://cran.r-project.org/web/packages/testthat/index.html) package first:

```
R -e "install.packages('testthat', repos='http://cran.us.r-project.org')"
.\bin\spark-submit2.cmd --conf spark.hadoop.fs.default.name="file:///" R\pkg\tests\run-all.R
```

52 changes: 52 additions & 0 deletions R/check-cran.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
#!/bin/bash

#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

set -o pipefail
set -e

FWDIR="$(cd `dirname $0`; pwd)"
pushd $FWDIR > /dev/null

if [ ! -z "$R_HOME" ]
then
R_SCRIPT_PATH="$R_HOME/bin"
else
# if system wide R_HOME is not found, then exit
if [ ! `command -v R` ]; then
echo "Cannot find 'R_HOME'. Please specify 'R_HOME' or make sure R is properly installed."
exit 1
fi
R_SCRIPT_PATH="$(dirname $(which R))"
fi
echo "USING R_HOME = $R_HOME"

# Build the latest docs
$FWDIR/create-docs.sh

# Build a zip file containing the source package
"$R_SCRIPT_PATH/"R CMD build $FWDIR/pkg

# Run check as-cran.
# TODO(shivaram): Remove the skip tests once we figure out the install mechanism

VERSION=`grep Version $FWDIR/pkg/DESCRIPTION | awk '{print $NF}'`

"$R_SCRIPT_PATH/"R CMD check --as-cran --no-tests SparkR_"$VERSION".tar.gz

popd > /dev/null
7 changes: 6 additions & 1 deletion R/install-dev.sh
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,12 @@ pushd $FWDIR > /dev/null
if [ ! -z "$R_HOME" ]
then
R_SCRIPT_PATH="$R_HOME/bin"
else
else
# if system wide R_HOME is not found, then exit
if [ ! `command -v R` ]; then
echo "Cannot find 'R_HOME'. Please specify 'R_HOME' or make sure R is properly installed."
exit 1
fi
R_SCRIPT_PATH="$(dirname $(which R))"
fi
echo "USING R_HOME = $R_HOME"
Expand Down
5 changes: 5 additions & 0 deletions R/pkg/.Rbuildignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
^.*\.Rproj$
^\.Rproj\.user$
^\.lintr$
^src-native$
^html$
10 changes: 5 additions & 5 deletions R/pkg/DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,20 +1,18 @@
Package: SparkR
Type: Package
Title: R frontend for Spark
Title: R Frontend for Apache Spark
Version: 2.0.0
Date: 2013-09-09
Date: 2016-07-07
Author: The Apache Software Foundation
Maintainer: Shivaram Venkataraman <shivaram@cs.berkeley.edu>
Imports:
methods
Depends:
R (>= 3.0),
methods,
Suggests:
testthat,
e1071,
survival
Description: R frontend for Spark
Description: The SparkR package provides an R frontend for Apache Spark.
License: Apache License (== 2.0)
Collate:
'schema.R'
Expand All @@ -26,6 +24,7 @@ Collate:
'pairRDD.R'
'DataFrame.R'
'SQLContext.R'
'WindowSpec.R'
'backend.R'
'broadcast.R'
'client.R'
Expand All @@ -38,4 +37,5 @@ Collate:
'stats.R'
'types.R'
'utils.R'
'window.R'
RoxygenNote: 5.0.1
Loading