Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
2161 commits
Select commit Hold shift + click to select a range
0abee53
[SPARK-14069][SQL] Improve SparkStatusTracker to also track executor …
cloud-fan Mar 31, 2016
10508f3
[SPARK-11327][MESOS] Dispatcher does not respect all args from the Su…
jayv Mar 31, 2016
3cfbeb7
[SPARK-13710][SHELL][WINDOWS] Fix jline dependency on Windows
michellemay Mar 31, 2016
e785402
[SPARK-14304][SQL][TESTS] Fix tests that don't create temp files in t…
zsxwing Mar 31, 2016
b11887c
[SPARK-14264][PYSPARK][ML] Add feature importance for GBTs in pyspark
sethah Mar 31, 2016
a7af6cd
[SPARK-14281][TESTS] Fix java8-tests and simplify their build
JoshRosen Mar 31, 2016
8de201b
[SPARK-14277][CORE] Upgrade Snappy Java to 1.1.2.4
Mar 31, 2016
f0afafd
[SPARK-14267] [SQL] [PYSPARK] execute multiple Python UDFs within sin…
Mar 31, 2016
96941b1
[SPARK-14242][CORE][NETWORK] avoid copy in compositeBuffer for frame …
liyezhang556520 Apr 1, 2016
1b07063
[SPARK-14295][SPARK-14274][SQL] Implements buildReader() for LibSVM
liancheng Apr 1, 2016
26867eb
[SPARK-11262][ML] Unit test for gradient, loss layers, memory managem…
avulanov Apr 1, 2016
22249af
[SPARK-14303][ML][SPARKR] Define and use KMeansWrapper for SparkR::km…
yanboliang Apr 1, 2016
3715ecd
[SPARK-14295][MLLIB][HOTFIX] Fixes Scala 2.10 compilation failure
liancheng Apr 1, 2016
0b04f8f
[SPARK-14184][SQL] Support native execution of SHOW DATABASE command …
dilipbiswal Apr 1, 2016
a471c7f
[SPARK-14133][SQL] Throws exception for unsupported create/drop/alter…
sureshthalamati Apr 1, 2016
58e6bc8
[MINOR] [SQL] Update usage of `debug` by removing `typeCheck` and add…
dongjoon-hyun Apr 1, 2016
8ba2b7f
[SPARK-12343][YARN] Simplify Yarn client and client argument
jerryshao Apr 1, 2016
381358f
[SPARK-14305][ML][PYSPARK] PySpark ml.clustering BisectingKMeans supp…
yanboliang Apr 1, 2016
df68beb
[SPARK-13995][SQL] Extract correct IsNotNull constraints for Expression
viirya Apr 1, 2016
a884daa
[SPARK-14191][SQL] Remove invalid Expand operator constraints
viirya Apr 1, 2016
1e88615
[SPARK-14070][SQL] Use ORC data source for SQL queries on ORC tables
tejasapatil Apr 1, 2016
1b829ce
[SPARK-14160] Time Windowing functions for Datasets
brkyvz Apr 1, 2016
3e991db
[SPARK-13674] [SQL] Add wholestage codegen support to Sample
viirya Apr 1, 2016
bd7b91c
[SPARK-12864][YARN] initialize executorIdCounter after ApplicationMas…
zhonghaihua Apr 1, 2016
e41acb7
[SPARK-13992] Add support for off-heap caching
JoshRosen Apr 1, 2016
0b7d496
[SPARK-14316][SQL] StateStoreCoordinator should extend ThreadSafeRpcE…
zsxwing Apr 1, 2016
0fc4aaa
[SPARK-14255][SQL] Streaming Aggregation
marmbrus Apr 1, 2016
c16a396
[SPARK-13825][CORE] Upgrade to Scala 2.11.8
jaceklaskowski Apr 1, 2016
19f32f2
[SPARK-12857][STREAMING] Standardize "records" and "events" on "records"
lw-lin Apr 1, 2016
abc6c42
[SPARK-13241][WEB UI] Added long values for dates in ApplicationAttem…
ajbozarth Apr 1, 2016
36e8fb8
[SPARK-7425][ML] spark.ml Predictor should support other numeric type…
BenFradet Apr 2, 2016
4fc35e6
[SPARK-14308][ML][MLLIB] Remove unused mllib tree classes and move pr…
sethah Apr 2, 2016
27e71a2
[SPARK-14244][SQL] Don't use SizeBasedWindowFunction.n created on exe…
liancheng Apr 2, 2016
877dc71
[SPARK-14138] [SQL] [MASTER] Fix generated SpecificColumnarIterator c…
kiszk Apr 2, 2016
fa1af0a
[SPARK-14251][SQL] Add SQL command for printing out generated code fo…
dongjoon-hyun Apr 2, 2016
f414154
[SPARK-14285][SQL] Implement common type-safe aggregate functions
rxin Apr 2, 2016
d7982a3
[MINOR][SQL] Fix comments styl and correct several styles and nits in…
HyukjinKwon Apr 2, 2016
67d7535
[HOTFIX] Fix compilation break.
rxin Apr 2, 2016
06694f1
[MINOR] Typo fixes
jaceklaskowski Apr 2, 2016
a3e2935
[HOTFIX] Disable StateStoreSuite.maintenance
rxin Apr 2, 2016
f705037
[SPARK-14338][SQL] Improve `SimplifyConditionals` rule to handle `nul…
dongjoon-hyun Apr 3, 2016
4a6e78a
[MINOR][DOCS] Use multi-line JavaDoc comments in Scala code.
dongjoon-hyun Apr 3, 2016
03d130f
[SPARK-14342][CORE][DOCS][TESTS] Remove straggler references to Tachyon
lw-lin Apr 3, 2016
1cf7018
[SPARK-14056] Appends s3 specific configurations and spark.hadoop con…
Apr 3, 2016
c2f25b1
[SPARK-13996] [SQL] Add more not null attributes for Filter codegen
viirya Apr 3, 2016
7be4620
[HOTFIX] Fix Scala 2.10 compilation
rxin Apr 3, 2016
2262a93
[SPARK-14231] [SQL] JSON data source infers floating-point values as …
HyukjinKwon Apr 3, 2016
1f0c5dc
[SPARK-14350][SQL] EXPLAIN output should be in a single cell
dongjoon-hyun Apr 3, 2016
c238cd0
[SPARK-14341][SQL] Throw exception on unsupported create / drop macro…
Apr 3, 2016
9023015
[SPARK-14163][CORE] SumEvaluator and countApprox cannot reliably hand…
mtustin-handy Apr 4, 2016
3f749f7
[SPARK-14355][BUILD] Fix typos in Exception/Testcase/Comments and sta…
dongjoon-hyun Apr 4, 2016
76f3c73
[SPARK-14356] Update spark.sql.execution.debug to work on Datasets
mateiz Apr 4, 2016
0340b3d
[SPARK-14360][SQL] QueryExecution.debug.codegen() to dump codegen
rxin Apr 4, 2016
7454253
[SPARK-14137] [SQL] Cleanup hash join
Apr 4, 2016
89f3bef
[SPARK-13784][ML] Persistence for RandomForestClassifier, RandomFores…
jkbradley Apr 4, 2016
855ed44
[SPARK-14176][SQL] Add DataFrameWriter.trigger to set the stream batc…
zsxwing Apr 4, 2016
5743c64
[SPARK-12981] [SQL] extract Pyhton UDF in physical plan
Apr 4, 2016
27dad6f
[SPARK-14364][SPARK] HeartbeatReceiver object should be private
rxin Apr 4, 2016
7143904
[SPARK-14358] Change SparkListener from a trait to an abstract class
rxin Apr 4, 2016
cc70f17
[SPARK-14334] [SQL] add toLocalIterator for Dataset/DataFrame
Apr 4, 2016
400b2f8
[SPARK-14259] [SQL] Merging small files together based on the cost of…
Apr 4, 2016
24d7d2e
[SPARK-13579][BUILD] Stop building the main Spark assembly.
Apr 4, 2016
a172e11
[SPARK-14366] Remove sbt-idea plugin
lresende Apr 4, 2016
7201f03
[SPARK-12425][STREAMING] DStream union optimisation
gpoulin Apr 5, 2016
ba24d1e
[SPARK-14287] isStreaming method for Dataset
brkyvz Apr 5, 2016
8f50574
[SPARK-14386][ML] Changed spark.ml ensemble trees methods to return c…
jkbradley Apr 5, 2016
7db5624
[SPARK-14368][PYSPARK] Support python.spark.worker.memory with upper-…
yongtang Apr 5, 2016
0646230
[SPARK-14359] Create built-in functions for typed aggregates in Java
ericl Apr 5, 2016
2715bc6
[SPARK-14348][SQL] Support native execution of SHOW TBLPROPERTIES com…
dilipbiswal Apr 5, 2016
7807173
[SPARK-14349][SQL] Issue Error Messages for Unsupported Operators/DML…
gatorsmile Apr 5, 2016
d356901
[SPARK-14284][ML] KMeansSummary deprecating size; adding clusterSizes
shallys Apr 5, 2016
e4bd504
[SPARK-14397][WEBUI] <html> and <body> tags are nested in LogPage
sarutak Apr 5, 2016
f77f11c
[SPARK-14345][SQL] Decouple deserializer expression resolution from O…
cloud-fan Apr 5, 2016
463bac0
[SPARK-14257][SQL] Allow multiple continuous queries to be started fr…
zsxwing Apr 5, 2016
bc36df1
[SPARK-13063][YARN] Make the SPARK YARN STAGING DIR as configurable
Apr 5, 2016
72544d6
[SPARK-14123][SPARK-14384][SQL] Handle CreateFunction/DropFunction
yhuai Apr 5, 2016
9ee5c25
[SPARK-14353] Dataset Time Window `window` API for Python, and SQL
brkyvz Apr 5, 2016
c59abad
[SPARK-14402][SQL] initcap UDF doesn't match Hive/Oracle behavior in …
dongjoon-hyun Apr 5, 2016
45d8cde
[SPARK-14129][SPARK-14128][SQL] Alter table DDL commands
Apr 5, 2016
7329fe2
[SPARK-14411][SQL] Add a note to warn that onQueryProgress is asynchr…
zsxwing Apr 5, 2016
d5ee9d5
[SPARK-529][SQL] Modify SQLConf to use new config API from core.
Apr 5, 2016
48682f6
[HOTFIX] Fix `optional` to `createOptional`.
dongjoon-hyun Apr 5, 2016
1146c53
[SPARK-14353] Dataset Time Window `window` API for R
brkyvz Apr 6, 2016
7d29c72
[SPARK-14359] Unit tests for java 8 lambda syntax with typed aggregates
ericl Apr 6, 2016
8e5c1cb
[SPARK-13211][STREAMING] StreamingContext throws NoSuchElementExcepti…
srowen Apr 6, 2016
f6456fa
[SPARK-14296][SQL] whole stage codegen support for Dataset.map
cloud-fan Apr 6, 2016
adbfdb8
[SPARK-14128][SQL] Alter table DDL followup
Apr 6, 2016
48467f4
[SPARK-14416][CORE] Add thread-safe comments for CoarseGrainedSchedul…
zsxwing Apr 6, 2016
68be5b9
[SPARK-14396][SQL] Throw Exceptions for DDLs of Partitioned Views
gatorsmile Apr 6, 2016
78c1076
[SPARK-14252] Executors do not try to download remote cached blocks
ericl Apr 6, 2016
25a4c8e
[SPARK-14396][BUILD][HOT] Fix compilation against Scala 2.10
gatorsmile Apr 6, 2016
2401519
Added omitted word in error message
blazy2k9 Apr 6, 2016
5e64dab
[SPARK-14430][BUILD] use https while downloading binaries from build/mvn
infynyxx Apr 6, 2016
59236e5
[SPARK-14288][SQL] Memory Sink for streaming
marmbrus Apr 6, 2016
90ca184
[SPARK-14418][PYSPARK] fix unpersist of Broadcast in Python
Apr 6, 2016
10494fe
[SPARK-14426][SQL] Merge PerserUtils and ParseUtils
sarutak Apr 6, 2016
5abd02c
[SPARK-14429][SQL] Improve LIKE pattern in "SHOW TABLES / FUNCTIONS L…
Apr 6, 2016
3c8d882
[SPARK-14383][SQL] missing "|" in the g4 file
Apr 6, 2016
db0b06c
[SPARK-13786][ML][PYSPARK] Add save/load for pyspark.ml.tuning
yinxusen Apr 6, 2016
8cffcb6
[SPARK-14322][MLLIB] Use treeAggregate instead of reduce in OnlineLDA…
hhbyyh Apr 6, 2016
af73d97
[SPARK-13538][ML] Add GaussianMixture to ML
zhengruifeng Apr 6, 2016
bb1fa5b
[SPARK-14320][SQL] Make ColumnarBatch.Row mutable
sameeragarwal Apr 6, 2016
9c6556c
[SPARK-13430][PYSPARK][ML] Python API for training summaries of linea…
BryanCutler Apr 6, 2016
a4ead6d
[SPARK-14382][SQL] QueryProgress should be post after committedOffset…
zsxwing Apr 6, 2016
5a4b11a
[SPARK-14224] [SPARK-14223] [SPARK-14310] [SQL] fix RowEncoder and pa…
Apr 6, 2016
de47926
[SPARK-14391][LAUNCHER] Increase test timeouts.
Apr 6, 2016
9af5423
[SPARK-12133][STREAMING] Streaming dynamic allocation
tdas Apr 6, 2016
457e58b
[SPARK-14424][BUILD][DOCS] Update the build docs to switch from assem…
holdenk Apr 6, 2016
d717ae1
[SPARK-14444][BUILD] Add a new scalastyle `NoScalaDoc` to prevent Sca…
dongjoon-hyun Apr 6, 2016
c4bb02a
[SPARK-14290][CORE][NETWORK] avoid significant memory copy in netty's…
liyezhang556520 Apr 6, 2016
f1def57
[SPARK-13112][CORE] Make sure RegisterExecutorResponse arrive before …
zsxwing Apr 6, 2016
864d1b4
[SPARK-14436][SQL] Make JavaDatasetAggregatorSuiteBase public.
Apr 6, 2016
bb87375
[SPARK-12382][ML] Remove mllib GBT implementation and wrap ml
sethah Apr 7, 2016
611dbce
[SPARK-12555][SQL] Result should not be corrupted after input columns…
lresende Apr 7, 2016
4901086
[SPARK-14446][TESTS] Fix ReplSuite for Scala 2.10.
Apr 7, 2016
d765922
[SPARK-12610][SQL] Left Anti Join
hvanhovell Apr 7, 2016
21d5ca1
[SPARK-14134][CORE] Change the package name used for shading classes.
Apr 7, 2016
e11aa9e
[SPARK-14452][SQL] Explicit APIs in Scala for specifying encoders
rxin Apr 7, 2016
9ca0760
[SPARK-10063][SQL] Remove DirectParquetOutputCommitter
rxin Apr 7, 2016
db75ccb
Better host description for multi-master mesos
Apr 7, 2016
35e0db2
[SPARK-14245][WEB UI] Display the user in the application view
ajbozarth Apr 7, 2016
033d808
[SPARK-12384] Enables spark-clients to set the min(-Xms) and max(*.me…
dhruve Apr 7, 2016
3aa7d76
[SQL][TESTS] Fix for flaky test in ContinuousQueryManagerSuite
tdas Apr 7, 2016
8dcb0c7
[SPARK-14456][SQL][MINOR] Remove unused variables and logics in DataS…
sarutak Apr 7, 2016
aa85221
[SPARK-12740] [SPARK-13932] support grouping()/grouping_id() in havin…
Apr 7, 2016
ae1db91
[SPARK-14410][SQL] Push functions existence check into catalog
Apr 7, 2016
49fb237
[SPARK-14270][SQL] whole stage codegen support for typed filter
cloud-fan Apr 8, 2016
30e980a
[DOCS][MINOR] Remove sentence about Mesos not supporting cluster mode.
Apr 8, 2016
3e29e37
[SPARK-14468] Always enable OutputCommitCoordinator
Apr 8, 2016
692c748
[SPARK-14449][SQL] SparkContext should use SparkListenerInterface
marmbrus Apr 8, 2016
953ff89
[SPARK-13048][ML][MLLIB] keepLastCheckpoint option for LDA EM optimizer
jkbradley Apr 8, 2016
04fb7db
Replace getLocalizedMessage with just normal toString in exception ha…
rxin Apr 8, 2016
725b860
[SPARK-14103][SQL] Parse unescaped quotes in CSV data source.
HyukjinKwon Apr 8, 2016
73b56a3
[SPARK-14189][SQL] JSON data sources find compatible types even if in…
HyukjinKwon Apr 8, 2016
6447098
[SPARK-14402][HOTFIX] Fix ExpressionDescription annotation
jaceklaskowski Apr 8, 2016
583b5e0
[SPARK-14470] Allow for overriding both httpclient and httpcore versions
tokaaron Apr 8, 2016
a9b630f
[SPARK-14477][BUILD] Allow custom mirrors for downloading artifacts i…
markgrover Apr 8, 2016
e5d8d6e
[SPARK-14373][PYSPARK] PySpark RandomForestClassifier, Regressor supp…
vectorijk Apr 8, 2016
e0ad75f
[SPARK-12569][PYSPARK][ML] DecisionTreeRegressor: provide variance of…
wangmiao1981 Apr 8, 2016
94ac58b
[BUILD][HOTFIX] Download Maven from regular mirror network rather tha…
JoshRosen Apr 8, 2016
56af8e8
[SPARK-14298][ML][MLLIB] LDA should support disable checkpoint
yanboliang Apr 8, 2016
0275753
[SPARK-14448] Improvements to ColumnVector
tedyu Apr 8, 2016
f8c9bec
[SPARK-14394][SQL] Generate AggregateHashMap class for LongTypes duri…
sameeragarwal Apr 8, 2016
464a3c1
[SPARK-14435][BUILD] Shade Kryo in our custom Hive 1.2.1 fork
JoshRosen Apr 8, 2016
906eef4
[SPARK-11416][BUILD] Update to Chill 0.8.0 & Kryo 3.0.3
JoshRosen Apr 8, 2016
4d7c359
[SPARK-14437][CORE] Use the address that NettyBlockTransferService li…
zsxwing Apr 9, 2016
813e96e
[SPARK-14454] Better exception handling while marking tasks as failed
sameeragarwal Apr 9, 2016
d7af736
[SPARK-14498][ML][PYTHON][SQL] Many cleanups to ML and ML-related docs
jkbradley Apr 9, 2016
2f0b882
[SPARK-14482][SQL] Change default Parquet codec from gzip to snappy
rxin Apr 9, 2016
520dde4
[SPARK-14451][SQL] Move encoder definition into Aggregator interface
rxin Apr 9, 2016
90c0a04
[SPARK-14419] [SQL] Improve HashedRelation for key fit within Long
Apr 9, 2016
a9b8b65
[SPARK-14392][ML] CountVectorizer Estimator should include binary tog…
wangmiao1981 Apr 9, 2016
10a9578
[SPARK-14496][SQL] fix some javadoc typos
Apr 9, 2016
1598d11
[SPARK-14462][ML][MLLIB] add the mllib-local build to maven pom
Apr 9, 2016
adb9d73
[SPARK-14339][DOC] Add python examples for DCT,MinMaxScaler,MaxAbsScaler
zhengruifeng Apr 9, 2016
f7ec854
Revert "[SPARK-14419] [SQL] Improve HashedRelation for key fit within…
davies Apr 9, 2016
cd2fed7
[SPARK-14335][SQL] Describe function command returns wrong output
yongtang Apr 9, 2016
415446c
Revert "[SPARK-14462][ML][MLLIB] add the mllib-local build to maven pom"
mengxr Apr 9, 2016
9be5558
[SPARK-14481][SQL] Issue Exceptions for All Unsupported Options durin…
gatorsmile Apr 9, 2016
dfce966
[SPARK-14362][SPARK-14406][SQL] DDL Native Support: Drop View and Dro…
gatorsmile Apr 10, 2016
5cb5eda
[SPARK-14419] [SQL] Improve HashedRelation for key fit within Long
Apr 10, 2016
5989c85
[SPARK-14217] [SQL] Fix bug if parquet data has columns that use dict…
nongli Apr 10, 2016
00288ea
[SPARK-13687][PYTHON] Cleanup PySpark parallelize temporary files
holdenk Apr 10, 2016
72e66bb
[SPARK-14301][EXAMPLES] Java examples code merge and clean up.
yongtang Apr 10, 2016
aea30a1
[SPARK-14465][BUILD] Checkstyle should check all Java files
dongjoon-hyun Apr 10, 2016
3fb09af
[SPARK-14506][SQL] HiveClientImpl's toHiveTable misses a table proper…
yhuai Apr 10, 2016
2c95e4e
[SPARK-14455][STREAMING] Fix NPE in allocatedExecutors when calling i…
jerryshao Apr 10, 2016
22014e6
[SPARK-14357][CORE] Properly handle the root cause being a commit den…
jasonmoore2k Apr 10, 2016
f434458
[SPARK-14497][ML] Use top instead of sortBy() to get top N frequent w…
lionelfeng Apr 10, 2016
b5c7856
Update KMeansExample.scala
oluies Apr 10, 2016
a7ce473
[SPARK-14415][SQL] All functions should show usages by command `DESC …
dongjoon-hyun Apr 10, 2016
fbf8d00
[SPARK-14419] [MINOR] coding style cleanup
Apr 11, 2016
9f838bd
[SPARK-14362][SPARK-14406][SQL][FOLLOW-UP] DDL Native Support: Drop V…
gatorsmile Apr 11, 2016
1a0cca1
[MINOR][DOCS] Fix wrong data types in JSON Datasets example.
dongjoon-hyun Apr 11, 2016
e82d95b
[SPARK-14372][SQL] Dataset.randomSplit() needs a Java version
rekhajoshm Apr 11, 2016
1c751fc
[SPARK-14500] [ML] Accept Dataset[_] instead of DataFrame in MLlib APIs
mengxr Apr 11, 2016
643b4e2
[SPARK-14510][MLLIB] Add args-checking for LDA and StreamingKMeans
zhengruifeng Apr 11, 2016
efaf7d1
[SPARK-14462][ML][MLLIB] Add the mllib-local build to maven pom
Apr 11, 2016
652c470
[SPARK-14528] [SQL] Fix same result of Union
Apr 11, 2016
5de2619
[SPARK-14502] [SQL] Add optimization for Binary Comparison Simplifica…
dongjoon-hyun Apr 11, 2016
2dacc81
[SPARK-14494][SQL] Fix the race conditions in MemoryStream and Memory…
zsxwing Apr 11, 2016
89a41c5
[SPARK-13600][MLLIB] Use approxQuantile from DataFrame stats in Quant…
Apr 11, 2016
3f0f408
[SPARK-14298][ML][MLLIB] Add unit test for EM LDA disable checkpointing
yanboliang Apr 11, 2016
94de630
[SPARK-10521][SQL] Utilize Docker for test DB2 JDBC Dialect support
lresende Apr 11, 2016
6f27027
[SPARK-14475] Propagate user-defined context from driver to executors
ericl Apr 12, 2016
26d7af9
[SPARK-14520][SQL] Use correct return type in VectorizedParquetInputF…
viirya Apr 12, 2016
e9e1adc
[MINOR][ML] Fixed MLlib build warnings
jkbradley Apr 12, 2016
83fb964
[SPARK-14132][SPARK-14133][SQL] Alter table partition DDLs
Apr 12, 2016
2d81ba5
[SPARK-14362][SPARK-14406][SQL][FOLLOW-UP] DDL Native Support: Drop V…
gatorsmile Apr 12, 2016
52a8011
[SPARK-14554][SQL] disable whole stage codegen if there are too many …
cloud-fan Apr 12, 2016
678b96e
[SPARK-14535][SQL] Remove buildInternalScan from FileFormat
cloud-fan Apr 12, 2016
b0f5497
[SPARK-14508][BUILD] Add a new ScalaStyle Rule `OmitBracesInCase`
dongjoon-hyun Apr 12, 2016
124cbfb
[SPARK-14488][SPARK-14493][SQL] "CREATE TEMPORARY TABLE ... USING ...…
liancheng Apr 12, 2016
da60b34
[SPARK-3724][ML] RandomForest: More options for feature subset size.
yongtang Apr 12, 2016
6bf6921
[SPARK-14474][SQL] Move FileSource offset log into checkpointLocation
zsxwing Apr 12, 2016
75e05a5
[SPARK-12566][SPARK-14324][ML] GLM model family, link function suppor…
yanboliang Apr 12, 2016
101663f
[SPARK-13322][ML] AFTSurvivalRegression supports feature standardization
yanboliang Apr 12, 2016
7f024c4
[SPARK-13597][PYSPARK][ML] Python API for GeneralizedLinearRegression
vectorijk Apr 12, 2016
1995c2e
[SPARK-14563][ML] use a random table name instead of __THIS__ in SQLT…
mengxr Apr 12, 2016
111a624
[SPARK-14147][ML][SPARKR] SparkR predict should not output feature co…
yanboliang Apr 12, 2016
852bbc6
[SPARK-14556][SQL] Code clean-ups for package o.a.s.sql.execution.str…
lw-lin Apr 12, 2016
85e68b4
[SPARK-14562] [SQL] improve constraints propagation in Union
Apr 12, 2016
bcd2076
[SPARK-14414][SQL] improve the error message class hierarchy
Apr 12, 2016
3e53de4
[SPARK-14513][CORE] Fix threads left behind after stopping SparkContext
chtyim Apr 12, 2016
1ef5f8c
[SPARK-14544] [SQL] improve performance of SQL UI tab
Apr 12, 2016
c439d88
[SPARK-14547] Avoid DNS resolution for reusing connections
rxin Apr 12, 2016
d187e7d
[SPARK-14363] Fix executor OOM due to memory leak in the Sorter
Apr 12, 2016
372baf0
[SPARK-14578] [SQL] Fix codegen for CreateExternalRow with nested wid…
Apr 13, 2016
768b3d6
[SPARK-14579][SQL] Fix a race condition in StreamExecution.processAll…
zsxwing Apr 13, 2016
587cd55
[MINOR][SQL] Remove some unused imports in datasources.
HyukjinKwon Apr 13, 2016
a5f8c9b
[SPARK-14554][SQL][FOLLOW-UP] use checkDataset to check the result
cloud-fan Apr 13, 2016
23f93f5
[SPARK-13992][CORE][PYSPARK][FOLLOWUP] Update OFF_HEAP semantics for …
lw-lin Apr 13, 2016
dd11e40
[SPARK-14537][CORE] Make TaskSchedulerImpl waiting fail if context is…
drcrallen Apr 13, 2016
323e739
Revert "[SPARK-14154][MLLIB] Simplify the implementation for Kolmogor…
mengxr Apr 13, 2016
1018a1c
[SPARK-14568][ML] Instrumentation framework for logistic regression
thunterdb Apr 13, 2016
7d2ed8c
[SPARK-14388][SQL] Implement CREATE TABLE
Apr 13, 2016
f9d578e
[SPARK-13783][ML] Model export/import for spark.ml: GBTs
yanboliang Apr 13, 2016
dbbe149
[SPARK-14581] [SQL] push predicatese through more logical plans
Apr 13, 2016
b0adb9f
[SPARK-10386][MLLIB] PrefixSpanModel supports save/load
yanboliang Apr 13, 2016
0d17593
[SPARK-14461][ML] GLM training summaries should provide solver
yanboliang Apr 13, 2016
a91aaf5
[SPARK-14375][ML] Unit test for spark.ml KMeansSummary
yanboliang Apr 13, 2016
fcdd692
[SPARK-14509][DOC] Add python CountVectorizerExample
zhengruifeng Apr 13, 2016
781df49
[SPARK-13089][ML] [Doc] spark.ml Naive Bayes user guide and examples
hhbyyh Apr 13, 2016
fc3cd2f
[SPARK-14472][PYSPARK][ML] Cleanup ML JavaWrapper and related class h…
BryanCutler Apr 13, 2016
62b7f30
[SPARK-14607] [SPARK-14484] [SQL] fix case-insensitive predicates in …
Apr 14, 2016
b481940
[SPARK-14596][SQL] Remove not used SqlNewHadoopRDD and some more unus…
HyukjinKwon Apr 14, 2016
478af2f
[SPARK-14573][PYSPARK][BUILD] Fix PyDoc Makefile & highlighting issues
holdenk Apr 14, 2016
6fc3dc8
[MINOR][SQL] Remove extra anonymous closure within functional transfo…
HyukjinKwon Apr 14, 2016
3cf3db1
[SPARK-14518][SQL] Support Comment in CREATE VIEW
gatorsmile Apr 14, 2016
f83ba45
[SPARK-14572][DOC] Update config docs to allow -Xms in extraJavaOptions
dhruve Apr 14, 2016
0d22092
[SPARK-14125][SQL] Native DDL Support: Alter View
gatorsmile Apr 14, 2016
de2ad52
[SPARK-14625] TaskUIData and ExecutorUIData shouldn't be case classes
rxin Apr 14, 2016
3e27940
[SPARK-14630][BUILD][CORE][SQL][STREAMING] Code style: public abstrac…
lw-lin Apr 14, 2016
9fa43a3
[SPARK-14612][ML] Consolidate the version of dependencies in mllib an…
srowen Apr 14, 2016
dac40b6
[SPARK-14619] Track internal accumulators (metrics) by stage attempt
rxin Apr 14, 2016
a46f98d
[SPARK-14617] Remove deprecated APIs in TaskMetrics
rxin Apr 14, 2016
1d04c86
[SPARK-14558][CORE] In ClosureCleaner, clean the outer pointer if it'…
cloud-fan Apr 14, 2016
c971aee
[SPARK-14499][SQL][TEST] Drop Partition Does Not Delete Data of Exter…
gatorsmile Apr 14, 2016
28efdd3
[SPARK-14592][SQL] Native support for CREATE TABLE LIKE DDL command
viirya Apr 14, 2016
c5172f8
[SPARK-13967][PYSPARK][ML] Added binary Param to Python CountVectorizer
BryanCutler Apr 14, 2016
bf65c87
[SPARK-14618][ML][DOC] Updated RegressionEvaluator.metricName param doc
jkbradley Apr 14, 2016
bc748b7
[SPARK-14238][ML][MLLIB][PYSPARK] Add binary toggle Param to PySpark …
yongtang Apr 14, 2016
d7e124e
[SPARK-14545][SQL] Improve `LikeSimplification` by adding `a%b` rule
dongjoon-hyun Apr 14, 2016
01dd1f5
[SPARK-14565][ML] RandomForest should use parseInt and parseDouble fo…
yongtang Apr 15, 2016
c80586d
[SPARK-12869] Implemented an improved version of the toIndexedRowMatrix
Apr 15, 2016
ff9ae61
[SPARK-14601][DOC] Minor doc/usage changes related to removal of Spar…
markgrover Apr 15, 2016
b5c60bc
[SPARK-14447][SQL] Speed up TungstenAggregate w/ keys using Vectorize…
sameeragarwal Apr 15, 2016
297ba3f
[SPARK-14275][SQL] Reimplement TypedAggregateExpression to Declarativ…
cloud-fan Apr 15, 2016
b961323
[SPARK-14374][ML][PYSPARK] PySpark ml GBTClassifier, Regressor suppor…
yanboliang Apr 15, 2016
a9324a0
Closes #12407
rxin Apr 15, 2016
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
The diff you're trying to view is too large. We only load the first 3000 changed files.
12 changes: 12 additions & 0 deletions .github/PULL_REQUEST_TEMPLATE
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
## What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)


## How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)


(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

9 changes: 5 additions & 4 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,6 @@ cache
work/
out/
.DS_Store
third_party/libmesos.so
third_party/libmesos.dylib
build/apache-maven*
build/zinc*
build/scala*
Expand Down Expand Up @@ -50,6 +48,7 @@ spark-tests.log
streaming-tests.log
dependency-reduced-pom.xml
.ensime
.ensime_cache/
.ensime_lucene
checkpoint
derby.log
Expand All @@ -59,8 +58,6 @@ dev/create-release/*final
spark-*-bin-*.tgz
unit-tests.log
/lib/
ec2/lib/
rat-results.txt
scalastyle.txt
scalastyle-output.xml
R-unit-tests.log
Expand All @@ -74,3 +71,7 @@ metastore/
warehouse/
TempStatsStore/
sql/hive-thriftserver/test_warehouses

# For R session data
.RHistory
.RData
85 changes: 0 additions & 85 deletions .rat-excludes

This file was deleted.

36 changes: 20 additions & 16 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@

Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
Expand Down Expand Up @@ -237,9 +236,9 @@ The following components are provided under a BSD-style license. See project lin
The text of each license is also included at licenses/LICENSE-[project].txt.

(BSD 3 Clause) netlib core (com.github.fommil.netlib:core:1.1.2 - https://github.com/fommil/netlib-java/core)
(BSD 3 Clause) JPMML-Model (org.jpmml:pmml-model:1.1.15 - https://github.com/jpmml/jpmml-model)
(BSD 3-clause style license) jblas (org.jblas:jblas:1.2.4 - http://jblas.org/)
(BSD 3 Clause) JPMML-Model (org.jpmml:pmml-model:1.2.7 - https://github.com/jpmml/jpmml-model)
(BSD License) AntLR Parser Generator (antlr:antlr:2.7.7 - http://www.antlr.org/)
(BSD License) ANTLR 4.5.2-1 (org.antlr:antlr4:4.5.2-1 - http://wwww.antlr.org/)
(BSD licence) ANTLR ST4 4.0.4 (org.antlr:ST4:4.0.4 - http://www.stringtemplate.org)
(BSD licence) ANTLR StringTemplate (org.antlr:stringtemplate:3.2.1 - http://www.stringtemplate.org)
(BSD License) Javolution (javolution:javolution:5.5.1 - http://javolution.org)
Expand All @@ -250,22 +249,21 @@ The text of each license is also included at licenses/LICENSE-[project].txt.
(Interpreter classes (all .scala files in repl/src/main/scala
except for Main.Scala, SparkHelper.scala and ExecutorClassLoader.scala),
and for SerializableMapWrapper in JavaUtils.scala)
(BSD-like) Scala Actors library (org.scala-lang:scala-actors:2.10.5 - http://www.scala-lang.org/)
(BSD-like) Scala Compiler (org.scala-lang:scala-compiler:2.10.5 - http://www.scala-lang.org/)
(BSD-like) Scala Compiler (org.scala-lang:scala-reflect:2.10.5 - http://www.scala-lang.org/)
(BSD-like) Scala Library (org.scala-lang:scala-library:2.10.5 - http://www.scala-lang.org/)
(BSD-like) Scalap (org.scala-lang:scalap:2.10.5 - http://www.scala-lang.org/)
(BSD-style) scalacheck (org.scalacheck:scalacheck_2.10:1.10.0 - http://www.scalacheck.org)
(BSD-style) spire (org.spire-math:spire_2.10:0.7.1 - http://spire-math.org)
(BSD-style) spire-macros (org.spire-math:spire-macros_2.10:0.7.1 - http://spire-math.org)
(New BSD License) Kryo (com.esotericsoftware.kryo:kryo:2.21 - http://code.google.com/p/kryo/)
(New BSD License) MinLog (com.esotericsoftware.minlog:minlog:1.2 - http://code.google.com/p/minlog/)
(New BSD License) ReflectASM (com.esotericsoftware.reflectasm:reflectasm:1.07 - http://code.google.com/p/reflectasm/)
(BSD-like) Scala Actors library (org.scala-lang:scala-actors:2.11.7 - http://www.scala-lang.org/)
(BSD-like) Scala Compiler (org.scala-lang:scala-compiler:2.11.7 - http://www.scala-lang.org/)
(BSD-like) Scala Compiler (org.scala-lang:scala-reflect:2.11.7 - http://www.scala-lang.org/)
(BSD-like) Scala Library (org.scala-lang:scala-library:2.11.7 - http://www.scala-lang.org/)
(BSD-like) Scalap (org.scala-lang:scalap:2.11.7 - http://www.scala-lang.org/)
(BSD-style) scalacheck (org.scalacheck:scalacheck_2.11:1.10.0 - http://www.scalacheck.org)
(BSD-style) spire (org.spire-math:spire_2.11:0.7.1 - http://spire-math.org)
(BSD-style) spire-macros (org.spire-math:spire-macros_2.11:0.7.1 - http://spire-math.org)
(New BSD License) Kryo (com.esotericsoftware:kryo:3.0.3 - https://github.com/EsotericSoftware/kryo)
(New BSD License) MinLog (com.esotericsoftware:minlog:1.3.0 - https://github.com/EsotericSoftware/minlog)
(New BSD license) Protocol Buffer Java API (com.google.protobuf:protobuf-java:2.5.0 - http://code.google.com/p/protobuf)
(New BSD license) Protocol Buffer Java API (org.spark-project.protobuf:protobuf-java:2.4.1-shaded - http://code.google.com/p/protobuf)
(The BSD License) Fortran to Java ARPACK (net.sourceforge.f2j:arpack_combined_all:0.1 - http://f2j.sourceforge.net)
(The BSD License) xmlenc Library (xmlenc:xmlenc:0.52 - http://xmlenc.sourceforge.net)
(The New BSD License) Py4J (net.sf.py4j:py4j:0.9 - http://py4j.sourceforge.net/)
(The New BSD License) Py4J (net.sf.py4j:py4j:0.9.2 - http://py4j.sourceforge.net/)
(Two-clause BSD-style license) JUnit-Interface (com.novocode:junit-interface:0.10 - http://github.com/szeiger/junit-interface/)
(BSD licence) sbt and sbt-launch-lib.bash
(BSD 3 Clause) d3.min.js (https://github.com/mbostock/d3/blob/master/LICENSE)
Expand All @@ -284,11 +282,17 @@ The text of each license is also included at licenses/LICENSE-[project].txt.
(MIT License) SLF4J API Module (org.slf4j:slf4j-api:1.7.5 - http://www.slf4j.org)
(MIT License) SLF4J LOG4J-12 Binding (org.slf4j:slf4j-log4j12:1.7.5 - http://www.slf4j.org)
(MIT License) pyrolite (org.spark-project:pyrolite:2.0.1 - http://pythonhosted.org/Pyro4/)
(MIT License) scopt (com.github.scopt:scopt_2.10:3.2.0 - https://github.com/scopt/scopt)
(MIT License) scopt (com.github.scopt:scopt_2.11:3.2.0 - https://github.com/scopt/scopt)
(The MIT License) Mockito (org.mockito:mockito-core:1.9.5 - http://www.mockito.org)
(MIT License) jquery (https://jquery.org/license/)
(MIT License) AnchorJS (https://github.com/bryanbraun/anchorjs)
(MIT License) graphlib-dot (https://github.com/cpettitt/graphlib-dot)
(MIT License) dagre-d3 (https://github.com/cpettitt/dagre-d3)
(MIT License) sorttable (https://github.com/stuartlangridge/sorttable)
(MIT License) boto (https://github.com/boto/boto/blob/develop/LICENSE)
(MIT License) datatables (http://datatables.net/license)
(MIT License) mustache (https://github.com/mustache/mustache/blob/master/LICENSE)
(MIT License) cookies (http://code.google.com/p/cookies/wiki/License)
(MIT License) blockUI (http://jquery.malsup.com/block/)
(MIT License) RowsGroup (http://datatables.net/license/mit)
(MIT License) jsonFormatter (http://www.jqueryscript.net/other/jQuery-Plugin-For-Pretty-JSON-Formatting-jsonFormatter.html)
62 changes: 60 additions & 2 deletions NOTICE
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,6 @@ Eclipse Public License 1.0

The following components are provided under the Eclipse Public License 1.0. See project link for details.

(Eclipse Public License - Version 1.0) mqtt-client (org.eclipse.paho:mqtt-client:0.4.0 - http://www.eclipse.org/paho/mqtt-client)
(Eclipse Public License v1.0) Eclipse JDT Core (org.eclipse.jdt:core:3.1.1 - http://www.eclipse.org/jdt/)

========================================================================
Expand Down Expand Up @@ -606,4 +605,63 @@ Vis.js uses and redistributes the following third-party libraries:

- keycharm
https://github.com/AlexDM0/keycharm
The MIT License
The MIT License

===============================================================================

The CSS style for the navigation sidebar of the documentation was originally
submitted by Óscar Nájera for the scikit-learn project. The scikit-learn project
is distributed under the 3-Clause BSD license.
===============================================================================

For CSV functionality:

/*
* Copyright 2014 Databricks
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

/*
* Copyright 2015 Ayasdi Inc
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/


===============================================================================
For dev/sparktestsupport/toposort.py:

Copyright 2014 True Blade Systems, Inc.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
20 changes: 15 additions & 5 deletions R/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,16 @@
# R on Spark

SparkR is an R package that provides a light-weight frontend to use Spark from R.
### Installing sparkR

Libraries of sparkR need to be created in `$SPARK_HOME/R/lib`. This can be done by running the script `$SPARK_HOME/R/install-dev.sh`.
By default the above script uses the system wide installation of R. However, this can be changed to any user installed location of R by setting the environment variable `R_HOME` the full path of the base directory where R is installed, before running install-dev.sh script.
Example:
```
# where /home/username/R is where R is installed and /home/username/R/bin contains the files R and RScript
export R_HOME=/home/username/R
./install-dev.sh
```

### SparkR development

Expand Down Expand Up @@ -30,7 +40,7 @@ To set other options like driver memory, executor memory etc. you can pass in th
If you wish to use SparkR from RStudio or other R frontends you will need to set some environment variables which point SparkR to your Spark installation. For example
```
# Set this to where Spark is installed
Sys.setenv(SPARK_HOME="/Users/shivaram/spark")
Sys.setenv(SPARK_HOME="/Users/username/spark")
# This line loads SparkR from the installed directory
.libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), .libPaths()))
library(SparkR)
Expand All @@ -41,7 +51,7 @@ sc <- sparkR.init(master="local")

The [instructions](https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark) for making contributions to Spark also apply to SparkR.
If you only make R file changes (i.e. no Scala changes) then you can just re-install the R package using `R/install-dev.sh` and test your changes.
Once you have made your changes, please include unit tests for them and run existing unit tests using the `run-tests.sh` script as described below.
Once you have made your changes, please include unit tests for them and run existing unit tests using the `R/run-tests.sh` script as described below.

#### Generating documentation

Expand All @@ -50,17 +60,17 @@ The SparkR documentation (Rd files and HTML files) are not a part of the source
### Examples, Unit tests

SparkR comes with several sample programs in the `examples/src/main/r` directory.
To run one of them, use `./bin/sparkR <filename> <args>`. For example:
To run one of them, use `./bin/spark-submit <filename> <args>`. For example:

./bin/sparkR examples/src/main/r/dataframe.R
./bin/spark-submit examples/src/main/r/dataframe.R

You can also run the unit-tests for SparkR by running (you need to install the [testthat](http://cran.r-project.org/web/packages/testthat/index.html) package first):

R -e 'install.packages("testthat", repos="http://cran.us.r-project.org")'
./R/run-tests.sh

### Running on YARN
The `./bin/spark-submit` and `./bin/sparkR` can also be used to submit jobs to YARN clusters. You will need to set YARN conf dir before doing so. For example on CDH you can run
The `./bin/spark-submit` can also be used to submit jobs to YARN clusters. You will need to set YARN conf dir before doing so. For example on CDH you can run
```
export YARN_CONF_DIR=/etc/hadoop/conf
./bin/spark-submit --master yarn examples/src/main/r/dataframe.R
Expand Down
6 changes: 6 additions & 0 deletions R/install-dev.bat
Original file line number Diff line number Diff line change
Expand Up @@ -25,3 +25,9 @@ set SPARK_HOME=%~dp0..
MKDIR %SPARK_HOME%\R\lib

R.exe CMD INSTALL --library="%SPARK_HOME%\R\lib" %SPARK_HOME%\R\pkg\

rem Zip the SparkR package so that it can be distributed to worker nodes on YARN
pushd %SPARK_HOME%\R\lib
%JAVA_HOME%\bin\jar.exe cfM "%SPARK_HOME%\R\lib\sparkr.zip" SparkR
popd

15 changes: 13 additions & 2 deletions R/install-dev.sh
Original file line number Diff line number Diff line change
Expand Up @@ -35,11 +35,22 @@ LIB_DIR="$FWDIR/lib"
mkdir -p $LIB_DIR

pushd $FWDIR > /dev/null
if [ ! -z "$R_HOME" ]
then
R_SCRIPT_PATH="$R_HOME/bin"
else
R_SCRIPT_PATH="$(dirname $(which R))"
fi
echo "USING R_HOME = $R_HOME"

# Generate Rd files if devtools is installed
Rscript -e ' if("devtools" %in% rownames(installed.packages())) { library(devtools); devtools::document(pkg="./pkg", roclets=c("rd")) }'
"$R_SCRIPT_PATH/"Rscript -e ' if("devtools" %in% rownames(installed.packages())) { library(devtools); devtools::document(pkg="./pkg", roclets=c("rd")) }'

# Install SparkR to $LIB_DIR
R CMD INSTALL --library=$LIB_DIR $FWDIR/pkg/
"$R_SCRIPT_PATH/"R CMD INSTALL --library=$LIB_DIR $FWDIR/pkg/

# Zip the SparkR package so that it can be distributed to worker nodes on YARN
cd $LIB_DIR
jar cfM "$LIB_DIR/sparkr.zip" SparkR

popd > /dev/null
Loading