Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
3208 commits
Select commit Hold shift + click to select a range
b1581ac
[SPARK-9854] [SQL] RuleExecutor.timeMap should be thread-safe
JoshRosen Aug 12, 2015
b85f9a2
[SPARK-8366] maxNumExecutorsNeeded should properly handle failed tasks
XuTingjun Aug 12, 2015
a807fcb
[SPARK-9806] [WEB UI] Don't share ReplayListenerBus between multiple …
Aug 12, 2015
4e3f4b9
[SPARK-9829] [WEBUI] Display the update value for peak execution memory
zsxwing Aug 12, 2015
bab8923
[SPARK-9426] [WEBUI] Job page DAG visualization is not shown
carsonwang Aug 12, 2015
5c99d8b
[SPARK-8798] [MESOS] Allow additional uris to be fetched with mesos
tnachen Aug 12, 2015
741a29f
[SPARK-9575] [MESOS] Add docuemntation around Mesos shuffle service.
tnachen Aug 12, 2015
9d08224
[SPARK-9182] [SQL] Filters are not passed through to jdbc source
yjshen Aug 12, 2015
3ecb379
[SPARK-9407] [SQL] Relaxes Parquet ValidTypeMap to allow ENUM predica…
liancheng Aug 12, 2015
2e68066
[SPARK-8625] [CORE] Propagate user exceptions in tasks back to driver
tomwhite Aug 12, 2015
be5d191
[SPARK-9795] Dynamic allocation: avoid double counting when killing s…
Aug 12, 2015
66d87c1
[SPARK-7583] [MLLIB] User guide update for RegexTokenizer
hhbyyh Aug 12, 2015
e011079
[SPARK-9747] [SQL] Avoid starving an unsafe operator in aggregation
Aug 12, 2015
57ec27d
[SPARK-9804] [HIVE] Use correct value for isSrcLocal parameter.
Aug 12, 2015
70fe558
[SPARK-9847] [ML] Modified copyValues to distinguish between default,…
jkbradley Aug 12, 2015
60103ec
[SPARK-9726] [PYTHON] PySpark DF join no longer accepts on=None
btashton Aug 12, 2015
762bacc
[SPARK-9766] [ML] [PySpark] check and add miss docs for PySpark ML
yanboliang Aug 12, 2015
551def5
[SPARK-9789] [ML] Added logreg threshold param back
jkbradley Aug 12, 2015
6f60298
[SPARK-8967] [DOC] add Since annotation
mengxr Aug 12, 2015
a17384f
[SPARK-9907] [SQL] Python crc32 is mistakenly calling md5
rxin Aug 12, 2015
738f353
[SPARK-9092] Fixed incompatibility when both num-executors and dynami…
Aug 12, 2015
ab7e721
[SPARK-9826] [CORE] Fix cannot use custom classes in log4j.properties
michellemay Aug 12, 2015
7035d88
[SPARK-9894] [SQL] Json writer should handle MapData.
yhuai Aug 12, 2015
caa14d9
[SPARK-9913] [MLLIB] LDAUtils should be private
mengxr Aug 12, 2015
6e409bc
[SPARK-9909] [ML] [TRIVIAL] move weightCol to shared params
holdenk Aug 12, 2015
e6aef55
[SPARK-9912] [MLLIB] QRDecomposition should use QType and RType for t…
mengxr Aug 13, 2015
fc1c7fd
[SPARK-9915] [ML] stopWords should use StringArrayParam
mengxr Aug 13, 2015
660e6dc
[SPARK-9449] [SQL] Include MetastoreRelation's inputFiles
marmbrus Aug 13, 2015
8ce6096
[SPARK-9780] [STREAMING] [KAFKA] prevent NPE if KafkaRDD instantiation …
koeninger Aug 13, 2015
0d1d146
[SPARK-9724] [WEB UI] Avoid unnecessary redirects in the Spark Web UI.
Aug 13, 2015
f4bc01f
[SPARK-9855] [SPARKR] Add expression functions into SparkR whose para…
yu-iskw Aug 13, 2015
7b13ed2
[SPARK-9870] Disable driver UI and Master REST server in SparkSubmitS…
JoshRosen Aug 13, 2015
7c35746
[SPARK-9827] [SQL] fix fd leak in UnsafeRowSerializer
Aug 13, 2015
4413d08
[SPARK-9908] [SQL] When spark.sql.tungsten.enabled is false, broadcas…
yhuai Aug 13, 2015
d2d5e7f
[SPARK-9704] [ML] Made ProbabilisticClassifier, Identifiable, VectorU…
jkbradley Aug 13, 2015
d7053be
[SPARK-9903] [MLLIB] skip local processing in PrefixSpan if there are…
mengxr Aug 13, 2015
2fb4901
[SPARK-9916] [BUILD] [SPARKR] removed left-over sparkr.zip copy/creat…
brkyvz Aug 13, 2015
2278219
[SPARK-9920] [SQL] The simpleString of TungstenAggregate does not sho…
yhuai Aug 13, 2015
a8ab263
[SPARK-9832] [SQL] add a thread-safe lookup for BytesToBytseMap
Aug 13, 2015
5fc058a
[SPARK-9917] [ML] add getMin/getMax and doc for originalMin/origianlM…
mengxr Aug 13, 2015
df54389
[SPARK-8922] [DOCUMENTATION, MLLIB] Add @since tags to mllib.evaluation
mosessky Aug 13, 2015
d7eb371
[SPARK-9914] [ML] define setters explicitly for Java and use setParam…
mengxr Aug 13, 2015
d0b1891
[SPARK-9927] [SQL] Revert 8049 since it's pushing wrong filter down
yjshen Aug 13, 2015
68f9957
[SPARK-9918] [MLLIB] remove runs from k-means and rename epsilon to tol
mengxr Aug 13, 2015
84a2791
[SPARK-9885] [SQL] Also pass barrierPrefixes and sharedPrefixes to Is…
yhuai Aug 13, 2015
6993031
[SPARK-9757] [SQL] Fixes persistence of Parquet relation with decimal…
liancheng Aug 13, 2015
2932e25
[SPARK-9073] [ML] spark.ml Models copy() should call setParent when t…
Lewuathe Aug 13, 2015
7a539ef
[SPARK-8965] [DOCS] Add ml-guide Python Example: Estimator, Transform…
Rosstin Aug 13, 2015
4b70798
[MINOR] [ML] change MultilayerPerceptronClassifierModel to Multilayer…
yanboliang Aug 13, 2015
65fec79
[MINOR] [DOC] fix mllib pydoc warnings
mengxr Aug 13, 2015
8815ba2
[SPARK-9649] Fix MasterSuite, third time's a charm
Aug 13, 2015
864de8e
[SPARK-9661] [MLLIB] [ML] Java compatibility
MechCoder Aug 13, 2015
a8d2f4c
[SPARK-9942] [PYSPARK] [SQL] ignore exceptions while try to import pa…
Aug 13, 2015
c2520f5
[SPARK-9935] [SQL] EqualNotNull not processed in ORC
HyukjinKwon Aug 13, 2015
6c5858b
[SPARK-9922] [ML] rename StringIndexerReverse to IndexToString
mengxr Aug 13, 2015
693949b
[SPARK-8976] [PYSPARK] fix open mode in python3
Aug 14, 2015
c50f97d
[SPARK-9943] [SQL] deserialized UnsafeHashedRelation should be serial…
Aug 14, 2015
8187b3a
[SPARK-9580] [SQL] Replace singletons in SQL tests
Aug 14, 2015
bd35385
[SPARK-9945] [SQL] pageSize should be calculated from executor.memory
Aug 14, 2015
7c7c752
[MINOR] [SQL] Remove canEqual in Row
viirya Aug 14, 2015
c8677d7
[SPARK-9958] [SQL] Make HiveThriftServer2Listener thread-safe and upd…
zsxwing Aug 14, 2015
a0e1abb
[SPARK-9661] [MLLIB] minor clean-up of SPARK-9661
mengxr Aug 14, 2015
7ecf0c4
[SPARK-9956] [ML] Make trees work with one-category features
jkbradley Aug 14, 2015
a7317cc
[SPARK-8744] [ML] Add a public constructor to StringIndexer
holdenk Aug 14, 2015
34d610b
[SPARK-9929] [SQL] support metadata in withColumn
cloud-fan Aug 14, 2015
57c2d08
[SPARK-9923] [CORE] ShuffleMapStage.numAvailableOutputs should be an …
Aug 14, 2015
3bc5528
[SPARK-9946] [SPARK-9589] [SQL] fix NPE and thread-safety in TaskMemo…
Aug 14, 2015
ece0056
[SPARK-9561] Re-enable BroadcastJoinSuite
Aug 14, 2015
ffa05c8
[SPARK-9828] [PYSPARK] Mutable values should not be default arguments
MechCoder Aug 14, 2015
33bae58
[SPARK-9809] Task crashes because the internal accumulators are not p…
carsonwang Aug 14, 2015
6518ef6
[SPARK-9948] Fix flaky AccumulatorSuite - internal accumulators
Aug 14, 2015
9407baa
[SPARK-9877] [CORE] Fix StandaloneRestServer NPE when submitting appl…
jerryshao Aug 14, 2015
11ed2b1
[SPARK-9978] [PYSPARK] [SQL] fix Window.orderBy and doc of ntile()
Aug 14, 2015
2a6590e
[SPARK-9981] [ML] Made labels public for StringIndexerModel
jkbradley Aug 14, 2015
1150a19
[SPARK-8670] [SQL] Nested columns can't be referenced in pyspark
cloud-fan Aug 14, 2015
f3bfb71
[SPARK-9966] [STREAMING] Handle couple of corner cases in PIDRateEsti…
tdas Aug 14, 2015
18a761e
[SPARK-9968] [STREAMING] Reduced time spent within synchronized block…
tdas Aug 14, 2015
932b24f
[SPARK-9949] [SQL] Fix TakeOrderedAndProject's output.
yhuai Aug 15, 2015
e5fd604
[SPARK-9934] Deprecate NIO ConnectionManager.
rxin Aug 15, 2015
37586e5
[HOTFIX] fix duplicated braces
Aug 15, 2015
ec29f20
[SPARK-9634] [SPARK-9323] [SQL] cleanup unnecessary Aliases in Logica…
cloud-fan Aug 15, 2015
6c4fdbe
[SPARK-8887] [SQL] Explicit define which data types can be used as dy…
yjshen Aug 15, 2015
609ce3c
[SPARK-9984] [SQL] Create local physical operator interface.
rxin Aug 15, 2015
71a3af8
[SPARK-9960] [GRAPHX] sendMessage type fix in LabelPropagation.scala
blindFS Aug 15, 2015
7c1e568
[SPARK-9725] [SQL] fix serialization of UTF8String across different JVM
Aug 15, 2015
a85fb6c
[SPARK-9980] [BUILD] Fix SBT publishLocal error due to invalid charac…
hvanhovell Aug 15, 2015
5705672
[SPARK-9955] [SQL] correct error message for aggregate
cloud-fan Aug 15, 2015
1db7179
[SPARK-9805] [MLLIB] [PYTHON] [STREAMING] Added _eventually for ml st…
jkbradley Aug 16, 2015
182f9b7
[SPARK-9973] [SQL] Correct in-memory columnar buffer size
viper-kun Aug 16, 2015
5f9ce73
[SPARK-8844] [SPARKR] head/collect is broken in SparkR.
Aug 16, 2015
cf01607
[SPARK-10008] Ensure shuffle locality doesn't take precedence over na…
mateiz Aug 16, 2015
ae2370e
[SPARK-10005] [SQL] Fixes schema merging for nested structs
liancheng Aug 16, 2015
26e7605
[SPARK-9871] [SPARKR] Add expression functions into SparkR which have…
yu-iskw Aug 17, 2015
3ff81ad
[SPARK-9199] [CORE] Upgrade Tachyon version from 0.7.0 -> 0.7.1.
calvinjia Aug 17, 2015
f7efda3
[SPARK-9959] [MLLIB] Association Rules Java Compatibility
Aug 17, 2015
76c155d
[SPARK-7837] [SQL] Avoids double closing output writers when commitTa…
liancheng Aug 17, 2015
ed092a0
[SPARK-9924] [WEB UI] Don't schedule checkForLogs while some of them …
Aug 17, 2015
f68d024
[SPARK-7736] [CORE] [YARN] Make pyspark fail YARN app on failure.
Aug 17, 2015
a4acdab
[SPARK-9950] [SQL] Wrong Analysis Error for grouping/aggregating on s…
cloud-fan Aug 17, 2015
f10660f
[SPARK-10036] [SQL] Load JDBC driver in DataFrameReader.jdbc and Data…
zsxwing Aug 17, 2015
b265e28
[SPARK-9526] [SQL] Utilize randomized tests to reveal potential bugs …
yjshen Aug 17, 2015
772e7c1
[SPARK-9592] [SQL] Fix Last function implemented based on AggregateEx…
yhuai Aug 17, 2015
fdaf17f
[SPARK-10068] [MLLIB] Adds links to MLlib types, algos, utilities lis…
Aug 17, 2015
088b11e
[SPARK-8920] [MLLIB] Add @since tags to mllib.linalg
Aug 17, 2015
52ae952
[SPARK-9974] [BUILD] [SQL] Makes sure com.twitter:parquet-hadoop-bund…
liancheng Aug 18, 2015
0076e82
[SPARK-9768] [PYSPARK] [ML] Add Python API and user guide for ml.feat…
yanboliang Aug 18, 2015
18523c1
SPARK-8916 [Documentation, MLlib] Add @since tags to mllib.regression
prayagchandran Aug 18, 2015
0b6b017
[SPARK-9898] [MLLIB] Prefix Span user guide
Aug 18, 2015
f9d1a92
[SPARK-7707] User guide and example code for KernelDensity
sryza Aug 18, 2015
c90c605
[SPARK-9902] [MLLIB] Add Java and Python examples to user guide for 1…
Aug 18, 2015
ee093c8
[SPARK-10059] [YARN] Explicitly add JSP dependencies for tests.
Aug 18, 2015
e290029
[SPARK-7808] [ML] add package doc for ml.feature
mengxr Aug 18, 2015
a091031
[MINOR] Format the comment of `translate` at `functions.scala`
yu-iskw Aug 18, 2015
5af3838
[SPARK-10038] [SQL] fix bug in generated unsafe projection when there…
Aug 18, 2015
dd0614f
[SPARK-10076] [ML] make MultilayerPerceptronClassifier layers and wei…
yanboliang Aug 18, 2015
c34e9ff
[MINOR] fix the comments in IndexShuffleBlockResolver
CodingCat Aug 18, 2015
5723d26
[SPARK-8118] [SQL] Redirects Parquet JUL logger via SLF4J
liancheng Aug 18, 2015
1968276
[SPARK-10007] [SPARKR] Update `NAMESPACE` file in SparkR for simple p…
yu-iskw Aug 18, 2015
354f458
[SPARK-9028] [ML] Add CountVectorizer as an estimator to generate Cou…
hhbyyh Aug 18, 2015
c1840a8
[SPARK-7736] [CORE] Fix a race introduced in PythonRunner.
Aug 18, 2015
f5ea391
[SPARK-9900] [MLLIB] User guide for Association Rules
Aug 18, 2015
f4fa61e
[SPARK-10029] [MLLIB] [DOC] Add Python examples for mllib IsotonicReg…
yanboliang Aug 18, 2015
747c2ba
[SPARK-10032] [PYSPARK] [DOC] Add Python example for mllib LDAModel u…
yanboliang Aug 18, 2015
8bae901
[SPARK-10085] [MLLIB] [DOCS] removed unnecessary numpy array import
stared Aug 18, 2015
bf1d661
[SPARK-9574] [STREAMING] Remove unnecessary contents of spark-streami…
zsxwing Aug 18, 2015
80cb25b
[SPARK-10080] [SQL] Fix binary incompatibility for $ column interpola…
marmbrus Aug 18, 2015
9b731fa
[SPARK-9782] [YARN] Support YARN application tags via SparkConf
dennishuo Aug 18, 2015
fa41e02
[SPARK-10089] [SQL] Add missing golden files.
Aug 18, 2015
492ac1f
[SPARK-10088] [SQL] Add support for "stored as avro" in HiveQL parser.
Aug 18, 2015
1dbffba
[SPARK-8924] [MLLIB, DOCUMENTATION] Added @since tags to mllib.tree
BryanCutler Aug 18, 2015
c635a16
[SPARK-10012] [ML] Missing test case for Params#arrayLengthGt
Lewuathe Aug 18, 2015
9108eff
[SPARK-10098] [STREAMING] [TEST] Cleanup active context after test in…
tdas Aug 19, 2015
badf7fa
[SPARK-8473] [SPARK-9889] [ML] User guide and example code for DCT
Aug 19, 2015
04e0fea
Bump SparkR version string to 1.5.0
falaki Aug 19, 2015
1f89029
[SPARK-9969] [YARN] Remove old MR classpath API support
jerryshao Aug 19, 2015
b4b35f1
[SPARKR] [MINOR] Get rid of a long line warning
yu-iskw Aug 19, 2015
1aeae05
[SPARK-10072] [STREAMING] BlockGenerator can deadlock when the queue …
tdas Aug 19, 2015
90273ef
[SPARK-10102] [STREAMING] Fix a race condition that startReceiver may…
zsxwing Aug 19, 2015
a5b5b93
[SPARK-9939] [SQL] Resorts to Java process API in CliSuite, HiveSpark…
liancheng Aug 19, 2015
bf32c1f
[SPARK-10075] [SPARKR] Add `when` expressino function in SparkR
yu-iskw Aug 19, 2015
270ee67
[SPARK-10095] [SQL] use public API of BigInteger
Aug 19, 2015
1ff0580
[SPARK-10093] [SPARK-10096] [SQL] Avoid transformation on executors &…
rxin Aug 19, 2015
de32238
[SPARK-9705] [DOC] fix docs about Python version
Aug 19, 2015
1c843e2
[SPARK-9508] GraphX Pregel docs update with new Pregel code
avulanov Aug 19, 2015
010b03e
[SPARK-9952] Fix N^2 loop when DAGScheduler.getPreferredLocsInternal …
JoshRosen Aug 19, 2015
bc9a0e0
[SPARK-9967] [SPARK-10099] [STREAMING] Renamed conf spark.streaming.b…
tdas Aug 19, 2015
b23c4d3
Fix Broken Link
bllchmbrs Aug 19, 2015
f141efe
[SPARK-10070] [DOCS] Remove Guava dependencies in user guides
srowen Aug 19, 2015
865a3df
[DOCS] [SQL] [PYSPARK] Fix typo in ntile function
moutai Aug 19, 2015
ba2a07e
[SPARK-9977] [DOCS] Update documentation for StringIndexer
Lewuathe Aug 19, 2015
3d16a54
[SPARK-8949] Print warnings when using preferred locations feature
darkjh Aug 19, 2015
39e4ebd
[SPARK-10060] [ML] [DOC] spark.ml DecisionTree user guide
jkbradley Aug 19, 2015
802b5b8
[SPARK-10084] [MLLIB] [DOC] Add Python example for mllib FP-growth us…
yanboliang Aug 19, 2015
f3e1779
[SPARK-5754] [YARN] Spark/Yarn/Windows driver/executor escaping Fix
cbvoxel Aug 19, 2015
2fcb9cb
[SPARK-9856] [SPARKR] Add expression functions into SparkR whose para…
yu-iskw Aug 19, 2015
5fd53c6
[SPARK-9833] [YARN] Add options to disable delegation token retrieval.
Aug 19, 2015
28a9846
[SPARK-10097] Adds `shouldMaximize` flag to `ml.evaluation.Evaluator`
Aug 19, 2015
d898c33
[SPARK-10106] [SPARKR] Add `ifelse` Column function to SparkR
yu-iskw Aug 19, 2015
5b62bef
[SPARK-8918] [MLLIB] [DOC] Add @since tags to mllib.clustering
mengxr Aug 19, 2015
f3391ff
[SPARK-8889] [CORE] Fix for OOM for graph creation
rekhajoshm Aug 19, 2015
e05da5c
[SPARK-10107] [SQL] fix NPE in format_number
Aug 19, 2015
0888736
[SPARK-10073] [SQL] Python withColumn should replace the old column
Aug 19, 2015
21bdbe9
[SPARK-9627] [SQL] Stops using Scala runtime reflection in Dictionary…
liancheng Aug 19, 2015
1f4c4fe
[SPARK-10090] [SQL] fix decimal scale of division
Aug 19, 2015
f3ff4c4
[SPARK-9899] [SQL] Disables customized output committer when speculat…
liancheng Aug 19, 2015
373a376
[SPARK-10083] [SQL] CaseWhen should support type coercion of DecimalT…
adrian-wang Aug 19, 2015
e0dd130
[SPARK-10119] [CORE] Fix isDynamicAllocationEnabled when config is ex…
Aug 19, 2015
b0dbaec
[SPARK-6489] [SQL] add column pruning for Generate
cloud-fan Aug 19, 2015
8e0a072
[SPARK-9895] User Guide for RFormula Feature Transformer
ericl Aug 19, 2015
ba5f7e1
[SPARK-10035] [SQL] Parquet filters does not process EqualNullSafe fi…
HyukjinKwon Aug 20, 2015
2f2686a
[SPARK-9242] [SQL] Audit UDAF interface.
rxin Aug 20, 2015
1f29d50
[SPARK-9812] [STREAMING] Fix Python 3 compatibility issue in PySpark …
zsxwing Aug 20, 2015
affc8a8
[SPARK-10125] [STREAMING] Fix a potential deadlock in JobGenerator.stop
zsxwing Aug 20, 2015
73431d8
[SPARK-10124] [MESOS] Fix removing queued driver in mesos cluster mode.
tnachen Aug 20, 2015
b762f99
[SPARK-10128] [STREAMING] Used correct classloader to deserialize WAL…
tdas Aug 20, 2015
43e0135
[SPARK-10092] [SQL] Multi-DB support follow up.
yhuai Aug 20, 2015
b4f4e91
[SPARK-10100] [SQL] Eliminate hash table lookup if there is no groupi…
rxin Aug 20, 2015
52c6053
[MINOR] [SQL] Fix sphinx warnings in PySpark SQL
MechCoder Aug 20, 2015
39e91fe
[SPARK-9982] [SPARKR] SparkR DataFrame fail to return data of Decimal…
ashkurenko Aug 20, 2015
85f9a61
[SPARK-10136] [SQL] Fixes Parquet support for Avro array of primitive…
liancheng Aug 20, 2015
12de348
[SPARK-10126] [PROJECT INFRA] Fix typo in release-build.sh which brok…
JoshRosen Aug 20, 2015
907df2f
[SQL] [MINOR] remove unnecessary class
cloud-fan Aug 20, 2015
2a3d98a
[SPARK-10138] [ML] move setters to MultilayerPerceptronClassifier and…
mengxr Aug 20, 2015
7cfc075
[SPARK-10108] Add since tags to mllib.feature
MechCoder Aug 20, 2015
eaafe13
[SPARK-9245] [MLLIB] LDA topic assignments
jkbradley Aug 20, 2015
afe9f03
[SPARK-9400] [SQL] codegen for StringLocate
tarekbecker Aug 20, 2015
cdd9a2b
[SPARK-10140] [DOC] add target fields to @Since
mengxr Aug 21, 2015
dcfe0c5
[SPARK-9846] [DOCS] User guide for Multilayer Perceptron Classifier
avulanov Aug 21, 2015
bb220f6
[SPARK-10040] [SQL] Use batch insert for JDBC writing
viirya Aug 21, 2015
708036c
[SPARK-9439] [YARN] External shuffle service robust to NM restarts us…
squito Aug 21, 2015
3c462f5
[SPARK-10130] [SQL] type coercion for IF should have children resolve…
adrian-wang Aug 21, 2015
d89cc38
[SPARK-10122] [PYSPARK] [STREAMING] Fix getOffsetRanges bug in PySpar…
jerryshao Aug 21, 2015
f5b028e
[SPARK-9864] [DOC] [MLlib] [SQL] Replace since in scaladoc to Since a…
MechCoder Aug 21, 2015
e335509
[SPARK-10143] [SQL] Use parquet's block size (row group size) setting…
yhuai Aug 21, 2015
f01c422
[SPARK-10163] [ML] Allow single-category features for GBT models
jkbradley Aug 21, 2015
630a994
[SPARK-9893] User guide with Java test suite for VectorSlicer
yinxusen Aug 21, 2015
46fcb9e
Update programming-guide.md
yosssi Aug 22, 2015
90cb9f0
[SPARK-9401] [SQL] Fully implement code generation for ConcatWs
yjshen Aug 22, 2015
623c675
Update streaming-programming-guide.md
yosssi Aug 23, 2015
c6df5f6
[SPARK-10148] [STREAMING] Display active and inactive receiver number…
zsxwing Aug 24, 2015
b963c19
[SPARK-10164] [MLLIB] Fixed GMM distributed decomposition bug
jkbradley Aug 24, 2015
053d94f
[SPARK-10142] [STREAMING] Made python checkpoint recovery handle non-…
tdas Aug 24, 2015
4e0395d
[SPARK-10168] [STREAMING] Fix the issue that maven publishes wrong ar…
zsxwing Aug 24, 2015
7478c8b
[SPARK-9791] [PACKAGE] Change private class to private class to preve…
tdas Aug 24, 2015
9ce0c7a
[SPARK-7710] [SPARK-7998] [DOCS] Docs for DataFrameStatFunctions
brkyvz Aug 24, 2015
662bb96
[SPARK-10144] [UI] Actually show peak execution memory by default
Aug 24, 2015
a2f4cdc
[SPARK-8580] [SQL] Refactors ParquetHiveCompatibilitySuite and adds m…
liancheng Aug 24, 2015
cb2d2e1
[SPARK-9758] [TEST] [SQL] Compilation issue for hive test / wrong pac…
srowen Aug 24, 2015
13db11c
[SPARK-10061] [DOC] ML ensemble docs
jkbradley Aug 24, 2015
d7b4c09
[SPARK-10190] Fix NPE in CatalystTypeConverters Decimal toScala conve…
JoshRosen Aug 24, 2015
2bf338c
[SPARK-10165] [SQL] Await child resolution in ResolveFunctions
marmbrus Aug 25, 2015
6511bf5
[SPARK-10118] [SPARKR] [DOCS] Improve SparkR API docs for 1.5 release
yu-iskw Aug 25, 2015
642c43c
[SQL] [MINOR] [DOC] Clarify docs for inferring DataFrame from RDD of …
Aug 25, 2015
a0c0aae
[SPARK-10121] [SQL] Thrift server always use the latest class loader …
yhuai Aug 25, 2015
5175ca0
[SPARK-10178] [SQL] HiveComparisionTest should print out dependent ta…
marmbrus Aug 25, 2015
d9c25de
[SPARK-9786] [STREAMING] [KAFKA] fix backpressure so it works with defa…
koeninger Aug 25, 2015
f023aa2
[SPARK-10137] [STREAMING] Avoid to restart receivers if scheduleRecei…
zsxwing Aug 25, 2015
df7041d
[SPARK-10196] [SQL] Correctly saving decimals in internal rows to JSON.
yhuai Aug 25, 2015
bf03fe6
[SPARK-10136] [SQL] A more robust fix for SPARK-10136
liancheng Aug 25, 2015
82268f0
[SPARK-9293] [SPARK-9813] Analysis should check that set operations a…
JoshRosen Aug 25, 2015
d4549fe
[SPARK-10214] [SPARKR] [DOCS] Improve SparkR Column, DataFrame API docs
yu-iskw Aug 25, 2015
57b960b
[SPARK-6196] [BUILD] Remove MapR profiles in favor of hadoop-provided
srowen Aug 25, 2015
1fc3758
[SPARK-10210] [STREAMING] Filter out non-existent blocks before creat…
tdas Aug 25, 2015
2f493f7
[SPARK-10177] [SQL] fix reading Timestamp in parquet from Hive
Aug 25, 2015
7bc9a8c
[SPARK-10195] [SQL] Data sources Filter should not expose internal types
JoshRosen Aug 25, 2015
0e6368f
[SPARK-10197] [SQL] Add null check in wrapperFor (inside HiveInspecto…
yhuai Aug 25, 2015
5c14890
[DOC] add missing parameters in SparkContext.scala for scala doc
liyezhang556520 Aug 25, 2015
7f1e507
Fixed a typo in DAGScheduler.
zzvara Aug 25, 2015
69c9c17
[SPARK-9613] [CORE] Ban use of JavaConversions and migrate all existi…
srowen Aug 25, 2015
5c08c86
[SPARK-10198] [SQL] Turn off partition verification by default
marmbrus Aug 25, 2015
b37f0cc
[SPARK-8531] [ML] Update ML user guide for MinMaxScaler
hhbyyh Aug 25, 2015
881208a
[SPARK-10230] [MLLIB] Rename optimizeAlpha to optimizeDocConcentration
Aug 25, 2015
16a2be1
[SPARK-10231] [MLLIB] update @Since annotation for mllib.classification
mengxr Aug 25, 2015
71a138c
[SPARK-10048] [SPARKR] Support arbitrary nested Java array in serde.
Aug 25, 2015
c0e9ff1
[SPARK-9800] Adds docs for GradientDescent$.runMiniBatchSGD alias
Aug 25, 2015
c619c75
[SPARK-10237] [MLLIB] update since versions in mllib.fpm
mengxr Aug 25, 2015
9205907
[SPARK-9797] [MLLIB] [DOC] StreamingLinearRegressionWithSGD.setConver…
Aug 25, 2015
00ae4be
[SPARK-10239] [SPARK-10244] [MLLIB] update since versions in mllib.pm…
mengxr Aug 25, 2015
ec89bd8
[SPARK-10245] [SQL] Fix decimal literals with precision < scale
Aug 25, 2015
7467b52
[SPARK-10215] [SQL] Fix precision of division (follow the rule in Hive)
Aug 25, 2015
125205c
[SPARK-9888] [MLLIB] User guide for new LDA features
Aug 26, 2015
8668ead
[SPARK-10233] [MLLIB] update since version in mllib.evaluation
mengxr Aug 26, 2015
ab431f8
[SPARK-10238] [MLLIB] update since versions in mllib.linalg
mengxr Aug 26, 2015
c3a5484
[SPARK-10240] [SPARK-10242] [MLLIB] update since versions in mlilb.ra…
mengxr Aug 26, 2015
d703372
[SPARK-10234] [MLLIB] update since version in mllib.clustering
mengxr Aug 26, 2015
fb7e12f
[SPARK-10243] [MLLIB] update since versions in mllib.tree
mengxr Aug 26, 2015
4657fa1
[SPARK-10235] [MLLIB] update since versions in mllib.regression
mengxr Aug 26, 2015
321d775
[SPARK-10236] [MLLIB] update since versions in mllib.feature
mengxr Aug 26, 2015
75d4773
[SPARK-9316] [SPARKR] Add support for filtering using `[` (synonym fo…
felixcheung Aug 26, 2015
bb16405
Closes #8443
rxin Aug 26, 2015
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
*.iml
*.iws
*.pyc
*.pyo
.idea/
.idea_modules/
build/*.jar
Expand Down Expand Up @@ -62,6 +63,10 @@ ec2/lib/
rat-results.txt
scalastyle.txt
scalastyle-output.xml
R-unit-tests.log
R/unit-tests.out
python/lib/pyspark.zip
lint-r-report.log

# For Hive
metastore_db/
Expand Down
30 changes: 30 additions & 0 deletions .rat-excludes
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
target
cache
.gitignore
.gitattributes
.project
Expand All @@ -14,20 +15,28 @@ TAGS
RELEASE
control
docs
docker.properties.template
fairscheduler.xml.template
spark-defaults.conf.template
log4j.properties
log4j.properties.template
metrics.properties
metrics.properties.template
slaves
slaves.template
spark-env.sh
spark-env.cmd
spark-env.sh.template
log4j-defaults.properties
log4j-defaults-repl.properties
bootstrap-tooltip.js
jquery-1.11.1.min.js
d3.min.js
dagre-d3.min.js
graphlib-dot.min.js
sorttable.js
vis.min.js
vis.min.css
.*avsc
.*txt
.*json
Expand Down Expand Up @@ -65,3 +74,24 @@ logs
.*scalastyle-output.xml
.*dependency-reduced-pom.xml
known_translations
json_expectation
local-1422981759269/*
local-1422981780767/*
local-1425081759269/*
local-1426533911241/*
local-1426633911242/*
local-1430917381534/*
local-1430917381535_1
local-1430917381535_2
DESCRIPTION
NAMESPACE
test_support/*
.*Rd
help/*
html/*
INDEX
.lintr
gen-java.*
.*avpr
org.apache.spark.sql.sources.DataSourceRegister
.*parquet
22 changes: 13 additions & 9 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,16 @@
## Contributing to Spark

Contributions via GitHub pull requests are gladly accepted from their original
author. Along with any pull requests, please state that the contribution is
your original work and that you license the work to the project under the
project's open source license. Whether or not you state this explicitly, by
submitting any copyrighted material via pull request, email, or other means
you agree to license the material under the project's open source license and
warrant that you have the legal authority to do so.
*Before opening a pull request*, review the
[Contributing to Spark wiki](https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark).
It lists steps that are required before creating a PR. In particular, consider:

- Is the change important and ready enough to ask the community to spend time reviewing?
- Have you searched for existing, related JIRAs and pull requests?
- Is this a new feature that can stand alone as a package on http://spark-packages.org ?
- Is the change being proposed clearly explained and motivated?

Please see the [Contributing to Spark wiki page](https://cwiki.apache.org/SPARK/Contributing+to+Spark)
for more information.
When you contribute code, you affirm that the contribution is your original work and that you
license the work to the project under the project's open source license. Whether or not you
state this explicitly, by submitting any copyrighted material via pull request, email, or
other means you agree to license the material under the project's open source license and
warrant that you have the legal authority to do so.
114 changes: 112 additions & 2 deletions LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -643,6 +643,36 @@ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

========================================================================
For d3 (core/src/main/resources/org/apache/spark/ui/static/d3.min.js):
========================================================================

Copyright (c) 2010-2015, Michael Bostock
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.

* The name Michael Bostock may not be used to endorse or promote products
derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL MICHAEL BOSTOCK BE LIABLE FOR ANY DIRECT,
INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

========================================================================
For Scala Interpreter classes (all .scala files in repl/src/main/scala
Expand Down Expand Up @@ -771,6 +801,22 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

========================================================================
For TestTimSort (core/src/test/java/org/apache/spark/util/collection/TestTimSort.java):
========================================================================
Copyright (C) 2015 Stijn de Gouw

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

========================================================================
For LimitedInputStream
Expand All @@ -790,6 +836,68 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

========================================================================
For vis.js (core/src/main/resources/org/apache/spark/ui/static/vis.min.js):
========================================================================
Copyright (C) 2010-2015 Almende B.V.

Vis.js is dual licensed under both

* The Apache 2.0 License
http://www.apache.org/licenses/LICENSE-2.0

and

* The MIT License
http://opensource.org/licenses/MIT

Vis.js may be distributed under either license.

========================================================================
For dagre-d3 (core/src/main/resources/org/apache/spark/ui/static/dagre-d3.min.js):
========================================================================
Copyright (c) 2013 Chris Pettitt

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

========================================================================
For graphlib-dot (core/src/main/resources/org/apache/spark/ui/static/graphlib-dot.min.js):
========================================================================
Copyright (c) 2012-2013 Chris Pettitt

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

========================================================================
BSD-style licenses
Expand All @@ -798,7 +906,8 @@ BSD-style licenses
The following components are provided under a BSD-style license. See project link for details.

(BSD 3 Clause) core (com.github.fommil.netlib:core:1.1.2 - https://github.com/fommil/netlib-java/core)
(BSD 3-clause style license) jblas (org.jblas:jblas:1.2.3 - http://jblas.org/)
(BSD 3 Clause) JPMML-Model (org.jpmml:pmml-model:1.1.15 - https://github.com/jpmml/jpmml-model)
(BSD 3-clause style license) jblas (org.jblas:jblas:1.2.4 - http://jblas.org/)
(BSD License) AntLR Parser Generator (antlr:antlr:2.7.7 - http://www.antlr.org/)
(BSD License) Javolution (javolution:javolution:5.5.1 - http://javolution.org)
(BSD licence) ANTLR ST4 4.0.4 (org.antlr:ST4:4.0.4 - http://www.stringtemplate.org)
Expand Down Expand Up @@ -839,5 +948,6 @@ The following components are provided under the MIT License. See project link fo
(MIT License) SLF4J LOG4J-12 Binding (org.slf4j:slf4j-log4j12:1.7.5 - http://www.slf4j.org)
(MIT License) pyrolite (org.spark-project:pyrolite:2.0.1 - http://pythonhosted.org/Pyro4/)
(MIT License) scopt (com.github.scopt:scopt_2.10:3.2.0 - https://github.com/scopt/scopt)
(The MIT License) Mockito (org.mockito:mockito-all:1.8.5 - http://www.mockito.org)
(The MIT License) Mockito (org.mockito:mockito-core:1.9.5 - http://www.mockito.org)
(MIT License) jquery (https://jquery.org/license/)
(MIT License) AnchorJS (https://github.com/bryanbraun/anchorjs)
6 changes: 6 additions & 0 deletions R/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
*.o
*.so
*.Rd
lib
pkg/man
pkg/html
12 changes: 12 additions & 0 deletions R/DOCUMENTATION.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# SparkR Documentation

SparkR documentation is generated using in-source comments annotated using using
`roxygen2`. After making changes to the documentation, to generate man pages,
you can run the following from an R console in the SparkR home directory

library(devtools)
devtools::document(pkg="./pkg", roclets=c("rd"))

You can verify if your changes are good by running

R CMD check pkg/
67 changes: 67 additions & 0 deletions R/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# R on Spark

SparkR is an R package that provides a light-weight frontend to use Spark from R.

### SparkR development

#### Build Spark

Build Spark with [Maven](http://spark.apache.org/docs/latest/building-spark.html#building-with-buildmvn) and include the `-Psparkr` profile to build the R package. For example to use the default Hadoop versions you can run
```
build/mvn -DskipTests -Psparkr package
```

#### Running sparkR

You can start using SparkR by launching the SparkR shell with

./bin/sparkR

The `sparkR` script automatically creates a SparkContext with Spark by default in
local mode. To specify the Spark master of a cluster for the automatically created
SparkContext, you can run

./bin/sparkR --master "local[2]"

To set other options like driver memory, executor memory etc. you can pass in the [spark-submit](http://spark.apache.org/docs/latest/submitting-applications.html) arguments to `./bin/sparkR`

#### Using SparkR from RStudio

If you wish to use SparkR from RStudio or other R frontends you will need to set some environment variables which point SparkR to your Spark installation. For example
```
# Set this to where Spark is installed
Sys.setenv(SPARK_HOME="/Users/shivaram/spark")
# This line loads SparkR from the installed directory
.libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), .libPaths()))
library(SparkR)
sc <- sparkR.init(master="local")
```

#### Making changes to SparkR

The [instructions](https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark) for making contributions to Spark also apply to SparkR.
If you only make R file changes (i.e. no Scala changes) then you can just re-install the R package using `R/install-dev.sh` and test your changes.
Once you have made your changes, please include unit tests for them and run existing unit tests using the `run-tests.sh` script as described below.

#### Generating documentation

The SparkR documentation (Rd files and HTML files) are not a part of the source repository. To generate them you can run the script `R/create-docs.sh`. This script uses `devtools` and `knitr` to generate the docs and these packages need to be installed on the machine before using the script.

### Examples, Unit tests

SparkR comes with several sample programs in the `examples/src/main/r` directory.
To run one of them, use `./bin/sparkR <filename> <args>`. For example:

./bin/sparkR examples/src/main/r/dataframe.R

You can also run the unit-tests for SparkR by running (you need to install the [testthat](http://cran.r-project.org/web/packages/testthat/index.html) package first):

R -e 'install.packages("testthat", repos="http://cran.us.r-project.org")'
./R/run-tests.sh

### Running on YARN
The `./bin/spark-submit` and `./bin/sparkR` can also be used to submit jobs to YARN clusters. You will need to set YARN conf dir before doing so. For example on CDH you can run
```
export YARN_CONF_DIR=/etc/hadoop/conf
./bin/spark-submit --master yarn examples/src/main/r/dataframe.R
```
13 changes: 13 additions & 0 deletions R/WINDOWS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
## Building SparkR on Windows

To build SparkR on Windows, the following steps are required

1. Install R (>= 3.1) and [Rtools](http://cran.r-project.org/bin/windows/Rtools/). Make sure to
include Rtools and R in `PATH`.
2. Install
[JDK7](http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html) and set
`JAVA_HOME` in the system environment variables.
3. Download and install [Maven](http://maven.apache.org/download.html). Also include the `bin`
directory in Maven in `PATH`.
4. Set `MAVEN_OPTS` as described in [Building Spark](http://spark.apache.org/docs/latest/building-spark.html).
5. Open a command shell (`cmd`) in the Spark directory and run `mvn -DskipTests -Psparkr package`
Loading