Skip to content

Conversation

@kanzhang
Copy link
Contributor

No description provided.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@kanzhang
Copy link
Contributor Author

@marmbrus I tried to implement the formula you gave on the mailing list. Not sure if I missed anything. Pls take a look. Note I changed Count() to return Long to match RDD.count(). On the python side, the original rdd.count() returns Int.

@ash211
Copy link
Contributor

ash211 commented May 20, 2014

Thanks for the contribution! Could use it in my own workflows.

Python ints are signed 32 bit numbers right? Should make that a long
explicitly unless python does the right thing with promoting to a long
rather than overflowing.
On May 20, 2014 12:44 PM, "kanzhang" notifications@github.com wrote:

@marmbrus https://github.com/marmbrus I tried to implement the formula
you gave on the mailing list. Not sure if I missed anything. Pls take a
look. Note I changed Count() to return Long to match RDD.count(). On the
python side, the original rdd.count() returns Int.


Reply to this email directly or view it on GitHubhttps://github.com//pull/841#issuecomment-43673656
.

@kanzhang
Copy link
Contributor Author

@ash211 In Python 2.X, it does promote an Int to Long when overflowing (only in doc tests, where you have to be explicit about the expected result is 3 or 3L).

@rxin
Copy link
Contributor

rxin commented May 21, 2014

He's on vacation this week so it might take a while for him to get back :)

@kanzhang
Copy link
Contributor Author

@rxin thanks for the heads up. I appreciate help from anyone to help burn down my open PRs, the oldest being over a month old.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind adding javadoc for this? Just explain different from RDD count's, SchemaRDD count actually invokes the optimizer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will do.

@kanzhang kanzhang changed the title [SPARK-1822] SchemaRDD.count() should use optimizer [SPARK-1822] SchemaRDD.count() should use SQL optimizer May 22, 2014
@kanzhang kanzhang changed the title [SPARK-1822] SchemaRDD.count() should use SQL optimizer [SPARK-1822] SchemaRDD.count() should use Catalyst optimizer May 23, 2014
@kanzhang kanzhang changed the title [SPARK-1822] SchemaRDD.count() should use Catalyst optimizer [SPARK-1822] SchemaRDD.count() should use query optimizer May 23, 2014
@rxin
Copy link
Contributor

rxin commented May 25, 2014

Jenkins, add to whitelist.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15183/

@rxin
Copy link
Contributor

rxin commented May 25, 2014

Thanks. I've merged this into master & branch-1.0.

asfgit pushed a commit that referenced this pull request May 25, 2014
Author: Kan Zhang <kzhang@apache.org>

Closes #841 from kanzhang/SPARK-1822 and squashes the following commits:

2f8072a [Kan Zhang] [SPARK-1822] Minor style update
cf4baa4 [Kan Zhang] [SPARK-1822] Adding Scaladoc
e67c910 [Kan Zhang] [SPARK-1822] SchemaRDD.count() should use optimizer

(cherry picked from commit 6052db9)
Signed-off-by: Reynold Xin <rxin@apache.org>
@asfgit asfgit closed this in 6052db9 May 25, 2014
asfgit pushed a commit that referenced this pull request May 25, 2014
Minor cleanup following #841.

Author: Reynold Xin <rxin@apache.org>

Closes #868 from rxin/schema-count and squashes the following commits:

5442651 [Reynold Xin] SPARK-1822: Some minor cleanup work on SchemaRDD.count()
asfgit pushed a commit that referenced this pull request May 25, 2014
Minor cleanup following #841.

Author: Reynold Xin <rxin@apache.org>

Closes #868 from rxin/schema-count and squashes the following commits:

5442651 [Reynold Xin] SPARK-1822: Some minor cleanup work on SchemaRDD.count()

(cherry picked from commit d66642e)
Signed-off-by: Reynold Xin <rxin@apache.org>
@kanzhang
Copy link
Contributor Author

@rxin thanks for the cleanup!

@kanzhang kanzhang deleted the SPARK-1822 branch May 25, 2014 22:58
pdeyhim pushed a commit to pdeyhim/spark-1 that referenced this pull request Jun 25, 2014
Author: Kan Zhang <kzhang@apache.org>

Closes apache#841 from kanzhang/SPARK-1822 and squashes the following commits:

2f8072a [Kan Zhang] [SPARK-1822] Minor style update
cf4baa4 [Kan Zhang] [SPARK-1822] Adding Scaladoc
e67c910 [Kan Zhang] [SPARK-1822] SchemaRDD.count() should use optimizer
pdeyhim pushed a commit to pdeyhim/spark-1 that referenced this pull request Jun 25, 2014
Minor cleanup following apache#841.

Author: Reynold Xin <rxin@apache.org>

Closes apache#868 from rxin/schema-count and squashes the following commits:

5442651 [Reynold Xin] SPARK-1822: Some minor cleanup work on SchemaRDD.count()
xiliu82 pushed a commit to xiliu82/spark that referenced this pull request Sep 4, 2014
Minor cleanup following apache#841.

Author: Reynold Xin <rxin@apache.org>

Closes apache#868 from rxin/schema-count and squashes the following commits:

5442651 [Reynold Xin] SPARK-1822: Some minor cleanup work on SchemaRDD.count()
agirish pushed a commit to HPEEzmeral/apache-spark that referenced this pull request May 5, 2022
… fail (apache#841)

* MapR [SPARK-903] spark.loadFromMapRDB(tableName, schema) using v2 api fail

Co-authored-by: Egor Krivokon <>
udaynpusa pushed a commit to mapr/spark that referenced this pull request Jan 30, 2024
… fail (apache#841)

* MapR [SPARK-903] spark.loadFromMapRDB(tableName, schema) using v2 api fail

Co-authored-by: Egor Krivokon <>
mapr-devops pushed a commit to mapr/spark that referenced this pull request May 8, 2025
… fail (apache#841)

* MapR [SPARK-903] spark.loadFromMapRDB(tableName, schema) using v2 api fail

Co-authored-by: Egor Krivokon <>
turboFei pushed a commit to turboFei/spark that referenced this pull request Nov 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants