Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADAM-1615] Add transform and transmute APIs to Java, R, and Python #1627

Conversation

fnothaft
Copy link
Member

@fnothaft fnothaft commented Jul 24, 2017

Resolves #1615. Adds GenomicDatasetConversion class to adam-core API, along with accompanying implementations in org.bdgenomics.adam.api.java package. Adds transformDataFrame and transmuteDataFrame to GenomicRDD API. To enable use from Python and R, we provide a DataFrameConversionWrapper that wraps the product of an R/Python function in a Spark Java API Function.

@fnothaft fnothaft added this to the 0.23.0 milestone Jul 24, 2017
@coveralls
Copy link

coveralls commented Jul 24, 2017

Coverage Status

Coverage decreased (-0.5%) to 83.464% when pulling b5b9a5f on fnothaft:issues/1615-transform-transmute-java-r-python into 9bc5e15 on bigdatagenomics:master.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2276/
Test PASSed.

Copy link
Member

@devin-petersohn devin-petersohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

}
import org.bdgenomics.adam.sql._
import scala.reflect.runtime.universe._
import scala.reflect.runtime.universe.TypeTag
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this line be removed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah, it should be removed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few more instances of this in this PR.

@@ -42,6 +42,8 @@ import org.bdgenomics.utils.interval.array.{
import scala.collection.JavaConversions._
import scala.math.max
import scala.reflect.ClassTag
import scala.reflect.runtime.universe._
import scala.reflect.runtime.universe.TypeTag
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line can be removed also.

Copy link
Member

@heuermh heuermh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will review again after rebase

dfFn = self._wrapTransformation(tFn)

# if no conversion function is provided, try to infer
if convFn is None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method name stuff is brittle, but I can't see any other way to do it

Resolves bigdatagenomics#1615. Adds `GenomicDatasetConversion` class to adam-core API, along
with accompanying implementations in `org.bdgenomics.adam.api.java` package.
Adds `transformDataFrame` and `transmuteDataFrame` to `GenomicRDD` API. To
enable use from Python and R, we provide a `DataFrameConversionWrapper` that
wraps the product of an R/Python function in a Spark Java API `Function`.
@fnothaft fnothaft force-pushed the issues/1615-transform-transmute-java-r-python branch from b5b9a5f to 6778abd Compare August 2, 2017 18:09
@coveralls
Copy link

coveralls commented Aug 2, 2017

Coverage Status

Coverage decreased (-0.5%) to 83.464% when pulling 6778abd on fnothaft:issues/1615-transform-transmute-java-r-python into 9032698 on bigdatagenomics:master.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2310/
Test PASSed.

@heuermh heuermh merged commit ca3c587 into bigdatagenomics:master Aug 2, 2017
@heuermh
Copy link
Member

heuermh commented Aug 2, 2017

Thank you, @fnothaft!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants