Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADAM-1783] Resolve check issues that block pushing to CRAN. #1849

Merged
merged 2 commits into from
Jan 4, 2018

Conversation

fnothaft
Copy link
Member

@fnothaft fnothaft commented Jan 2, 2018

Based on #1848. Resolves #1783 and #1847. Cleans up a host of documentation issues that would cause warnings when submitting to CRAN. TODO for tomorrow:

- [ ] Make sure that JAR is packaged in properly
- [ ] Test on linux
- [ ] Possibly test on Windows via CRAN windows test server
- [ ] Add CRAN markdown readme

  • Integrate into release flow

@fnothaft fnothaft added this to the 0.23.0 milestone Jan 2, 2018
@fnothaft fnothaft requested a review from heuermh January 2, 2018 08:27
@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2535/

Build result: FAILURE

[...truncated 15 lines...] > /home/jenkins/git2/bin/git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ # timeout=15 > /home/jenkins/git2/bin/git rev-parse origin/pr/1849/merge^{commit} # timeout=10 > /home/jenkins/git2/bin/git branch -a -v --no-abbrev --contains 7c4f9a3 # timeout=10Checking out Revision 7c4f9a3 (origin/pr/1849/merge) > /home/jenkins/git2/bin/git config core.sparsecheckout # timeout=10 > /home/jenkins/git2/bin/git checkout -f 7c4f9a3328d1e79afb3b23f453e4ec0a1c7b338bFirst time build. Skipping changelog.Triggering ADAM-prb ? 2.6.2,2.11,1.6.3,centosTriggering ADAM-prb ? 2.7.3,2.10,1.6.3,centosTriggering ADAM-prb ? 2.7.3,2.10,2.2.0,centosTriggering ADAM-prb ? 2.7.3,2.11,1.6.3,centosTriggering ADAM-prb ? 2.6.2,2.10,2.2.0,centosTriggering ADAM-prb ? 2.6.2,2.10,1.6.3,centosTriggering ADAM-prb ? 2.6.2,2.11,2.2.0,centosTriggering ADAM-prb ? 2.7.3,2.11,2.2.0,centosADAM-prb ? 2.6.2,2.11,1.6.3,centos completed with result SUCCESSADAM-prb ? 2.7.3,2.10,1.6.3,centos completed with result SUCCESSADAM-prb ? 2.7.3,2.10,2.2.0,centos completed with result SUCCESSADAM-prb ? 2.7.3,2.11,1.6.3,centos completed with result SUCCESSADAM-prb ? 2.6.2,2.10,2.2.0,centos completed with result SUCCESSADAM-prb ? 2.6.2,2.10,1.6.3,centos completed with result FAILUREADAM-prb ? 2.6.2,2.11,2.2.0,centos completed with result FAILUREADAM-prb ? 2.7.3,2.11,2.2.0,centos completed with result FAILURENotifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

#' @export
setClass("GenomicRDD",
slots = list(jrdd = "jobj"))


#' A class that wraps DataFrame of genomic data with helpful metadata.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wraps DataFrame → wraps a DataFrame

FeatureRDD <- function(jrdd) {
new("FeatureRDD", jrdd = jrdd)
}

#' A class that wraps an RDD of read fragments with helpful metadata.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find "read fragments" to be a bit confusing, but I suppose we use that elsewhere. "Read pairs grouped by sequencing fragment" is probably too verbose?

@@ -474,89 +575,64 @@ setMethod("recalibrateBaseQualities",

#' Realigns indels using a concensus-based heuristic.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

concensus → consensus

@heuermh
Copy link
Member

heuermh commented Jan 2, 2018

As far as I know, SparkR ran into trouble with the CRAN submission process because Spark dropped temporary files in places it shouldn't. We might need to look out for the same. I had some Spark JIRA issues on the topic bookmarked but apparently not on this computer.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2537/

Build result: FAILURE

[...truncated 15 lines...] > /home/jenkins/git2/bin/git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ # timeout=15 > /home/jenkins/git2/bin/git rev-parse origin/pr/1849/merge^{commit} # timeout=10 > /home/jenkins/git2/bin/git branch -a -v --no-abbrev --contains 9628690 # timeout=10Checking out Revision 9628690 (origin/pr/1849/merge) > /home/jenkins/git2/bin/git config core.sparsecheckout # timeout=10 > /home/jenkins/git2/bin/git checkout -f 962869088b4d0d35fc231fca86e3b0aa120fb6c0First time build. Skipping changelog.Triggering ADAM-prb ? 2.6.2,2.11,1.6.3,centosTriggering ADAM-prb ? 2.7.3,2.10,1.6.3,centosTriggering ADAM-prb ? 2.7.3,2.10,2.2.0,centosTriggering ADAM-prb ? 2.7.3,2.11,1.6.3,centosTriggering ADAM-prb ? 2.6.2,2.10,2.2.0,centosTriggering ADAM-prb ? 2.6.2,2.10,1.6.3,centosTriggering ADAM-prb ? 2.6.2,2.11,2.2.0,centosTriggering ADAM-prb ? 2.7.3,2.11,2.2.0,centosADAM-prb ? 2.6.2,2.11,1.6.3,centos completed with result SUCCESSADAM-prb ? 2.7.3,2.10,1.6.3,centos completed with result SUCCESSADAM-prb ? 2.7.3,2.10,2.2.0,centos completed with result FAILUREADAM-prb ? 2.7.3,2.11,1.6.3,centos completed with result SUCCESSADAM-prb ? 2.6.2,2.10,2.2.0,centos completed with result SUCCESSADAM-prb ? 2.6.2,2.10,1.6.3,centos completed with result SUCCESSADAM-prb ? 2.6.2,2.11,2.2.0,centos completed with result SUCCESSADAM-prb ? 2.7.3,2.11,2.2.0,centos completed with result SUCCESSNotifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

@fnothaft
Copy link
Member Author

fnothaft commented Jan 3, 2018

Jenkins, retest this please.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2538/
Test PASSed.

@fnothaft
Copy link
Member Author

fnothaft commented Jan 3, 2018

As per discussion with @heuermh, we don't think we can push to CRAN until SparkR is back in CRAN due to:

Packages on which a CRAN package depends should be available from a mainstream
repository: if any mentioned in ‘Suggests’ or ‘Enhances’ fields are not from
such a repository, where to obtain them at a repository should be specified in an
‘Additional_repositories’ field of the DESCRIPTION file (as a comma-separated list
of repository URLs) or for other means of access, described in the ‘Description’ field.
A package listed in ‘Suggests’ or ‘Enhances’ should be used conditionally in examples
or tests if it cannot straightforwardly be installed on the major R platforms

With SparkR being temporarily removed from CRAN, it is not available from a mainstream repository. The above snippet is from the CRAN policy guide. In lieu of CRAN, we will distribute the R tarball.

@fnothaft fnothaft mentioned this pull request Jan 3, 2018
5 tasks
Resolves bigdatagenomics#1847. Cribs heavily from PySpark's script flow for supporting a full,
self-contained pip install-able Spark by finding the JARs and bin scripts and
packaging them up as packages which are deployed to pip. We then needed to
modify the bin scripts to find the pip installed JARs.
Resolves bigdatagenomics#1783. Cleans up a host of documentation issues that would cause
warnings when submitting to CRAN.
@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2539/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2540/
Test PASSed.

@heuermh
Copy link
Member

heuermh commented Jan 3, 2018

Are twine and pypandoc supposed to be installed as part of the virtualenv?

...
adam-python (HEAD detached at fnothaft/issues/1783-cran)!
$ virtualenv release-venv
Using base prefix '/Users/mheuer2/miniconda3'
New python executable in /Users/mheuer2/working/adam/adam-python/release-venv/bin/python
Installing setuptools, pip, wheel...done.

adam-python (HEAD detached at fnothaft/issues/1783-cran)!
$ . release-venv/bin/activate

(release-venv) adam-python (HEAD detached at fnothaft/issues/1783-cran)!
$ pip install pyspark
Collecting pyspark
  Downloading pyspark-2.2.0.post0.tar.gz (188.3MB)
    100% |████████████████████████████████| 188.3MB 9.3kB/s
Collecting py4j==0.10.4 (from pyspark)
  Downloading py4j-0.10.4-py2.py3-none-any.whl (186kB)
    100% |████████████████████████████████| 194kB 4.4MB/s
Building wheels for collected packages: pyspark
  Running setup.py bdist_wheel for pyspark ... done
  Stored in directory: /Users/mheuer2/Library/Caches/pip/wheels/5f/0b/b3/5cb16b15d28dcc32f8e7ec91a044829642874bb7586f6e6cbe
Successfully built pyspark
Installing collected packages: py4j, pyspark
Successfully installed py4j-0.10.4 pyspark-2.2.0

(release-venv) adam-python (HEAD detached at fnothaft/issues/1783-cran)!
$ rm -rf dist

(release-venv) adam-python (HEAD detached at fnothaft/issues/1783-cran)!
$ make sdist
python2.7 setup.py sdist
Could not import pypandoc - required to package bdgenomics.adam
/usr/local/lib/python2.7/site-packages/setuptools/dist.py:360: UserWarning: The version specified ('0.23.0-SNAPSHOT') is an invalid version, this may not work as expected with newer versions of setuptools, pip, and PyPI. Please see PEP 440 for more details.
  "details." % self.metadata.version
running sdist
running egg_info
creating bdgenomics.adam.egg-info
writing requirements to bdgenomics.adam.egg-info/requires.txt
writing bdgenomics.adam.egg-info/PKG-INFO
writing top-level names to bdgenomics.adam.egg-info/top_level.txt
writing dependency_links to bdgenomics.adam.egg-info/dependency_links.txt
writing manifest file 'bdgenomics.adam.egg-info/SOURCES.txt'
package init file 'deps/jars/__init__.py' not found (or not a regular file)
package init file 'deps/bin/__init__.py' not found (or not a regular file)
reading manifest file 'bdgenomics.adam.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
warning: no previously-included files matching '*.py[cod]' found anywhere in distribution
warning: no previously-included files matching '__pycache__' found anywhere in distribution
warning: no previously-included files matching '.DS_Store' found anywhere in distribution
writing manifest file 'bdgenomics.adam.egg-info/SOURCES.txt'
running check
creating bdgenomics.adam-0.23.0-SNAPSHOT
creating bdgenomics.adam-0.23.0-SNAPSHOT/bdgenomics
creating bdgenomics.adam-0.23.0-SNAPSHOT/bdgenomics.adam.egg-info
creating bdgenomics.adam-0.23.0-SNAPSHOT/bdgenomics/adam
creating bdgenomics.adam-0.23.0-SNAPSHOT/bdgenomics/adam/test
creating bdgenomics.adam-0.23.0-SNAPSHOT/deps
creating bdgenomics.adam-0.23.0-SNAPSHOT/deps/bin
creating bdgenomics.adam-0.23.0-SNAPSHOT/deps/jars
copying files to bdgenomics.adam-0.23.0-SNAPSHOT...
copying MANIFEST.in -> bdgenomics.adam-0.23.0-SNAPSHOT
copying README.md -> bdgenomics.adam-0.23.0-SNAPSHOT
copying setup.py -> bdgenomics.adam-0.23.0-SNAPSHOT
copying version.py -> bdgenomics.adam-0.23.0-SNAPSHOT
copying bdgenomics/__init__.py -> bdgenomics.adam-0.23.0-SNAPSHOT/bdgenomics
copying bdgenomics.adam.egg-info/PKG-INFO -> bdgenomics.adam-0.23.0-SNAPSHOT/bdgenomics.adam.egg-info
copying bdgenomics.adam.egg-info/SOURCES.txt -> bdgenomics.adam-0.23.0-SNAPSHOT/bdgenomics.adam.egg-info
copying bdgenomics.adam.egg-info/dependency_links.txt -> bdgenomics.adam-0.23.0-SNAPSHOT/bdgenomics.adam.egg-info
copying bdgenomics.adam.egg-info/requires.txt -> bdgenomics.adam-0.23.0-SNAPSHOT/bdgenomics.adam.egg-info
copying bdgenomics.adam.egg-info/top_level.txt -> bdgenomics.adam-0.23.0-SNAPSHOT/bdgenomics.adam.egg-info
copying bdgenomics/adam/__init__.py -> bdgenomics.adam-0.23.0-SNAPSHOT/bdgenomics/adam
copying bdgenomics/adam/adamContext.py -> bdgenomics.adam-0.23.0-SNAPSHOT/bdgenomics/adam
copying bdgenomics/adam/find_adam_home.py -> bdgenomics.adam-0.23.0-SNAPSHOT/bdgenomics/adam
copying bdgenomics/adam/rdd.py -> bdgenomics.adam-0.23.0-SNAPSHOT/bdgenomics/adam
copying bdgenomics/adam/stringency.py -> bdgenomics.adam-0.23.0-SNAPSHOT/bdgenomics/adam
copying bdgenomics/adam/test/__init__.py -> bdgenomics.adam-0.23.0-SNAPSHOT/bdgenomics/adam/test
copying bdgenomics/adam/test/adamContext_test.py -> bdgenomics.adam-0.23.0-SNAPSHOT/bdgenomics/adam/test
copying bdgenomics/adam/test/alignmentRecordRdd_test.py -> bdgenomics.adam-0.23.0-SNAPSHOT/bdgenomics/adam/test
copying bdgenomics/adam/test/featureRdd_test.py -> bdgenomics.adam-0.23.0-SNAPSHOT/bdgenomics/adam/test
copying bdgenomics/adam/test/genotypeRdd_test.py -> bdgenomics.adam-0.23.0-SNAPSHOT/bdgenomics/adam/test
copying bdgenomics/adam/test/variantRdd_test.py -> bdgenomics.adam-0.23.0-SNAPSHOT/bdgenomics/adam/test
copying deps/bin/adam-shell -> bdgenomics.adam-0.23.0-SNAPSHOT/deps/bin
copying deps/bin/adam-submit -> bdgenomics.adam-0.23.0-SNAPSHOT/deps/bin
copying deps/bin/adamR -> bdgenomics.adam-0.23.0-SNAPSHOT/deps/bin
copying deps/bin/find-adam-assembly.sh -> bdgenomics.adam-0.23.0-SNAPSHOT/deps/bin
copying deps/bin/find-adam-egg.sh -> bdgenomics.adam-0.23.0-SNAPSHOT/deps/bin
copying deps/bin/find-adam-home -> bdgenomics.adam-0.23.0-SNAPSHOT/deps/bin
copying deps/bin/find-spark.sh -> bdgenomics.adam-0.23.0-SNAPSHOT/deps/bin
copying deps/bin/pyadam -> bdgenomics.adam-0.23.0-SNAPSHOT/deps/bin
copying deps/jars/adam.jar -> bdgenomics.adam-0.23.0-SNAPSHOT/deps/jars
Writing bdgenomics.adam-0.23.0-SNAPSHOT/setup.cfg
creating dist
Creating tar archive
removing 'bdgenomics.adam-0.23.0-SNAPSHOT' (and everything under it)

(release-venv) adam-python (HEAD detached at fnothaft/issues/1783-cran)!
$ twine upload dist/*.tar.gz
-bash: twine: command not found

Then I suppose I should update the version in version.py to an rc before attempting to deploy to test pypi again.

@heuermh
Copy link
Member

heuermh commented Jan 3, 2018

Regarding version.py, should that be committed as "0.23.0" before the release? Or do we want to modify it in place in release.sh?

@fnothaft
Copy link
Member Author

fnothaft commented Jan 3, 2018

@heuermh I'd commit version.py as 0.23.0 before the release.

Are twine and pypandoc supposed to be installed as part of the virtualenv?

No, you'll need to pip install twine pypandoc, and yum install pandoc (or whatever is correct for your system) iirc.

@heuermh
Copy link
Member

heuermh commented Jan 3, 2018

ok, will try again this afternoon, and if it deploys to testpypi correctly, then I'll merge this pr and perform the release.

@heuermh
Copy link
Member

heuermh commented Jan 3, 2018

Back to my workstation, installed Miniconda for Python 2.7, sourced ~/.bash_profile, then

$ virtualenv release-venv
New python executable in /Users/heuermh/working/adam-tmp/adam-python/release-venv/bin/python
Installing setuptools, pip, wheel...
  Complete output from command /Users/heuermh/worki...ease-venv/bin/python - setuptools pip wheel:
  Traceback (most recent call last):
  File "<stdin>", line 4, in <module>
  File "/Users/heuermh/.miniconda2/lib/python2.7/tempfile.py", line 32, in <module>
    import io as _io
  File "/Users/heuermh/.miniconda2/lib/python2.7/io.py", line 51, in <module>
    import _io
ImportError: dlopen(/Users/heuermh/working/adam-tmp/adam-python/release-venv/lib/python2.7/lib-dynload/_io.so, 2): Symbol not found: __PyCodecInfo_GetIncrementalDecoder
  Referenced from: /Users/heuermh/working/adam-tmp/adam-python/release-venv/lib/python2.7/lib-dynload/_io.so
  Expected in: flat namespace
 in /Users/heuermh/working/adam-tmp/adam-python/release-venv/lib/python2.7/lib-dynload/_io.so
----------------------------------------
...Installing setuptools, pip, wheel...done.
Traceback (most recent call last):
  File "/Users/heuermh/.miniconda2/bin/virtualenv", line 11, in <module>
    sys.exit(main())
  File "/Users/heuermh/.miniconda2/lib/python2.7/site-packages/virtualenv.py", line 713, in main
    symlink=options.symlink)
  File "/Users/heuermh/.miniconda2/lib/python2.7/site-packages/virtualenv.py", line 945, in create_environment
    download=download,
  File "/Users/heuermh/.miniconda2/lib/python2.7/site-packages/virtualenv.py", line 901, in install_wheel
    call_subprocess(cmd, show_stdout=False, extra_env=env, stdin=SCRIPT)
  File "/Users/heuermh/.miniconda2/lib/python2.7/site-packages/virtualenv.py", line 797, in call_subprocess
    % (cmd_desc, proc.returncode))
OSError: Command /Users/heuermh/worki...ease-venv/bin/python - setuptools pip wheel failed with error code 1

@heuermh
Copy link
Member

heuermh commented Jan 3, 2018

Next tried installing pip, virtualenv, twine and pypandoc against system python, pandoc is /usr/local/bin/pandoc. Got farther, then

...
Installing collected packages: py4j, pyspark
Successfully installed py4j-0.10.4 pyspark-2.2.0

(release-venv) adam-python (HEAD detached at fnothaft/issues/1783-cran)!
$ rm -rf dist

(release-venv) adam-python (HEAD detached at fnothaft/issues/1783-cran)!
$ make sdist
python2.7 setup.py sdist
Could not import pypandoc - required to package bdgenomics.adam

@heuermh
Copy link
Member

heuermh commented Jan 3, 2018

Ran pip install pypandoc inside the virtualenv.

Now having trouble with twine, created .pypirc with [testpypi] credentials, then

(release-venv) adam-python (HEAD detached at fnothaft/issues/1783-cran)!
$ twine upload dist/*.tar.gz
KeyError: Missing 'pypi' section from the configuration file
or not a complete URL in --repository.
Maybe you have a out-dated '~/.pypirc' format?
more info: https://docs.python.org/distutils/packageindex.html#pypirc

(release-venv) adam-python (HEAD detached at fnothaft/issues/1783-cran)!
$ twine upload -r testpypi dist/*.tar.gz
KeyError: Missing 'testpypi' section from the configuration file
or not a complete URL in --repository.
Maybe you have a out-dated '~/.pypirc' format?
more info: https://docs.python.org/distutils/packageindex.html#pypirc

(release-venv) adam-python (HEAD detached at fnothaft/issues/1783-cran)!
$ twine upload --repository-url https://test.pypi.org/legacy/ dist/*.tar.gz
Uploading distributions to https://test.pypi.org/legacy/
Enter your username: ***
Enter your password: 
Uploading bdgenomics.adam-0.23.0rc19.tar.gz
SSLError: HTTPSConnectionPool(host='test.pypi.org', port=443): Max retries exceeded with url: /legacy/ (Caused by SSLError(SSLError(1, u'[SSL: TLSV1_ALERT_PROTOCOL_VERSION] tlsv1 alert protocol version (_ssl.c:590)'),))

(release-venv) adam-python (HEAD detached at fnothaft/issues/1783-cran)!
$ twine upload --repository-url https://test.pypi.org/ dist/*.tar.gz
Uploading distributions to https://test.pypi.org/
Enter your username: ***
Enter your password: 
Uploading bdgenomics.adam-0.23.0rc19.tar.gz
SSLError: HTTPSConnectionPool(host='test.pypi.org', port=443): Max retries exceeded with url: / (Caused by SSLError(SSLError(1, u'[SSL: TLSV1_ALERT_PROTOCOL_VERSION] tlsv1 alert protocol version (_ssl.c:590)'),))

@fnothaft
Copy link
Member Author

fnothaft commented Jan 3, 2018

Run this:

twine upload --repository-url https://test.pypi.org/legacy/ dist/*

@heuermh heuermh merged commit 2986cf7 into bigdatagenomics:master Jan 4, 2018
@heuermh
Copy link
Member

heuermh commented Jan 4, 2018

DO
IT
LIVE

@fnothaft fnothaft deleted the issues/1783-cran branch January 4, 2018 00:15
@fnothaft
Copy link
Member Author

fnothaft commented Jan 4, 2018

w00t!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants