Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to TileDB-VCF 0.21.0 #62

Merged
merged 1 commit into from
Dec 16, 2022
Merged

Update to TileDB-VCF 0.21.0 #62

merged 1 commit into from
Dec 16, 2022

Conversation

awenocur
Copy link
Contributor

No description provided.

@shortcut-integration
Copy link

This pull request has been linked to Shortcut Story #20914: Design frequency-querying functionality.

@awenocur awenocur force-pushed the adamwenocur/sc-20914 branch from 047d170 to 81a0629 Compare December 12, 2022 18:56
Copy link
Member

@gspowley gspowley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing updates to tiledb 2.13?

@awenocur awenocur force-pushed the adamwenocur/sc-20914 branch from 81a0629 to 724e4cc Compare December 12, 2022 19:23
@awenocur awenocur force-pushed the adamwenocur/sc-20914 branch from 724e4cc to 28eda02 Compare December 12, 2022 23:52
@jdblischak
Copy link
Collaborator

Does anyone know what is causing the solver error?

The reported errors are:
- Encountered problems while solving:
-   - package libtiledbvcf-0.21.0-hfb5ef83_1 requires htslib >=1.16,<1.17.0a0, but none of the providers can be installed

conda_build.exceptions.DependencyNeedsBuildingError: Unsatisfiable dependencies for platform linux-64: {MatchSpec("libtiledbvcf==0.21.0=hfb5ef83_1"), MatchSpec("htslib[version='>=1.16,<1.17.0a0']")}

htslib 1.16 definitely exists for linux-64:

https://anaconda.org/bioconda/htslib/files?version=1.16

From a quick skim of the logs, it appears that htslib is found when building libtiledbvcf, but then can't be found when building tiledbvcf-py

@awenocur
Copy link
Contributor Author

From a quick skim of the logs, it appears that htslib is found when building libtiledbvcf, but then can't be found when building tiledbvcf-py

Hi @jdblischak,

Thank you for taking a look! I was about to contact you asking exactly the same question. We've been banging our heads against this for the past day.

@jdblischak
Copy link
Collaborator

Something strange is going on. I was able to reproduce the error locally. Next I activated the experimental improved error messages with the hope that this would better pinpoint the source of the conflict, but that time it succeeded! I'm now running it again without the experimental error messages.

@jdblischak
Copy link
Collaborator

Alright, confirmed. I can now build the recipe locally, despite not changing anything. @awenocur could you please restart the Azure linux job?

@awenocur
Copy link
Contributor Author

@awenocur could you please restart the Azure linux job?

@jdblischak Thanks for investigating. We'll know in about fifteen minutes whether we have gained Conda's favor.

@jdblischak
Copy link
Collaborator

My guess is that the recent backports to arrow-cpp/pyarrow is what changed the build status. My local build failed when it pulled arrow-cpp 9.0.0-py39h06993d0_13_cpu and succeeded when it pulled the recently uploaded 9.0.0-py39hdbe7bc9_13_cpu

#61
conda-forge/arrow-cpp-feedstock#918
conda-forge/arrow-cpp-feedstock#900
https://github.com/conda-forge/arrow-cpp-feedstock/commits/9.0.x
https://anaconda.org/conda-forge/arrow-cpp/files?version=9.0.0

@awenocur
Copy link
Contributor Author

@jdblischak Should we be expecting the backport to propagate somehow?

@jdblischak
Copy link
Collaborator

I see, the problem is the Python versions. I built for Python 3.9 locally. The Azure build failed to build for Python 3.8. I think that's because the Python 3.8 job for arrow-cpp/pyarrow is still running as of this moment

https://dev.azure.com/conda-forge/feedstock-builds/_build/results?buildId=628181&view=logs&j=d0d954b5-f111-5dc4-4d76-03b6c9d0cf7e

@jdblischak
Copy link
Collaborator

The conda binary arrow-cpp-9.0.0-py38h3ffc01c_14_cpu was uploaded to anaconda.org about 10 minutes ago. It can take up to a few hours for all the anaconda.org mirrors to sync. Depending on how much of a hurry you are in, I'd recommend returning to this in a few hours to restart

@awenocur awenocur force-pushed the adamwenocur/sc-20914 branch from 28eda02 to b264966 Compare December 14, 2022 14:22
@jdblischak
Copy link
Collaborator

Quick status update on my local builds:

  • Yesterday I was able to build tiledbvcf-py for Python 3.7, 3.9, and 3.11. It consistently failed for Python 3.8 and 3.10
  • This morning I tried again. I was able to build for Python 3.8, but the build is still failing for Python 3.10
$ ls ~/miniconda3/conda-bld/linux-64/*vcf*
~/miniconda3/conda-bld/linux-64/libtiledbvcf-0.21.0-heab887e_1.tar.bz2
~/miniconda3/conda-bld/linux-64/tiledbvcf-py-0.21.0-py311heab887e_1.tar.bz2
~/miniconda3/conda-bld/linux-64/tiledbvcf-py-0.21.0-py37heab887e_1.tar.bz2
~/miniconda3/conda-bld/linux-64/tiledbvcf-py-0.21.0-py38heab887e_1.tar.bz2
~/miniconda3/conda-bld/linux-64/tiledbvcf-py-0.21.0-py39heab887e_1.tar.bz2

@jdblischak
Copy link
Collaborator

Another update: I was able to build for Python 3.10 after downgrading pyarrow from version 9 to version 8.

We're running into the same confusing build conflicts between htslib and pyarrow that @Shelnutt2 mentioned in #61 (comment)

How crucial is it to upgrade to pyarrow 9? What would be the consequences of downgrading to pyarrow 8?

@Shelnutt2
Copy link
Member

@jdblischak we can downgrade. Was trying to get some newer compute features for use in envs with tiledb-vcf installed. But downgrading back to 6.0 or anything earlier than 9.0 is acceptable to get this out now.

@awenocur awenocur force-pushed the adamwenocur/sc-20914 branch from b264966 to 1d5e510 Compare December 14, 2022 17:53
@jdblischak
Copy link
Collaborator

This is so frustrating. Not only are the errors cryptic, but they aren't consistent between Azure and my local machine. @awenocur could you please next try pyarrow 7 and then 6?

@awenocur awenocur force-pushed the adamwenocur/sc-20914 branch from 1d5e510 to a8002b0 Compare December 15, 2022 16:46
@awenocur
Copy link
Contributor Author

@awenocur awenocur force-pushed the adamwenocur/sc-20914 branch from a8002b0 to f00a635 Compare December 15, 2022 17:04
@awenocur
Copy link
Contributor Author

@awenocur
Copy link
Contributor Author

@jdblischak I'm trying many things to make this work locally, but one thing I noticed is that if I change the htslib version, it won't affect what's being pulled in at the end of the build, except if i change the TileDB-VCF version also to one that doesn't exist. This makes me suspect that a stale Conda artifact is being pulled in from the repo instead of the new TileDB-VCF that was just built.

@jdblischak
Copy link
Collaborator

This makes me suspect that a stale Conda artifact is being pulled in from the repo instead of the new TileDB-VCF that was just built.

That's an interesting idea! At first I was confused how this could happen, but then I noticed that 0.21.0 packages have already been uploaded to the tiledb channel on anaconda.org. Has this feedstock always been setup to upload conda binaries from PR builds? I didn't even realize this could be a potential issue.

From the consistent error message we keep seeing, it always pulls linux-64/libtiledbvcf-0.21.0-hfb5ef83_0.conda, which was uploaded almost 3 days ago! (ie when this PR was opened)

conda_build.exceptions.DependencyNeedsBuildingError: Unsatisfiable dependencies for platform linux-64: {MatchSpec("htslib[version='>=1.16,<1.17.0a0']"), MatchSpec("libtiledbvcf==0.21.0=hfb5ef83_0")}

Thus all of our tinkering with the pyarrow version has had no effect because the solver keeps picking that existing one.

I see that you've disabled the default of automated uploads. That's a good start. My next suggestion would be to bump the build number. That should hopefully give the local version of libtiledbvcf an edge in the solver. If it doesn't, we will need to delete the existing binary on anaconda.org. In fact, if it's not too much trouble, I recommend deleting all 0.21.0 versions of libtiledbvcf and tiledbvcf-py from anaconda.org. It's hard to interpret the Azure logs as we fiddle with the pyarrow version when we have no idea (without very close inspection of the logs) which versions are being pulled.

https://anaconda.org/tiledb/libtiledbvcf/files?version=0.21.0
https://anaconda.org/tiledb/tiledbvcf-py/files?version=0.21.0

Not to mention it is hard to trust these existing binaries. For example, how are there 3 different macOS binaries (build numbers 0, 1, and 2) but only one linux-64 binary? Did you previously bump the build number? Instead of always overwriting your past commits, another strategy to keep a clean Git history is to keep all the tinkering commits in the PR, and then do a final squash merge to combine them into a single final commit.

@awenocur
Copy link
Contributor Author

awenocur commented Dec 15, 2022

@jdblischak I had bumped the build number a couple of times; the shipping build would have to be 3, as there's a macOS (but not Linux) artifact at build number 2. I realize there are plenty of integers left, but I was being conservative resetting it to zero for this test, just in case I hadn't prevented the upload successfully. When I had it at 2 though, the local build still did not have priority.

I can try again with number 2.

@awenocur awenocur force-pushed the adamwenocur/sc-20914 branch from f00a635 to cb9b343 Compare December 15, 2022 20:59
@awenocur
Copy link
Contributor Author

Instead of always overwriting your past commits, another strategy to keep a clean Git history is to keep all the tinkering commits in the PR, and then do a final squash merge to combine them into a single final commit.

This is of course good advice, given that tinkering inside of CI is a good idea. The build 1 and 2 binaries were uploaded when I was running CI from an experimental branch and had failed to disable uploading. Build 0 was from a commit to this PR where I had the wrong TileDB Embedded version.

@jdblischak
Copy link
Collaborator

Ok, so messing with the build number at least ensures that we are using the locally built package. The binary with build number 2 doesn't exist for linux-64 on anaconda.org. Unfortunately we are still stuck with the same htslib conflict 😭

conda_build.exceptions.DependencyNeedsBuildingError: Unsatisfiable dependencies for platform linux-64: {MatchSpec("htslib[version='>=1.16,<1.17.0a0']"), MatchSpec("libtiledbvcf==0.21.0=hfb5ef83_2")}

@awenocur
Copy link
Contributor Author

awenocur commented Dec 16, 2022

@jdblischak, @Shelnutt2 has a candidate branch making htslib build directly from source here. Unfortunately, it seems to have the same problem running actions that you reported on #60.

recipe/meta.yaml Outdated
@@ -65,7 +65,7 @@ outputs:
- {{ pin_compatible('numpy', lower_bound='1.16') }}
- {{ pin_subpackage('libtiledbvcf', exact=True) }}
- python
- pyarrow 9.0.*
- pyarrow 7.0.*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets revert this and rebase again master and see if it passes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants