Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion : Roadmap to 1.0.0 #638

Open
fulmicoton opened this issue Aug 20, 2019 · 6 comments
Open

Discussion : Roadmap to 1.0.0 #638

fulmicoton opened this issue Aug 20, 2019 · 6 comments

Comments

@fulmicoton
Copy link
Collaborator

fulmicoton commented Aug 20, 2019

This ticket is here to open a discussion. Feel free to chime in, especially if you are using tantivy somewhere.

@petr-tik is advocating for a clear roadmap to tantivy 1.0.0.

I am not against really against the idea, as long as we don't rush it.
I'm ok to put a 1.0.0 stamp on tantivy if

  • we are reasonably confident ppl can use this in production and sleep at night.
  • we reach a version for which we are confident we won't break backward compatibility within a year or so.

We both agree being super featureful is not a problem.

Required because related to the use of tantivy in production or the backward compatibility story:

Nice to have:

Note that prioritizing reaching 1.0.0 is very different from optimizing for adoption.

@sunface
Copy link

sunface commented Aug 20, 2019

nice hearing this ,especially for production-ready roadmap

@petr-tik petr-tik pinned this issue Sep 5, 2019
@petr-tik
Copy link
Contributor

petr-tik commented Sep 6, 2019

Thanks for opening this Paul.

To emphasise, I don't want to rush the 1.0.0 either. My main reason to start this conversation is to help us prioritise feature/bug work and broadcast serious intentions to prospective users of tantivy.

Apart from a consistent index format and a stable API, I believe security to be an equally important feature of tantivy.

Raph's recent post about cargo bloat and our long-ish compile times gave me time to think about another aspect of stability - security and dependencies.

Cargo makes it easy to import crates

Rust enables us to build systems software using the tooling of high-level languages like cargo and using great crates like fst (thanks BurntSushi), serde and crossbeam to name a few.

However, to the users of tantivy the buck of security stops with us. People adding tantivy to their Cargo.toml should have a great user experience and be confident that their applications will do nothing unexpected/nasty.

The sprawl of dependencies and the implicit rules about updating/reviewing might let us down. If any of our dependencies misbehave, it will make tantivy look bad and the Rust ecosystem less reliable (cough npm)

We control some dependencies: census, tantivy-fst, bitpacking, but most of them we import.

I suggest we do a light audit as part of 1.0.0 and a hard audit just before releasing 1.0.0.

Dependency light audit

Look for low hanging fruit in removing dependencies or updating their versions to reduce duplication.

For example, the scoped_pool crate looks dormant with outstanding PRs and no updates in the last 2 years.

One of the open PRs updates crossbeam to 0.4 (we are on 0.7), because there was an ordering bug in the MsQueue implementation. Scoped_pool occurs only in one file - our implementation of the search executor (src/core/executor.rs).

I don't know enough about executors to estimate the amount of effort to upgrade/replace this, but the security implications of using an out-of-date dependency seem reason enough to do so.

Minimising dependencies also carries the benefit of using rustc/cargo's caching mechanisms and speeding up builds, which improves user experience.

Upgrade regex-syntax dep in tantivy-fst to match that of regex

Using cargo-tree I found that we tantivy-fst uses a different version of regex-syntax than regex. If the API isn't too different, this should be doable.

regex-syntax v0.4.2 
└── tantivy-fst v0.1.0 
└── tantivy v0.11.0 

regex-syntax v0.6.6 
└── regex v1.1.5 
└── tantivy v0.11.0 

Won't make a massive impact, but looks like a small enough change with limited downside.

Move to Rust std futures

I expect Rust 1.39 and Futures to be released before tantivy 1.0.0. Currently, we use futures and futures_cpupool in the following files

src/indexer/segment_updater.rs
src/indexer/merger.rs 
src/indexer/index_writer.rs

If the functionality allows it, I think it's best we move to std-lib futures. Given how few files rely on it, it should be a relatively easy change.

Freeze dependencies for v1.0.0

Once we have completed the feature roadmap for 1.0.0, I suggest pinning tantivy 1.0.0 to exact versions (or version ranges like we have with combine) of dependencies in Cargo.toml.

Establish and automate future dependency audits

We can add dependency review like cargo-geiger and crev to CI.
Run it either:

  • on changes to Cargo.toml (new dependency or correction of a current dependency)
  • on every commit (since cargo automatically uses new minor versions unless specified otherwise).

If we pin 1.0.0 to exact versions, the latter should be redundant.

Publish a yank/replace procedure

If a problem arises with any of our dependencies, we should have a replacement procedure.

If the dependency is small, we can internalise the functionality in tantivy. If the dependency is large, we will need a way to push a hotfix.

@fulmicoton
Copy link
Collaborator Author

Once we have completed the feature roadmap for 1.0.0, I suggest pinning tantivy 1.0.0 to exact versions (or version ranges like we have with combine) of dependencies in Cargo.toml.

I wonder if this is a good idea...
Of course, we should not reexport symbol of crates external to our workspace, so conflict should not happen, but do we want to inflate clients binary because they use two libraries which import two different minors of the same library?

I thought the best practise was to not force the version down to the minor in libraries.

@petr-tik
Copy link
Contributor

petr-tik commented Sep 6, 2019

Fair point re minor version numbers, didn't think of that.

@fulmicoton
Copy link
Collaborator Author

Apart from that I agree with most of the stuff above.

I don't care much about having an official Publish a yank/replace procedure specific to dependencies. I think it should be the same story as for regular hotfixes. It is certainly worth having some guide somewhere, if other dev end up having to write or even (who knows?) publish hotfix

@Fiedzia
Copy link
Contributor

Fiedzia commented Sep 24, 2019

I suggest pinning tantivy 1.0.0 to exact versions

Tantivy is a library, it should work with whatever application author will decide to use it with. Excluding versions that are known to be broken is fine if there are any, but pinning is not.

@shikhar shikhar unpinned this issue Feb 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants