Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: add 'Has' feature. #276

Merged
merged 2 commits into from
Oct 25, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 71 additions & 0 deletions storage/api.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,82 @@ import (

// --- basics --->

// Storage is one of the base interfaces in the storage APIs.
// This type is rarely seen by itself alone (and never useful to implement alone),
// but is included in both ReadableStorage and WritableStorage.
// Because it's included in both the of the other two useful base interfaces,
// you can define functions that work on either one of them
// by using this type to describe your function's parameters.
//
// Library functions that work with storage systems should take either
// ReadableStorage, or WritableStorage, or Storage, as a parameter,
// depending on whether the function deals with the reading of data,
// or the writing of data, or may be found on either, respectively.
//
// An implementation of Storage may also support many other methods.
// At the very least, it should also support one of either ReadableStorage or WritableStorage.
// It may support even more interfaces beyond that for additional feature detection.
// See the package-wide docs for more discussion of this design.
//
// The Storage interface does not include much of use in itself alone,
// because ReadableStorage and WritableStorage are meant to be the most used types in declarations.
// However, it does include the Has function, because that function is reasonable to require ubiquitously from all implementations,
// and it serves as a reasonable marker to make sure the Storage interface is not trivially satisfied.
type Storage interface {
Has(ctx context.Context, key string) (bool, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for putting Has here instead of ReadableStorage? Write-only storages will have to figure out how to satisfy this method.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly the fear of ending up with a Has2 function on the package scope. Avoiding that problem seems to mandate we have some interface on the bottom that's in common to both the base read and the base write interface.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Has2 function you posit seems to require a readable storage anyway. I don't really see a strong need for a predefined Storage interfaces. It seems more useful for the consumer of a storage to define what methods it needs satisfied by a provider that is passed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to agree with Ian. I don't really understand why we want to require WritableStorage to implement Has, but not Get.

Copy link
Collaborator Author

@warpfork warpfork Oct 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recap: The holistic design goal I'm keeping aim on is: to consistently tell both users and intermediate API designers that they should be referring to either ReadableStorage or WritableStorage. And nothing else -- because then they should use the package-scope functions -- not the methods -- in order to do all their work (and let those functions take care of all the feature detection for them).

Now, Has is an operation that a user might want to ask for on either one of them.

(And surprisingly, I'd say one is almost more likely to want to ask Has on the writable side. To ask Has on the reader might be slightly cheaper than opening the read stream for some implementations; but to ask Has before writing can be significantly cheaper than opening a write stream for some implementations. But I digress; let's suffice to say we want the operation available on either direction.)

Now if I want to make a package-scope function for Has, just to keep 100% consistent with the messaging to the API user about "use the package-scope functions for all operations"... it turns out to be the one thing so far that's legitimately in common to both the read and the write directions.

If golang had parametric polymorphism, I'd write two functions:

func Has(Context, ReadableStorage, Key) (bool, error) {...}
func Has(Context, WritableStorage, Key) (bool, error) {...}

... but we do not have parametric polymorphism, so this is not an option.

Then our other options are:

  • Give up on having a package-scope Has function, and document the inconsistency of the "use the package-scope functions for all operations" rule for users.
  • Probably still have Has as methods on both anyway?
  • Or if not, and just putting Has on ReadableStorage only, then... we make people feature-detect Has on the writer side, when that's wanted?
  • Or we throw in the towel on separating reader and writer directions entirely? But I don't care for this at all -- this split has felt really good so far. It's like a soft form of "capabilities"-style APIs that keeps you from making mistakes in wiring, and it's also really great to be able to implement a read-only system and express that clearly at compile time. So surely we don't want to give all that up.

All of those choices sound worse to me than introducing an interface that's at the bottom of the type graph here.

Introducing an interface type that's at the bottom of the type graph gives us what we need to make the package API consistent and low-friction. Coincidentally, Has fits really nicely there.

(And in the future: if we find more features that might work on both the read or the write direction (and presumably, are optional extensions), we make those package-scope functions of the style func Foo(Context, Storage, ...) (...) too, reusing the same bottom type and doing feature-detection internally from there, to minimize burden to the caller.)

Copy link
Collaborator Author

@warpfork warpfork Oct 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This is sort of like the inverse of #277 . It makes the most sense if you look at how many casts show up when you actually try to use the API.)

}

// ReadableStorage is one of the base interfaces in the storage APIs;
// a storage system should implement at minimum either this, or WritableStorage,
// depending on whether it supports reading or writing.
// (One type may also implement both.)
//
// ReadableStorage implementations must at minimum provide
// a way to ask the store whether it contains a key,
// and a way to ask it to return the value.
//
// Library functions that work with storage systems should take either
// ReadableStorage, or WritableStorage, or Storage, as a parameter,
// depending on whether the function deals with the reading of data,
// or the writing of data, or may be found on either, respectively.
//
// An implementation of ReadableStorage may also support many other methods --
// for example, it may additionally match StreamingReadableStorage, or yet more interfaces.
// Usually, you should not need to check for this yourself; instead,
// you should use the storage package's functions to ask for the desired mode of interaction.
// Those functions will will accept any ReadableStorage as an argument,
// detect the additional interfaces automatically and use them if present,
// or, fall back to synthesizing equivalent behaviors from the basics.
// See the package-wide docs for more discussion of this design.
type ReadableStorage interface {
Storage
Get(ctx context.Context, key string) ([]byte, error)
}

// WritableStorage is one of the base interfaces in the storage APIs;
// a storage system should implement at minimum either this, or ReadableStorage,
// depending on whether it supports reading or writing.
// (One type may also implement both.)
//
// WritableStorage implementations must at minimum provide
// a way to ask the store whether it contains a key,
// and a way to put a value into storage indexed by some key.
//
// Library functions that work with storage systems should take either
// ReadableStorage, or WritableStorage, or Storage, as a parameter,
// depending on whether the function deals with the reading of data,
// or the writing of data, or may be found on either, respectively.
//
// An implementation of WritableStorage may also support many other methods --
// for example, it may additionally match StreamingWritableStorage, or yet more interfaces.
// Usually, you should not need to check for this yourself; instead,
// you should use the storage package's functions to ask for the desired mode of interaction.
// Those functions will will accept any WritableStorage as an argument,
// detect the additional interfaces automatically and use them if present,
// or, fall back to synthesizing equivalent behaviors from the basics.
// See the package-wide docs for more discussion of this design.
type WritableStorage interface {
Storage
Put(ctx context.Context, key string, content []byte) error
}

Expand Down
48 changes: 37 additions & 11 deletions storage/doc.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,25 +5,51 @@
//
// In IPLD, you can often avoid dealing with storage directly yourself,
// and instead use linking.LinkSystem to handle serialization, hashing, and storage all at once.
// You'll hand some values that match interfaces from this package to LinkSystem when configuring it.
// (You'll hand some values that match interfaces from this package to LinkSystem when configuring it.)
// It's probably best to work at that level and above as much as possible.
// If you do need to interact with storage more directly, the read on.
//
// The most basic APIs are ReadableStorage and WritableStorage.
// When writing code that works with storage systems, these two interfaces should be seen in almost all situations:
// user code is recommended to think in terms of these types;
// functions provided by this package will accept parameters of these types and work on them;
// implementations are expected to provide these types first;
// and any new library code is recommended to keep with the theme: use these interfaces preferentially.
//
// Users should decide which actions they want to take using a storage system,
// find the appropriate function in this package (n.b., package function -- not a method on an interface!
// You will likely find one of each, with the same name: pick the package function!),
// and use that function, providing it the storage system (e.g. either ReadableStorage, WritableStorage, or sometimes just Storage)
// as a parameter.
// That function will then use feature-detection (checking for matches to the other,
// more advanced and more specific interfaces in this package) and choose the best way
// to satisfy the request; or, if it can't feature-detect any relevant features,
// the function will fall back to synthesizing the requested behavior out of the most basic API.
// Using the package functions, and letting them do the feature detection for you,
// should provide the most consistent user experience and minimize the amount of work you need to do.
// (Bonus: It also gives us a convenient place to smooth out any future library migrations for you!)
//
// If writing new APIs that are meant to work reusably for any storage implementation:
// APIs should usually be designed around accepting ReadableStorage or WritableStorage as parameters
// (depending on which direction of data flow the API is regarding),
// (depending on which direction of data flow the API is regarding).
// and use the other interfaces (e.g. StreamingReadableStorage) thereafter internally for feature detection.
// Similarly, implementers of storage systems should implement ReadableStorage or WritableStorage
// before any other features.
// For APIs which may sometimes be found relating to either a read or a write direction of data flow,
// the Storage interface may be used in order to define a function that should accept either ReadableStorage or WritableStorage.
// In other words: when writing reusable APIs, one should follow the same pattern as this package's own functions do.
//
// Similarly, implementers of storage systems should always implement either ReadableStorage or WritableStorage first.
// Only after satisfying one of those should the implementation then move on to further supporting
// additional interfaces in this package (all of which are meant to support feature-detection).
// Beyond one of the basic two, all the other interfaces are optional:
// you can implement them if you want to advertise additional features,
// or advertise fastpaths that your storage system supports;
// but you don't have implement any of those additional interfaces if you don't want to,
// or if your implementation can't offer useful fastpaths for them.
//
// Storage systems as described by this package are allowed to make some interesting trades.
// Generally, write operations are allowed to be first-write-wins.
// Furthermore, there is no requirement that the system return an error if a subsequent write to the same key has different content.
// These rules are reasonable for a content-addressed storage system, and allow great optimizitions to be made.
//
// If implementing a storage system, you should implement packages from this interface.
// Beyond the basic two (described above), all the other interfaces are optional:
// you can implement them if you want to advertise additional features,
// or advertise fastpaths that your storage system supports;
// but you don't have implement any of the additional interfaces if you don't want to.
// These rules are reasonable for a content-addressed storage system, and allow great optimizations to be made.
//
// Note that all of the interfaces in this package only use types that are present in the golang standard library.
// This is intentional, and was done very carefully.
Expand Down
28 changes: 28 additions & 0 deletions storage/dsadapter/dsadapter.go
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,34 @@ type Adapter struct {
EscapingFunc func(string) string
}

// Has implements go-ipld-prime/storage.Storage.Has.
func (a *Adapter) Has(ctx context.Context, key string) (bool, error) {
// Return early if the context is already closed.
// This is also the last time we'll check the context,
// since go-datastore doesn't take them.
if ctx.Err() != nil {
return false, ctx.Err()
}

// If we have an EscapingFunc, apply it.
if a.EscapingFunc != nil {
key = a.EscapingFunc(key)
}

// Wrap the key into go-datastore's concrete type that it requires.
// Note that this does a bunch of actual work, which may be surprising.
// The key may be transformed (as per path.Clean).
// There will also be an allocation, if the key doesn't start with "/".
// (Avoiding these performance drags is part of why we started
// new interfaces in go-ipld-prime/storage.)
k := datastore.NewKey(key)

// Delegate the has call.
// Note that for some datastore implementations, this will do *yet more*
// validation on the key, and may return errors from that.
return a.Wrapped.Has(k)
}

// Get implements go-ipld-prime/storage.ReadableStorage.Get.
func (a *Adapter) Get(ctx context.Context, key string) ([]byte, error) {
// Return early if the context is already closed.
Expand Down
5 changes: 5 additions & 0 deletions storage/funcs.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,11 @@ import (
regardless of how much explicit support the storage implementation has for the exact behavior you requested.
*/

func Has(ctx context.Context, store Storage, key string) (bool, error) {
// Okay, not much going on here -- this function is only here for consistency of style.
return store.Has(ctx, key)
}

func Get(ctx context.Context, store ReadableStorage, key string) ([]byte, error) {
// Okay, not much going on here -- this function is only here for consistency of style.
return store.Get(ctx, key)
Expand Down
9 changes: 9 additions & 0 deletions storage/memstore/memstore.go
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,15 @@ func (store *Store) beInitialized() {
store.Bag = make(map[string][]byte)
}

// Has implements go-ipld-prime/storage.Storage.Has.
func (store *Store) Has(ctx context.Context, key string) (bool, error) {
if store.Bag == nil {
return false, nil
}
_, exists := store.Bag[key]
return exists, nil
}

// Get implements go-ipld-prime/storage.ReadableStorage.Get.
//
// Note that this internally performs a defensive copy;
Expand Down