Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Must a ConfigSource implementation be thread safe? #369

Open
ljnelson opened this issue Jun 8, 2018 · 14 comments
Open

Must a ConfigSource implementation be thread safe? #369

ljnelson opened this issue Jun 8, 2018 · 14 comments
Labels
clarification 🏆 A clarification (PR) or request for clarification (issue) in the spec or API

Comments

@ljnelson
Copy link
Contributor

ljnelson commented Jun 8, 2018

I didn't see anywhere in the specification saying whether a ConfigSource must be thread safe. I assume it must, since it is accessible to an end user by way of Config#getConfigSources().

Specifically, suppose a ConfigSource is constantly changing (is that allowed? the specification suggests but does not define that this is possible). That means, potentially, that the return values of its getPropertyNames() and (redundant?) getProperties() methods must have their behavior specified in the context of concurrent access by multiple threads.

For example, are the property names returned by getPropertyNames() a snapshot-in-time of the property names, and hence safe to iterate over without synchronization, or are they "live" and backed by the underlying ConfigSource? If they are intended to be live, then obviously a mechanism for synchronizing iteration must be supplied by the specification.

@jmesnil jmesnil added the clarification 🏆 A clarification (PR) or request for clarification (issue) in the spec or API label Aug 2, 2019
@jmesnil
Copy link
Contributor

jmesnil commented Aug 2, 2019

  • We must clarify that default config sources (from Config implementations) are thread-safe
  • It's an implementation detail for other config sources

@ljnelson
Copy link
Contributor Author

ljnelson commented Aug 2, 2019

Does this mean (I think it must) that the mechanism by which Config iterates over its ConfigSources is actually expected to be a Certain Something™, even though that Certain Something™ isn't (currently) spelled out anywhere?

@dmlloyd
Copy link
Contributor

dmlloyd commented Sep 11, 2019

I think that config sources have to be thread safe, otherwise the thread safety of Config itself is in question. This means that the value returned from getPropertyNames() must be safe against concurrent access (this might mean a snapshot or it might mean a thread-safe iterator, depending on the implementation; I don't think this matters much until we actually start talking about explicitly mutable configuration and snapshot capabilities).

The discussion about mutability is in #432.

@ljnelson
Copy link
Contributor Author

I agree that ConfigSources must be thread safe, and that their thread safety cannot be a mere implementation detail.

Regarding mutability, while #432 has focused on the dubious idea of snapshots, haven't (meanwhile) ConfigSources been allowed to provide changing information since the beginning?

That is, I don't see anywhere in the specification where it says that the return values of several ConfigSource#getValue(String.class) method invocations must be the same (e.g. properties "reachable" by a ConfigSource, it seems to me, can come and go). I also don't see anywhere in the specification where it says that successive invocations of ConfigSource#getPropertyNames() must be the same. In general the spirit of this lenient specification seems to be (for better or for worse): if it ain't there, then you can do it. So you can do it.

Since this is the case, shouldn't the specification change sooner rather than later such that the behavior of Config with regard to how it accesses ConfigSource information is thoroughly-specified instead of unspecified?

@dmlloyd
Copy link
Contributor

dmlloyd commented Sep 11, 2019

I agree. In particular I think it might be good to deprecate the ConfigSource#getProperties() method.

But documenting the exact way in which an implementation is allowed to access the ConfigSource is definitely important, I agree 100%.

@ljnelson
Copy link
Contributor Author

Thoroughly agreed. Now: if the behavior of Config is fully specified (let's look into the future) with respect to calling ConfigSource#getPropertyNames(), then I wonder if Config itself shouldn't be an abstract class (I know, I know, interfaces with default methods are all the rage but) whose harvesting-of-the-ConfigSource-information methods are declared final? Then there's no ambiguity.

@dmlloyd
Copy link
Contributor

dmlloyd commented Sep 11, 2019

That would probably have been better but there's really no way to do that without preserving binary compatibility, unfortunately. And it also voids the ability for implementations to perform any clever optimizations.

@ljnelson
Copy link
Contributor Author

So fine, default implementations for such methods that fully specify the behavior. You see what I'm getting at. This stuff must not be left to interpretation.

@Emily-Jiang
Copy link
Member

@ljnelson Has your concern fixed by #497 ?

@ljnelson
Copy link
Contributor Author

Er, no; clearly not. That issue adds javadoc that says that the Config#getConfigSources() method must return an Iterable whose Iterable#iterator() method must return a non-null Iterator that must be able to iterate in a thread-safe manner over a set of ConfigSources whose elements can change in kind, and which can, potentially, increase in cardinality, but must never shrink in cardinality.

(Actually, that's not actually what the new javadoc says, but probably what it should say.)

This of course does not address any thread-safety concerns about the individual ConfigSources themselves, which is the subject of this issue.

@dmlloyd
Copy link
Contributor

dmlloyd commented Feb 13, 2020

I don't think that cardinality enters into it. I think that the guarantees provided by (for example) ConcurrentHashMap are a reasonable guideline:

Similarly, Iterators, Spliterators and Enumerations return elements reflecting the state of the hash table at some point at or since the creation of the iterator/enumeration. They do not throw ConcurrentModificationException. However, iterators are designed to be used by only one thread at a time.

Tracking cardinality would add the guarantee that any ConfigSource returned cannot be subsequently observed to have been removed, but is that necessary to the goal of "thread safety" as given in this issue?

@ljnelson
Copy link
Contributor Author

Leaving aside the specifics of this particular Github issue, and issue #497, and looking at the contract as it currently exists on getConfigSources(), there are no guarantees whatsoever about the Iterable returned other than iteration should somehow be thread safe.

Given that the return type is an Iterable, then, there are several ways that are permitted by the specification (currently) that its reachable Iterator could proceed:

  1. It could iterate over an ever-increasing-in-number set whose members themselves are not changing.
  2. It could iterate over a set whose size doesn't change, but whose members are changing.
  3. It could iterate over an ever-increasing-in-number set whose members are changing.
  4. It could iterate over a set whose size does not change and whose members do not change (the "snapshot iterator" approach you mention and which is probably what everyone is assuming but not making explicit).

But:

  • It cannot iterate over a set whose size shrinks

…or the dance between hasNext and next would be surprising at best and nonfunctional at worst.

But all of that it seems to me is better discussed over at #497 or another issue since it doesn't have to do with the thread safety of an individual ConfigSource.

@dmlloyd
Copy link
Contributor

dmlloyd commented Feb 13, 2020

Leaving aside the specifics of this particular Github issue, and issue #497, and looking at the contract as it currently exists on getConfigSources(), there are no guarantees whatsoever about the Iterable returned other than iteration should somehow be thread safe.

Given that the return type is an Iterable, then, there are several ways that are permitted by the specification (currently) that its reachable Iterator could proceed:

  1. It could iterate over an ever-increasing-in-number set whose members themselves are not changing.
  2. It could iterate over a set whose size doesn't change, but whose members are changing.
  3. It could iterate over an ever-increasing-in-number set whose members are changing.
  4. It could iterate over a set whose size does not change and whose members do not change (the "snapshot iterator" approach you mention and which is probably what everyone is assuming but not making explicit).

But:

  • It cannot iterate over a set whose size shrinks

…or the dance between hasNext and next would be surprising at best and nonfunctional at worst.

Ah I see the source of confusion. The contract [1] of Iterator is that once hasNext returns true, the iterator is committed to returning the next element (usually this is done by prefetching the next element before returning true). But that does not in any way exclude the possibility that additional elements which have not yet been promised (by hasNext returning true) might be removed before the iterator sees them, without any risk to the contract of Iterator.

[1] "Returns true if the iteration has more elements. (In other words, returns true if next() would return an element rather than throwing an exception.)" (JDK JavaDoc)

But all of that it seems to me is better discussed over at #497 or another issue since it doesn't have to do with the thread safety of an individual ConfigSource.

Sure, that makes sense.

@ljnelson
Copy link
Contributor Author

Note that the Iterator contract specifically says "would", not "will", however, indicating something like "at the very same moment as this hasNext() invocation", not "whenever next() happens next to be invoked".

My much larger point is: any time iteration is not part of the object being returned from a method, or is otherwise not handled by encapsulation, the specification of how to perform iteration must be absolutely, unimpeachably clear. I don't feel like #497 et al. meets that bar and that the javadoc needs to be orders of magnitude larger and more specific.

Regarding this issue, I think the original question remains: in what ways will the iterable objects given to me by getProperties() and getPropertyNames() change? Do I need to synchronize on them if I iterate over them? Can I remove things from them (surely not)? And so on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clarification 🏆 A clarification (PR) or request for clarification (issue) in the spec or API
Projects
None yet
Development

No branches or pull requests

4 participants