-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Atomic record fields #13404
base: trunk
Are you sure you want to change the base?
Atomic record fields #13404
Conversation
Thanks for this contribution. I am starting to review this PR. I wondered why I also wondered whether it was considered to introduce
instead of
I'm not suggesting that we do this, and only curious to understand the choice. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gasche suggested I take a look at this since I suggested this variant of the design. The implementation looks like it faithfully implements the described changes; I don't have many comments on the code that seems fairly straightforward.
With this patch, all regular atomic writes now incur an additional integer untagging operation. This is briefly discussed and brushed away in one of the commits as being "not noticeably more efficient". It is probably true for most (all?) hardware and workflows, especially given that this is inside a C call, but can you share the measurements (if any) that were made to support that assessment? I would personally err on the side of caution here and provide untagged variants, especially given that atomics are likely to be used for performance.
I noted while reading the PR that people who need the extra flexibility of proposal 1 (first-class location) from ocaml/RFCs#39 can do:
module Atomic_field = struct
type ('r, 'a) t
external get : 'r -> ('r, 'a) t -> 'a = "%atomic_load_field"
external exchange : 'r -> ('r, 'a) t -> 'a -> 'a = "%atomic_exchange_field"
external compare_and_set : 'r -> ('r, 'a) t -> 'a -> 'a -> bool = "%atomic_cas_field"
external fetch_and_add : 'r -> ('r, int) t -> int -> int = "%atomic_fetch_add_field"
end
The only missing piece would be an [%atomic.field bar]
extension point, although I don't think it necessarily makes sense to add it in this PR (especially given that there are some interactions e.g. with type disambiguation that would need to be worked out first). It was not necessarily obvious to me initially that both designs could work together this way.
end | ||
|
||
type !'a t = | ||
{ mutable contents: 'a [@atomic]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parallel with ref
feels somewhat satisfying!
No measurements, I assess that the cost of the OCaml/C context switch and of the atomic operation itself dwarf the cost of a bit shift. (I think that if people wanted to make the code more complex because they believe otherwise, they should try to disprove this with microbenchmarks first.) |
I think I would be more convenient to introduce the |
In principle I would be in favor of making I would even go so far as to suggest that one should write I would also be happy if we could find a better syntax for |
My appetite for syntax dicussions is fairly low. Maybe we could first consider the feature and its implementation, merge the present PR if we decide to go for it, and then discuss reserved syntax for those constructs? If a release were to happen in between the two changes, maintaining both options is very low-maintenance so it's no big deal. (Another option would be to move syntax discussions to ocaml/RFCs#39, in which case they could start right away without flooding the discussion of this one implementation.) |
Sure, my comments about syntax were not meant to delay this PR. I just happen to think that syntax is an important aspect of language design, so I hope that (sooner rather than later) it is possible to give a nice syntactic appearance to the proposed new features. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reviewed today with @clef-men and @gasche present, and the implementation looks correct to me. I like the design, as I had expressed in the RFC thread.
I have mostly minor comments, apart from the fact that I have been unable to reproduce the first bootstrap commit (see my dedicated review comment about that).
Naive question: why is |
a445931
to
68c59b7
Compare
It reflects the fact that |
68c59b7
to
d7242bc
Compare
d7242bc
to
b183ff6
Compare
This morning I gave a talk on The Pattern-Matching Bug, and we realized (discussion with @clef-men and @fpottier) that there is a connection to atomic record fields:
Forbidding reading atomic fields in patterns is restrictive, we could relax the restriction later if users find it too strong and we understand better what would be reasonable. One unfortunate effect of this restriction is that adding the |
Naive question: why is this an issue? If the order of two atomic reads is critical, I would expect a parallel programmer not to perform them via pattern matching. |
I guess it doesn't help that we were discussing The Pattern-Matching Bug at the same time, whose longer name could be What Can Go Horribly Wrong If You Pattern-Match On Mutable Fields. A serious suggestion that emerged during the discussion is to simply forbid all matches on mutable fields in OCaml. I pointed out that this would break too many programs, so that we cannot do this right now, but maybe (the discussion went) it should have been forbidden from the start. In this context, the decision to forbid implicit atomic reads in patterns makes sense, I think -- if we identify a pain point, let's not make the problem worse. (This being said, I agree with you that this is not the only reasonable choice. In particular, |
The osx-arm64 fails due to a difference in cmm stamps (I'm not sure why an architecture has different stamps from the other, but oh well). We fixed previous such failures by using predicates to rule out different-cmm-output configurations (we use (In the long run we should try to change the cmm type definitions to keep more structure to the identifiers, to support the |
I'm not ! But I concur that I can't get many other people to care about this kind of issues. |
db6fddd
to
95b6c82
Compare
95b6c82
to
f09dc9c
Compare
A proposal for a Changes entry:
|
In that case, forbidding them in patterns seems reasonable, I guess. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My previous comments have been addressed (except for a minor point) and the added code to forbid matching on atomic fields looks good to me. There is a small conflict with trunk to resolve.
Consensus from the maintainer meeting: mild consensus in favor of the feature. People wondered about whether the current proposal (where the record value and the offset are packed together in an (I think that the just-the-offset version is more complex to use for users, and that it makes it harder to extend to arrays.) |
143916c
to
4f7f3a6
Compare
This is a breaking change because this function was (unfortunately) exposed outside CAML_INTERNALS, and is used by exactly one external user, you guessed it: https://github.com/ocaml-multicore/multicore-magic/blob/360c2e829c9addeca9ccaee1c71f4ad36bb14a79/src/Multicore_magic.mli#L181-L185 https://github.com/ocaml-multicore/multicore-magic/blob/360c2e829c9addeca9ccaee1c71f4ad36bb14a79/src/unboxed5/multicore_magic_atomic_array.ml#L36-L43 We chose to change the prototype to remain consistent with the naming convention for the new caml_atomic_*_field primitives, which will be added to support atomic record fields. User code can easily adapt to this new prototype we are using, but not in a way that is compatible with both old and new versions of OCaml (not without some preprocessing at least). Another option would be to expose int caml_atomic_cas_field(value obj, intnat fld, value, value) value caml_atomic_cas_field_boxed(value obj, value vfld, value, value) but no other group of primitives in the runtime uses this _boxed terminology, they instead use int caml_atomic_cas_field_unboxed(value obj, intnat fld, value, value) value caml_atomic_cas_field(value obj, value vfld, value, value) and this would again break compatiblity -- it is not easier to convert code to that two-version proposal, and not noticeably more efficient. So in this case we decided to break compatibility (of an obscure, experimental, undocumented but exposed feature) in favor of consistency and simplificity of the result.
…_exchange_field] and [caml_atomic_fetch_add_field].
Uses of existing atomic primitives %atomic_foo, which act on single-field references, are now translated into %atomic_foo_field, which act on a pointer and an offset -- passed as separate arguments. In particular, note that the arity of the internal Lambda primitive Patomic_load increases by one with this patchset. (Initially we renamed it into Patomic_load_field but this creates a lot of churn for no clear benefits.) We also support primitives of the form %atomic_foo_loc, which expects a pair of a pointer and an offset (as a single argument), as we proposed in the RFC on atomic fields ocaml/RFCs#39 (but there is no language-level support for atomic record fields yet) Co-authored-by: Clément Allain <clef-men@orange.fr>
Requires a bootstrap. Co-authored-by: Gabriel Scherer <gabriel.scherer@gmail.com>
This type will be used for ['a Atomic.Loc.t], as proposed in the RFC ocaml/RFCs#39 We implement this here to be able to use it in the stdlib later, after a bootstrap.
We want to use [mark_label_used] in a context where we cannot easily find the label declaration, only the label description (from the environment).
This bootstrap is not required by a compiler change, but it enables the use of the predefined type `'a atomic_loc` and the expression-former [%atomic.loc ...] in the standard library.
4f7f3a6
to
023fc54
Compare
Example:
This PR is implemented by myself (@clef-men) with help from @gasche, including for this PR description.
It is ready for review. We did our best to have a clean git history, so reading the PR commit-by-commit is recommended.
This PR sits on top of #13396, #13397 and #13398.
(Helping make decisions on those PRs will help the present PR move forward.)
(cc @OlivierNicole, @polytypic)
Why
Current OCaml 5 only supports atomic operations on a special type
'a Atomic.t
of atomic references, which are like'a ref
except that access operations are atomic. When implementing concurrent data structures, it would be desirable for performance to have records with atomic fields, instead of having an indirection on each atomic field -- just likemutable x : foo
is more efficient thanx : foo ref
. This is also helpful when combined with inline records inside variant construcors.Design
This PR implements the "Atomic Record Fields" RFC
ocaml/RFCs#39
more specifically the "design number 2" from
ocaml/RFCs#39 (comment)
proposed by @bclement-ocp.
(The description below is self-contained, reading the RFC and is discussion is not necessary.)
We implement two features in sequence, as described in the RFC.
First, atomic record fields are just record fields marked with an
[@atomic]
attribute. Reads and writes to these fields are compiled to atomic operations. In our example, the fieldreaders
is marked atomic, tov.readers
andv.readers <- 42
will be compiled to an atomic read and an atomic write.Second, we implement "atomic locations", which is a compiler-supported way to describe an atomic field within a record to perform other atomic operations than read and write. Continuing this example,
[%atomic.loc v.readers]
has typeint Atomic.Loc.t
, which indicates an atomic location of typeint
. The submoduleAtomic.Loc
exposes operations similar toAtomic
, but on this new typeAtomic.Loc.t
.Currently the only atomic locations supported are atomic record fields. In the future we hope to expose atomic arrays using a similar approach, but we limit the scope of the current PR to record fields.
Implementation (high-level)
In trunk,
Atomic.get
is implemented by a direct memory access, but all otherAtomic
primitives are implemented in C, for example:value caml_atomic_exchange(value ref, value newval)
We preserve this design, but introduce new C functions that take a pointer and an offset instead of just an atomic reference, for example:
value caml_atomic_exchange_field(value obj, value vfield, value newval)
(The old functions are kept around for backward-compatibility reasons, redefined from the new ones with offset
0
.)Internally, a value of type
'a Atomic.Loc.t
is a pair of a block and an offset inside the block. With the example above,[%atomic.loc v.readers]
is the pair(v, 1)
, indicating the second field of the recordv
. The callAtomic.Loc.exchange [%atomic.loc v.readers] x
gets rewritten to something like%atomic_exchange_field v 1 x
, which will eventually become the C callcaml_atomic_exchange_field(v, Val_long(1), x)
. (When an atomic primitive is directly applied to an[%atomic.loc ...]
expression, the compiler eliminates the pair construction on the fly. If it is passed around as a first-class location, then the pair may be constructed.)We reimplement the
Atomic.t
type as a record with a single atomic field, and the corresponding functions become calls to theAtomic.Loc.t
primitives, with offset0
.After this PR, the entire code of
stdlib/atomic.ml
is as follows. ('a atomic_loc
is a new builtin/predef type, used to typecheck the[%atomic.loc ..]
construction.)There is currently no support for something similar to
Atomic.make_contented
(placing values on isolated cache lines to avoid false sharing) for records with atomic fields . Workflows that requiremake_contended
must stick to the existingAtomic.t
type. Allocation directives for records or record fields could be future work -- outside the scope of the present PR.