Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Atomic record fields #13404

Open
wants to merge 20 commits into
base: trunk
Choose a base branch
from
Open

Atomic record fields #13404

wants to merge 20 commits into from

Conversation

clef-men
Copy link
Contributor

Example:

type t = {
  mutable current : int;
  mutable readers: int [@atomic];
}

let add_new_reader v =
  Atomic.Loc.incr [%atomic.loc v.readers]

This PR is implemented by myself (@clef-men) with help from @gasche, including for this PR description.

It is ready for review. We did our best to have a clean git history, so reading the PR commit-by-commit is recommended.

This PR sits on top of #13396, #13397 and #13398.
(Helping make decisions on those PRs will help the present PR move forward.)

(cc @OlivierNicole, @polytypic)

Why

Current OCaml 5 only supports atomic operations on a special type 'a Atomic.t of atomic references, which are like 'a ref except that access operations are atomic. When implementing concurrent data structures, it would be desirable for performance to have records with atomic fields, instead of having an indirection on each atomic field -- just like mutable x : foo is more efficient than x : foo ref. This is also helpful when combined with inline records inside variant construcors.

Design

This PR implements the "Atomic Record Fields" RFC
ocaml/RFCs#39
more specifically the "design number 2" from
ocaml/RFCs#39 (comment)
proposed by @bclement-ocp.

(The description below is self-contained, reading the RFC and is discussion is not necessary.)

We implement two features in sequence, as described in the RFC.

First, atomic record fields are just record fields marked with an [@atomic] attribute. Reads and writes to these fields are compiled to atomic operations. In our example, the field readers is marked atomic, to v.readers and v.readers <- 42 will be compiled to an atomic read and an atomic write.

Second, we implement "atomic locations", which is a compiler-supported way to describe an atomic field within a record to perform other atomic operations than read and write. Continuing this example, [%atomic.loc v.readers] has type int Atomic.Loc.t, which indicates an atomic location of type int. The submodule Atomic.Loc exposes operations similar to Atomic, but on this new type Atomic.Loc.t.

Currently the only atomic locations supported are atomic record fields. In the future we hope to expose atomic arrays using a similar approach, but we limit the scope of the current PR to record fields.

Implementation (high-level)

  • In trunk, Atomic.get is implemented by a direct memory access, but all other Atomic primitives are implemented in C, for example:

    value caml_atomic_exchange(value ref, value newval)

    We preserve this design, but introduce new C functions that take a pointer and an offset instead of just an atomic reference, for example:

    value caml_atomic_exchange_field(value obj, value vfield, value newval)

    (The old functions are kept around for backward-compatibility reasons, redefined from the new ones with offset 0.)

  • Internally, a value of type 'a Atomic.Loc.t is a pair of a block and an offset inside the block. With the example above, [%atomic.loc v.readers] is the pair (v, 1), indicating the second field of the record v. The call Atomic.Loc.exchange [%atomic.loc v.readers] x gets rewritten to something like %atomic_exchange_field v 1 x, which will eventually become the C call caml_atomic_exchange_field(v, Val_long(1), x). (When an atomic primitive is directly applied to an [%atomic.loc ...] expression, the compiler eliminates the pair construction on the fly. If it is passed around as a first-class location, then the pair may be constructed.)

  • We reimplement the Atomic.t type as a record with a single atomic field, and the corresponding functions become calls to the Atomic.Loc.t primitives, with offset 0.

After this PR, the entire code of stdlib/atomic.ml is as follows. ('a atomic_loc is a new builtin/predef type, used to typecheck the [%atomic.loc ..] construction.)

external ignore : 'a -> unit = "%ignore"

module Loc = struct
  type 'a t = 'a atomic_loc

  external get : 'a t -> 'a = "%atomic_load_loc"
  external exchange : 'a t -> 'a -> 'a = "%atomic_exchange_loc"
  external compare_and_set : 'a t -> 'a -> 'a -> bool = "%atomic_cas_loc"
  external fetch_and_add : int t -> int -> int = "%atomic_fetch_add_loc"

  let set t v =
    ignore (exchange t v)
  let incr t =
    ignore (fetch_and_add t 1)
  let decr t =
    ignore (fetch_and_add t (-1))
end

type !'a t = { mutable contents: 'a [@atomic]; }

let make v = { contents= v }

external make_contended : 'a -> 'a t = "caml_atomic_make_contended"

let get t =
  t.contents
let set t v =
  t.contents <- v

let exchange t v =
  Loc.exchange [%atomic.loc t.contents] v
let compare_and_set t old new_ =
  Loc.compare_and_set [%atomic.loc t.contents] old new_
let fetch_and_add t incr =
  Loc.fetch_and_add [%atomic.loc t.contents] incr
let incr t =
  Loc.incr [%atomic.loc t.contents]
let decr t =
  Loc.decr [%atomic.loc t.contents]

There is currently no support for something similar to Atomic.make_contented (placing values on isolated cache lines to avoid false sharing) for records with atomic fields . Workflows that require make_contended must stick to the existing Atomic.t type. Allocation directives for records or record fields could be future work -- outside the scope of the present PR.

@OlivierNicole OlivierNicole added the run-thread-sanitizer This label makes the CI run the testsuite with TSAN enabled label Aug 27, 2024
@kayceesrk
Copy link
Contributor

Thanks for this contribution. I am starting to review this PR.

I wondered why [@atomic] attribute was necessary at all and why we can't just use [%atomic.loc ...] on arbitrary fields to perform the atomic operations. One answer is that the OCaml memory model separates atomic and non-atomic locations. If we allow [%atomic.loc ...] on arbitrary fields, then we open up the possibility of mixed-mode accesses, which the memory model doesn't specify the semantics of.

I also wondered whether it was considered to introduce atomic as a keyword to annotate atomic record fields:

type t = {
  mutable current : int;
  atomic readers: int;
}

instead of

type t = {
  mutable current : int;
  mutable readers: int [@atomic];
}

I'm not suggesting that we do this, and only curious to understand the choice.

Copy link
Contributor

@bclement-ocp bclement-ocp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gasche suggested I take a look at this since I suggested this variant of the design. The implementation looks like it faithfully implements the described changes; I don't have many comments on the code that seems fairly straightforward.

With this patch, all regular atomic writes now incur an additional integer untagging operation. This is briefly discussed and brushed away in one of the commits as being "not noticeably more efficient". It is probably true for most (all?) hardware and workflows, especially given that this is inside a C call, but can you share the measurements (if any) that were made to support that assessment? I would personally err on the side of caution here and provide untagged variants, especially given that atomics are likely to be used for performance.

I noted while reading the PR that people who need the extra flexibility of proposal 1 (first-class location) from ocaml/RFCs#39 can do:

module Atomic_field = struct
  type ('r, 'a) t

  external get : 'r -> ('r, 'a) t -> 'a = "%atomic_load_field"
  external exchange : 'r -> ('r, 'a) t -> 'a -> 'a = "%atomic_exchange_field"
  external compare_and_set : 'r -> ('r, 'a) t -> 'a -> 'a -> bool = "%atomic_cas_field"
  external fetch_and_add : 'r -> ('r, int) t -> int -> int = "%atomic_fetch_add_field"
end

The only missing piece would be an [%atomic.field bar] extension point, although I don't think it necessarily makes sense to add it in this PR (especially given that there are some interactions e.g. with type disambiguation that would need to be worked out first). It was not necessarily obvious to me initially that both designs could work together this way.

typing/typecore.ml Show resolved Hide resolved
typing/types.mli Outdated Show resolved Hide resolved
end

type !'a t =
{ mutable contents: 'a [@atomic];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parallel with ref feels somewhat satisfying!

ocamldoc/odoc_sig.ml Outdated Show resolved Hide resolved
@gasche
Copy link
Member

gasche commented Aug 28, 2024

With this patch, all regular atomic writes now incur an additional integer untagging operation. [..] It is probably true for most (all?) hardware and workflows, especially given that this is inside a C call, but can you share the measurements (if any) that were made to support that assessment?

No measurements, I assess that the cost of the OCaml/C context switch and of the atomic operation itself dwarf the cost of a bit shift. (I think that if people wanted to make the code more complex because they believe otherwise, they should try to disprove this with microbenchmarks first.)

@clef-men
Copy link
Contributor Author

Thanks for this contribution. I am starting to review this PR.

I wondered why [@atomic] attribute was necessary at all and why we can't just use [%atomic.loc ...] on arbitrary fields to perform the atomic operations. One answer is that the OCaml memory model separates atomic and non-atomic locations. If we allow [%atomic.loc ...] on arbitrary fields, then we open up the possibility of mixed-mode accesses, which the memory model doesn't specify the semantics of.

I also wondered whether it was considered to introduce atomic as a keyword to annotate atomic record fields:

type t = {
  mutable current : int;
  atomic readers: int;
}

instead of

type t = {
  mutable current : int;
  mutable readers: int [@atomic];
}

I'm not suggesting that we do this, and only curious to understand the choice.

I think I would be more convenient to introduce the atomic keyword.
We considered it but were concerned that it would be difficult to have it accepted. We chose the easy, non-invasive way.
However, if everyone agrees on it, we can include it in the PR.

@fpottier
Copy link
Contributor

fpottier commented Sep 3, 2024

In principle I would be in favor of making atomic a keyword. I don't think an attribute should influence the semantics of the code.

I would even go so far as to suggest that one should write atomic mutable or mutable atomic as opposed to just atomic.

I would also be happy if we could find a better syntax for [%atomic.loc x.foo]. I am not sure what to propose. By analogy with C, perhaps &x.foo would make sense, but I suppose that this is impossible, for compatibility reasons?

@gasche
Copy link
Member

gasche commented Sep 3, 2024

My appetite for syntax dicussions is fairly low. Maybe we could first consider the feature and its implementation, merge the present PR if we decide to go for it, and then discuss reserved syntax for those constructs? If a release were to happen in between the two changes, maintaining both options is very low-maintenance so it's no big deal.

(Another option would be to move syntax discussions to ocaml/RFCs#39, in which case they could start right away without flooding the discussion of this one implementation.)

@fpottier
Copy link
Contributor

fpottier commented Sep 3, 2024

Sure, my comments about syntax were not meant to delay this PR.

I just happen to think that syntax is an important aspect of language design, so I hope that (sooner rather than later) it is possible to give a nice syntactic appearance to the proposed new features.

Copy link
Contributor

@OlivierNicole OlivierNicole left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed today with @clef-men and @gasche present, and the implementation looks correct to me. I like the design, as I had expressed in the RFC thread.

I have mostly minor comments, apart from the fact that I have been unable to reproduce the first bootstrap commit (see my dedicated review comment about that).

runtime/memory.c Show resolved Hide resolved
lambda/translprim.ml Outdated Show resolved Hide resolved
asmcomp/cmm_helpers.mli Outdated Show resolved Hide resolved
typing/typedecl.ml Outdated Show resolved Hide resolved
typing/typecore.ml Show resolved Hide resolved
testsuite/tests/atomic-locs/record_fields.ml Outdated Show resolved Hide resolved
stdlib/atomic.ml Outdated Show resolved Hide resolved
lambda/translcore.ml Outdated Show resolved Hide resolved
lambda/translcore.ml Outdated Show resolved Hide resolved
boot/ocamlc Outdated Show resolved Hide resolved
@fpottier
Copy link
Contributor

Naive question: why is Atomic.Loc.set implemented in terms of Atomic.Loc.exchange? Isn't that needlessly costly?

@OlivierNicole
Copy link
Contributor

It reflects the fact that Atomic.set is implemented in terms of Atomic.exchange, to enforce some properties of the OCaml memory model (see #10995).

@gasche
Copy link
Member

gasche commented Sep 27, 2024

This morning I gave a talk on The Pattern-Matching Bug, and we realized (discussion with @clef-men and @fpottier) that there is a connection to atomic record fields:

  1. This PR does not handle reads to atomic record fields that come from pattern-matching, only those in Pexp_getfield expressions. Oops! At minima one would expect those reads to be atomic reads.
  2. It is probably not a good idea to synthesize atomic reads in the middle of pattern-matching control flow. For example if a and b are atomic fields of the same record, then the pattern {a; b} would perform two atomic fields in an implementation-defined order. Instead we propose to forbid all atomic field patterns (even those binding just variables), so users have to bind the record and then access it in the right-hand side.

Forbidding reading atomic fields in patterns is restrictive, we could relax the restriction later if users find it too strong and we understand better what would be reasonable. One unfortunate effect of this restriction is that adding the [@atomic] attribute to a record field that already exists may cause the program to be rejected statically, if the code contains pattern-matches on that field.

@OlivierNicole
Copy link
Contributor

2. It is probably not a good idea to synthesize atomic reads in the middle of pattern-matching control flow. For example if a and b are atomic fields of the same record, then the pattern {a; b} would perform two atomic fields in an implementation-defined order.

Naive question: why is this an issue? If the order of two atomic reads is critical, I would expect a parallel programmer not to perform them via pattern matching.

@gasche
Copy link
Member

gasche commented Sep 27, 2024

I guess it doesn't help that we were discussing The Pattern-Matching Bug at the same time, whose longer name could be What Can Go Horribly Wrong If You Pattern-Match On Mutable Fields. A serious suggestion that emerged during the discussion is to simply forbid all matches on mutable fields in OCaml. I pointed out that this would break too many programs, so that we cannot do this right now, but maybe (the discussion went) it should have been forbidden from the start. In this context, the decision to forbid implicit atomic reads in patterns makes sense, I think -- if we identify a pain point, let's not make the problem worse.

(This being said, I agree with you that this is not the only reasonable choice. In particular, Atomic.read x + Atomic.read y also has unspecified atomic-read order and everyone is fine with that.)

@gasche
Copy link
Member

gasche commented Sep 27, 2024

The osx-arm64 fails due to a difference in cmm stamps (I'm not sure why an architecture has different stamps from the other, but oh well). We fixed previous such failures by using predicates to rule out different-cmm-output configurations (we use not-windows; no-flambda; no-tsan;. But there is no flag to disable OSX, and no one has been willing to review "negation in ocamltest" (#13315) yet, so there is no easy way to silence this one. I will propose instead to restrict the test to a single known-good configuration (linux-amd64).

(In the long run we should try to change the cmm type definitions to keep more structure to the identifiers, to support the -dno-unique-ids option in -dcmm, to have more robust test output, or maybe to have an entirely different printer designed with reproducibility in mind.)

@lthls
Copy link
Contributor

lthls commented Sep 27, 2024

In particular, Atomic.read x + Atomic.read y also has unspecified atomic-read order and everyone is fine with that.

I'm not ! But I concur that I can't get many other people to care about this kind of issues.

typing/typecore.ml Show resolved Hide resolved
typing/typecore.ml Outdated Show resolved Hide resolved
@gasche
Copy link
Member

gasche commented Sep 28, 2024

A proposal for a Changes entry:

- RFCs#39, #13404: atomic record fields
  (Clément Allain and Gabriel Scherer, review by KC Sivaramakrishnan, Basile Clément
   and Olivier Nicole)

@OlivierNicole
Copy link
Contributor

I guess it doesn't help that we were discussing The Pattern-Matching Bug at the same time, whose longer name could be What Can Go Horribly Wrong If You Pattern-Match On Mutable Fields. A serious suggestion that emerged during the discussion is to simply forbid all matches on mutable fields in OCaml. I pointed out that this would break too many programs, so that we cannot do this right now, but maybe (the discussion went) it should have been forbidden from the start. In this context, the decision to forbid implicit atomic reads in patterns makes sense, I think -- if we identify a pain point, let's not make the problem worse.

In that case, forbidding them in patterns seems reasonable, I guess.

Copy link
Contributor

@OlivierNicole OlivierNicole left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My previous comments have been addressed (except for a minor point) and the added code to forbid matching on atomic fields looks good to me. There is a small conflict with trunk to resolve.

@gasche
Copy link
Member

gasche commented Oct 28, 2024

Consensus from the maintainer meeting: mild consensus in favor of the feature. People wondered about whether the current proposal (where the record value and the offset are packed together in an 'a Loc.t type, proposed by @bclement-ocp ) is better than the previous proposal (with only the offset at a ('r, 'a) Loc.t type), and whether the previous could be better in some scenarios. We are planning to ask @polytypic about this.

(I think that the just-the-offset version is more complex to use for users, and that it makes it harder to extend to arrays.)

gasche and others added 19 commits November 21, 2024 14:50
This is a breaking change because this function was (unfortunately)
exposed outside CAML_INTERNALS, and is used by exactly one external
user, you guessed it:
  https://github.com/ocaml-multicore/multicore-magic/blob/360c2e829c9addeca9ccaee1c71f4ad36bb14a79/src/Multicore_magic.mli#L181-L185
  https://github.com/ocaml-multicore/multicore-magic/blob/360c2e829c9addeca9ccaee1c71f4ad36bb14a79/src/unboxed5/multicore_magic_atomic_array.ml#L36-L43

We chose to change the prototype to remain consistent with the naming
convention for the new caml_atomic_*_field primitives, which will be
added to support atomic record fields.

User code can easily adapt to this new prototype we are using, but not
in a way that is compatible with both old and new versions of
OCaml (not without some preprocessing at least).

Another option would be to expose

    int caml_atomic_cas_field(value obj, intnat fld, value, value)
    value caml_atomic_cas_field_boxed(value obj, value vfld, value, value)

but no other group of primitives in the runtime uses this _boxed
terminology, they instead use

    int caml_atomic_cas_field_unboxed(value obj, intnat fld, value, value)
    value caml_atomic_cas_field(value obj, value vfld, value, value)

and this would again break compatiblity -- it is not easier to convert
code to that two-version proposal, and not noticeably more efficient.

So in this case we decided to break compatibility (of an obscure,
experimental, undocumented but exposed feature) in favor of
consistency and simplificity of the result.
…_exchange_field] and [caml_atomic_fetch_add_field].
Uses of existing atomic primitives %atomic_foo, which act on
single-field references, are now translated into %atomic_foo_field,
which act on a pointer and an offset -- passed as separate arguments.

In particular, note that the arity of the internal Lambda primitive
    Patomic_load
increases by one with this patchset. (Initially we renamed it into
    Patomic_load_field
but this creates a lot of churn for no clear benefits.)

We also support primitives of the form %atomic_foo_loc, which
expects a pair of a pointer and an offset (as a single argument),
as we proposed in the RFC on atomic fields
  ocaml/RFCs#39
(but there is no language-level support for atomic record fields yet)

Co-authored-by: Clément Allain <clef-men@orange.fr>
Requires a bootstrap.

Co-authored-by: Gabriel Scherer <gabriel.scherer@gmail.com>
This type will be used for ['a Atomic.Loc.t], as proposed
in the RFC
  ocaml/RFCs#39

We implement this here to be able to use it in the stdlib later,
after a bootstrap.
We want to use [mark_label_used] in a context where we cannot easily
find the label declaration, only the label description (from the
environment).
This bootstrap is not required by a compiler change, but it enables
the use of the predefined type `'a atomic_loc` and the
expression-former [%atomic.loc ...] in the standard library.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run-thread-sanitizer This label makes the CI run the testsuite with TSAN enabled
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants