| MEMBAR_OPS(3) | Library Functions Manual | MEMBAR_OPS(3) |
membar_ops,
membar_acquire,
membar_release,
membar_producer,
membar_consumer,
membar_datadep_consumer,
membar_sync — memory
ordering barriers
#include
<sys/atomic.h>
void
membar_acquire(void);
void
membar_release(void);
void
membar_producer(void);
void
membar_consumer(void);
void
membar_datadep_consumer(void);
void
membar_sync(void);
The membar_ops family of functions prevent
reordering of memory operations, as needed for synchronization in
multiprocessor execution environments that have relaxed load and store
order.
In general, memory barriers must come in pairs
— a barrier on one CPU, such as
membar_release(),
must pair with a barrier on another CPU, such as
membar_acquire(), in order to synchronize anything
between the two CPUs. Code using membar_ops should
generally be annotated with comments identifying how they are paired.
membar_ops affect only operations on
regular memory, not on device memory; see
bus_space(9) and
bus_dma(9) for
machine-independent interfaces to handling device memory and DMA operations
for device drivers.
Unlike C11,
all memory operations
— that is, all loads and stores on regular memory — are
affected by membar_ops, not just C11 atomic
operations on _Atomic-qualified objects.
membar_acquire()membar_acquire() will happen
before all memory operations following it.
A load followed by a
membar_acquire()
implies a load-acquire operation in the language of
C11. membar_acquire() should only be used after
atomic read/modify/write, such as
atomic_cas_uint(3).
For regular loads, instead of x = *p;
membar_acquire(), you should use x =
atomic_load_acquire(p).
membar_acquire()
is typically used in code that implements locking primitives to ensure
that a lock protects its data, and is typically paired with
membar_release(); see below for an example.
membar_release()membar_release()
will happen before any store that follows it.
A
membar_release()
followed by a store implies a store-release operation
in the language of C11. membar_release() should
only be used before atomic read/modify/write, such as
atomic_inc_uint(3).
For regular stores, instead of membar_release(); *p =
x, you should use atomic_store_release(p,
x).
membar_release()
is typically paired with membar_acquire(), and
is typically used in code that implements locking or reference counting
primitives. Releasing a lock or reference count should use
membar_release(), and acquiring a lock or
handling an object after draining references should use
membar_acquire(), so that whatever happened
before releasing will also have happened before acquiring. For
example:
/* thread A -- release a reference */ obj->state.mumblefrotz = 42; KASSERT(valid(&obj->state)); membar_release(); atomic_dec_uint(&obj->refcnt); /* * thread B -- busy-wait until last reference is released, * then lock it by setting refcnt to UINT_MAX */ while (atomic_cas_uint(&obj->refcnt, 0, -1) != 0) continue; membar_acquire(); KASSERT(valid(&obj->state)); obj->state.mumblefrotz--;
In this example,
if the load in
atomic_cas_uint()
in thread B witnesses the store in
atomic_dec_uint()
in thread A setting the reference count to zero,
then
everything in thread A before the
membar_release() is guaranteed to happen before
everything in thread B after the
membar_acquire(), as if the machine had
sequentially executed:
obj->state.mumblefrotz = 42; /* from thread A */ KASSERT(valid(&obj->state)); ... KASSERT(valid(&obj->state)); /* from thread B */ obj->state.mumblefrotz--;
membar_release()
followed by a store, serving as a store-release
operation, may also be paired with a subsequent load followed by
membar_acquire(), serving as the corresponding
load-acquire operation. However, you should use
atomic_store_release(9)
and
atomic_load_acquire(9)
instead in that situation, unless the store is an atomic
read/modify/write which requires a separate
membar_release().
membar_producer()membar_producer() will happen
before any stores following it.
membar_producer()
has no analogue in C11.
membar_producer()
is typically used in code that produces data for read-only consumers
which use membar_consumer(), such as
‘seqlocked’ snapshots of statistics; see below for an
example.
membar_consumer()membar_consumer() will
complete before any loads after it.
membar_consumer()
has no analogue in C11.
membar_consumer()
is typically used in code that reads data from producers which use
membar_producer(), such as
‘seqlocked’ snapshots of statistics. For example:
struct {
/* version number and in-progress bit */
unsigned seq;
/* read-only statistics, too large for atomic load */
unsigned foo;
int bar;
uint64_t baz;
} stats;
/* producer (must be serialized, e.g. with mutex(9)) */
stats->seq |= 1; /* mark update in progress */
membar_producer();
stats->foo = count_foo();
stats->bar = measure_bar();
stats->baz = enumerate_baz();
membar_producer();
stats->seq++; /* bump version number */
/* consumer (in parallel w/ producer, other consumers) */
restart:
while ((seq = stats->seq) & 1) /* wait for update */
SPINLOCK_BACKOFF_HOOK;
membar_consumer();
foo = stats->foo; /* read out a candidate snapshot */
bar = stats->bar;
baz = stats->baz;
membar_consumer();
if (seq != stats->seq) /* try again if version changed */
goto restart;
membar_datadep_consumer()membar_consumer(), but limited to loads of
addresses dependent on prior loads, or ‘data-dependent’
loads:
int **pp, *p, v; p = *pp; membar_datadep_consumer(); v = *p; consume(v);
membar_datadep_consumer()
is typically paired with membar_release() by
code that initializes an object before publishing it. However, you
should use
atomic_store_release(9)
and
atomic_load_consume(9)
instead, to avoid obscure edge cases in case the consumer is not
read-only.
membar_datadep_consumer()
does not guarantee ordering of loads in branches, or
‘control-dependent’ loads — you must use
membar_consumer() instead:
int *ok, *p, v;
if (*ok) {
membar_consumer();
v = *p;
consume(v);
}
Most CPUs do not reorder
data-dependent loads (i.e., most CPUs guarantee that cached values are
not stale in that case), so
membar_datadep_consumer()
is a no-op on those CPUs.
membar_sync()membar_sync() will
happen before any memory operations following it.
membar_sync()
is a sequential consistency acquire/release barrier, analogous to
atomic_thread_fence(memory_order_seq_cst) in
C11.
membar_sync()
is typically paired with membar_sync().
membar_sync()
is typically not needed except in exotic synchronization schemes like
Dekker's algorithm that require store-before-load ordering. If you are
tempted to reach for it, see if there is another way to do what you're
trying to do first.
The following memory barriers are deprecated. They were imported from Solaris, which describes them as providing ordering relative to ‘lock acquisition’, but the documentation in NetBSD disagreed with the implementation and use on the semantics.
membar_enter()membar_sync() everywhere, meaning a full
load/store-before-load/store sequential consistency barrier, in order to
guarantee what the documentation claimed
and what
the implementation actually did.
New code should use
membar_acquire()
for load-before-load/store ordering, which is what most uses need, or
membar_sync() for store-before-load/store
ordering, which typically only appears in exotic synchronization schemes
like Dekker's algorithm.
membar_exit()membar_release(). This was originally
meant to be paired with membar_enter().
New code should use
membar_release()
instead.
atomic_ops(3), atomic_loadstore(9), bus_dma(9), bus_space(9)
The membar_ops functions first appeared in
NetBSD 5.0.
The data-dependent load barrier,
membar_datadep_consumer(), first appeared in
NetBSD 7.0.
The membar_acquire() and
membar_release() functions first appeared, and the
membar_enter() and
membar_exit() functions were deprecated, in
NetBSD 10.0.
| March 30, 2022 | NetBSD 11.0 |