Post by Robert HaasPost by Andres FreundPost by Robert HaasPost by Andres FreundWhich is why these acquire/release fences, in contrast to
acquire/release operations, have more guarantees... You put your finger
right onto the spot.
But, uh, we still don't seem to know what those guarantees actually ARE.
Paired together they form a synchronized-with relationship. Problem #1
is that the standard's language isn't, to me at least, clear if there's
not some case where that's not the case. Problem #2 is that our current
README.barrier definition doesn't actually require barriers to be
paired. Which imo is bad, but still a fact.
I don't know what a "synchronized-with relationship" means.
I'm using the standard's language here, given that I'm trying to reason
about its behaviour...
What it means is that if you have a matching pair of acquire/release
operations or barriers/fences everything that happened *before* the last
release fence will be visible *after* executing the next acquire
operation in a different thread-of-execution. And 'after' is defined in
the way that is true if the 'acquiring' thread can see the result of the
'releasing' operation.
I.e. no loads after the acquire can see values from before the release.
My problem with the definition in the standard is that it's not
particularly clear how acquire fences *without* a underlying explicit
atomic operation are defined in the standard.
I checked gcc's current code and it's fine in that regard. Also other
popular concurrent open source stuff like
http://git.qemu.org/?p=qemu.git;a=blob;f=include/qemu/atomic.h;hb=HEAD
does precisely what I'm talking about:
100 #ifndef smp_wmb
101 #ifdef __ATOMIC_RELEASE
102 #define smp_wmb() __atomic_thread_fence(__ATOMIC_RELEASE)
103 #else
104 #define smp_wmb() __sync_synchronize()
105 #endif
106 #endif
107
108 #ifndef smp_rmb
109 #ifdef __ATOMIC_ACQUIRE
110 #define smp_rmb() __atomic_thread_fence(__ATOMIC_ACQUIRE)
111 #else
112 #define smp_rmb() __sync_synchronize()
113 #endif
114 #endif
The commit that added it
http://git.qemu.org/?p=qemu.git;a=commitdiff;h=5444e768ee1abe6e021bece19a9a932351f88c88
was written by one gcc guy and reviewed by another one...
So I think we can be pretty sure that gcc's __atomic_thread_fence()
behaves like we want. We probably have to be a bit more careful about
extending that definition (by including atomic.h and doing
atomic_thread_fence(memory_order_acquire)) to use general C11. Which is
probably a couple years away anyway.
Post by Robert HaasAlso, I pretty much designed those definitions to match what Linux
does. And it doesn't require that either, though it says that in most
cases it will work out that way.
My point is that that read barriers aren't particularly meaningful
without a defined store order from another thread/process. Without any
form of pairing you don't have that. The writing side could just have
reordered the writes in a way you didn't want them. And the kernel docs
do say "A lack of appropriate pairing is almost certainly an error". But
since read barriers also pair with lock releases operations, that's
normally not a big problem.
Post by Robert HaasPost by Andres FreundThe definition of ACQ_REL is pretty clearly sufficient imo: "Full
barrier in both directions and synchronizes with acquire loads and
release stores in another thread.".
I dunno. What's an acquire load? What's a release store? I know
what loads and stores are; I don't know what the adjectives mean.
An acquire load is either an explicit atomic load (tas, cmpxchg, etc
also count) or a normal load combined with a acquire barrier. The symmetric
definition is true for release store.
(so, on x86 every load/store that prevents compiler reordering
essentially a acquire/release store)
Post by Robert HaasPost by Andres FreundAnd realistically, in the above example, you'd have to read flag to see
that it's not already 1, right?
Not necessarily. You could be the only writer. Think about the way
the backend entries in the stats system work. The point of setting
the flag may be for other people to know whether the data is in the
middle of being modified.
So you're thinking about something seqlock alike... Isn't the problem
then that you actually don't want acquire semantics, but release or
write barrier semantics on that store? The acquire/read barrier part
would be on the reader side, no?
I'm still unsure what you want to show with that example?
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers