Post by Kevin GrittnerEvery way I look at it, inside a REPEATABLE READ or SERIALIZABLE
transaction a check for child rows when validating a parent DELETE
should consider both rows which exist according to the transaction
snapshot and according to a "current" snapshot. Interestingly, the
run of the query passes both snapshots through to the executor, but
for this query the estate->es_crosscheck_snapshot field (which
contains the transaction snapshot) doesn't seem to be consulted.
It makes me wonder whether we were at some point doing this right
and it later got broken.
I've been pondering a completely different way to fix this. Many years
ago I tried to get rid of the crosscheck snapshot completely by changing
the way locking conflicts are treated for REPEATABLE READ transactions
above.
The basic idea is that taking a share lock on a row implies that you're
going to apply further changes whose correctness depends on existence
of the row you lock. That, in particular, applies to the locks taken
by RI triggers -- we lock the parent row before we add children, because
the children's existence necessitates the existence of the parent. If
you take an exclusive lock, OTOH, that implies a modification of the row
itself (we never explicitly take that lock during an UPDATE or DELETE,
but we do so implicitly, because UPDATEs and DELETEs conflict with SHARE
locks). So after obtaining such a lock, its the lock holder's responsibility
to check that the desired update doesn't break anything, i.e. in the case
of RI that it doesn't create any orphaned children.
The only reason we need the crosscheck snapshot to do that is because
children may have been added (and the change committed) *after* the
transaction which removed the parent has taken its snapshot, but *before*
that transaction locks the parent row.
My proposal is to instead extend the locking protocol to prevent that.
Essentially, we have to raise a serialization error whenever
1) We attempt to exclusively lock a row (this includes updating or deleting
it), and
2) somebody else did hold a share lock on that row recently, and
3) That somebody is invisible to us according to our snapshot.
My initial attempt to do that failed, because we used to have very little
means of storing the locking history - the only source of information was
the xmax field, so any update of a tuple removed information about previous
lock holders - even if that update was later aborted. I pondered using
multi-xids for this, but at the time I deemed that too risky - back at the
time, they had a few wraparound issues and the like which were OK for share
locks, but not for what I intended.
But now that we have KEY SHARE locks, the situation changes. We now rely on
multi-xids to a much greater extent, and AFAIK multi-xid wraparound is now
correctly dealt with. We also already ensure that the information contained
in multi-xids are preserved across tuple upgrades (otherwise, updating a row
on which someone holds a KEY SHARE lock would be broken).
So all that is missing, I think, is
1) To make sure we only remove a multi-xid if none of the xids are invisible
to any snapshot (i.e. lie before GlobalXmin or something like that).
2) When we acquire a lock (either explicitly or implicitly by doing an
UPDATE or DELETE) check if all previous committed lock holders are
visible according to our snapshot, and raise a serialization error
if not.
The big advantage of doing that over fixing the crosscheck logic would be
that it'd make it possible to write concurrency-safe FK triggers in any
procedural language.
best regards,
Florian Pflug
--
Sent via pgsql-hackers mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers