Discussion:
Support UPDATE table SET(*)=...
Atri Sharma
2014-10-15 08:02:56 UTC
Permalink
Hi All,

Please find attached a patch which implements support for UPDATE table1
SET(*)=...
The patch supports both UPDATE table SET(*)=(a,b,c) and UPDATE table1
SET(*)=(SELECT a,b,c FROM...).
It solves the problem of doing UPDATE from a record variable of the same
type as the table e.g. update foo set (*) = (select foorec.*) where ...;

The design is simple. It basically expands the * in transformation stage,
does the necessary type checking and adds it to the parse tree. This allows
for normal execution for the rest of the stages.


Thoughts/Comments?

Regards,

Atri
Marti Raudsepp
2014-10-15 08:44:48 UTC
Permalink
Hi
Post by Atri Sharma
Please find attached a patch which implements support for UPDATE table1
SET(*)=...
I presume you haven't read Tom Lane's proposal and discussion about
multiple column assignment in UPDATE:
http://www.postgresql.org/message-id/***@sss.pgh.pa.us
(Assigning all columns was also discussed there)

And there's a WIP patch:
http://www.postgresql.org/message-id/***@sss.pgh.pa.us

Regards,
Marti
--
Sent via pgsql-hackers mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Atri Sharma
2014-10-15 08:48:25 UTC
Permalink
Post by Marti Raudsepp
Hi
Post by Atri Sharma
Please find attached a patch which implements support for UPDATE table1
SET(*)=...
I presume you haven't read Tom Lane's proposal and discussion about
(Assigning all columns was also discussed there)
Thanks for the links, but this patch only targets SET(*) case, which, if I
understand correctly, the patch you mentioned doesn't directly handle (If I
understand correctly, the target of the two patches is different).

Regards,

Atri
--
Regards,

Atri
*l'apprenant*
Atri Sharma
2014-10-15 09:32:52 UTC
Permalink
Post by Atri Sharma
Post by Marti Raudsepp
Hi
Post by Atri Sharma
Please find attached a patch which implements support for UPDATE table1
SET(*)=...
I presume you haven't read Tom Lane's proposal and discussion about
(Assigning all columns was also discussed there)
Thanks for the links, but this patch only targets SET(*) case, which, if I
understand correctly, the patch you mentioned doesn't directly handle (If I
understand correctly, the target of the two patches is different).
Digging more, I figured that the patch I posted builds on top of Tom's
patch, since it did not add whole row cases.

Regards,

Atri
Merlin Moncure
2014-10-17 14:15:27 UTC
Permalink
Post by Atri Sharma
Post by Marti Raudsepp
Hi
Post by Atri Sharma
Please find attached a patch which implements support for UPDATE table1
SET(*)=...
I presume you haven't read Tom Lane's proposal and discussion about
(Assigning all columns was also discussed there)
Thanks for the links, but this patch only targets SET(*) case, which, if I
understand correctly, the patch you mentioned doesn't directly handle (If I
understand correctly, the target of the two patches is different).
Yeah -- in fact, there was some discussion about this exact case.
This patch solves a very important problem: when doing record
operations to move data between databases with identical schema
there's currently no way to 'update' in a generic way without building
out the entire field list via complicated and nasty dynamic SQL. I'm
not sure about the proposed syntax though; it seems a little weird to
me. Any particular reason why you couldn't have just done:

UPDATE table1 SET * = a,b,c, ...

also,

UPDATE table1 t SET t = (SELECT (a,b,c)::t FROM...);

seems cleaner than the proposed syntax for row assignment. Tom
objected though IIRC.

merlin
--
Sent via pgsql-hackers mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Atri Sharma
2014-10-17 14:50:25 UTC
Permalink
Post by Atri Sharma
Post by Atri Sharma
Post by Marti Raudsepp
Hi
Post by Atri Sharma
Please find attached a patch which implements support for UPDATE
table1
Post by Atri Sharma
Post by Marti Raudsepp
Post by Atri Sharma
SET(*)=...
I presume you haven't read Tom Lane's proposal and discussion about
(Assigning all columns was also discussed there)
Thanks for the links, but this patch only targets SET(*) case, which, if
I
Post by Atri Sharma
understand correctly, the patch you mentioned doesn't directly handle
(If I
Post by Atri Sharma
understand correctly, the target of the two patches is different).
Yeah -- in fact, there was some discussion about this exact case.
This patch solves a very important problem: when doing record
operations to move data between databases with identical schema
there's currently no way to 'update' in a generic way without building
out the entire field list via complicated and nasty dynamic SQL.
Thanks!
Post by Atri Sharma
I'm
not sure about the proposed syntax though; it seems a little weird to
UPDATE table1 SET * = a,b,c, ...
also,
UPDATE table1 t SET t = (SELECT (a,b,c)::t FROM...);
I honestly have not spent a lot of time thinking about the exact syntax
that may be acceptable. If we have arguments for or against a specific
syntax, I will be glad to incorporate them.
Marko Tiikkaja
2014-10-17 14:55:17 UTC
Permalink
Post by Merlin Moncure
UPDATE table1 SET * = a,b,c, ...
That just looks wrong to me. I'd prefer (*) = .. over that any day.
Post by Merlin Moncure
UPDATE table1 t SET t = (SELECT (a,b,c)::t FROM...);
seems cleaner than the proposed syntax for row assignment. Tom
objected though IIRC.
I don't know about Tom, but I didn't like that:
http://www.postgresql.org/message-id/***@joh.to


.marko
--
Sent via pgsql-hackers mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Merlin Moncure
2014-10-17 15:03:41 UTC
Permalink
Post by Marko Tiikkaja
Post by Merlin Moncure
UPDATE table1 SET * = a,b,c, ...
That just looks wrong to me. I'd prefer (*) = .. over that any day.
Post by Merlin Moncure
UPDATE table1 t SET t = (SELECT (a,b,c)::t FROM...);
seems cleaner than the proposed syntax for row assignment. Tom
objected though IIRC.
Hm, I didn't understand your objection:

<quoting>
So e.g.:
UPDATE foo f SET f = ..;

would resolve to the table, despite there being a column called "f"?
That would break backwards compatibility.
</quoting>

That's not correct: it should work exactly as 'select' does; given a
conflict resolve the field name, so no backwards compatibility issue.

merlin
--
Sent via pgsql-hackers mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Marko Tiikkaja
2014-10-17 15:06:44 UTC
Permalink
Post by Merlin Moncure
<quoting>
UPDATE foo f SET f = ..;
would resolve to the table, despite there being a column called "f"?
That would break backwards compatibility.
</quoting>
That's not correct: it should work exactly as 'select' does; given a
conflict resolve the field name, so no backwards compatibility issue.
local:marko=# show server_version;
server_version
----------------
9.1.13
(1 row)

local:marko=#* create table foo(f int);
CREATE TABLE
local:marko=#* update foo f set f=1;
UPDATE 0

This query would change meaning with your suggestion.

I'm not saying it would be a massive problem in practice, but I think we
should first consider options which don't break backwards compatibility,
even if some consider them "less clean".


.marko
--
Sent via pgsql-hackers mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Tom Lane
2014-10-17 15:16:11 UTC
Permalink
Post by Marko Tiikkaja
local:marko=#* create table foo(f int);
CREATE TABLE
local:marko=#* update foo f set f=1;
UPDATE 0
This query would change meaning with your suggestion.
I think it wouldn't; Merlin is proposing that f would be taken as the
field name. A more realistic objection goes like this:

create table foo(f int, g int);
update foo x set x = (1,2); -- works
alter table foo add column x int;
update foo x set x = (1,2,3); -- no longer works

It's not a real good thing if a column addition or renaming can
so fundamentally change the nature of a query.

regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Merlin Moncure
2014-10-17 15:22:11 UTC
Permalink
Post by Tom Lane
Post by Marko Tiikkaja
local:marko=#* create table foo(f int);
CREATE TABLE
local:marko=#* update foo f set f=1;
UPDATE 0
This query would change meaning with your suggestion.
I think it wouldn't; Merlin is proposing that f would be taken as the
create table foo(f int, g int);
update foo x set x = (1,2); -- works
alter table foo add column x int;
update foo x set x = (1,2,3); -- no longer works
It's not a real good thing if a column addition or renaming can
so fundamentally change the nature of a query.
That's exactly how SELECT works. I also dispute that the user should
be surprised in such cases.

merlin
--
Sent via pgsql-hackers mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Tom Lane
2014-10-17 15:30:51 UTC
Permalink
Post by Merlin Moncure
Post by Tom Lane
I think it wouldn't; Merlin is proposing that f would be taken as the
create table foo(f int, g int);
update foo x set x = (1,2); -- works
alter table foo add column x int;
update foo x set x = (1,2,3); -- no longer works
It's not a real good thing if a column addition or renaming can
so fundamentally change the nature of a query.
That's exactly how SELECT works. I also dispute that the user should
be surprised in such cases.
Well, the reason we have a problem in SELECT is that we support an
unholy miscegenation of SQL-spec and PostQUEL syntax: the idea that
"SELECT foo FROM foo" could represent a whole-row selection is nowhere
to be found in the SQL standard, for good reason IMO. But we've never
had the courage to break cleanly with this Berkeley leftover and
insist that people spell it SQL-style as "SELECT ROW(foo.*) FROM foo".
I'd just as soon not introduce the same kind of ambiguity into UPDATE
if we have a reasonable alternative.

regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Merlin Moncure
2014-10-17 15:47:57 UTC
Permalink
Post by Tom Lane
Post by Merlin Moncure
Post by Tom Lane
I think it wouldn't; Merlin is proposing that f would be taken as the
create table foo(f int, g int);
update foo x set x = (1,2); -- works
alter table foo add column x int;
update foo x set x = (1,2,3); -- no longer works
It's not a real good thing if a column addition or renaming can
so fundamentally change the nature of a query.
That's exactly how SELECT works. I also dispute that the user should
be surprised in such cases.
Well, the reason we have a problem in SELECT is that we support an
unholy miscegenation of SQL-spec and PostQUEL syntax: the idea that
"SELECT foo FROM foo" could represent a whole-row selection is nowhere
to be found in the SQL standard, for good reason IMO. But we've never
had the courage to break cleanly with this Berkeley leftover and
insist that people spell it SQL-style as "SELECT ROW(foo.*) FROM foo".
I'd just as soon not introduce the same kind of ambiguity into UPDATE
if we have a reasonable alternative.
Ah, interesting point (I happen to like the terse syntax and use it
often). This is for posterity anyways since you guys seem to like
Atri's proposal, which surprised me. However, I think you're over
simplifying things here. Syntax aside: I think
SELECT f FROM foo f;
and a hypothetical
SELECT row(f.*) FROM foo f;

give different semantics. The former gives an object of type 'f' and
the latter gives type 'row'. To get parity, you'd have to add an
extra cast which means you'd have to play tricky games to avoid extra
performance overhead besides being significantly more verbose. I'm
aware some of the other QUELisms are pretty dodgy and have burned us
in the past (like that whole function as record reference thing) but
pulling a record as a field in select is pretty nice. It's also
widely used and quite useful in json serialization.

merlin
--
Sent via pgsql-hackers mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Merlin Moncure
2014-10-22 15:05:29 UTC
Permalink
Post by Merlin Moncure
Post by Tom Lane
Post by Merlin Moncure
Post by Tom Lane
I think it wouldn't; Merlin is proposing that f would be taken as the
create table foo(f int, g int);
update foo x set x = (1,2); -- works
alter table foo add column x int;
update foo x set x = (1,2,3); -- no longer works
It's not a real good thing if a column addition or renaming can
so fundamentally change the nature of a query.
That's exactly how SELECT works. I also dispute that the user should
be surprised in such cases.
Well, the reason we have a problem in SELECT is that we support an
unholy miscegenation of SQL-spec and PostQUEL syntax: the idea that
"SELECT foo FROM foo" could represent a whole-row selection is nowhere
to be found in the SQL standard, for good reason IMO. But we've never
had the courage to break cleanly with this Berkeley leftover and
insist that people spell it SQL-style as "SELECT ROW(foo.*) FROM foo".
I'd just as soon not introduce the same kind of ambiguity into UPDATE
if we have a reasonable alternative.
Ah, interesting point (I happen to like the terse syntax and use it
often). This is for posterity anyways since you guys seem to like
Atri's proposal, which surprised me. However, I think you're over
simplifying things here. Syntax aside: I think
SELECT f FROM foo f;
and a hypothetical
SELECT row(f.*) FROM foo f;
give different semantics. The former gives an object of type 'f' and
the latter gives type 'row'. To get parity, you'd have to add an
extra cast which means you'd have to play tricky games to avoid extra
performance overhead besides being significantly more verbose. I'm
aware some of the other QUELisms are pretty dodgy and have burned us
in the past (like that whole function as record reference thing) but
pulling a record as a field in select is pretty nice. It's also
widely used and quite useful in json serialization.
Been thinking about this in the back of my mind the last couple of
days. There are some things you can do with the QUEL 'table as
column' in select syntax that are impossible otherwise, at least
today, and its usage is going to proliferate because of that. Row
construction via () or row() needs to be avoided whenever the column
names are important and there is no handy type to cast to. For
example, especially during json serialization it's convenient to do
things like:

select
array_agg((select q from (select a, b) q) order by ...)
from foo;

...where a,b are fields of foo. FWICT, this is the only way to do
json serialization of arbitrary runtime row constructions in a way
that does not anonymize the type. Until I figured out this trick I
used to create lots of composite types that served no purpose other
than to give me a type to cast to which is understandably annoying.

if:
select (x).* from (select (1, 2) as x) q;

worked and properly expanded x to given names should they exist and:
SELECT row(f.*) FROM foo f;

worked and did same, and:
SELECT (row(f.*)).* FROM foo f;

was as reasonably performant and gave the same results as:
SELECT (f).* FROM foo f;

...then IMSNHO you'd have a reasonable path to deprecating the QUEL
inspired syntax. Food for thought.

merlin
--
Sent via pgsql-hackers mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Marti Raudsepp
2014-10-17 16:05:09 UTC
Permalink
Post by Tom Lane
create table foo(f int, g int);
update foo x set x = (1,2); -- works
alter table foo add column x int;
update foo x set x = (1,2,3); -- no longer works
It's not a real good thing if a column addition or renaming can
so fundamentally change the nature of a query.
I think a significant use case for this feature is when you already have a
row-value and want to persist it in the database, like you can do with
INSERT:
insert into foo select * from populate_record_json(null::foo, '{...}');

In this case the opposite is true: requiring explicit column names would
break the query if you add columns to the table. The fact that you can't
reasonably use populate_record/_json with UPDATE is a significant omission.
IMO this really speaks for supporting shorthand whole-row assignment,
whatever the syntax.

Regards,
Marti
Tom Lane
2014-10-17 15:10:18 UTC
Permalink
Post by Merlin Moncure
<quoting>
UPDATE foo f SET f = ..;
would resolve to the table, despite there being a column called "f"?
That would break backwards compatibility.
</quoting>
That's not correct: it should work exactly as 'select' does; given a
conflict resolve the field name, so no backwards compatibility issue.
The point is that it's fairly messy (and bug-prone) to have a syntax
where we have to make an arbitrary choice between two reasonable
interpretations.

If you look back at the whole thread Marko's above-cited message is in,
we'd considered a bunch of different possible syntaxes for this, and
none of them had much support. The "(*)" idea actually is starting to
look pretty good to me.

regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Merlin Moncure
2014-10-17 15:11:26 UTC
Permalink
Post by Tom Lane
Post by Merlin Moncure
<quoting>
UPDATE foo f SET f = ..;
would resolve to the table, despite there being a column called "f"?
That would break backwards compatibility.
</quoting>
That's not correct: it should work exactly as 'select' does; given a
conflict resolve the field name, so no backwards compatibility issue.
The point is that it's fairly messy (and bug-prone) to have a syntax
where we have to make an arbitrary choice between two reasonable
interpretations.
If you look back at the whole thread Marko's above-cited message is in,
we'd considered a bunch of different possible syntaxes for this, and
none of them had much support. The "(*)" idea actually is starting to
look pretty good to me.
Hm, I'll take it then.

merlin
--
Sent via pgsql-hackers mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Tom Lane
2014-10-17 15:04:27 UTC
Permalink
Post by Merlin Moncure
Post by Atri Sharma
Thanks for the links, but this patch only targets SET(*) case, which, if I
understand correctly, the patch you mentioned doesn't directly handle (If I
understand correctly, the target of the two patches is different).
Yeah -- in fact, there was some discussion about this exact case.
This patch solves a very important problem: when doing record
operations to move data between databases with identical schema
there's currently no way to 'update' in a generic way without building
out the entire field list via complicated and nasty dynamic SQL. I'm
not sure about the proposed syntax though; it seems a little weird to
UPDATE table1 SET * = a,b,c, ...
also,
UPDATE table1 t SET t = (SELECT (a,b,c)::t FROM...);
seems cleaner than the proposed syntax for row assignment. Tom
objected though IIRC.
That last proposal is no good because it would be ambiguous if the
table contains a column by that name. The "(*)" idea actually seems
not bad, since it's shorthand for a parenthesized column list.

I'm not sure about the patch itself though --- in particular, it
doesn't seem to me that it should be touching transformTargetList,
since that doesn't have anything to do with expansion of multiassignments
today. Probably this is a symptom of having chosen a poor representation
of the construct in the raw grammar output. However, I've not exactly
wrapped my head around what that representation is ... the lack of any
comments explaining it doesn't help.

A larger question is whether it's appropriate to do the *-expansion
at parse time, or whether we'd need to postpone it to later in order
to handle reasonable use-cases. Since we expand "SELECT *" at parse
time (and are mandated to do so by the SQL spec, I believe), it seems
consistent to do this at parse time as well; but perhaps there is a
counter argument.

regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-***@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Loading...