Fix equivclass.c's not-quite-right strategy for handling X=X clauses.

The original coding correctly noted that these aren't just redundancies
(they're effectively X IS NOT NULL, assuming = is strict).  However, they
got treated that way if X happened to be in a single-member EquivalenceClass
already, which could happen if there was an ORDER BY X clause, for instance.
The simplest and most reliable solution seems to be to not try to process
such clauses through the EquivalenceClass machinery; just throw them back
for traditional processing.  The amount of work that'd be needed to be
smarter than that seems out of proportion to the benefit.

Per bug #5084 from Bernt Marius Johnsen, and analysis by Andrew Gierth.
This commit is contained in:
Tom Lane 2009-09-29 01:20:34 +00:00
parent 176c3c8db9
commit 25549edb26
4 changed files with 44 additions and 10 deletions

View File

@ -1,4 +1,4 @@
$PostgreSQL: pgsql/src/backend/optimizer/README,v 1.51 2009/09/17 20:49:28 tgl Exp $
$PostgreSQL: pgsql/src/backend/optimizer/README,v 1.52 2009/09/29 01:20:34 tgl Exp $
Optimizer
=========
@ -481,7 +481,10 @@ operator is mergejoinable, so there is no way for a volatile expression to
get into EquivalenceClasses otherwise. Aggregates are disallowed in WHERE
altogether, so will never be found in a mergejoinable clause.) This is just
a convenience to maintain a uniform PathKey representation: such an
EquivalenceClass will never be merged with any other.
EquivalenceClass will never be merged with any other. Note in particular
that a single-item EquivalenceClass {a.x} is *not* meant to imply an
assertion that a.x = a.x; the practical effect of this is that a.x could
be NULL.
An EquivalenceClass also contains a list of btree opfamily OIDs, which
determines what the equalities it represents actually "mean". All the

View File

@ -10,7 +10,7 @@
* Portions Copyright (c) 1994, Regents of the University of California
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/optimizer/path/equivclass.c,v 1.20 2009/09/12 00:04:58 tgl Exp $
* $PostgreSQL: pgsql/src/backend/optimizer/path/equivclass.c,v 1.21 2009/09/29 01:20:34 tgl Exp $
*
*-------------------------------------------------------------------------
*/
@ -114,6 +114,19 @@ process_equivalence(PlannerInfo *root, RestrictInfo *restrictinfo,
item1_relids = restrictinfo->left_relids;
item2_relids = restrictinfo->right_relids;
/*
* Reject clauses of the form X=X. These are not as redundant as they
* might seem at first glance: assuming the operator is strict, this is
* really an expensive way to write X IS NOT NULL. So we must not risk
* just losing the clause, which would be possible if there is already
* a single-element EquivalenceClass containing X. The case is not
* common enough to be worth contorting the EC machinery for, so just
* reject the clause and let it be processed as a normal restriction
* clause.
*/
if (equal(item1, item2))
return false; /* X=X is not a useful equivalence */
/*
* If below outer join, check for strictness, else reject.
*/
@ -152,13 +165,10 @@ process_equivalence(PlannerInfo *root, RestrictInfo *restrictinfo,
*
* 4. We find neither. Make a new, two-entry EC.
*
* Note: since all ECs are built through this process, it's impossible
* that we'd match an item in more than one existing EC. It is possible
* to match more than once within an EC, if someone fed us something silly
* like "WHERE X=X". (However, we can't simply discard such clauses,
* since they should fail when X is null; so we will build a 2-member EC
* to ensure the correct restriction clause gets generated. Hence there
* is no shortcut here for item1 and item2 equal.)
* Note: since all ECs are built through this process or the similar
* search in get_eclass_for_sort_expr(), it's impossible that we'd match
* an item in more than one existing nonvolatile EC. So it's okay to stop
* at the first match.
*/
ec1 = ec2 = NULL;
em1 = em2 = NULL;

View File

@ -768,3 +768,19 @@ select sillysrf(-1) order by 1;
(4 rows)
drop function sillysrf(int);
-- X = X isn't a no-op, it's effectively X IS NOT NULL assuming = is strict
-- (see bug #5084)
select * from (values (2),(null),(1)) v(k) where k = k order by k;
k
---
1
2
(2 rows)
select * from (values (2),(null),(1)) v(k) where k = k;
k
---
2
1
(2 rows)

View File

@ -202,3 +202,8 @@ select sillysrf(42);
select sillysrf(-1) order by 1;
drop function sillysrf(int);
-- X = X isn't a no-op, it's effectively X IS NOT NULL assuming = is strict
-- (see bug #5084)
select * from (values (2),(null),(1)) v(k) where k = k order by k;
select * from (values (2),(null),(1)) v(k) where k = k;