2010-09-20 22:08:53 +02:00
|
|
|
<!-- doc/src/sgml/typeconv.sgml -->
|
2003-03-13 02:30:29 +01:00
|
|
|
|
2011-04-05 20:06:06 +02:00
|
|
|
<chapter id="typeconv">
|
1998-07-08 15:53:15 +02:00
|
|
|
<title>Type Conversion</title>
|
|
|
|
|
2003-08-31 19:32:24 +02:00
|
|
|
<indexterm zone="typeconv">
|
|
|
|
<primary>data type</primary>
|
|
|
|
<secondary>conversion</secondary>
|
|
|
|
</indexterm>
|
|
|
|
|
1998-07-08 15:53:15 +02:00
|
|
|
<para>
|
2003-03-13 02:30:29 +01:00
|
|
|
<acronym>SQL</acronym> statements can, intentionally or not, require
|
2009-04-27 18:27:36 +02:00
|
|
|
the mixing of different data types in the same expression.
|
2001-09-15 02:48:59 +02:00
|
|
|
<productname>PostgreSQL</productname> has extensive facilities for
|
1998-07-08 15:53:15 +02:00
|
|
|
evaluating mixed-type expressions.
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
|
|
|
<para>
|
2009-04-27 18:27:36 +02:00
|
|
|
In many cases a user does not need
|
1998-07-08 15:53:15 +02:00
|
|
|
to understand the details of the type conversion mechanism.
|
2009-04-27 18:27:36 +02:00
|
|
|
However, implicit conversions done by <productname>PostgreSQL</productname>
|
2000-12-17 06:55:26 +01:00
|
|
|
can affect the results of a query. When necessary, these results
|
2004-12-24 00:07:38 +01:00
|
|
|
can be tailored by using <emphasis>explicit</emphasis> type conversion.
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
|
|
|
<para>
|
2001-09-15 02:48:59 +02:00
|
|
|
This chapter introduces the <productname>PostgreSQL</productname>
|
2000-12-17 06:55:26 +01:00
|
|
|
type conversion mechanisms and conventions.
|
2017-11-23 15:39:47 +01:00
|
|
|
Refer to the relevant sections in <xref linkend="datatype"/> and <xref linkend="functions"/>
|
2000-12-17 06:55:26 +01:00
|
|
|
for more information on specific data types and allowed functions and
|
|
|
|
operators.
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
2000-09-29 22:21:34 +02:00
|
|
|
<sect1 id="typeconv-overview">
|
1998-07-08 15:53:15 +02:00
|
|
|
<title>Overview</title>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
<acronym>SQL</acronym> is a strongly typed language. That is, every data item
|
|
|
|
has an associated data type which determines its behavior and allowed usage.
|
2001-09-15 02:48:59 +02:00
|
|
|
<productname>PostgreSQL</productname> has an extensible type system that is
|
2009-04-27 18:27:36 +02:00
|
|
|
more general and flexible than other <acronym>SQL</acronym> implementations.
|
2001-09-15 02:48:59 +02:00
|
|
|
Hence, most type conversion behavior in <productname>PostgreSQL</productname>
|
2017-10-09 03:44:17 +02:00
|
|
|
is governed by general rules rather than by <foreignphrase>ad hoc</foreignphrase>
|
2009-04-27 18:27:36 +02:00
|
|
|
heuristics. This allows the use of mixed-type expressions even with
|
|
|
|
user-defined types.
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
|
|
|
<para>
|
2004-12-24 00:07:38 +01:00
|
|
|
The <productname>PostgreSQL</productname> scanner/parser divides lexical
|
2009-04-27 18:27:36 +02:00
|
|
|
elements into five fundamental categories: integers, non-integer numbers,
|
2004-12-24 00:07:38 +01:00
|
|
|
strings, identifiers, and key words. Constants of most non-numeric types are
|
|
|
|
first classified as strings. The <acronym>SQL</acronym> language definition
|
|
|
|
allows specifying type names with strings, and this mechanism can be used in
|
2001-09-15 02:48:59 +02:00
|
|
|
<productname>PostgreSQL</productname> to start the parser down the correct
|
2009-04-27 18:27:36 +02:00
|
|
|
path. For example, the query:
|
1998-07-08 15:53:15 +02:00
|
|
|
|
2001-09-15 02:48:59 +02:00
|
|
|
<screen>
|
2003-03-13 02:30:29 +01:00
|
|
|
SELECT text 'Origin' AS "label", point '(0,0)' AS "value";
|
|
|
|
|
|
|
|
label | value
|
2000-03-26 20:32:30 +02:00
|
|
|
--------+-------
|
|
|
|
Origin | (0,0)
|
1998-07-08 15:53:15 +02:00
|
|
|
(1 row)
|
2001-09-15 02:48:59 +02:00
|
|
|
</screen>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
2001-10-15 03:00:59 +02:00
|
|
|
has two literal constants, of type <type>text</type> and <type>point</type>.
|
|
|
|
If a type is not specified for a string literal, then the placeholder type
|
2003-03-13 02:30:29 +01:00
|
|
|
<type>unknown</type> is assigned initially, to be resolved in later
|
2000-12-17 06:55:26 +01:00
|
|
|
stages as described below.
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
|
|
|
<para>
|
|
|
|
There are four fundamental <acronym>SQL</acronym> constructs requiring
|
2001-09-15 02:48:59 +02:00
|
|
|
distinct type conversion rules in the <productname>PostgreSQL</productname>
|
1998-07-08 15:53:15 +02:00
|
|
|
parser:
|
|
|
|
|
|
|
|
<variablelist>
|
|
|
|
<varlistentry>
|
|
|
|
<term>
|
2004-12-24 00:07:38 +01:00
|
|
|
Function calls
|
1998-07-08 15:53:15 +02:00
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2004-12-24 00:07:38 +01:00
|
|
|
Much of the <productname>PostgreSQL</productname> type system is built around a
|
|
|
|
rich set of functions. Functions can have one or more arguments.
|
|
|
|
Since <productname>PostgreSQL</productname> permits function
|
|
|
|
overloading, the function name alone does not uniquely identify the function
|
|
|
|
to be called; the parser must select the right function based on the data
|
|
|
|
types of the supplied arguments.
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
1998-07-08 15:53:15 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
|
|
<term>
|
2004-12-24 00:07:38 +01:00
|
|
|
Operators
|
1998-07-08 15:53:15 +02:00
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2004-12-24 00:07:38 +01:00
|
|
|
<productname>PostgreSQL</productname> allows expressions with
|
Remove support for postfix (right-unary) operators.
This feature has been a thorn in our sides for a long time, causing
many grammatical ambiguity problems. It doesn't seem worth the
pain to continue to support it, so remove it.
There are some follow-on improvements we can make in the grammar,
but this commit only removes the bare minimum number of productions,
plus assorted backend support code.
Note that pg_dump and psql continue to have full support, since
they may be used against older servers. However, pg_dump warns
about postfix operators. There is also a check in pg_upgrade.
Documentation-wise, I (tgl) largely removed the "left unary"
terminology in favor of saying "prefix operator", which is
a more standard and IMO less confusing term.
I included a catversion bump, although no initial catalog data
changes here, to mark the boundary at which oprkind = 'r'
stopped being valid in pg_operator.
Mark Dilger, based on work by myself and Robert Haas;
review by John Naylor
Discussion: https://postgr.es/m/38ca86db-42ab-9b48-2902-337a0d6b8311@2ndquadrant.com
2020-09-18 01:38:05 +02:00
|
|
|
prefix (one-argument) operators,
|
|
|
|
as well as infix (two-argument) operators. Like functions, operators can
|
2009-04-27 18:27:36 +02:00
|
|
|
be overloaded, so the same problem of selecting the right operator
|
2004-12-24 00:07:38 +01:00
|
|
|
exists.
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
1998-07-08 15:53:15 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
|
|
<term>
|
2003-03-13 02:30:29 +01:00
|
|
|
Value Storage
|
1998-07-08 15:53:15 +02:00
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2001-09-15 02:48:59 +02:00
|
|
|
<acronym>SQL</acronym> <command>INSERT</command> and <command>UPDATE</command> statements place the results of
|
2003-03-13 02:30:29 +01:00
|
|
|
expressions into a table. The expressions in the statement must be matched up
|
2000-12-17 06:55:26 +01:00
|
|
|
with, and perhaps converted to, the types of the target columns.
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
1998-07-08 15:53:15 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
|
|
<term>
|
2005-06-27 00:05:42 +02:00
|
|
|
<literal>UNION</literal>, <literal>CASE</literal>, and related constructs
|
1998-07-08 15:53:15 +02:00
|
|
|
</term>
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2003-12-14 01:10:32 +01:00
|
|
|
Since all query results from a unionized <command>SELECT</command> statement
|
2003-08-15 01:13:27 +02:00
|
|
|
must appear in a single set of columns, the types of the results of each
|
2017-10-09 03:44:17 +02:00
|
|
|
<command>SELECT</command> clause must be matched up and converted to a uniform set.
|
|
|
|
Similarly, the result expressions of a <literal>CASE</literal> construct must be
|
|
|
|
converted to a common type so that the <literal>CASE</literal> expression as a whole
|
|
|
|
has a known output type. The same holds for <literal>ARRAY</literal> constructs,
|
|
|
|
and for the <function>GREATEST</function> and <function>LEAST</function> functions.
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
1998-07-08 15:53:15 +02:00
|
|
|
</listitem>
|
|
|
|
</varlistentry>
|
|
|
|
</variablelist>
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
|
|
|
<para>
|
2009-04-27 18:27:36 +02:00
|
|
|
The system catalogs store information about which conversions, or
|
|
|
|
<firstterm>casts</firstterm>, exist between which data types, and how to
|
2002-11-11 21:14:04 +01:00
|
|
|
perform those conversions. Additional casts can be added by the user
|
2017-11-23 15:39:47 +01:00
|
|
|
with the <xref linkend="sql-createcast"/>
|
2007-06-05 23:31:09 +02:00
|
|
|
command. (This is usually
|
2002-11-11 21:14:04 +01:00
|
|
|
done in conjunction with defining new data types. The set of casts
|
2009-04-27 18:27:36 +02:00
|
|
|
between built-in types has been carefully crafted and is best not
|
2003-05-26 02:11:29 +02:00
|
|
|
altered.)
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
2003-08-31 19:32:24 +02:00
|
|
|
<indexterm>
|
|
|
|
<primary>data type</primary>
|
|
|
|
<secondary>category</secondary>
|
|
|
|
</indexterm>
|
|
|
|
|
1998-07-08 15:53:15 +02:00
|
|
|
<para>
|
2009-04-27 18:27:36 +02:00
|
|
|
An additional heuristic provided by the parser allows improved determination
|
|
|
|
of the proper casting behavior among groups of types that have implicit casts.
|
Replace the hard-wired type knowledge in TypeCategory() and IsPreferredType()
with system catalog lookups, as was foreseen to be necessary almost since
their creation. Instead put the information into two new pg_type columns,
typcategory and typispreferred. Add support for setting these when
creating a user-defined base type.
The category column is just a "char" (i.e. a poor man's enum), allowing
a crude form of user extensibility of the category list: just use an
otherwise-unused character. This seems sufficient for foreseen uses,
but we could upgrade to having an actual category catalog someday, if
there proves to be a huge demand for custom type categories.
In this patch I have attempted to hew exactly to the behavior of the
previous hardwired logic, except for introducing new type categories for
arrays, composites, and enums. In particular the default preferred state
for user-defined types remains TRUE. That seems worth revisiting, but it
should be done as a separate patch from introducing the infrastructure.
Likewise, any adjustment of the standard set of categories should be done
separately.
2008-07-30 19:05:05 +02:00
|
|
|
Data types are divided into several basic <firstterm>type
|
|
|
|
categories</firstterm>, including <type>boolean</type>, <type>numeric</type>,
|
|
|
|
<type>string</type>, <type>bitstring</type>, <type>datetime</type>,
|
|
|
|
<type>timespan</type>, <type>geometric</type>, <type>network</type>, and
|
2017-11-23 15:39:47 +01:00
|
|
|
user-defined. (For a list see <xref linkend="catalog-typcategory-table"/>;
|
Replace the hard-wired type knowledge in TypeCategory() and IsPreferredType()
with system catalog lookups, as was foreseen to be necessary almost since
their creation. Instead put the information into two new pg_type columns,
typcategory and typispreferred. Add support for setting these when
creating a user-defined base type.
The category column is just a "char" (i.e. a poor man's enum), allowing
a crude form of user extensibility of the category list: just use an
otherwise-unused character. This seems sufficient for foreseen uses,
but we could upgrade to having an actual category catalog someday, if
there proves to be a huge demand for custom type categories.
In this patch I have attempted to hew exactly to the behavior of the
previous hardwired logic, except for introducing new type categories for
arrays, composites, and enums. In particular the default preferred state
for user-defined types remains TRUE. That seems worth revisiting, but it
should be done as a separate patch from introducing the infrastructure.
Likewise, any adjustment of the standard set of categories should be done
separately.
2008-07-30 19:05:05 +02:00
|
|
|
but note it is also possible to create custom type categories.) Within each
|
2008-07-30 21:35:13 +02:00
|
|
|
category there can be one or more <firstterm>preferred types</firstterm>, which
|
2009-06-17 23:58:49 +02:00
|
|
|
are preferred when there is a choice of possible types. With careful selection
|
Replace the hard-wired type knowledge in TypeCategory() and IsPreferredType()
with system catalog lookups, as was foreseen to be necessary almost since
their creation. Instead put the information into two new pg_type columns,
typcategory and typispreferred. Add support for setting these when
creating a user-defined base type.
The category column is just a "char" (i.e. a poor man's enum), allowing
a crude form of user extensibility of the category list: just use an
otherwise-unused character. This seems sufficient for foreseen uses,
but we could upgrade to having an actual category catalog someday, if
there proves to be a huge demand for custom type categories.
In this patch I have attempted to hew exactly to the behavior of the
previous hardwired logic, except for introducing new type categories for
arrays, composites, and enums. In particular the default preferred state
for user-defined types remains TRUE. That seems worth revisiting, but it
should be done as a separate patch from introducing the infrastructure.
Likewise, any adjustment of the standard set of categories should be done
separately.
2008-07-30 19:05:05 +02:00
|
|
|
of preferred types and available implicit casts, it is possible to ensure that
|
|
|
|
ambiguous expressions (those with multiple candidate parsing solutions) can be
|
|
|
|
resolved in a useful way.
|
|
|
|
</para>
|
|
|
|
|
1998-07-08 15:53:15 +02:00
|
|
|
<para>
|
|
|
|
All type conversion rules are designed with several principles in mind:
|
|
|
|
|
2002-11-11 21:14:04 +01:00
|
|
|
<itemizedlist>
|
1998-07-08 15:53:15 +02:00
|
|
|
<listitem>
|
|
|
|
<para>
|
2000-03-20 05:22:11 +01:00
|
|
|
Implicit conversions should never have surprising or unpredictable outcomes.
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
|
|
|
<listitem>
|
|
|
|
<para>
|
2009-04-27 18:27:36 +02:00
|
|
|
There should be no extra overhead in the parser or executor
|
1998-07-08 15:53:15 +02:00
|
|
|
if a query does not need implicit type conversion.
|
2009-04-27 18:27:36 +02:00
|
|
|
That is, if a query is well-formed and the types already match, then the query should execute
|
1998-07-08 15:53:15 +02:00
|
|
|
without spending extra time in the parser and without introducing unnecessary implicit conversion
|
2009-04-27 18:27:36 +02:00
|
|
|
calls in the query.
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
2014-08-10 22:13:13 +02:00
|
|
|
</listitem>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
2014-08-10 22:13:13 +02:00
|
|
|
<listitem>
|
1998-07-08 15:53:15 +02:00
|
|
|
<para>
|
|
|
|
Additionally, if a query usually requires an implicit conversion for a function, and
|
2003-03-13 02:30:29 +01:00
|
|
|
if then the user defines a new function with the correct argument types, the parser
|
2009-06-17 23:58:49 +02:00
|
|
|
should use this new function and no longer do implicit conversion to use the old function.
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
|
|
|
</listitem>
|
1998-07-08 15:53:15 +02:00
|
|
|
</itemizedlist>
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
2001-09-15 02:48:59 +02:00
|
|
|
|
1998-12-29 03:24:47 +01:00
|
|
|
</sect1>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
2000-09-29 22:21:34 +02:00
|
|
|
<sect1 id="typeconv-oper">
|
1998-07-08 15:53:15 +02:00
|
|
|
<title>Operators</title>
|
|
|
|
|
2003-08-31 19:32:24 +02:00
|
|
|
<indexterm zone="typeconv-oper">
|
|
|
|
<primary>operator</primary>
|
|
|
|
<secondary>type resolution in an invocation</secondary>
|
|
|
|
</indexterm>
|
|
|
|
|
2001-09-15 02:48:59 +02:00
|
|
|
<para>
|
2009-06-17 23:58:49 +02:00
|
|
|
The specific operator that is referenced by an operator expression
|
|
|
|
is determined using the following procedure.
|
|
|
|
Note that this procedure is indirectly affected
|
2014-08-10 22:13:13 +02:00
|
|
|
by the precedence of the operators involved, since that will determine
|
2009-06-17 23:58:49 +02:00
|
|
|
which sub-expressions are taken to be the inputs of which operators.
|
2017-11-23 15:39:47 +01:00
|
|
|
See <xref linkend="sql-precedence"/> for more information.
|
2001-09-15 02:48:59 +02:00
|
|
|
</para>
|
|
|
|
|
1998-07-08 15:53:15 +02:00
|
|
|
<procedure>
|
2003-05-26 02:11:29 +02:00
|
|
|
<title>Operator Type Resolution</title>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
2014-08-10 22:13:13 +02:00
|
|
|
<step id="op-resol-select" performance="required">
|
1998-07-08 15:53:15 +02:00
|
|
|
<para>
|
2002-04-25 22:14:43 +02:00
|
|
|
Select the operators to be considered from the
|
2009-04-27 18:27:36 +02:00
|
|
|
<classname>pg_operator</classname> system catalog. If a non-schema-qualified
|
2003-03-13 02:30:29 +01:00
|
|
|
operator name was used (the usual case), the operators
|
2009-06-17 23:58:49 +02:00
|
|
|
considered are those with the matching name and argument count that are
|
2017-11-23 15:39:47 +01:00
|
|
|
visible in the current search path (see <xref linkend="ddl-schemas-path"/>).
|
2002-04-25 22:14:43 +02:00
|
|
|
If a qualified operator name was given, only operators in the specified
|
|
|
|
schema are considered.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<substeps>
|
|
|
|
<step performance="optional">
|
|
|
|
<para>
|
2009-04-27 18:27:36 +02:00
|
|
|
If the search path finds multiple operators with identical argument types,
|
|
|
|
only the one appearing earliest in the path is considered. Operators with
|
2002-04-25 22:14:43 +02:00
|
|
|
different argument types are considered on an equal footing regardless of
|
|
|
|
search path position.
|
|
|
|
</para>
|
|
|
|
</step>
|
|
|
|
</substeps>
|
|
|
|
</step>
|
|
|
|
|
2014-08-10 22:13:13 +02:00
|
|
|
<step id="op-resol-exact-match" performance="required">
|
2002-04-25 22:14:43 +02:00
|
|
|
<para>
|
|
|
|
Check for an operator accepting exactly the input argument types.
|
|
|
|
If one exists (there can be only one exact match in the set of
|
2018-07-29 05:08:01 +02:00
|
|
|
operators considered), use it. Lack of an exact match creates a security
|
|
|
|
hazard when calling, via qualified name
|
|
|
|
<footnote id="op-qualified-security">
|
|
|
|
<!-- If you edit this, consider editing func-qualified-security. -->
|
|
|
|
<para>
|
|
|
|
The hazard does not arise with a non-schema-qualified name, because a
|
|
|
|
search path containing schemas that permit untrusted users to create
|
|
|
|
objects is not a <link linkend="ddl-schemas-patterns">secure schema usage
|
|
|
|
pattern</link>.
|
|
|
|
</para>
|
|
|
|
</footnote>
|
|
|
|
(not typical), any operator found in a schema that permits untrusted users to
|
|
|
|
create objects. In such situations, cast arguments to force an exact match.
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
|
|
|
<substeps>
|
2014-08-10 22:13:13 +02:00
|
|
|
<step id="op-resol-exact-unknown" performance="optional">
|
1998-07-08 15:53:15 +02:00
|
|
|
<para>
|
2003-03-13 02:30:29 +01:00
|
|
|
If one argument of a binary operator invocation is of the <type>unknown</type> type,
|
2000-12-17 06:55:26 +01:00
|
|
|
then assume it is the same type as the other argument for this check.
|
Remove support for postfix (right-unary) operators.
This feature has been a thorn in our sides for a long time, causing
many grammatical ambiguity problems. It doesn't seem worth the
pain to continue to support it, so remove it.
There are some follow-on improvements we can make in the grammar,
but this commit only removes the bare minimum number of productions,
plus assorted backend support code.
Note that pg_dump and psql continue to have full support, since
they may be used against older servers. However, pg_dump warns
about postfix operators. There is also a check in pg_upgrade.
Documentation-wise, I (tgl) largely removed the "left unary"
terminology in favor of saying "prefix operator", which is
a more standard and IMO less confusing term.
I included a catversion bump, although no initial catalog data
changes here, to mark the boundary at which oprkind = 'r'
stopped being valid in pg_operator.
Mark Dilger, based on work by myself and Robert Haas;
review by John Naylor
Discussion: https://postgr.es/m/38ca86db-42ab-9b48-2902-337a0d6b8311@2ndquadrant.com
2020-09-18 01:38:05 +02:00
|
|
|
Invocations involving two <type>unknown</type> inputs, or a prefix operator
|
2009-06-17 23:58:49 +02:00
|
|
|
with an <type>unknown</type> input, will never find a match at this step.
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
|
|
|
</step>
|
2014-08-10 22:13:13 +02:00
|
|
|
<step id="op-resol-exact-domain" performance="optional">
|
|
|
|
<para>
|
|
|
|
If one argument of a binary operator invocation is of the <type>unknown</type>
|
|
|
|
type and the other is of a domain type, next check to see if there is an
|
|
|
|
operator accepting exactly the domain's base type on both sides; if so, use it.
|
|
|
|
</para>
|
|
|
|
</step>
|
1998-07-08 15:53:15 +02:00
|
|
|
</substeps>
|
1998-12-29 03:24:47 +01:00
|
|
|
</step>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
2014-08-10 22:13:13 +02:00
|
|
|
<step id="op-resol-best-match" performance="required">
|
1998-07-08 15:53:15 +02:00
|
|
|
<para>
|
|
|
|
Look for the best match.
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
1998-07-08 15:53:15 +02:00
|
|
|
<substeps>
|
2000-12-17 06:55:26 +01:00
|
|
|
<step performance="required">
|
1998-07-08 15:53:15 +02:00
|
|
|
<para>
|
2002-04-25 22:14:43 +02:00
|
|
|
Discard candidate operators for which the input types do not match
|
2003-03-13 02:30:29 +01:00
|
|
|
and cannot be converted (using an implicit conversion) to match.
|
2002-04-25 22:14:43 +02:00
|
|
|
<type>unknown</type> literals are
|
2003-03-13 02:30:29 +01:00
|
|
|
assumed to be convertible to anything for this purpose. If only one
|
2002-04-25 22:14:43 +02:00
|
|
|
candidate remains, use it; else continue to the next step.
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
|
|
|
</step>
|
1998-07-08 15:53:15 +02:00
|
|
|
<step performance="required">
|
|
|
|
<para>
|
2014-08-10 22:13:13 +02:00
|
|
|
If any input argument is of a domain type, treat it as being of the
|
|
|
|
domain's base type for all subsequent steps. This ensures that domains
|
|
|
|
act like their base types for purposes of ambiguous-operator resolution.
|
|
|
|
</para>
|
|
|
|
</step>
|
|
|
|
<step performance="required">
|
|
|
|
<para>
|
2000-12-17 06:55:26 +01:00
|
|
|
Run through all candidates and keep those with the most exact matches
|
2014-08-10 22:13:13 +02:00
|
|
|
on input types. Keep all candidates if none have exact matches.
|
2000-12-17 06:55:26 +01:00
|
|
|
If only one candidate remains, use it; else continue to the next step.
|
|
|
|
</para>
|
2000-12-17 18:50:46 +01:00
|
|
|
</step>
|
2000-12-17 06:55:26 +01:00
|
|
|
<step performance="required">
|
|
|
|
<para>
|
2003-05-26 02:11:29 +02:00
|
|
|
Run through all candidates and keep those that accept preferred types (of the
|
2003-11-01 02:56:29 +01:00
|
|
|
input data type's type category) at the most positions where type conversion
|
2003-05-26 02:11:29 +02:00
|
|
|
will be required.
|
2000-12-17 06:55:26 +01:00
|
|
|
Keep all candidates if none accept preferred types.
|
|
|
|
If only one candidate remains, use it; else continue to the next step.
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
|
|
|
</step>
|
1998-07-08 15:53:15 +02:00
|
|
|
<step performance="required">
|
|
|
|
<para>
|
2003-03-13 02:30:29 +01:00
|
|
|
If any input arguments are <type>unknown</type>, check the type
|
2001-10-13 01:32:34 +02:00
|
|
|
categories accepted at those argument positions by the remaining
|
2003-05-26 02:11:29 +02:00
|
|
|
candidates. At each position, select the <type>string</type> category
|
|
|
|
if any
|
2003-03-13 02:30:29 +01:00
|
|
|
candidate accepts that category. (This bias towards string is appropriate
|
2009-04-27 18:27:36 +02:00
|
|
|
since an unknown-type literal looks like a string.) Otherwise, if
|
2001-10-13 01:32:34 +02:00
|
|
|
all the remaining candidates accept the same type category, select that
|
|
|
|
category; otherwise fail because the correct choice cannot be deduced
|
2003-05-26 02:11:29 +02:00
|
|
|
without more clues. Now discard
|
2003-03-13 02:30:29 +01:00
|
|
|
candidates that do not accept the selected type category. Furthermore,
|
Replace the hard-wired type knowledge in TypeCategory() and IsPreferredType()
with system catalog lookups, as was foreseen to be necessary almost since
their creation. Instead put the information into two new pg_type columns,
typcategory and typispreferred. Add support for setting these when
creating a user-defined base type.
The category column is just a "char" (i.e. a poor man's enum), allowing
a crude form of user extensibility of the category list: just use an
otherwise-unused character. This seems sufficient for foreseen uses,
but we could upgrade to having an actual category catalog someday, if
there proves to be a huge demand for custom type categories.
In this patch I have attempted to hew exactly to the behavior of the
previous hardwired logic, except for introducing new type categories for
arrays, composites, and enums. In particular the default preferred state
for user-defined types remains TRUE. That seems worth revisiting, but it
should be done as a separate patch from introducing the infrastructure.
Likewise, any adjustment of the standard set of categories should be done
separately.
2008-07-30 19:05:05 +02:00
|
|
|
if any candidate accepts a preferred type in that category,
|
2001-10-13 01:32:34 +02:00
|
|
|
discard candidates that accept non-preferred types for that argument.
|
2011-11-18 00:28:41 +01:00
|
|
|
Keep all candidates if none survive these tests.
|
|
|
|
If only one candidate remains, use it; else continue to the next step.
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
|
|
|
</step>
|
2014-08-10 22:13:13 +02:00
|
|
|
<step id="op-resol-last-unknown" performance="required">
|
1998-07-08 15:53:15 +02:00
|
|
|
<para>
|
2011-11-18 00:28:41 +01:00
|
|
|
If there are both <type>unknown</type> and known-type arguments, and all
|
|
|
|
the known-type arguments have the same type, assume that the
|
|
|
|
<type>unknown</type> arguments are also of that type, and check which
|
|
|
|
candidates can accept that type at the <type>unknown</type>-argument
|
|
|
|
positions. If exactly one candidate passes this test, use it.
|
|
|
|
Otherwise, fail.
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
|
|
|
</step>
|
1998-07-08 15:53:15 +02:00
|
|
|
</substeps>
|
1998-12-29 03:24:47 +01:00
|
|
|
</step>
|
1998-07-08 15:53:15 +02:00
|
|
|
</procedure>
|
|
|
|
|
2003-03-13 02:30:29 +01:00
|
|
|
<para>
|
|
|
|
Some examples follow.
|
|
|
|
</para>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
2001-09-15 02:48:59 +02:00
|
|
|
<example>
|
Mark factorial operator, and postfix operators in general, as deprecated.
Per discussion, we're planning to remove parser support for postfix
operators in order to simplify the grammar. So it behooves us to
put out a deprecation notice at least one release before that.
There is only one built-in postfix operator, ! for factorial.
Label it deprecated in the docs and in pg_description, and adjust
some examples that formerly relied on it. (The sister prefix
operator !! is also deprecated. We don't really have to remove
that one, but since we're suggesting that people use factorial()
instead, it seems better to remove both operators.)
Also state in the CREATE OPERATOR ref page that postfix operators
in general are going away.
Although this changes the initial contents of pg_description,
I did not force a catversion bump; it doesn't seem essential.
In v13, also back-patch 4c5cf5431, so that there's someplace for
the <link>s to point to.
Mark Dilger and John Naylor, with some adjustments by me
Discussion: https://postgr.es/m/BE2DF53D-251A-4E26-972F-930E523580E9@enterprisedb.com
2020-08-30 20:37:24 +02:00
|
|
|
<title>Square Root Operator Type Resolution</title>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
|
|
|
<para>
|
Mark factorial operator, and postfix operators in general, as deprecated.
Per discussion, we're planning to remove parser support for postfix
operators in order to simplify the grammar. So it behooves us to
put out a deprecation notice at least one release before that.
There is only one built-in postfix operator, ! for factorial.
Label it deprecated in the docs and in pg_description, and adjust
some examples that formerly relied on it. (The sister prefix
operator !! is also deprecated. We don't really have to remove
that one, but since we're suggesting that people use factorial()
instead, it seems better to remove both operators.)
Also state in the CREATE OPERATOR ref page that postfix operators
in general are going away.
Although this changes the initial contents of pg_description,
I did not force a catversion bump; it doesn't seem essential.
In v13, also back-patch 4c5cf5431, so that there's someplace for
the <link>s to point to.
Mark Dilger and John Naylor, with some adjustments by me
Discussion: https://postgr.es/m/BE2DF53D-251A-4E26-972F-930E523580E9@enterprisedb.com
2020-08-30 20:37:24 +02:00
|
|
|
There is only one square root operator (prefix <literal>|/</literal>)
|
Replace the hard-wired type knowledge in TypeCategory() and IsPreferredType()
with system catalog lookups, as was foreseen to be necessary almost since
their creation. Instead put the information into two new pg_type columns,
typcategory and typispreferred. Add support for setting these when
creating a user-defined base type.
The category column is just a "char" (i.e. a poor man's enum), allowing
a crude form of user extensibility of the category list: just use an
otherwise-unused character. This seems sufficient for foreseen uses,
but we could upgrade to having an actual category catalog someday, if
there proves to be a huge demand for custom type categories.
In this patch I have attempted to hew exactly to the behavior of the
previous hardwired logic, except for introducing new type categories for
arrays, composites, and enums. In particular the default preferred state
for user-defined types remains TRUE. That seems worth revisiting, but it
should be done as a separate patch from introducing the infrastructure.
Likewise, any adjustment of the standard set of categories should be done
separately.
2008-07-30 19:05:05 +02:00
|
|
|
defined in the standard catalog, and it takes an argument of type
|
Mark factorial operator, and postfix operators in general, as deprecated.
Per discussion, we're planning to remove parser support for postfix
operators in order to simplify the grammar. So it behooves us to
put out a deprecation notice at least one release before that.
There is only one built-in postfix operator, ! for factorial.
Label it deprecated in the docs and in pg_description, and adjust
some examples that formerly relied on it. (The sister prefix
operator !! is also deprecated. We don't really have to remove
that one, but since we're suggesting that people use factorial()
instead, it seems better to remove both operators.)
Also state in the CREATE OPERATOR ref page that postfix operators
in general are going away.
Although this changes the initial contents of pg_description,
I did not force a catversion bump; it doesn't seem essential.
In v13, also back-patch 4c5cf5431, so that there's someplace for
the <link>s to point to.
Mark Dilger and John Naylor, with some adjustments by me
Discussion: https://postgr.es/m/BE2DF53D-251A-4E26-972F-930E523580E9@enterprisedb.com
2020-08-30 20:37:24 +02:00
|
|
|
<type>double precision</type>.
|
2007-06-05 23:31:09 +02:00
|
|
|
The scanner assigns an initial type of <type>integer</type> to the argument
|
|
|
|
in this query expression:
|
2001-09-15 02:48:59 +02:00
|
|
|
<screen>
|
Mark factorial operator, and postfix operators in general, as deprecated.
Per discussion, we're planning to remove parser support for postfix
operators in order to simplify the grammar. So it behooves us to
put out a deprecation notice at least one release before that.
There is only one built-in postfix operator, ! for factorial.
Label it deprecated in the docs and in pg_description, and adjust
some examples that formerly relied on it. (The sister prefix
operator !! is also deprecated. We don't really have to remove
that one, but since we're suggesting that people use factorial()
instead, it seems better to remove both operators.)
Also state in the CREATE OPERATOR ref page that postfix operators
in general are going away.
Although this changes the initial contents of pg_description,
I did not force a catversion bump; it doesn't seem essential.
In v13, also back-patch 4c5cf5431, so that there's someplace for
the <link>s to point to.
Mark Dilger and John Naylor, with some adjustments by me
Discussion: https://postgr.es/m/BE2DF53D-251A-4E26-972F-930E523580E9@enterprisedb.com
2020-08-30 20:37:24 +02:00
|
|
|
SELECT |/ 40 AS "square root of 40";
|
|
|
|
square root of 40
|
|
|
|
-------------------
|
|
|
|
6.324555320336759
|
1998-07-08 15:53:15 +02:00
|
|
|
(1 row)
|
2001-09-15 02:48:59 +02:00
|
|
|
</screen>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
2007-06-05 23:31:09 +02:00
|
|
|
So the parser does a type conversion on the operand and the query
|
2009-04-27 18:27:36 +02:00
|
|
|
is equivalent to:
|
1998-07-08 15:53:15 +02:00
|
|
|
|
2001-09-15 02:48:59 +02:00
|
|
|
<screen>
|
Mark factorial operator, and postfix operators in general, as deprecated.
Per discussion, we're planning to remove parser support for postfix
operators in order to simplify the grammar. So it behooves us to
put out a deprecation notice at least one release before that.
There is only one built-in postfix operator, ! for factorial.
Label it deprecated in the docs and in pg_description, and adjust
some examples that formerly relied on it. (The sister prefix
operator !! is also deprecated. We don't really have to remove
that one, but since we're suggesting that people use factorial()
instead, it seems better to remove both operators.)
Also state in the CREATE OPERATOR ref page that postfix operators
in general are going away.
Although this changes the initial contents of pg_description,
I did not force a catversion bump; it doesn't seem essential.
In v13, also back-patch 4c5cf5431, so that there's someplace for
the <link>s to point to.
Mark Dilger and John Naylor, with some adjustments by me
Discussion: https://postgr.es/m/BE2DF53D-251A-4E26-972F-930E523580E9@enterprisedb.com
2020-08-30 20:37:24 +02:00
|
|
|
SELECT |/ CAST(40 AS double precision) AS "square root of 40";
|
2001-09-15 02:48:59 +02:00
|
|
|
</screen>
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
2001-09-15 02:48:59 +02:00
|
|
|
</example>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
2001-09-15 02:48:59 +02:00
|
|
|
<example>
|
|
|
|
<title>String Concatenation Operator Type Resolution</title>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
|
|
|
<para>
|
2009-04-27 18:27:36 +02:00
|
|
|
A string-like syntax is used for working with string types and for
|
2003-03-13 02:30:29 +01:00
|
|
|
working with complex extension types.
|
1998-07-08 15:53:15 +02:00
|
|
|
Strings with unspecified type are matched with likely operator candidates.
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
|
|
|
<para>
|
2001-10-15 03:00:59 +02:00
|
|
|
An example with one unspecified argument:
|
2001-09-15 02:48:59 +02:00
|
|
|
<screen>
|
2003-03-13 02:30:29 +01:00
|
|
|
SELECT text 'abc' || 'def' AS "text and unknown";
|
|
|
|
|
|
|
|
text and unknown
|
2000-03-26 20:32:30 +02:00
|
|
|
------------------
|
|
|
|
abcdef
|
1998-07-08 15:53:15 +02:00
|
|
|
(1 row)
|
2001-09-15 02:48:59 +02:00
|
|
|
</screen>
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
|
|
|
<para>
|
|
|
|
In this case the parser looks to see if there is an operator taking <type>text</type>
|
|
|
|
for both arguments. Since there is, it assumes that the second argument should
|
2009-04-27 18:27:36 +02:00
|
|
|
be interpreted as type <type>text</type>.
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
|
|
|
<para>
|
2011-11-18 00:28:41 +01:00
|
|
|
Here is a concatenation of two values of unspecified types:
|
2001-09-15 02:48:59 +02:00
|
|
|
<screen>
|
2003-03-13 02:30:29 +01:00
|
|
|
SELECT 'abc' || 'def' AS "unspecified";
|
|
|
|
|
|
|
|
unspecified
|
2000-03-26 20:32:30 +02:00
|
|
|
-------------
|
|
|
|
abcdef
|
1998-07-08 15:53:15 +02:00
|
|
|
(1 row)
|
2001-09-15 02:48:59 +02:00
|
|
|
</screen>
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
|
|
|
<para>
|
|
|
|
In this case there is no initial hint for which type to use, since no types
|
|
|
|
are specified in the query. So, the parser looks for all candidate operators
|
2000-12-17 06:55:26 +01:00
|
|
|
and finds that there are candidates accepting both string-category and
|
2001-09-09 19:21:59 +02:00
|
|
|
bit-string-category inputs. Since string category is preferred when available,
|
2009-06-17 23:58:49 +02:00
|
|
|
that category is selected, and then the
|
2003-03-13 02:30:29 +01:00
|
|
|
preferred type for strings, <type>text</type>, is used as the specific
|
2011-11-18 00:28:41 +01:00
|
|
|
type to resolve the unknown-type literals as.
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
2001-09-15 02:48:59 +02:00
|
|
|
</example>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
2003-03-20 17:17:32 +01:00
|
|
|
<example>
|
2003-12-02 01:26:59 +01:00
|
|
|
<title>Absolute-Value and Negation Operator Type Resolution</title>
|
2003-03-20 17:17:32 +01:00
|
|
|
|
|
|
|
<para>
|
|
|
|
The <productname>PostgreSQL</productname> operator catalog has several
|
2017-10-09 03:44:17 +02:00
|
|
|
entries for the prefix operator <literal>@</literal>, all of which implement
|
2003-03-20 17:17:32 +01:00
|
|
|
absolute-value operations for various numeric data types. One of these
|
|
|
|
entries is for type <type>float8</type>, which is the preferred type in
|
|
|
|
the numeric category. Therefore, <productname>PostgreSQL</productname>
|
2017-10-09 03:44:17 +02:00
|
|
|
will use that entry when faced with an <type>unknown</type> input:
|
2003-03-20 17:17:32 +01:00
|
|
|
<screen>
|
|
|
|
SELECT @ '-4.5' AS "abs";
|
|
|
|
abs
|
|
|
|
-----
|
|
|
|
4.5
|
|
|
|
(1 row)
|
|
|
|
</screen>
|
2007-06-05 23:31:09 +02:00
|
|
|
Here the system has implicitly resolved the unknown-type literal as type
|
|
|
|
<type>float8</type> before applying the chosen operator. We can verify that
|
|
|
|
<type>float8</type> and not some other type was used:
|
2003-03-20 17:17:32 +01:00
|
|
|
<screen>
|
|
|
|
SELECT @ '-4.5e500' AS "abs";
|
|
|
|
|
2003-09-30 05:22:33 +02:00
|
|
|
ERROR: "-4.5e500" is out of range for type double precision
|
2003-03-20 17:17:32 +01:00
|
|
|
</screen>
|
|
|
|
</para>
|
2003-12-02 01:26:59 +01:00
|
|
|
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
On the other hand, the prefix operator <literal>~</literal> (bitwise negation)
|
2003-12-02 01:26:59 +01:00
|
|
|
is defined only for integer data types, not for <type>float8</type>. So, if we
|
2017-10-09 03:44:17 +02:00
|
|
|
try a similar case with <literal>~</literal>, we get:
|
2003-12-02 01:26:59 +01:00
|
|
|
<screen>
|
|
|
|
SELECT ~ '20' AS "negation";
|
|
|
|
|
|
|
|
ERROR: operator is not unique: ~ "unknown"
|
2007-06-05 23:31:09 +02:00
|
|
|
HINT: Could not choose a best candidate operator. You might need to add
|
|
|
|
explicit type casts.
|
2003-12-02 01:26:59 +01:00
|
|
|
</screen>
|
Wording cleanup for error messages. Also change can't -> cannot.
Standard English uses "may", "can", and "might" in different ways:
may - permission, "You may borrow my rake."
can - ability, "I can lift that log."
might - possibility, "It might rain today."
Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice. Similarly, "It may crash" is better stated, "It might crash".
2007-02-01 20:10:30 +01:00
|
|
|
This happens because the system cannot decide which of the several
|
2017-10-09 03:44:17 +02:00
|
|
|
possible <literal>~</literal> operators should be preferred. We can help
|
2003-12-02 01:26:59 +01:00
|
|
|
it out with an explicit cast:
|
|
|
|
<screen>
|
|
|
|
SELECT ~ CAST('20' AS int8) AS "negation";
|
|
|
|
|
|
|
|
negation
|
|
|
|
----------
|
|
|
|
-21
|
|
|
|
(1 row)
|
|
|
|
</screen>
|
|
|
|
</para>
|
2003-03-20 17:17:32 +01:00
|
|
|
</example>
|
|
|
|
|
2011-11-18 00:28:41 +01:00
|
|
|
<example>
|
|
|
|
<title>Array Inclusion Operator Type Resolution</title>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Here is another example of resolving an operator with one known and one
|
|
|
|
unknown input:
|
|
|
|
<screen>
|
|
|
|
SELECT array[1,2] <@ '{1,2,3}' as "is subset";
|
|
|
|
|
|
|
|
is subset
|
|
|
|
-----------
|
|
|
|
t
|
|
|
|
(1 row)
|
|
|
|
</screen>
|
|
|
|
The <productname>PostgreSQL</productname> operator catalog has several
|
2017-10-09 03:44:17 +02:00
|
|
|
entries for the infix operator <literal><@</literal>, but the only two that
|
2011-11-18 00:28:41 +01:00
|
|
|
could possibly accept an integer array on the left-hand side are
|
2017-10-09 03:44:17 +02:00
|
|
|
array inclusion (<type>anyarray</type> <literal><@</literal> <type>anyarray</type>)
|
|
|
|
and range inclusion (<type>anyelement</type> <literal><@</literal> <type>anyrange</type>).
|
2011-11-18 00:28:41 +01:00
|
|
|
Since none of these polymorphic pseudo-types (see <xref
|
2017-11-23 15:39:47 +01:00
|
|
|
linkend="datatype-pseudo"/>) are considered preferred, the parser cannot
|
2014-08-10 22:13:13 +02:00
|
|
|
resolve the ambiguity on that basis.
|
2017-11-23 15:39:47 +01:00
|
|
|
However, <xref linkend="op-resol-last-unknown"/> tells
|
2011-11-18 00:28:41 +01:00
|
|
|
it to assume that the unknown-type literal is of the same type as the other
|
|
|
|
input, that is, integer array. Now only one of the two operators can match,
|
|
|
|
so array inclusion is selected. (Had range inclusion been selected, we would
|
|
|
|
have gotten an error, because the string does not have the right format to be
|
|
|
|
a range literal.)
|
|
|
|
</para>
|
|
|
|
</example>
|
|
|
|
|
2014-08-10 22:13:13 +02:00
|
|
|
<example>
|
|
|
|
<title>Custom Operator on a Domain Type</title>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Users sometimes try to declare operators applying just to a domain type.
|
|
|
|
This is possible but is not nearly as useful as it might seem, because the
|
|
|
|
operator resolution rules are designed to select operators applying to the
|
|
|
|
domain's base type. As an example consider
|
|
|
|
<screen>
|
|
|
|
CREATE DOMAIN mytext AS text CHECK(...);
|
|
|
|
CREATE FUNCTION mytext_eq_text (mytext, text) RETURNS boolean AS ...;
|
|
|
|
CREATE OPERATOR = (procedure=mytext_eq_text, leftarg=mytext, rightarg=text);
|
|
|
|
CREATE TABLE mytable (val mytext);
|
|
|
|
|
|
|
|
SELECT * FROM mytable WHERE val = 'foo';
|
|
|
|
</screen>
|
|
|
|
This query will not use the custom operator. The parser will first see if
|
2017-10-09 03:44:17 +02:00
|
|
|
there is a <type>mytext</type> <literal>=</literal> <type>mytext</type> operator
|
2017-11-23 15:39:47 +01:00
|
|
|
(<xref linkend="op-resol-exact-unknown"/>), which there is not;
|
2017-10-09 03:44:17 +02:00
|
|
|
then it will consider the domain's base type <type>text</type>, and see if
|
|
|
|
there is a <type>text</type> <literal>=</literal> <type>text</type> operator
|
2017-11-23 15:39:47 +01:00
|
|
|
(<xref linkend="op-resol-exact-domain"/>), which there is;
|
2017-10-09 03:44:17 +02:00
|
|
|
so it resolves the <type>unknown</type>-type literal as <type>text</type> and
|
|
|
|
uses the <type>text</type> <literal>=</literal> <type>text</type> operator.
|
2014-08-10 22:13:13 +02:00
|
|
|
The only way to get the custom operator to be used is to explicitly cast
|
|
|
|
the literal:
|
|
|
|
<screen>
|
|
|
|
SELECT * FROM mytable WHERE val = text 'foo';
|
|
|
|
</screen>
|
2017-10-09 03:44:17 +02:00
|
|
|
so that the <type>mytext</type> <literal>=</literal> <type>text</type> operator is found
|
2014-08-10 22:13:13 +02:00
|
|
|
immediately according to the exact-match rule. If the best-match rules
|
|
|
|
are reached, they actively discriminate against operators on domain types.
|
|
|
|
If they did not, such an operator would create too many ambiguous-operator
|
|
|
|
failures, because the casting rules always consider a domain as castable
|
|
|
|
to or from its base type, and so the domain operator would be considered
|
|
|
|
usable in all the same cases as a similarly-named operator on the base type.
|
|
|
|
</para>
|
|
|
|
</example>
|
|
|
|
|
1998-12-29 03:24:47 +01:00
|
|
|
</sect1>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
2000-09-29 22:21:34 +02:00
|
|
|
<sect1 id="typeconv-func">
|
1998-07-08 15:53:15 +02:00
|
|
|
<title>Functions</title>
|
|
|
|
|
2003-08-31 19:32:24 +02:00
|
|
|
<indexterm zone="typeconv-func">
|
|
|
|
<primary>function</primary>
|
|
|
|
<secondary>type resolution in an invocation</secondary>
|
|
|
|
</indexterm>
|
|
|
|
|
2001-09-15 02:48:59 +02:00
|
|
|
<para>
|
2009-06-17 23:58:49 +02:00
|
|
|
The specific function that is referenced by a function call
|
|
|
|
is determined using the following procedure.
|
2001-09-15 02:48:59 +02:00
|
|
|
</para>
|
|
|
|
|
1998-07-08 15:53:15 +02:00
|
|
|
<procedure>
|
2003-05-26 02:11:29 +02:00
|
|
|
<title>Function Type Resolution</title>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
|
|
|
<step performance="required">
|
|
|
|
<para>
|
2002-04-25 22:14:43 +02:00
|
|
|
Select the functions to be considered from the
|
2009-04-27 18:27:36 +02:00
|
|
|
<classname>pg_proc</classname> system catalog. If a non-schema-qualified
|
2003-03-13 02:30:29 +01:00
|
|
|
function name was used, the functions
|
2009-06-17 23:58:49 +02:00
|
|
|
considered are those with the matching name and argument count that are
|
2017-11-23 15:39:47 +01:00
|
|
|
visible in the current search path (see <xref linkend="ddl-schemas-path"/>).
|
2002-04-25 22:14:43 +02:00
|
|
|
If a qualified function name was given, only functions in the specified
|
|
|
|
schema are considered.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<substeps>
|
|
|
|
<step performance="optional">
|
|
|
|
<para>
|
|
|
|
If the search path finds multiple functions of identical argument types,
|
2009-04-27 18:27:36 +02:00
|
|
|
only the one appearing earliest in the path is considered. Functions of
|
2002-04-25 22:14:43 +02:00
|
|
|
different argument types are considered on an equal footing regardless of
|
|
|
|
search path position.
|
|
|
|
</para>
|
|
|
|
</step>
|
2008-07-16 03:30:23 +02:00
|
|
|
<step performance="optional">
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
If a function is declared with a <literal>VARIADIC</literal> array parameter, and
|
|
|
|
the call does not use the <literal>VARIADIC</literal> keyword, then the function
|
2008-07-16 03:30:23 +02:00
|
|
|
is treated as if the array parameter were replaced by one or more occurrences
|
|
|
|
of its element type, as needed to match the call. After such expansion the
|
|
|
|
function might have effective argument types identical to some non-variadic
|
|
|
|
function. In that case the function appearing earlier in the search path is
|
|
|
|
used, or if the two functions are in the same schema, the non-variadic one is
|
2008-12-18 19:20:35 +01:00
|
|
|
preferred.
|
|
|
|
</para>
|
2018-07-29 05:08:01 +02:00
|
|
|
<para>
|
|
|
|
This creates a security hazard when calling, via qualified name
|
|
|
|
<footnote id="func-qualified-security">
|
|
|
|
<!-- If you edit this, consider editing op-qualified-security. -->
|
|
|
|
<para>
|
|
|
|
The hazard does not arise with a non-schema-qualified name, because a
|
|
|
|
search path containing schemas that permit untrusted users to create
|
|
|
|
objects is not a <link linkend="ddl-schemas-patterns">secure schema usage
|
|
|
|
pattern</link>.
|
|
|
|
</para>
|
|
|
|
</footnote>,
|
|
|
|
a variadic function found in a schema that permits untrusted users to create
|
|
|
|
objects. A malicious user can take control and execute arbitrary SQL
|
|
|
|
functions as though you executed them. Substitute a call bearing
|
|
|
|
the <literal>VARIADIC</literal> keyword, which bypasses this hazard. Calls
|
|
|
|
populating <literal>VARIADIC "any"</literal> parameters often have no
|
|
|
|
equivalent formulation containing the <literal>VARIADIC</literal> keyword. To
|
|
|
|
issue those calls safely, the function's schema must permit only trusted users
|
|
|
|
to create objects.
|
|
|
|
</para>
|
2008-12-18 19:20:35 +01:00
|
|
|
</step>
|
|
|
|
<step performance="optional">
|
|
|
|
<para>
|
|
|
|
Functions that have default values for parameters are considered to match any
|
|
|
|
call that omits zero or more of the defaultable parameter positions. If more
|
|
|
|
than one such function matches a call, the one appearing earliest in the
|
|
|
|
search path is used. If there are two or more such functions in the same
|
|
|
|
schema with identical parameter types in the non-defaulted positions (which is
|
|
|
|
possible if they have different sets of defaultable parameters), the system
|
|
|
|
will not be able to determine which to prefer, and so an <quote>ambiguous
|
2017-10-09 03:44:17 +02:00
|
|
|
function call</quote> error will result if no better match to the call can be
|
2008-12-18 19:20:35 +01:00
|
|
|
found.
|
2008-07-16 03:30:23 +02:00
|
|
|
</para>
|
2018-07-29 05:08:01 +02:00
|
|
|
<para>
|
|
|
|
This creates an availability hazard when calling, via qualified
|
|
|
|
name<footnoteref linkend="func-qualified-security"/>, any function found in a
|
|
|
|
schema that permits untrusted users to create objects. A malicious user can
|
|
|
|
create a function with the name of an existing function, replicating that
|
|
|
|
function's parameters and appending novel parameters having default values.
|
|
|
|
This precludes new calls to the original function. To forestall this hazard,
|
|
|
|
place functions in schemas that permit only trusted users to create objects.
|
|
|
|
</para>
|
2008-07-16 03:30:23 +02:00
|
|
|
</step>
|
2002-04-25 22:14:43 +02:00
|
|
|
</substeps>
|
|
|
|
</step>
|
|
|
|
|
|
|
|
<step performance="required">
|
|
|
|
<para>
|
|
|
|
Check for a function accepting exactly the input argument types.
|
|
|
|
If one exists (there can be only one exact match in the set of
|
2018-07-29 05:08:01 +02:00
|
|
|
functions considered), use it. Lack of an exact match creates a security
|
|
|
|
hazard when calling, via qualified
|
|
|
|
name<footnoteref linkend="func-qualified-security"/>, a function found in a
|
|
|
|
schema that permits untrusted users to create objects. In such situations,
|
|
|
|
cast arguments to force an exact match. (Cases involving <type>unknown</type>
|
|
|
|
will never find a match at this step.)
|
2003-03-13 02:30:29 +01:00
|
|
|
</para>
|
|
|
|
</step>
|
|
|
|
|
1998-07-08 15:53:15 +02:00
|
|
|
<step performance="required">
|
|
|
|
<para>
|
2009-04-27 18:27:36 +02:00
|
|
|
If no exact match is found, see if the function call appears
|
2007-06-05 23:31:09 +02:00
|
|
|
to be a special type conversion request. This happens if the function call
|
2001-10-05 00:06:46 +02:00
|
|
|
has just one argument and the function name is the same as the (internal)
|
|
|
|
name of some data type. Furthermore, the function argument must be either
|
2008-07-11 09:02:43 +02:00
|
|
|
an unknown-type literal, or a type that is binary-coercible to the named
|
2007-06-05 23:31:09 +02:00
|
|
|
data type, or a type that could be converted to the named data type by
|
|
|
|
applying that type's I/O functions (that is, the conversion is either to or
|
|
|
|
from one of the standard string types). When these conditions are met,
|
2017-10-09 03:44:17 +02:00
|
|
|
the function call is treated as a form of <literal>CAST</literal> specification.
|
2007-06-05 23:31:09 +02:00
|
|
|
<footnote>
|
|
|
|
<para>
|
|
|
|
The reason for this step is to support function-style cast specifications
|
|
|
|
in cases where there is not an actual cast function. If there is a cast
|
|
|
|
function, it is conventionally named after its output type, and so there
|
|
|
|
is no need to have a special case. See
|
2017-11-23 15:39:47 +01:00
|
|
|
<xref linkend="sql-createcast"/>
|
2007-06-05 23:31:09 +02:00
|
|
|
for additional commentary.
|
|
|
|
</para>
|
|
|
|
</footnote>
|
2001-10-05 00:06:46 +02:00
|
|
|
</para>
|
|
|
|
</step>
|
|
|
|
<step performance="required">
|
|
|
|
<para>
|
1998-07-08 15:53:15 +02:00
|
|
|
Look for the best match.
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
1998-07-08 15:53:15 +02:00
|
|
|
<substeps>
|
|
|
|
<step performance="required">
|
|
|
|
<para>
|
2009-06-17 23:58:49 +02:00
|
|
|
Discard candidate functions for which the input types do not match
|
2003-03-13 02:30:29 +01:00
|
|
|
and cannot be converted (using an implicit conversion) to match.
|
2002-04-25 22:14:43 +02:00
|
|
|
<type>unknown</type> literals are
|
2003-03-13 02:30:29 +01:00
|
|
|
assumed to be convertible to anything for this purpose. If only one
|
2002-04-25 22:14:43 +02:00
|
|
|
candidate remains, use it; else continue to the next step.
|
2000-12-17 06:55:26 +01:00
|
|
|
</para>
|
|
|
|
</step>
|
1998-07-08 15:53:15 +02:00
|
|
|
<step performance="required">
|
|
|
|
<para>
|
2014-08-10 22:13:13 +02:00
|
|
|
If any input argument is of a domain type, treat it as being of the
|
|
|
|
domain's base type for all subsequent steps. This ensures that domains
|
|
|
|
act like their base types for purposes of ambiguous-function resolution.
|
|
|
|
</para>
|
|
|
|
</step>
|
|
|
|
<step performance="required">
|
|
|
|
<para>
|
2000-12-17 06:55:26 +01:00
|
|
|
Run through all candidates and keep those with the most exact matches
|
2014-08-10 22:13:13 +02:00
|
|
|
on input types. Keep all candidates if none have exact matches.
|
2000-12-17 06:55:26 +01:00
|
|
|
If only one candidate remains, use it; else continue to the next step.
|
|
|
|
</para>
|
|
|
|
</step>
|
1998-07-08 15:53:15 +02:00
|
|
|
<step performance="required">
|
|
|
|
<para>
|
2003-05-26 02:11:29 +02:00
|
|
|
Run through all candidates and keep those that accept preferred types (of the
|
2003-11-01 02:56:29 +01:00
|
|
|
input data type's type category) at the most positions where type conversion
|
2003-05-26 02:11:29 +02:00
|
|
|
will be required.
|
2000-12-17 06:55:26 +01:00
|
|
|
Keep all candidates if none accept preferred types.
|
|
|
|
If only one candidate remains, use it; else continue to the next step.
|
|
|
|
</para>
|
|
|
|
</step>
|
1998-07-08 15:53:15 +02:00
|
|
|
<step performance="required">
|
|
|
|
<para>
|
2003-05-26 02:11:29 +02:00
|
|
|
If any input arguments are <type>unknown</type>, check the type categories
|
Replace the hard-wired type knowledge in TypeCategory() and IsPreferredType()
with system catalog lookups, as was foreseen to be necessary almost since
their creation. Instead put the information into two new pg_type columns,
typcategory and typispreferred. Add support for setting these when
creating a user-defined base type.
The category column is just a "char" (i.e. a poor man's enum), allowing
a crude form of user extensibility of the category list: just use an
otherwise-unused character. This seems sufficient for foreseen uses,
but we could upgrade to having an actual category catalog someday, if
there proves to be a huge demand for custom type categories.
In this patch I have attempted to hew exactly to the behavior of the
previous hardwired logic, except for introducing new type categories for
arrays, composites, and enums. In particular the default preferred state
for user-defined types remains TRUE. That seems worth revisiting, but it
should be done as a separate patch from introducing the infrastructure.
Likewise, any adjustment of the standard set of categories should be done
separately.
2008-07-30 19:05:05 +02:00
|
|
|
accepted
|
2000-12-17 06:55:26 +01:00
|
|
|
at those argument positions by the remaining candidates. At each position,
|
2003-03-13 02:30:29 +01:00
|
|
|
select the <type>string</type> category if any candidate accepts that category.
|
|
|
|
(This bias towards string
|
2009-04-27 18:27:36 +02:00
|
|
|
is appropriate since an unknown-type literal looks like a string.)
|
2000-12-17 06:55:26 +01:00
|
|
|
Otherwise, if all the remaining candidates accept the same type category,
|
2000-12-19 01:54:59 +01:00
|
|
|
select that category; otherwise fail because
|
2003-03-13 02:30:29 +01:00
|
|
|
the correct choice cannot be deduced without more clues.
|
2003-05-26 02:11:29 +02:00
|
|
|
Now discard candidates that do not accept the selected type category.
|
Replace the hard-wired type knowledge in TypeCategory() and IsPreferredType()
with system catalog lookups, as was foreseen to be necessary almost since
their creation. Instead put the information into two new pg_type columns,
typcategory and typispreferred. Add support for setting these when
creating a user-defined base type.
The category column is just a "char" (i.e. a poor man's enum), allowing
a crude form of user extensibility of the category list: just use an
otherwise-unused character. This seems sufficient for foreseen uses,
but we could upgrade to having an actual category catalog someday, if
there proves to be a huge demand for custom type categories.
In this patch I have attempted to hew exactly to the behavior of the
previous hardwired logic, except for introducing new type categories for
arrays, composites, and enums. In particular the default preferred state
for user-defined types remains TRUE. That seems worth revisiting, but it
should be done as a separate patch from introducing the infrastructure.
Likewise, any adjustment of the standard set of categories should be done
separately.
2008-07-30 19:05:05 +02:00
|
|
|
Furthermore, if any candidate accepts a preferred type in that category,
|
|
|
|
discard candidates that accept non-preferred types for that argument.
|
2011-11-18 00:28:41 +01:00
|
|
|
Keep all candidates if none survive these tests.
|
|
|
|
If only one candidate remains, use it; else continue to the next step.
|
2000-12-17 06:55:26 +01:00
|
|
|
</para>
|
|
|
|
</step>
|
|
|
|
<step performance="required">
|
|
|
|
<para>
|
2011-11-18 00:28:41 +01:00
|
|
|
If there are both <type>unknown</type> and known-type arguments, and all
|
|
|
|
the known-type arguments have the same type, assume that the
|
|
|
|
<type>unknown</type> arguments are also of that type, and check which
|
|
|
|
candidates can accept that type at the <type>unknown</type>-argument
|
|
|
|
positions. If exactly one candidate passes this test, use it.
|
|
|
|
Otherwise, fail.
|
2000-12-17 06:55:26 +01:00
|
|
|
</para>
|
|
|
|
</step>
|
1998-07-08 15:53:15 +02:00
|
|
|
</substeps>
|
1998-12-29 03:24:47 +01:00
|
|
|
</step>
|
1998-07-08 15:53:15 +02:00
|
|
|
</procedure>
|
2000-12-17 06:55:26 +01:00
|
|
|
|
2003-03-13 02:30:29 +01:00
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
Note that the <quote>best match</quote> rules are identical for operator and
|
2003-05-26 02:11:29 +02:00
|
|
|
function type resolution.
|
2003-03-13 02:30:29 +01:00
|
|
|
Some examples follow.
|
|
|
|
</para>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
2001-09-15 02:48:59 +02:00
|
|
|
<example>
|
2003-03-13 02:30:29 +01:00
|
|
|
<title>Rounding Function Argument Type Resolution</title>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
|
|
|
<para>
|
2009-06-17 23:58:49 +02:00
|
|
|
There is only one <function>round</function> function that takes two
|
|
|
|
arguments; it takes a first argument of type <type>numeric</type> and
|
|
|
|
a second argument of type <type>integer</type>.
|
|
|
|
So the following query automatically converts
|
2003-03-13 02:30:29 +01:00
|
|
|
the first argument of type <type>integer</type> to
|
|
|
|
<type>numeric</type>:
|
1998-07-08 15:53:15 +02:00
|
|
|
|
2001-09-15 02:48:59 +02:00
|
|
|
<screen>
|
2003-03-13 02:30:29 +01:00
|
|
|
SELECT round(4, 4);
|
|
|
|
|
|
|
|
round
|
|
|
|
--------
|
|
|
|
4.0000
|
1998-07-08 15:53:15 +02:00
|
|
|
(1 row)
|
2001-09-15 02:48:59 +02:00
|
|
|
</screen>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
2009-04-27 18:27:36 +02:00
|
|
|
That query is actually transformed by the parser to:
|
2001-09-15 02:48:59 +02:00
|
|
|
<screen>
|
2003-03-13 02:30:29 +01:00
|
|
|
SELECT round(CAST (4 AS numeric), 4);
|
|
|
|
</screen>
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
Since numeric constants with decimal points are initially assigned the
|
|
|
|
type <type>numeric</type>, the following query will require no type
|
2009-04-27 18:27:36 +02:00
|
|
|
conversion and therefore might be slightly more efficient:
|
2003-03-13 02:30:29 +01:00
|
|
|
<screen>
|
|
|
|
SELECT round(4.0, 4);
|
2001-09-15 02:48:59 +02:00
|
|
|
</screen>
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
2001-09-15 02:48:59 +02:00
|
|
|
</example>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
2018-07-29 05:08:01 +02:00
|
|
|
<example>
|
|
|
|
<title>Variadic Function Resolution</title>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
<screen>
|
|
|
|
CREATE FUNCTION public.variadic_example(VARIADIC numeric[]) RETURNS int
|
|
|
|
LANGUAGE sql AS 'SELECT 1';
|
|
|
|
CREATE FUNCTION
|
|
|
|
</screen>
|
|
|
|
|
|
|
|
This function accepts, but does not require, the VARIADIC keyword. It
|
|
|
|
tolerates both integer and numeric arguments:
|
|
|
|
|
|
|
|
<screen>
|
|
|
|
SELECT public.variadic_example(0),
|
|
|
|
public.variadic_example(0.0),
|
|
|
|
public.variadic_example(VARIADIC array[0.0]);
|
|
|
|
variadic_example | variadic_example | variadic_example
|
|
|
|
------------------+------------------+------------------
|
|
|
|
1 | 1 | 1
|
|
|
|
(1 row)
|
|
|
|
</screen>
|
|
|
|
|
|
|
|
However, the first and second calls will prefer more-specific functions, if
|
|
|
|
available:
|
|
|
|
|
|
|
|
<screen>
|
|
|
|
CREATE FUNCTION public.variadic_example(numeric) RETURNS int
|
|
|
|
LANGUAGE sql AS 'SELECT 2';
|
|
|
|
CREATE FUNCTION
|
|
|
|
|
|
|
|
CREATE FUNCTION public.variadic_example(int) RETURNS int
|
|
|
|
LANGUAGE sql AS 'SELECT 3';
|
|
|
|
CREATE FUNCTION
|
|
|
|
|
|
|
|
SELECT public.variadic_example(0),
|
|
|
|
public.variadic_example(0.0),
|
|
|
|
public.variadic_example(VARIADIC array[0.0]);
|
|
|
|
variadic_example | variadic_example | variadic_example
|
|
|
|
------------------+------------------+------------------
|
|
|
|
3 | 2 | 1
|
|
|
|
(1 row)
|
|
|
|
</screen>
|
|
|
|
|
|
|
|
Given the default configuration and only the first function existing, the
|
|
|
|
first and second calls are insecure. Any user could intercept them by
|
|
|
|
creating the second or third function. By matching the argument type exactly
|
|
|
|
and using the <literal>VARIADIC</literal> keyword, the third call is secure.
|
|
|
|
</para>
|
|
|
|
</example>
|
|
|
|
|
2001-09-15 02:48:59 +02:00
|
|
|
<example>
|
|
|
|
<title>Substring Function Type Resolution</title>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
|
|
|
<para>
|
2003-03-13 02:30:29 +01:00
|
|
|
There are several <function>substr</function> functions, one of which
|
|
|
|
takes types <type>text</type> and <type>integer</type>. If called
|
|
|
|
with a string constant of unspecified type, the system chooses the
|
|
|
|
candidate function that accepts an argument of the preferred category
|
|
|
|
<literal>string</literal> (namely of type <type>text</type>).
|
1998-07-08 15:53:15 +02:00
|
|
|
|
2001-09-15 02:48:59 +02:00
|
|
|
<screen>
|
2003-03-13 02:30:29 +01:00
|
|
|
SELECT substr('1234', 3);
|
|
|
|
|
2000-03-26 20:32:30 +02:00
|
|
|
substr
|
|
|
|
--------
|
|
|
|
34
|
1998-07-08 15:53:15 +02:00
|
|
|
(1 row)
|
2001-09-15 02:48:59 +02:00
|
|
|
</screen>
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
|
|
|
<para>
|
|
|
|
If the string is declared to be of type <type>varchar</type>, as might be the case
|
2003-03-13 02:30:29 +01:00
|
|
|
if it comes from a table, then the parser will try to convert it to become <type>text</type>:
|
2001-09-15 02:48:59 +02:00
|
|
|
<screen>
|
2003-03-13 02:30:29 +01:00
|
|
|
SELECT substr(varchar '1234', 3);
|
|
|
|
|
2000-03-26 20:32:30 +02:00
|
|
|
substr
|
|
|
|
--------
|
|
|
|
34
|
1998-07-08 15:53:15 +02:00
|
|
|
(1 row)
|
2001-09-15 02:48:59 +02:00
|
|
|
</screen>
|
2003-03-13 02:30:29 +01:00
|
|
|
|
2009-04-27 18:27:36 +02:00
|
|
|
This is transformed by the parser to effectively become:
|
2001-09-15 02:48:59 +02:00
|
|
|
<screen>
|
2003-03-13 02:30:29 +01:00
|
|
|
SELECT substr(CAST (varchar '1234' AS text), 3);
|
2001-09-15 02:48:59 +02:00
|
|
|
</screen>
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
2001-09-15 02:48:59 +02:00
|
|
|
<para>
|
1998-07-08 15:53:15 +02:00
|
|
|
<note>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
The parser learns from the <structname>pg_cast</structname> catalog that
|
2003-05-26 02:11:29 +02:00
|
|
|
<type>text</type> and <type>varchar</type>
|
2003-03-13 02:30:29 +01:00
|
|
|
are binary-compatible, meaning that one can be passed to a function that
|
2000-12-17 06:55:26 +01:00
|
|
|
accepts the other without doing any physical conversion. Therefore, no
|
2007-06-05 23:31:09 +02:00
|
|
|
type conversion call is really inserted in this case.
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
1998-07-08 15:53:15 +02:00
|
|
|
</note>
|
2001-09-15 02:48:59 +02:00
|
|
|
</para>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
|
|
|
<para>
|
2007-06-05 23:31:09 +02:00
|
|
|
And, if the function is called with an argument of type <type>integer</type>,
|
|
|
|
the parser will try to convert that to <type>text</type>:
|
2001-09-15 02:48:59 +02:00
|
|
|
<screen>
|
2003-03-13 02:30:29 +01:00
|
|
|
SELECT substr(1234, 3);
|
2007-06-05 23:31:09 +02:00
|
|
|
ERROR: function substr(integer, integer) does not exist
|
|
|
|
HINT: No function matches the given name and argument types. You might need
|
|
|
|
to add explicit type casts.
|
|
|
|
</screen>
|
|
|
|
|
2017-10-09 03:44:17 +02:00
|
|
|
This does not work because <type>integer</type> does not have an implicit cast
|
|
|
|
to <type>text</type>. An explicit cast will work, however:
|
2007-06-05 23:31:09 +02:00
|
|
|
<screen>
|
|
|
|
SELECT substr(CAST (1234 AS text), 3);
|
2003-03-13 02:30:29 +01:00
|
|
|
|
2000-03-26 20:32:30 +02:00
|
|
|
substr
|
|
|
|
--------
|
|
|
|
34
|
1998-07-08 15:53:15 +02:00
|
|
|
(1 row)
|
2001-09-15 02:48:59 +02:00
|
|
|
</screen>
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
2001-09-15 02:48:59 +02:00
|
|
|
</example>
|
|
|
|
|
1998-12-29 03:24:47 +01:00
|
|
|
</sect1>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
2000-09-29 22:21:34 +02:00
|
|
|
<sect1 id="typeconv-query">
|
2003-03-13 02:30:29 +01:00
|
|
|
<title>Value Storage</title>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
2001-10-15 03:00:59 +02:00
|
|
|
<para>
|
2003-03-13 02:30:29 +01:00
|
|
|
Values to be inserted into a table are converted to the destination
|
2002-01-20 23:19:57 +01:00
|
|
|
column's data type according to the
|
2001-10-15 03:00:59 +02:00
|
|
|
following steps.
|
|
|
|
</para>
|
|
|
|
|
1998-07-08 15:53:15 +02:00
|
|
|
<procedure>
|
2003-03-13 02:30:29 +01:00
|
|
|
<title>Value Storage Type Conversion</title>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
|
|
|
<step performance="required">
|
|
|
|
<para>
|
|
|
|
Check for an exact match with the target.
|
2003-03-13 02:30:29 +01:00
|
|
|
</para>
|
|
|
|
</step>
|
|
|
|
|
1998-07-08 15:53:15 +02:00
|
|
|
<step performance="required">
|
|
|
|
<para>
|
Use standard casting mechanism to convert types in plpgsql, when possible.
plpgsql's historical method for converting datatypes during assignments was
to apply the source type's output function and then the destination type's
input function. Aside from being miserably inefficient in most cases, this
method failed outright in many cases where a user might expect it to work;
an example is that "declare x int; ... x := 3.9;" would fail, not round the
value to 4.
Instead, let's convert by applying the appropriate assignment cast whenever
there is one. To avoid breaking compatibility unnecessarily, fall back to
the I/O conversion method if there is no assignment cast.
So far as I can tell, there is just one case where this method produces a
different result than the old code in a case where the old code would not
have thrown an error. That is assignment of a boolean value to a string
variable (type text, varchar, or bpchar); the old way gave boolean's output
representation, ie 't'/'f', while the new way follows the behavior of the
bool-to-text cast and so gives 'true' or 'false'. This will need to be
called out as an incompatibility in the 9.5 release notes.
Aside from handling many conversion cases more sanely, this method is
often significantly faster than the old way. In part that's because
of more effective caching of the conversion info.
2015-03-04 17:04:30 +01:00
|
|
|
Otherwise, try to convert the expression to the target type. This is possible
|
2017-10-09 03:44:17 +02:00
|
|
|
if an <firstterm>assignment cast</firstterm> between the two types is registered in the
|
2017-11-23 15:39:47 +01:00
|
|
|
<structname>pg_cast</structname> catalog (see <xref linkend="sql-createcast"/>).
|
Use standard casting mechanism to convert types in plpgsql, when possible.
plpgsql's historical method for converting datatypes during assignments was
to apply the source type's output function and then the destination type's
input function. Aside from being miserably inefficient in most cases, this
method failed outright in many cases where a user might expect it to work;
an example is that "declare x int; ... x := 3.9;" would fail, not round the
value to 4.
Instead, let's convert by applying the appropriate assignment cast whenever
there is one. To avoid breaking compatibility unnecessarily, fall back to
the I/O conversion method if there is no assignment cast.
So far as I can tell, there is just one case where this method produces a
different result than the old code in a case where the old code would not
have thrown an error. That is assignment of a boolean value to a string
variable (type text, varchar, or bpchar); the old way gave boolean's output
representation, ie 't'/'f', while the new way follows the behavior of the
bool-to-text cast and so gives 'true' or 'false'. This will need to be
called out as an incompatibility in the 9.5 release notes.
Aside from handling many conversion cases more sanely, this method is
often significantly faster than the old way. In part that's because
of more effective caching of the conversion info.
2015-03-04 17:04:30 +01:00
|
|
|
Alternatively, if the expression is an unknown-type literal, the contents of
|
2000-12-17 06:55:26 +01:00
|
|
|
the literal string will be fed to the input conversion routine for the target
|
|
|
|
type.
|
2003-03-13 02:30:29 +01:00
|
|
|
</para>
|
|
|
|
</step>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
|
|
|
<step performance="required">
|
|
|
|
<para>
|
2004-12-24 00:07:38 +01:00
|
|
|
Check to see if there is a sizing cast for the target type. A sizing
|
|
|
|
cast is a cast from that type to itself. If one is found in the
|
2017-10-09 03:44:17 +02:00
|
|
|
<structname>pg_cast</structname> catalog, apply it to the expression before storing
|
2004-12-24 00:07:38 +01:00
|
|
|
into the destination column. The implementation function for such a cast
|
|
|
|
always takes an extra parameter of type <type>integer</type>, which receives
|
2017-10-09 03:44:17 +02:00
|
|
|
the destination column's <structfield>atttypmod</structfield> value (typically its
|
|
|
|
declared length, although the interpretation of <structfield>atttypmod</structfield>
|
|
|
|
varies for different data types), and it may take a third <type>boolean</type>
|
2011-09-06 18:14:51 +02:00
|
|
|
parameter that says whether the cast is explicit or implicit. The cast
|
|
|
|
function
|
2004-12-24 00:07:38 +01:00
|
|
|
is responsible for applying any length-dependent semantics such as size
|
|
|
|
checking or truncation.
|
2003-03-13 02:30:29 +01:00
|
|
|
</para>
|
|
|
|
</step>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
|
|
|
</procedure>
|
|
|
|
|
2001-09-15 02:48:59 +02:00
|
|
|
<example>
|
2001-10-15 03:00:59 +02:00
|
|
|
<title><type>character</type> Storage Type Conversion</title>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
|
|
|
<para>
|
2011-09-06 18:14:51 +02:00
|
|
|
For a target column declared as <type>character(20)</type> the following
|
|
|
|
statement shows that the stored value is sized correctly:
|
1998-07-08 15:53:15 +02:00
|
|
|
|
2001-09-15 02:48:59 +02:00
|
|
|
<screen>
|
2003-03-13 02:30:29 +01:00
|
|
|
CREATE TABLE vv (v character(20));
|
|
|
|
INSERT INTO vv SELECT 'abc' || 'def';
|
2011-09-06 18:14:51 +02:00
|
|
|
SELECT v, octet_length(v) FROM vv;
|
2003-03-13 02:30:29 +01:00
|
|
|
|
2011-09-06 18:14:51 +02:00
|
|
|
v | octet_length
|
|
|
|
----------------------+--------------
|
|
|
|
abcdef | 20
|
2000-12-17 06:55:26 +01:00
|
|
|
(1 row)
|
2001-09-15 02:48:59 +02:00
|
|
|
</screen>
|
2003-03-13 02:30:29 +01:00
|
|
|
</para>
|
2000-12-17 06:55:26 +01:00
|
|
|
|
2003-03-13 02:30:29 +01:00
|
|
|
<para>
|
2001-09-15 02:48:59 +02:00
|
|
|
What has really happened here is that the two unknown literals are resolved
|
2001-10-15 03:00:59 +02:00
|
|
|
to <type>text</type> by default, allowing the <literal>||</literal> operator
|
|
|
|
to be resolved as <type>text</type> concatenation. Then the <type>text</type>
|
2003-03-13 02:30:29 +01:00
|
|
|
result of the operator is converted to <type>bpchar</type> (<quote>blank-padded
|
2017-10-09 03:44:17 +02:00
|
|
|
char</quote>, the internal name of the <type>character</type> data type) to match the target
|
2008-07-11 09:02:43 +02:00
|
|
|
column type. (Since the conversion from <type>text</type> to
|
|
|
|
<type>bpchar</type> is binary-coercible, this conversion does
|
2001-10-15 03:00:59 +02:00
|
|
|
not insert any real function call.) Finally, the sizing function
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>bpchar(bpchar, integer, boolean)</literal> is found in the system catalog
|
2001-10-15 03:00:59 +02:00
|
|
|
and applied to the operator's result and the stored column length. This
|
|
|
|
type-specific function performs the required length check and addition of
|
|
|
|
padding spaces.
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
2001-09-15 02:48:59 +02:00
|
|
|
</example>
|
1998-12-29 03:24:47 +01:00
|
|
|
</sect1>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
2000-12-17 06:55:26 +01:00
|
|
|
<sect1 id="typeconv-union-case">
|
2005-06-27 00:05:42 +02:00
|
|
|
<title><literal>UNION</literal>, <literal>CASE</literal>, and Related Constructs</title>
|
2003-08-31 19:32:24 +02:00
|
|
|
|
|
|
|
<indexterm zone="typeconv-union-case">
|
|
|
|
<primary>UNION</primary>
|
|
|
|
<secondary>determination of result type</secondary>
|
|
|
|
</indexterm>
|
|
|
|
|
|
|
|
<indexterm zone="typeconv-union-case">
|
|
|
|
<primary>CASE</primary>
|
|
|
|
<secondary>determination of result type</secondary>
|
|
|
|
</indexterm>
|
|
|
|
|
|
|
|
<indexterm zone="typeconv-union-case">
|
|
|
|
<primary>ARRAY</primary>
|
|
|
|
<secondary>determination of result type</secondary>
|
|
|
|
</indexterm>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
2006-09-18 21:54:01 +02:00
|
|
|
<indexterm zone="typeconv-union-case">
|
|
|
|
<primary>VALUES</primary>
|
|
|
|
<secondary>determination of result type</secondary>
|
|
|
|
</indexterm>
|
|
|
|
|
2005-06-27 00:05:42 +02:00
|
|
|
<indexterm zone="typeconv-union-case">
|
|
|
|
<primary>GREATEST</primary>
|
|
|
|
<secondary>determination of result type</secondary>
|
|
|
|
</indexterm>
|
|
|
|
|
|
|
|
<indexterm zone="typeconv-union-case">
|
|
|
|
<primary>LEAST</primary>
|
|
|
|
<secondary>determination of result type</secondary>
|
|
|
|
</indexterm>
|
|
|
|
|
1998-07-08 15:53:15 +02:00
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
SQL <literal>UNION</literal> constructs must match up possibly dissimilar
|
2003-11-04 10:55:39 +01:00
|
|
|
types to become a single result set. The resolution algorithm is
|
|
|
|
applied separately to each output column of a union query. The
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>INTERSECT</literal> and <literal>EXCEPT</literal> constructs resolve
|
|
|
|
dissimilar types in the same way as <literal>UNION</literal>. The
|
|
|
|
<literal>CASE</literal>, <literal>ARRAY</literal>, <literal>VALUES</literal>,
|
|
|
|
<function>GREATEST</function> and <function>LEAST</function> constructs use the identical
|
2003-11-04 10:55:39 +01:00
|
|
|
algorithm to match up their component expressions and select a result
|
|
|
|
data type.
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
2003-03-13 02:30:29 +01:00
|
|
|
|
1998-07-08 15:53:15 +02:00
|
|
|
<procedure>
|
2005-06-27 00:05:42 +02:00
|
|
|
<title>Type Resolution for <literal>UNION</literal>, <literal>CASE</literal>,
|
|
|
|
and Related Constructs</title>
|
2000-12-17 06:55:26 +01:00
|
|
|
|
Fix select_common_type() so that it can select a domain type, if all inputs
to a UNION, CASE, or related construct are of the same domain type. The
main part of this routine smashes domains to their base types, which seems
necessary because the logic involves TypeCategory() and IsPreferredType(),
neither of which work usefully on domains. However, we can add a first
pass that just detects whether all the inputs are exactly the same type,
and if so accept that without question (so long as it's not UNKNOWN).
Per recent gripe from Dean Rasheed.
In passing, remove some tests for InvalidOid, which have clearly been dead
code for quite some time now, because getBaseType() would fail on that input.
Also, clarify the manual's not-very-precise description of the existing
algorithm's behavior.
2007-11-26 17:46:51 +01:00
|
|
|
<step performance="required">
|
|
|
|
<para>
|
|
|
|
If all inputs are of the same type, and it is not <type>unknown</type>,
|
2014-08-10 22:13:13 +02:00
|
|
|
resolve as that type.
|
|
|
|
</para>
|
|
|
|
</step>
|
|
|
|
|
|
|
|
<step performance="required">
|
|
|
|
<para>
|
|
|
|
If any input is of a domain type, treat it as being of the
|
|
|
|
domain's base type for all subsequent steps.
|
|
|
|
<footnote>
|
|
|
|
<para>
|
|
|
|
Somewhat like the treatment of domain inputs for operators and
|
|
|
|
functions, this behavior allows a domain type to be preserved through
|
2017-10-09 03:44:17 +02:00
|
|
|
a <literal>UNION</literal> or similar construct, so long as the user is
|
2014-08-10 22:13:13 +02:00
|
|
|
careful to ensure that all inputs are implicitly or explicitly of that
|
2020-08-17 21:40:07 +02:00
|
|
|
exact type. Otherwise the domain's base type will be used.
|
2014-08-10 22:13:13 +02:00
|
|
|
</para>
|
|
|
|
</footnote>
|
Fix select_common_type() so that it can select a domain type, if all inputs
to a UNION, CASE, or related construct are of the same domain type. The
main part of this routine smashes domains to their base types, which seems
necessary because the logic involves TypeCategory() and IsPreferredType(),
neither of which work usefully on domains. However, we can add a first
pass that just detects whether all the inputs are exactly the same type,
and if so accept that without question (so long as it's not UNKNOWN).
Per recent gripe from Dean Rasheed.
In passing, remove some tests for InvalidOid, which have clearly been dead
code for quite some time now, because getBaseType() would fail on that input.
Also, clarify the manual's not-very-precise description of the existing
algorithm's behavior.
2007-11-26 17:46:51 +01:00
|
|
|
</para>
|
|
|
|
</step>
|
|
|
|
|
2000-12-17 06:55:26 +01:00
|
|
|
<step performance="required">
|
|
|
|
<para>
|
|
|
|
If all inputs are of type <type>unknown</type>, resolve as type
|
2003-03-13 02:30:29 +01:00
|
|
|
<type>text</type> (the preferred type of the string category).
|
Change unknown-type literals to type text in SELECT and RETURNING lists.
Previously, we left such literals alone if the query or subquery had
no properties forcing a type decision to be made (such as an ORDER BY or
DISTINCT clause using that output column). This meant that "unknown" could
be an exposed output column type, which has never been a great idea because
it could result in strange failures later on. For example, an outer query
that tried to do any operations on an unknown-type subquery output would
generally fail with some weird error like "failed to find conversion
function from unknown to text" or "could not determine which collation to
use for string comparison". Also, if the case occurred in a CREATE VIEW's
query then the view would have an unknown-type column, causing similar
failures in queries trying to use the view.
To fix, at the tail end of parse analysis of a query, forcibly convert any
remaining "unknown" literals in its SELECT or RETURNING list to type text.
However, provide a switch to suppress that, and use it in the cases of
SELECT inside a set operation or INSERT command. In those cases we already
had type resolution rules that make use of context information from outside
the subquery proper, and we don't want to change that behavior.
Also, change creation of an unknown-type column in a relation from a
warning to a hard error. The error should be unreachable now in CREATE
VIEW or CREATE MATVIEW, but it's still possible to explicitly say "unknown"
in CREATE TABLE or CREATE (composite) TYPE. We want to forbid that because
it's nothing but a foot-gun.
This change creates a pg_upgrade failure case: a matview that contains an
unknown-type column can't be pg_upgraded, because reparsing the matview's
defining query will now decide that the column is of type text, which
doesn't match the cstring-like storage that the old materialized column
would actually have. Add a checking pass to detect that. While at it,
we can detect tables or composite types that would fail, essentially
for free. Those would fail safely anyway later on, but we might as
well fail earlier.
This patch is by me, but it owes something to previous investigations
by Rahila Syed. Also thanks to Ashutosh Bapat and Michael Paquier for
review.
Discussion: https://postgr.es/m/CAH2L28uwwbL9HUM-WR=hromW1Cvamkn7O-g8fPY2m=_7muJ0oA@mail.gmail.com
2017-01-25 15:17:18 +01:00
|
|
|
Otherwise, <type>unknown</type> inputs are ignored for the purposes
|
|
|
|
of the remaining rules.
|
2003-03-13 02:30:29 +01:00
|
|
|
</para>
|
|
|
|
</step>
|
2000-12-17 06:55:26 +01:00
|
|
|
|
|
|
|
<step performance="required">
|
|
|
|
<para>
|
2000-12-19 01:54:59 +01:00
|
|
|
If the non-unknown inputs are not all of the same type category, fail.
|
2003-03-13 02:30:29 +01:00
|
|
|
</para>
|
|
|
|
</step>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
|
|
|
<step performance="required">
|
|
|
|
<para>
|
2020-08-17 21:40:07 +02:00
|
|
|
Select the first non-unknown input type as the candidate type,
|
|
|
|
then consider each other non-unknown input type, left to right.
|
|
|
|
<footnote>
|
|
|
|
<para>
|
|
|
|
For historical reasons, <literal>CASE</literal> treats
|
|
|
|
its <literal>ELSE</literal> clause (if any) as the <quote>first</quote>
|
|
|
|
input, with the <literal>THEN</literal> clauses(s) considered after
|
|
|
|
that. In all other cases, <quote>left to right</quote> means the order
|
|
|
|
in which the expressions appear in the query text.
|
|
|
|
</para>
|
|
|
|
</footnote>
|
|
|
|
If the candidate type can be implicitly converted to the other type,
|
|
|
|
but not vice-versa, select the other type as the new candidate type.
|
|
|
|
Then continue considering the remaining inputs. If, at any stage of this
|
|
|
|
process, a preferred type is selected, stop considering additional
|
|
|
|
inputs.
|
2003-03-13 02:30:29 +01:00
|
|
|
</para>
|
|
|
|
</step>
|
2000-12-17 06:55:26 +01:00
|
|
|
|
|
|
|
<step performance="required">
|
|
|
|
<para>
|
2020-08-17 21:40:07 +02:00
|
|
|
Convert all inputs to the final candidate type. Fail if there is not an
|
|
|
|
implicit conversion from a given input type to the candidate type.
|
2003-03-13 02:30:29 +01:00
|
|
|
</para>
|
|
|
|
</step>
|
1998-07-08 15:53:15 +02:00
|
|
|
</procedure>
|
|
|
|
|
2003-03-13 02:30:29 +01:00
|
|
|
<para>
|
|
|
|
Some examples follow.
|
|
|
|
</para>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
2001-09-15 02:48:59 +02:00
|
|
|
<example>
|
2003-03-13 02:30:29 +01:00
|
|
|
<title>Type Resolution with Underspecified Types in a Union</title>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
|
|
|
<para>
|
2001-09-15 02:48:59 +02:00
|
|
|
<screen>
|
2003-03-13 02:30:29 +01:00
|
|
|
SELECT text 'a' AS "text" UNION SELECT 'b';
|
|
|
|
|
|
|
|
text
|
2000-03-26 20:32:30 +02:00
|
|
|
------
|
|
|
|
a
|
|
|
|
b
|
1998-07-08 15:53:15 +02:00
|
|
|
(2 rows)
|
2001-09-15 02:48:59 +02:00
|
|
|
</screen>
|
2009-04-27 18:27:36 +02:00
|
|
|
Here, the unknown-type literal <literal>'b'</literal> will be resolved to type <type>text</type>.
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
2001-09-15 02:48:59 +02:00
|
|
|
</example>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
2001-09-15 02:48:59 +02:00
|
|
|
<example>
|
2003-03-13 02:30:29 +01:00
|
|
|
<title>Type Resolution in a Simple Union</title>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
|
|
|
<para>
|
2001-09-15 02:48:59 +02:00
|
|
|
<screen>
|
2003-03-13 02:30:29 +01:00
|
|
|
SELECT 1.2 AS "numeric" UNION SELECT 1;
|
|
|
|
|
|
|
|
numeric
|
Extend pg_cast castimplicit column to a three-way value; this allows us
to be flexible about assignment casts without introducing ambiguity in
operator/function resolution. Introduce a well-defined promotion hierarchy
for numeric datatypes (int2->int4->int8->numeric->float4->float8).
Change make_const to initially label numeric literals as int4, int8, or
numeric (never float8 anymore).
Explicitly mark Func and RelabelType nodes to indicate whether they came
from a function call, explicit cast, or implicit cast; use this to do
reverse-listing more accurately and without so many heuristics.
Explicit casts to char, varchar, bit, varbit will truncate or pad without
raising an error (the pre-7.2 behavior), while assigning to a column without
any explicit cast will still raise an error for wrong-length data like 7.3.
This more nearly follows the SQL spec than 7.2 behavior (we should be
reporting a 'completion condition' in the explicit-cast cases, but we have
no mechanism for that, so just do silent truncation).
Fix some problems with enforcement of typmod for array elements;
it didn't work at all in 'UPDATE ... SET array[n] = foo', for example.
Provide a generalized array_length_coerce() function to replace the
specialized per-array-type functions that used to be needed (and were
missing for NUMERIC as well as all the datetime types).
Add missing conversions int8<->float4, text<->numeric, oid<->int8.
initdb forced.
2002-09-18 23:35:25 +02:00
|
|
|
---------
|
|
|
|
1
|
|
|
|
1.2
|
1998-07-08 15:53:15 +02:00
|
|
|
(2 rows)
|
2001-09-15 02:48:59 +02:00
|
|
|
</screen>
|
2017-10-09 03:44:17 +02:00
|
|
|
The literal <literal>1.2</literal> is of type <type>numeric</type>,
|
|
|
|
and the <type>integer</type> value <literal>1</literal> can be cast implicitly to
|
|
|
|
<type>numeric</type>, so that type is used.
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
2001-09-15 02:48:59 +02:00
|
|
|
</example>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
2001-09-15 02:48:59 +02:00
|
|
|
<example>
|
2003-03-13 02:30:29 +01:00
|
|
|
<title>Type Resolution in a Transposed Union</title>
|
1998-07-08 15:53:15 +02:00
|
|
|
|
|
|
|
<para>
|
2001-09-15 02:48:59 +02:00
|
|
|
<screen>
|
2003-03-13 02:30:29 +01:00
|
|
|
SELECT 1 AS "real" UNION SELECT CAST('2.2' AS REAL);
|
|
|
|
|
|
|
|
real
|
Extend pg_cast castimplicit column to a three-way value; this allows us
to be flexible about assignment casts without introducing ambiguity in
operator/function resolution. Introduce a well-defined promotion hierarchy
for numeric datatypes (int2->int4->int8->numeric->float4->float8).
Change make_const to initially label numeric literals as int4, int8, or
numeric (never float8 anymore).
Explicitly mark Func and RelabelType nodes to indicate whether they came
from a function call, explicit cast, or implicit cast; use this to do
reverse-listing more accurately and without so many heuristics.
Explicit casts to char, varchar, bit, varbit will truncate or pad without
raising an error (the pre-7.2 behavior), while assigning to a column without
any explicit cast will still raise an error for wrong-length data like 7.3.
This more nearly follows the SQL spec than 7.2 behavior (we should be
reporting a 'completion condition' in the explicit-cast cases, but we have
no mechanism for that, so just do silent truncation).
Fix some problems with enforcement of typmod for array elements;
it didn't work at all in 'UPDATE ... SET array[n] = foo', for example.
Provide a generalized array_length_coerce() function to replace the
specialized per-array-type functions that used to be needed (and were
missing for NUMERIC as well as all the datetime types).
Add missing conversions int8<->float4, text<->numeric, oid<->int8.
initdb forced.
2002-09-18 23:35:25 +02:00
|
|
|
------
|
|
|
|
1
|
|
|
|
2.2
|
2000-12-17 06:55:26 +01:00
|
|
|
(2 rows)
|
2001-09-15 02:48:59 +02:00
|
|
|
</screen>
|
2017-10-09 03:44:17 +02:00
|
|
|
Here, since type <type>real</type> cannot be implicitly cast to <type>integer</type>,
|
|
|
|
but <type>integer</type> can be implicitly cast to <type>real</type>, the union
|
|
|
|
result type is resolved as <type>real</type>.
|
1998-12-29 03:24:47 +01:00
|
|
|
</para>
|
2001-09-15 02:48:59 +02:00
|
|
|
</example>
|
2018-03-25 22:15:15 +02:00
|
|
|
|
|
|
|
<example>
|
|
|
|
<title>Type Resolution in a Nested Union</title>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
<screen>
|
|
|
|
SELECT NULL UNION SELECT NULL UNION SELECT 1;
|
|
|
|
|
|
|
|
ERROR: UNION types text and integer cannot be matched
|
|
|
|
</screen>
|
|
|
|
This failure occurs because <productname>PostgreSQL</productname> treats
|
|
|
|
multiple <literal>UNION</literal>s as a nest of pairwise operations;
|
|
|
|
that is, this input is the same as
|
|
|
|
<screen>
|
|
|
|
(SELECT NULL UNION SELECT NULL) UNION SELECT 1;
|
|
|
|
</screen>
|
|
|
|
The inner <literal>UNION</literal> is resolved as emitting
|
|
|
|
type <type>text</type>, according to the rules given above. Then the
|
|
|
|
outer <literal>UNION</literal> has inputs of types <type>text</type>
|
|
|
|
and <type>integer</type>, leading to the observed error. The problem
|
|
|
|
can be fixed by ensuring that the leftmost <literal>UNION</literal>
|
|
|
|
has at least one input of the desired result type.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
<literal>INTERSECT</literal> and <literal>EXCEPT</literal> operations are
|
|
|
|
likewise resolved pairwise. However, the other constructs described in this
|
|
|
|
section consider all of their inputs in one resolution step.
|
|
|
|
</para>
|
|
|
|
</example>
|
Change unknown-type literals to type text in SELECT and RETURNING lists.
Previously, we left such literals alone if the query or subquery had
no properties forcing a type decision to be made (such as an ORDER BY or
DISTINCT clause using that output column). This meant that "unknown" could
be an exposed output column type, which has never been a great idea because
it could result in strange failures later on. For example, an outer query
that tried to do any operations on an unknown-type subquery output would
generally fail with some weird error like "failed to find conversion
function from unknown to text" or "could not determine which collation to
use for string comparison". Also, if the case occurred in a CREATE VIEW's
query then the view would have an unknown-type column, causing similar
failures in queries trying to use the view.
To fix, at the tail end of parse analysis of a query, forcibly convert any
remaining "unknown" literals in its SELECT or RETURNING list to type text.
However, provide a switch to suppress that, and use it in the cases of
SELECT inside a set operation or INSERT command. In those cases we already
had type resolution rules that make use of context information from outside
the subquery proper, and we don't want to change that behavior.
Also, change creation of an unknown-type column in a relation from a
warning to a hard error. The error should be unreachable now in CREATE
VIEW or CREATE MATVIEW, but it's still possible to explicitly say "unknown"
in CREATE TABLE or CREATE (composite) TYPE. We want to forbid that because
it's nothing but a foot-gun.
This change creates a pg_upgrade failure case: a matview that contains an
unknown-type column can't be pg_upgraded, because reparsing the matview's
defining query will now decide that the column is of type text, which
doesn't match the cstring-like storage that the old materialized column
would actually have. Add a checking pass to detect that. While at it,
we can detect tables or composite types that would fail, essentially
for free. Those would fail safely anyway later on, but we might as
well fail earlier.
This patch is by me, but it owes something to previous investigations
by Rahila Syed. Also thanks to Ashutosh Bapat and Michael Paquier for
review.
Discussion: https://postgr.es/m/CAH2L28uwwbL9HUM-WR=hromW1Cvamkn7O-g8fPY2m=_7muJ0oA@mail.gmail.com
2017-01-25 15:17:18 +01:00
|
|
|
</sect1>
|
|
|
|
|
|
|
|
<sect1 id="typeconv-select">
|
|
|
|
<title><literal>SELECT</literal> Output Columns</title>
|
|
|
|
|
|
|
|
<indexterm zone="typeconv-select">
|
|
|
|
<primary>SELECT</primary>
|
|
|
|
<secondary>determination of result type</secondary>
|
|
|
|
</indexterm>
|
|
|
|
|
|
|
|
<para>
|
|
|
|
The rules given in the preceding sections will result in assignment
|
2017-10-09 03:44:17 +02:00
|
|
|
of non-<type>unknown</type> data types to all expressions in a SQL query,
|
Change unknown-type literals to type text in SELECT and RETURNING lists.
Previously, we left such literals alone if the query or subquery had
no properties forcing a type decision to be made (such as an ORDER BY or
DISTINCT clause using that output column). This meant that "unknown" could
be an exposed output column type, which has never been a great idea because
it could result in strange failures later on. For example, an outer query
that tried to do any operations on an unknown-type subquery output would
generally fail with some weird error like "failed to find conversion
function from unknown to text" or "could not determine which collation to
use for string comparison". Also, if the case occurred in a CREATE VIEW's
query then the view would have an unknown-type column, causing similar
failures in queries trying to use the view.
To fix, at the tail end of parse analysis of a query, forcibly convert any
remaining "unknown" literals in its SELECT or RETURNING list to type text.
However, provide a switch to suppress that, and use it in the cases of
SELECT inside a set operation or INSERT command. In those cases we already
had type resolution rules that make use of context information from outside
the subquery proper, and we don't want to change that behavior.
Also, change creation of an unknown-type column in a relation from a
warning to a hard error. The error should be unreachable now in CREATE
VIEW or CREATE MATVIEW, but it's still possible to explicitly say "unknown"
in CREATE TABLE or CREATE (composite) TYPE. We want to forbid that because
it's nothing but a foot-gun.
This change creates a pg_upgrade failure case: a matview that contains an
unknown-type column can't be pg_upgraded, because reparsing the matview's
defining query will now decide that the column is of type text, which
doesn't match the cstring-like storage that the old materialized column
would actually have. Add a checking pass to detect that. While at it,
we can detect tables or composite types that would fail, essentially
for free. Those would fail safely anyway later on, but we might as
well fail earlier.
This patch is by me, but it owes something to previous investigations
by Rahila Syed. Also thanks to Ashutosh Bapat and Michael Paquier for
review.
Discussion: https://postgr.es/m/CAH2L28uwwbL9HUM-WR=hromW1Cvamkn7O-g8fPY2m=_7muJ0oA@mail.gmail.com
2017-01-25 15:17:18 +01:00
|
|
|
except for unspecified-type literals that appear as simple output
|
2017-10-09 03:44:17 +02:00
|
|
|
columns of a <command>SELECT</command> command. For example, in
|
Change unknown-type literals to type text in SELECT and RETURNING lists.
Previously, we left such literals alone if the query or subquery had
no properties forcing a type decision to be made (such as an ORDER BY or
DISTINCT clause using that output column). This meant that "unknown" could
be an exposed output column type, which has never been a great idea because
it could result in strange failures later on. For example, an outer query
that tried to do any operations on an unknown-type subquery output would
generally fail with some weird error like "failed to find conversion
function from unknown to text" or "could not determine which collation to
use for string comparison". Also, if the case occurred in a CREATE VIEW's
query then the view would have an unknown-type column, causing similar
failures in queries trying to use the view.
To fix, at the tail end of parse analysis of a query, forcibly convert any
remaining "unknown" literals in its SELECT or RETURNING list to type text.
However, provide a switch to suppress that, and use it in the cases of
SELECT inside a set operation or INSERT command. In those cases we already
had type resolution rules that make use of context information from outside
the subquery proper, and we don't want to change that behavior.
Also, change creation of an unknown-type column in a relation from a
warning to a hard error. The error should be unreachable now in CREATE
VIEW or CREATE MATVIEW, but it's still possible to explicitly say "unknown"
in CREATE TABLE or CREATE (composite) TYPE. We want to forbid that because
it's nothing but a foot-gun.
This change creates a pg_upgrade failure case: a matview that contains an
unknown-type column can't be pg_upgraded, because reparsing the matview's
defining query will now decide that the column is of type text, which
doesn't match the cstring-like storage that the old materialized column
would actually have. Add a checking pass to detect that. While at it,
we can detect tables or composite types that would fail, essentially
for free. Those would fail safely anyway later on, but we might as
well fail earlier.
This patch is by me, but it owes something to previous investigations
by Rahila Syed. Also thanks to Ashutosh Bapat and Michael Paquier for
review.
Discussion: https://postgr.es/m/CAH2L28uwwbL9HUM-WR=hromW1Cvamkn7O-g8fPY2m=_7muJ0oA@mail.gmail.com
2017-01-25 15:17:18 +01:00
|
|
|
|
|
|
|
<screen>
|
|
|
|
SELECT 'Hello World';
|
|
|
|
</screen>
|
|
|
|
|
|
|
|
there is nothing to identify what type the string literal should be
|
2017-10-09 03:44:17 +02:00
|
|
|
taken as. In this situation <productname>PostgreSQL</productname> will fall back
|
|
|
|
to resolving the literal's type as <type>text</type>.
|
Change unknown-type literals to type text in SELECT and RETURNING lists.
Previously, we left such literals alone if the query or subquery had
no properties forcing a type decision to be made (such as an ORDER BY or
DISTINCT clause using that output column). This meant that "unknown" could
be an exposed output column type, which has never been a great idea because
it could result in strange failures later on. For example, an outer query
that tried to do any operations on an unknown-type subquery output would
generally fail with some weird error like "failed to find conversion
function from unknown to text" or "could not determine which collation to
use for string comparison". Also, if the case occurred in a CREATE VIEW's
query then the view would have an unknown-type column, causing similar
failures in queries trying to use the view.
To fix, at the tail end of parse analysis of a query, forcibly convert any
remaining "unknown" literals in its SELECT or RETURNING list to type text.
However, provide a switch to suppress that, and use it in the cases of
SELECT inside a set operation or INSERT command. In those cases we already
had type resolution rules that make use of context information from outside
the subquery proper, and we don't want to change that behavior.
Also, change creation of an unknown-type column in a relation from a
warning to a hard error. The error should be unreachable now in CREATE
VIEW or CREATE MATVIEW, but it's still possible to explicitly say "unknown"
in CREATE TABLE or CREATE (composite) TYPE. We want to forbid that because
it's nothing but a foot-gun.
This change creates a pg_upgrade failure case: a matview that contains an
unknown-type column can't be pg_upgraded, because reparsing the matview's
defining query will now decide that the column is of type text, which
doesn't match the cstring-like storage that the old materialized column
would actually have. Add a checking pass to detect that. While at it,
we can detect tables or composite types that would fail, essentially
for free. Those would fail safely anyway later on, but we might as
well fail earlier.
This patch is by me, but it owes something to previous investigations
by Rahila Syed. Also thanks to Ashutosh Bapat and Michael Paquier for
review.
Discussion: https://postgr.es/m/CAH2L28uwwbL9HUM-WR=hromW1Cvamkn7O-g8fPY2m=_7muJ0oA@mail.gmail.com
2017-01-25 15:17:18 +01:00
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
When the <command>SELECT</command> is one arm of a <literal>UNION</literal>
|
|
|
|
(or <literal>INTERSECT</literal> or <literal>EXCEPT</literal>) construct, or when it
|
|
|
|
appears within <command>INSERT ... SELECT</command>, this rule is not applied
|
Change unknown-type literals to type text in SELECT and RETURNING lists.
Previously, we left such literals alone if the query or subquery had
no properties forcing a type decision to be made (such as an ORDER BY or
DISTINCT clause using that output column). This meant that "unknown" could
be an exposed output column type, which has never been a great idea because
it could result in strange failures later on. For example, an outer query
that tried to do any operations on an unknown-type subquery output would
generally fail with some weird error like "failed to find conversion
function from unknown to text" or "could not determine which collation to
use for string comparison". Also, if the case occurred in a CREATE VIEW's
query then the view would have an unknown-type column, causing similar
failures in queries trying to use the view.
To fix, at the tail end of parse analysis of a query, forcibly convert any
remaining "unknown" literals in its SELECT or RETURNING list to type text.
However, provide a switch to suppress that, and use it in the cases of
SELECT inside a set operation or INSERT command. In those cases we already
had type resolution rules that make use of context information from outside
the subquery proper, and we don't want to change that behavior.
Also, change creation of an unknown-type column in a relation from a
warning to a hard error. The error should be unreachable now in CREATE
VIEW or CREATE MATVIEW, but it's still possible to explicitly say "unknown"
in CREATE TABLE or CREATE (composite) TYPE. We want to forbid that because
it's nothing but a foot-gun.
This change creates a pg_upgrade failure case: a matview that contains an
unknown-type column can't be pg_upgraded, because reparsing the matview's
defining query will now decide that the column is of type text, which
doesn't match the cstring-like storage that the old materialized column
would actually have. Add a checking pass to detect that. While at it,
we can detect tables or composite types that would fail, essentially
for free. Those would fail safely anyway later on, but we might as
well fail earlier.
This patch is by me, but it owes something to previous investigations
by Rahila Syed. Also thanks to Ashutosh Bapat and Michael Paquier for
review.
Discussion: https://postgr.es/m/CAH2L28uwwbL9HUM-WR=hromW1Cvamkn7O-g8fPY2m=_7muJ0oA@mail.gmail.com
2017-01-25 15:17:18 +01:00
|
|
|
since rules given in preceding sections take precedence. The type of an
|
2017-10-09 03:44:17 +02:00
|
|
|
unspecified-type literal can be taken from the other <literal>UNION</literal> arm
|
Change unknown-type literals to type text in SELECT and RETURNING lists.
Previously, we left such literals alone if the query or subquery had
no properties forcing a type decision to be made (such as an ORDER BY or
DISTINCT clause using that output column). This meant that "unknown" could
be an exposed output column type, which has never been a great idea because
it could result in strange failures later on. For example, an outer query
that tried to do any operations on an unknown-type subquery output would
generally fail with some weird error like "failed to find conversion
function from unknown to text" or "could not determine which collation to
use for string comparison". Also, if the case occurred in a CREATE VIEW's
query then the view would have an unknown-type column, causing similar
failures in queries trying to use the view.
To fix, at the tail end of parse analysis of a query, forcibly convert any
remaining "unknown" literals in its SELECT or RETURNING list to type text.
However, provide a switch to suppress that, and use it in the cases of
SELECT inside a set operation or INSERT command. In those cases we already
had type resolution rules that make use of context information from outside
the subquery proper, and we don't want to change that behavior.
Also, change creation of an unknown-type column in a relation from a
warning to a hard error. The error should be unreachable now in CREATE
VIEW or CREATE MATVIEW, but it's still possible to explicitly say "unknown"
in CREATE TABLE or CREATE (composite) TYPE. We want to forbid that because
it's nothing but a foot-gun.
This change creates a pg_upgrade failure case: a matview that contains an
unknown-type column can't be pg_upgraded, because reparsing the matview's
defining query will now decide that the column is of type text, which
doesn't match the cstring-like storage that the old materialized column
would actually have. Add a checking pass to detect that. While at it,
we can detect tables or composite types that would fail, essentially
for free. Those would fail safely anyway later on, but we might as
well fail earlier.
This patch is by me, but it owes something to previous investigations
by Rahila Syed. Also thanks to Ashutosh Bapat and Michael Paquier for
review.
Discussion: https://postgr.es/m/CAH2L28uwwbL9HUM-WR=hromW1Cvamkn7O-g8fPY2m=_7muJ0oA@mail.gmail.com
2017-01-25 15:17:18 +01:00
|
|
|
in the first case, or from the destination column in the second case.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
<literal>RETURNING</literal> lists are treated the same as <command>SELECT</command>
|
Change unknown-type literals to type text in SELECT and RETURNING lists.
Previously, we left such literals alone if the query or subquery had
no properties forcing a type decision to be made (such as an ORDER BY or
DISTINCT clause using that output column). This meant that "unknown" could
be an exposed output column type, which has never been a great idea because
it could result in strange failures later on. For example, an outer query
that tried to do any operations on an unknown-type subquery output would
generally fail with some weird error like "failed to find conversion
function from unknown to text" or "could not determine which collation to
use for string comparison". Also, if the case occurred in a CREATE VIEW's
query then the view would have an unknown-type column, causing similar
failures in queries trying to use the view.
To fix, at the tail end of parse analysis of a query, forcibly convert any
remaining "unknown" literals in its SELECT or RETURNING list to type text.
However, provide a switch to suppress that, and use it in the cases of
SELECT inside a set operation or INSERT command. In those cases we already
had type resolution rules that make use of context information from outside
the subquery proper, and we don't want to change that behavior.
Also, change creation of an unknown-type column in a relation from a
warning to a hard error. The error should be unreachable now in CREATE
VIEW or CREATE MATVIEW, but it's still possible to explicitly say "unknown"
in CREATE TABLE or CREATE (composite) TYPE. We want to forbid that because
it's nothing but a foot-gun.
This change creates a pg_upgrade failure case: a matview that contains an
unknown-type column can't be pg_upgraded, because reparsing the matview's
defining query will now decide that the column is of type text, which
doesn't match the cstring-like storage that the old materialized column
would actually have. Add a checking pass to detect that. While at it,
we can detect tables or composite types that would fail, essentially
for free. Those would fail safely anyway later on, but we might as
well fail earlier.
This patch is by me, but it owes something to previous investigations
by Rahila Syed. Also thanks to Ashutosh Bapat and Michael Paquier for
review.
Discussion: https://postgr.es/m/CAH2L28uwwbL9HUM-WR=hromW1Cvamkn7O-g8fPY2m=_7muJ0oA@mail.gmail.com
2017-01-25 15:17:18 +01:00
|
|
|
output lists for this purpose.
|
|
|
|
</para>
|
|
|
|
|
|
|
|
<note>
|
|
|
|
<para>
|
2017-10-09 03:44:17 +02:00
|
|
|
Prior to <productname>PostgreSQL</productname> 10, this rule did not exist, and
|
|
|
|
unspecified-type literals in a <command>SELECT</command> output list were
|
|
|
|
left as type <type>unknown</type>. That had assorted bad consequences,
|
Change unknown-type literals to type text in SELECT and RETURNING lists.
Previously, we left such literals alone if the query or subquery had
no properties forcing a type decision to be made (such as an ORDER BY or
DISTINCT clause using that output column). This meant that "unknown" could
be an exposed output column type, which has never been a great idea because
it could result in strange failures later on. For example, an outer query
that tried to do any operations on an unknown-type subquery output would
generally fail with some weird error like "failed to find conversion
function from unknown to text" or "could not determine which collation to
use for string comparison". Also, if the case occurred in a CREATE VIEW's
query then the view would have an unknown-type column, causing similar
failures in queries trying to use the view.
To fix, at the tail end of parse analysis of a query, forcibly convert any
remaining "unknown" literals in its SELECT or RETURNING list to type text.
However, provide a switch to suppress that, and use it in the cases of
SELECT inside a set operation or INSERT command. In those cases we already
had type resolution rules that make use of context information from outside
the subquery proper, and we don't want to change that behavior.
Also, change creation of an unknown-type column in a relation from a
warning to a hard error. The error should be unreachable now in CREATE
VIEW or CREATE MATVIEW, but it's still possible to explicitly say "unknown"
in CREATE TABLE or CREATE (composite) TYPE. We want to forbid that because
it's nothing but a foot-gun.
This change creates a pg_upgrade failure case: a matview that contains an
unknown-type column can't be pg_upgraded, because reparsing the matview's
defining query will now decide that the column is of type text, which
doesn't match the cstring-like storage that the old materialized column
would actually have. Add a checking pass to detect that. While at it,
we can detect tables or composite types that would fail, essentially
for free. Those would fail safely anyway later on, but we might as
well fail earlier.
This patch is by me, but it owes something to previous investigations
by Rahila Syed. Also thanks to Ashutosh Bapat and Michael Paquier for
review.
Discussion: https://postgr.es/m/CAH2L28uwwbL9HUM-WR=hromW1Cvamkn7O-g8fPY2m=_7muJ0oA@mail.gmail.com
2017-01-25 15:17:18 +01:00
|
|
|
so it's been changed.
|
|
|
|
</para>
|
|
|
|
</note>
|
2001-09-15 02:48:59 +02:00
|
|
|
|
1998-12-29 03:24:47 +01:00
|
|
|
</sect1>
|
1998-07-08 15:53:15 +02:00
|
|
|
</chapter>
|