postgresql/doc/src/sgml/plpython.sgml

<!-- doc/src/sgml/plpython.sgml -->

<chapter id="plpython">
 <title>PL/Python &mdash; Python Procedural Language</title>

 <indexterm zone="plpython"><primary>PL/Python</primary></indexterm>
 <indexterm zone="plpython"><primary>Python</primary></indexterm>

 <para>
  The <application>PL/Python</application> procedural language allows
  <productname>PostgreSQL</productname> functions to be written in the
  <ulink url="https://www.python.org">Python language</ulink>.
 </para>

 <para>
  To install PL/Python in a particular database, use
  <literal>CREATE EXTENSION plpythonu</literal> (but
  see also <xref linkend="plpython-python23"/>).
 </para>

  <tip>
   <para>
    If a language is installed into <literal>template1</literal>, all subsequently
    created databases will have the language installed automatically.
   </para>
  </tip>

 <para>
  PL/Python is only available as an <quote>untrusted</quote> language, meaning
  it does not offer any way of restricting what users can do in it and
  is therefore named <literal>plpythonu</literal>.  A trusted
  variant <literal>plpython</literal> might become available in the future
  if a secure execution mechanism is developed in Python.  The
  writer of a function in untrusted PL/Python must take care that the
  function cannot be used to do anything unwanted, since it will be
  able to do anything that could be done by a user logged in as the
  database administrator.  Only superusers can create functions in
  untrusted languages such as <literal>plpythonu</literal>.
 </para>

 <note>
  <para>
   Users of source packages must specially enable the build of
   PL/Python during the installation process.  (Refer to the
   installation instructions for more information.)  Users of binary
   packages might find PL/Python in a separate subpackage.
  </para>
 </note>

 <sect1 id="plpython-python23">
  <title>Python 2 vs. Python 3</title>

  <para>
   PL/Python supports both the Python 2 and Python 3 language
   variants.  (The PostgreSQL installation instructions might contain
   more precise information about the exact supported minor versions
   of Python.)  Because the Python 2 and Python 3 language variants
   are incompatible in some important aspects, the following naming
   and transitioning scheme is used by PL/Python to avoid mixing them:

   <itemizedlist>
    <listitem>
     <para>
      The PostgreSQL language named <literal>plpython2u</literal>
      implements PL/Python based on the Python 2 language variant.
     </para>
    </listitem>

    <listitem>
     <para>
      The PostgreSQL language named <literal>plpython3u</literal>
      implements PL/Python based on the Python 3 language variant.
     </para>
    </listitem>

    <listitem>
     <para>
      The language named <literal>plpythonu</literal> implements
      PL/Python based on the default Python language variant, which is
      currently Python 2.  (This default is independent of what any
      local Python installations might consider to be
      their <quote>default</quote>, for example,
      what <filename>/usr/bin/python</filename> might be.)  The
      default will probably be changed to Python 3 in a distant future
      release of PostgreSQL, depending on the progress of the
      migration to Python 3 in the Python community.
     </para>
    </listitem>
   </itemizedlist>

   This scheme is analogous to the recommendations in <ulink
   url="https://www.python.org/dev/peps/pep-0394/">PEP 394</ulink> regarding the
   naming and transitioning of the <command>python</command> command.
  </para>

  <para>
   It depends on the build configuration or the installed packages
   whether PL/Python for Python 2 or Python 3 or both are available.
  </para>

  <tip>
   <para>
    The built variant depends on which Python version was found during
    the installation or which version was explicitly set using
    the <envar>PYTHON</envar> environment variable;
    see <xref linkend="install-procedure"/>.  To make both variants of
    PL/Python available in one installation, the source tree has to be
    configured and built twice.
   </para>
  </tip>

  <para>
   This results in the following usage and migration strategy:

   <itemizedlist>
    <listitem>
     <para>
      Existing users and users who are currently not interested in
      Python 3 use the language name <literal>plpythonu</literal> and
      don't have to change anything for the foreseeable future.  It is
      recommended to gradually <quote>future-proof</quote> the code
      via migration to Python 2.6/2.7 to simplify the eventual
      migration to Python 3.
     </para>

     <para>
      In practice, many PL/Python functions will migrate to Python 3
      with few or no changes.
     </para>
    </listitem>

    <listitem>
     <para>
      Users who know that they have heavily Python 2 dependent code
      and don't plan to ever change it can make use of
      the <literal>plpython2u</literal> language name.  This will
      continue to work into the very distant future, until Python 2
      support might be completely dropped by PostgreSQL.
     </para>
    </listitem>

    <listitem>
     <para>
      Users who want to dive into Python 3 can use
      the <literal>plpython3u</literal> language name, which will keep
      working forever by today's standards.  In the distant future,
      when Python 3 might become the default, they might like to
      remove the <quote>3</quote> for aesthetic reasons.
     </para>
    </listitem>

    <listitem>
     <para>
      Daredevils, who want to build a Python-3-only operating system
      environment, can change the contents of
      <literal>plpythonu</literal>'s extension control and script files
      to make <literal>plpythonu</literal> be equivalent
      to <literal>plpython3u</literal>, keeping in mind that this
      would make their installation incompatible with most of the rest
      of the world.
     </para>
    </listitem>
   </itemizedlist>
  </para>

  <para>
   See also the
   document <ulink url="https://docs.python.org/3/whatsnew/3.0.html">What's
   New In Python 3.0</ulink> for more information about porting to
   Python 3.
  </para>

  <para>
   It is not allowed to use PL/Python based on Python 2 and PL/Python
   based on Python 3 in the same session, because the symbols in the
   dynamic modules would clash, which could result in crashes of the
   PostgreSQL server process.  There is a check that prevents mixing
   Python major versions in a session, which will abort the session if
   a mismatch is detected.  It is possible, however, to use both
   PL/Python variants in the same database, from separate sessions.
  </para>
 </sect1>

 <sect1 id="plpython-funcs">
  <title>PL/Python Functions</title>

  <para>
   Functions in PL/Python are declared via the
   standard <xref linkend="sql-createfunction"/> syntax:

<programlisting>
CREATE FUNCTION <replaceable>funcname</replaceable> (<replaceable>argument-list</replaceable>)
  RETURNS <replaceable>return-type</replaceable>
AS $$
  # PL/Python function body
$$ LANGUAGE plpythonu;
</programlisting>
  </para>

  <para>
   The body of a function is simply a Python script. When the function
   is called, its arguments are passed as elements of the list
   <varname>args</varname>; named arguments are also passed as
   ordinary variables to the Python script.  Use of named arguments is
   usually more readable.  The result is returned from the Python code
   in the usual way, with <literal>return</literal> or
   <literal>yield</literal> (in case of a result-set statement).  If
   you do not provide a return value, Python returns the default
   <symbol>None</symbol>. <application>PL/Python</application> translates
   Python's <symbol>None</symbol> into the SQL null value.  In a procedure,
   the result from the Python code must be <symbol>None</symbol> (typically
   achieved by ending the procedure without a <literal>return</literal>
   statement or by using a <literal>return</literal> statement without
   argument); otherwise, an error will be raised.
  </para>

  <para>
   For example, a function to return the greater of two integers can be
   defined as:

<programlisting>
CREATE FUNCTION pymax (a integer, b integer)
  RETURNS integer
AS $$
  if a &gt; b:
    return a
  return b
$$ LANGUAGE plpythonu;
</programlisting>

   The Python code that is given as the body of the function definition
   is transformed into a Python function. For example, the above results in:

<programlisting>
def __plpython_procedure_pymax_23456():
  if a &gt; b:
    return a
  return b
</programlisting>

   assuming that 23456 is the OID assigned to the function by
   <productname>PostgreSQL</productname>.
  </para>

  <para>
   The arguments are set as global variables.  Because of the scoping
   rules of Python, this has the subtle consequence that an argument
   variable cannot be reassigned inside the function to the value of
   an expression that involves the variable name itself, unless the
   variable is redeclared as global in the block.  For example, the
   following won't work:
<programlisting>
CREATE FUNCTION pystrip(x text)
  RETURNS text
AS $$
  x = x.strip()  # error
  return x
$$ LANGUAGE plpythonu;
</programlisting>
   because assigning to <varname>x</varname>
   makes <varname>x</varname> a local variable for the entire block,
   and so the <varname>x</varname> on the right-hand side of the
   assignment refers to a not-yet-assigned local
   variable <varname>x</varname>, not the PL/Python function
   parameter.  Using the <literal>global</literal> statement, this can
   be made to work:
<programlisting>
CREATE FUNCTION pystrip(x text)
  RETURNS text
AS $$
  global x
  x = x.strip()  # ok now
  return x
$$ LANGUAGE plpythonu;
</programlisting>
   But it is advisable not to rely on this implementation detail of
   PL/Python.  It is better to treat the function parameters as
   read-only.
  </para>
 </sect1>

 <sect1 id="plpython-data">
  <title>Data Values</title>
  <para>
   Generally speaking, the aim of PL/Python is to provide
   a <quote>natural</quote> mapping between the PostgreSQL and the
   Python worlds.  This informs the data mapping rules described
   below.
  </para>

  <sect2>
   <title>Data Type Mapping</title>
   <para>
    When a PL/Python function is called, its arguments are converted from
    their PostgreSQL data type to a corresponding Python type:

    <itemizedlist>
     <listitem>
      <para>
       PostgreSQL <type>boolean</type> is converted to Python <type>bool</type>.
      </para>
     </listitem>

     <listitem>
      <para>
       PostgreSQL <type>smallint</type> and <type>int</type> are
       converted to Python <type>int</type>.
       PostgreSQL <type>bigint</type> and <type>oid</type> are converted
       to <type>long</type> in Python 2 and to <type>int</type> in
       Python 3.
      </para>
     </listitem>

     <listitem>
      <para>
       PostgreSQL <type>real</type> and <type>double</type> are converted to
       Python <type>float</type>.
      </para>
     </listitem>

     <listitem>
      <para>
       PostgreSQL <type>numeric</type> is converted to
       Python <type>Decimal</type>.  This type is imported from
       the <literal>cdecimal</literal> package if that is available.
       Otherwise,
       <literal>decimal.Decimal</literal> from the standard library will be
       used.  <literal>cdecimal</literal> is significantly faster
       than <literal>decimal</literal>.  In Python 3.3 and up,
       however, <literal>cdecimal</literal> has been integrated into the
       standard library under the name <literal>decimal</literal>, so there is
       no longer any difference.
      </para>
     </listitem>

     <listitem>
      <para>
       PostgreSQL <type>bytea</type> is converted to
       Python <type>str</type> in Python 2 and to <type>bytes</type>
       in Python 3.  In Python 2, the string should be treated as a
       byte sequence without any character encoding.
      </para>
     </listitem>

     <listitem>
      <para>
       All other data types, including the PostgreSQL character string
       types, are converted to a Python <type>str</type>.  In Python
       2, this string will be in the PostgreSQL server encoding; in
       Python 3, it will be a Unicode string like all strings.
      </para>
     </listitem>

     <listitem>
      <para>
       For nonscalar data types, see below.
      </para>
     </listitem>
    </itemizedlist>
   </para>

   <para>
    When a PL/Python function returns, its return value is converted to the
    function's declared PostgreSQL return data type as follows:

    <itemizedlist>
     <listitem>
      <para>
       When the PostgreSQL return type is <type>boolean</type>, the
       return value will be evaluated for truth according to the
       <emphasis>Python</emphasis> rules.  That is, 0 and empty string
       are false, but notably <literal>'f'</literal> is true.
      </para>
     </listitem>

     <listitem>
      <para>
       When the PostgreSQL return type is <type>bytea</type>, the
       return value will be converted to a string (Python 2) or bytes
       (Python 3) using the respective Python built-ins, with the
       result being converted to <type>bytea</type>.
      </para>
     </listitem>

     <listitem>
      <para>
       For all other PostgreSQL return types, the return value is converted
       to a string using the Python built-in <literal>str</literal>, and the
       result is passed to the input function of the PostgreSQL data type.
       (If the Python value is a <type>float</type>, it is converted using
       the <literal>repr</literal> built-in instead of <literal>str</literal>, to
       avoid loss of precision.)
      </para>

      <para>
       Strings in Python 2 are required to be in the PostgreSQL server
       encoding when they are passed to PostgreSQL.  Strings that are
       not valid in the current server encoding will raise an error,
       but not all encoding mismatches can be detected, so garbage
       data can still result when this is not done correctly.  Unicode
       strings are converted to the correct encoding automatically, so
       it can be safer and more convenient to use those.  In Python 3,
       all strings are Unicode strings.
      </para>
     </listitem>

     <listitem>
      <para>
       For nonscalar data types, see below.
      </para>
     </listitem>
    </itemizedlist>

    Note that logical mismatches between the declared PostgreSQL
    return type and the Python data type of the actual return object
    are not flagged; the value will be converted in any case.
   </para>
  </sect2>

  <sect2>
   <title>Null, None</title>
  <para>
   If an SQL null value<indexterm><primary>null value</primary><secondary
   sortas="PL/Python">in PL/Python</secondary></indexterm> is passed to a
   function, the argument value will appear as <symbol>None</symbol> in
   Python. For example, the function definition of <function>pymax</function>
   shown in <xref linkend="plpython-funcs"/> will return the wrong answer for null
   inputs. We could add <literal>STRICT</literal> to the function definition
   to make <productname>PostgreSQL</productname> do something more reasonable:
   if a null value is passed, the function will not be called at all,
   but will just return a null result automatically. Alternatively,
   we could check for null inputs in the function body:

<programlisting>
CREATE FUNCTION pymax (a integer, b integer)
  RETURNS integer
AS $$
  if (a is None) or (b is None):
    return None
  if a &gt; b:
    return a
  return b
$$ LANGUAGE plpythonu;
</programlisting>

   As shown above, to return an SQL null value from a PL/Python
   function, return the value <symbol>None</symbol>. This can be done whether the
   function is strict or not.
  </para>
  </sect2>

  <sect2 id="plpython-arrays">
   <title>Arrays, Lists</title>
  <para>
   SQL array values are passed into PL/Python as a Python list.  To
   return an SQL array value out of a PL/Python function, return a
   Python list:

<programlisting>
CREATE FUNCTION return_arr()
  RETURNS int[]
AS $$
return [1, 2, 3, 4, 5]
$$ LANGUAGE plpythonu;

SELECT return_arr();
 return_arr
-------------
 {1,2,3,4,5}
(1 row)
</programlisting>

   Multidimensional arrays are passed into PL/Python as nested Python lists.
   A 2-dimensional array is a list of lists, for example. When returning
   a multi-dimensional SQL array out of a PL/Python function, the inner
   lists at each level must all be of the same size. For example:

<programlisting>
CREATE FUNCTION test_type_conversion_array_int4(x int4[]) RETURNS int4[] AS $$
plpy.info(x, type(x))
return x
$$ LANGUAGE plpythonu;

SELECT * FROM test_type_conversion_array_int4(ARRAY[[1,2,3],[4,5,6]]);
INFO:  ([[1, 2, 3], [4, 5, 6]], &lt;type 'list'&gt;)
 test_type_conversion_array_int4
---------------------------------
 {{1,2,3},{4,5,6}}
(1 row)
</programlisting>

   Other Python sequences, like tuples, are also accepted for
   backwards-compatibility with PostgreSQL versions 9.6 and below, when
   multi-dimensional arrays were not supported. However, they are always
   treated as one-dimensional arrays, because they are ambiguous with
   composite types. For the same reason, when a composite type is used in a
   multi-dimensional array, it must be represented by a tuple, rather than a
   list.
  </para>
  <para>
   Note that in Python, strings are sequences, which can have
   undesirable effects that might be familiar to Python programmers:

<programlisting>
CREATE FUNCTION return_str_arr()
  RETURNS varchar[]
AS $$
return "hello"
$$ LANGUAGE plpythonu;

SELECT return_str_arr();
 return_str_arr
----------------
 {h,e,l,l,o}
(1 row)
</programlisting>
  </para>
  </sect2>

  <sect2>
   <title>Composite Types</title>
  <para>
   Composite-type arguments are passed to the function as Python mappings. The
   element names of the mapping are the attribute names of the composite type.
   If an attribute in the passed row has the null value, it has the value
   <symbol>None</symbol> in the mapping. Here is an example:

<programlisting>
CREATE TABLE employee (
  name text,
  salary integer,
  age integer
);

CREATE FUNCTION overpaid (e employee)
  RETURNS boolean
AS $$
  if e["salary"] &gt; 200000:
    return True
  if (e["age"] &lt; 30) and (e["salary"] &gt; 100000):
    return True
  return False
$$ LANGUAGE plpythonu;
</programlisting>
  </para>

  <para>
   There are multiple ways to return row or composite types from a Python
   function. The following examples assume we have:

<programlisting>
CREATE TYPE named_value AS (
  name   text,
  value  integer
);
</programlisting>

   A composite result can be returned as a:

   <variablelist>
    <varlistentry>
     <term>Sequence type (a tuple or list, but not a set because
     it is not indexable)</term>
     <listitem>
      <para>
       Returned sequence objects must have the same number of items as the
       composite result type has fields. The item with index 0 is assigned to
       the first field of the composite type, 1 to the second and so on. For
       example:

<programlisting>
CREATE FUNCTION make_pair (name text, value integer)
  RETURNS named_value
AS $$
  return ( name, value )
  # or alternatively, as tuple: return [ name, value ]
$$ LANGUAGE plpythonu;
</programlisting>

       To return a SQL null for any column, insert <symbol>None</symbol> at
       the corresponding position.
      </para>
      <para>
       When an array of composite types is returned, it cannot be returned as a list,
       because it is ambiguous whether the Python list represents a composite type,
       or another array dimension.
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term>Mapping (dictionary)</term>
     <listitem>
      <para>
       The value for each result type column is retrieved from the mapping
       with the column name as key. Example:

<programlisting>
CREATE FUNCTION make_pair (name text, value integer)
  RETURNS named_value
AS $$
  return { "name": name, "value": value }
$$ LANGUAGE plpythonu;
</programlisting>

       Any extra dictionary key/value pairs are ignored. Missing keys are
       treated as errors.
       To return a SQL null value for any column, insert
       <symbol>None</symbol> with the corresponding column name as the key.
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term>Object (any object providing method <literal>__getattr__</literal>)</term>
     <listitem>
      <para>
       This works the same as a mapping.
       Example:

<programlisting>
CREATE FUNCTION make_pair (name text, value integer)
  RETURNS named_value
AS $$
  class named_value:
    def __init__ (self, n, v):
      self.name = n
      self.value = v
  return named_value(name, value)

  # or simply
  class nv: pass
  nv.name = name
  nv.value = value
  return nv
$$ LANGUAGE plpythonu;
</programlisting>
      </para>
     </listitem>
    </varlistentry>
   </variablelist>
  </para>

   <para>
    Functions with <literal>OUT</literal> parameters are also supported.  For example:
<programlisting>
CREATE FUNCTION multiout_simple(OUT i integer, OUT j integer) AS $$
return (1, 2)
$$ LANGUAGE plpythonu;

SELECT * FROM multiout_simple();
</programlisting>
   </para>

   <para>
    Output parameters of procedures are passed back the same way.  For example:
<programlisting>
CREATE PROCEDURE python_triple(INOUT a integer, INOUT b integer) AS $$
return (a * 3, b * 3)
$$ LANGUAGE plpythonu;

CALL python_triple(5, 10);
</programlisting>
   </para>
  </sect2>

  <sect2>
   <title>Set-Returning Functions</title>
  <para>
   A <application>PL/Python</application> function can also return sets of
   scalar or composite types. There are several ways to achieve this because
   the returned object is internally turned into an iterator. The following
   examples assume we have composite type:

<programlisting>
CREATE TYPE greeting AS (
  how text,
  who text
);
</programlisting>

   A set result can be returned from a:

   <variablelist>
    <varlistentry>
     <term>Sequence type (tuple, list, set)</term>
     <listitem>
      <para>
<programlisting>
CREATE FUNCTION greet (how text)
  RETURNS SETOF greeting
AS $$
  # return tuple containing lists as composite types
  # all other combinations work also
  return ( [ how, "World" ], [ how, "PostgreSQL" ], [ how, "PL/Python" ] )
$$ LANGUAGE plpythonu;
</programlisting>
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term>Iterator (any object providing <symbol>__iter__</symbol> and
      <symbol>next</symbol> methods)</term>
     <listitem>
      <para>
<programlisting>
CREATE FUNCTION greet (how text)
  RETURNS SETOF greeting
AS $$
  class producer:
    def __init__ (self, how, who):
      self.how = how
      self.who = who
      self.ndx = -1

    def __iter__ (self):
      return self

    def next (self):
      self.ndx += 1
      if self.ndx == len(self.who):
        raise StopIteration
      return ( self.how, self.who[self.ndx] )

  return producer(how, [ "World", "PostgreSQL", "PL/Python" ])
$$ LANGUAGE plpythonu;
</programlisting>
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term>Generator (<literal>yield</literal>)</term>
     <listitem>
      <para>
<programlisting>
CREATE FUNCTION greet (how text)
  RETURNS SETOF greeting
AS $$
  for who in [ "World", "PostgreSQL", "PL/Python" ]:
    yield ( how, who )
$$ LANGUAGE plpythonu;
</programlisting>

      </para>
     </listitem>
    </varlistentry>
   </variablelist>
  </para>

   <para>
    Set-returning functions with <literal>OUT</literal> parameters
    (using <literal>RETURNS SETOF record</literal>) are also
    supported.  For example:
<programlisting>
CREATE FUNCTION multiout_simple_setof(n integer, OUT integer, OUT integer) RETURNS SETOF record AS $$
return [(1, 2)] * n
$$ LANGUAGE plpythonu;

SELECT * FROM multiout_simple_setof(3);
</programlisting>
   </para>
  </sect2>
 </sect1>

 <sect1 id="plpython-sharing">
  <title>Sharing Data</title>
  <para>
   The global dictionary <varname>SD</varname> is available to store
   private data between repeated calls to the same function.
   The global dictionary <varname>GD</varname> is public data,
   that is available to all Python functions within a session;  use with
   care.<indexterm><primary>global data</primary>
   <secondary>in PL/Python</secondary></indexterm>
  </para>

  <para>
   Each function gets its own execution environment in the
   Python interpreter, so that global data and function arguments from
   <function>myfunc</function> are not available to
   <function>myfunc2</function>.  The exception is the data in the
   <varname>GD</varname> dictionary, as mentioned above.
  </para>
 </sect1>

 <sect1 id="plpython-do">
  <title>Anonymous Code Blocks</title>

  <para>
   PL/Python also supports anonymous code blocks called with the
   <xref linkend="sql-do"/> statement:

<programlisting>
DO $$
    # PL/Python code
$$ LANGUAGE plpythonu;
</programlisting>

   An anonymous code block receives no arguments, and whatever value it
   might return is discarded.  Otherwise it behaves just like a function.
  </para>
 </sect1>

 <sect1 id="plpython-trigger">
  <title>Trigger Functions</title>

  <indexterm zone="plpython-trigger">
   <primary>trigger</primary>
   <secondary>in PL/Python</secondary>
  </indexterm>

  <para>
   When a function is used as a trigger, the dictionary
   <literal>TD</literal> contains trigger-related values:
   <variablelist>
    <varlistentry>
     <term><literal>TD["event"]</literal></term>
     <listitem>
      <para>
       contains the event as a string:
       <literal>INSERT</literal>, <literal>UPDATE</literal>,
       <literal>DELETE</literal>, or <literal>TRUNCATE</literal>.
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term><literal>TD["when"]</literal></term>
     <listitem>
      <para>
       contains one of <literal>BEFORE</literal>, <literal>AFTER</literal>, or
       <literal>INSTEAD OF</literal>.
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term><literal>TD["level"]</literal></term>
     <listitem>
      <para>
       contains <literal>ROW</literal> or <literal>STATEMENT</literal>.
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term><literal>TD["new"]</literal></term>
     <term><literal>TD["old"]</literal></term>
     <listitem>
      <para>
       For a row-level trigger, one or both of these fields contain
       the respective trigger rows, depending on the trigger event.
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term><literal>TD["name"]</literal></term>
     <listitem>
      <para>
       contains the trigger name.
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term><literal>TD["table_name"]</literal></term>
     <listitem>
      <para>
       contains the name of the table on which the trigger occurred.
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term><literal>TD["table_schema"]</literal></term>
     <listitem>
      <para>
       contains the schema of the table on which the trigger occurred.
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term><literal>TD["relid"]</literal></term>
     <listitem>
      <para>
       contains the OID of the table on which the trigger occurred.
      </para>
     </listitem>
    </varlistentry>

    <varlistentry>
     <term><literal>TD["args"]</literal></term>
     <listitem>
      <para>
       If the <command>CREATE TRIGGER</command> command
       included arguments, they are available in <literal>TD["args"][0]</literal> to
       <literal>TD["args"][<replaceable>n</replaceable>-1]</literal>.
      </para>
     </listitem>
    </varlistentry>
   </variablelist>
  </para>

  <para>
   If <literal>TD["when"]</literal> is <literal>BEFORE</literal> or
   <literal>INSTEAD OF</literal> and
   <literal>TD["level"]</literal> is <literal>ROW</literal>, you can
   return <literal>None</literal> or <literal>"OK"</literal> from the
   Python function to indicate the row is unmodified,
   <literal>"SKIP"</literal> to abort the event, or if <literal>TD["event"]</literal>
   is <command>INSERT</command> or <command>UPDATE</command> you can return
   <literal>"MODIFY"</literal> to indicate you've modified the new row.
   Otherwise the return value is ignored.
  </para>
 </sect1>

 <sect1 id="plpython-database">
  <title>Database Access</title>

  <para>
   The PL/Python language module automatically imports a Python module
   called <literal>plpy</literal>.  The functions and constants in
   this module are available to you in the Python code as
   <literal>plpy.<replaceable>foo</replaceable></literal>.
  </para>

  <sect2>
    <title>Database Access Functions</title>

  <para>
   The <literal>plpy</literal> module provides several functions to execute
   database commands:
  </para>

  <variablelist>
   <varlistentry>
    <term><literal>plpy.<function>execute</function>(<replaceable>query</replaceable> [, <replaceable>max-rows</replaceable>])</literal></term>
    <listitem>
     <para>
      Calling <function>plpy.execute</function> with a query string and an
      optional row limit argument causes that query to be run and the result to
      be returned in a result object.
     </para>

     <para>
      The result object emulates a list or dictionary object.  The result
      object can be accessed by row number and column name.  For example:
<programlisting>
rv = plpy.execute("SELECT * FROM my_table", 5)
</programlisting>
      returns up to 5 rows from <literal>my_table</literal>.  If
      <literal>my_table</literal> has a column
      <literal>my_column</literal>, it would be accessed as:
<programlisting>
foo = rv[i]["my_column"]
</programlisting>
      The number of rows returned can be obtained using the built-in
      <function>len</function> function.
     </para>

     <para>
      The result object has these additional methods:
      <variablelist>
       <varlistentry>
        <term><literal><function>nrows</function>()</literal></term>
        <listitem>
         <para>
          Returns the number of rows processed by the command.  Note that this
          is not necessarily the same as the number of rows returned.  For
          example, an <command>UPDATE</command> command will set this value but
          won't return any rows (unless <literal>RETURNING</literal> is used).
         </para>
        </listitem>
       </varlistentry>

       <varlistentry>
        <term><literal><function>status</function>()</literal></term>
        <listitem>
         <para>
          The <function>SPI_execute()</function> return value.
         </para>
        </listitem>
       </varlistentry>

       <varlistentry>
        <term><literal><function>colnames</function>()</literal></term>
        <term><literal><function>coltypes</function>()</literal></term>
        <term><literal><function>coltypmods</function>()</literal></term>
        <listitem>
         <para>
          Return a list of column names, list of column type OIDs, and list of
          type-specific type modifiers for the columns, respectively.
         </para>

         <para>
          These methods raise an exception when called on a result object from
          a command that did not produce a result set, e.g.,
          <command>UPDATE</command> without <literal>RETURNING</literal>, or
          <command>DROP TABLE</command>.  But it is OK to use these methods on
          a result set containing zero rows.
         </para>
        </listitem>
       </varlistentry>

       <varlistentry>
        <term><literal><function>__str__</function>()</literal></term>
        <listitem>
         <para>
          The standard <literal>__str__</literal> method is defined so that it
          is possible for example to debug query execution results
          using <literal>plpy.debug(rv)</literal>.
         </para>
        </listitem>
       </varlistentry>
      </variablelist>
     </para>

     <para>
      The result object can be modified.
     </para>

     <para>
      Note that calling <literal>plpy.execute</literal> will cause the entire
      result set to be read into memory.  Only use that function when you are
      sure that the result set will be relatively small.  If you don't want to
      risk excessive memory usage when fetching large results,
      use <literal>plpy.cursor</literal> rather
      than <literal>plpy.execute</literal>.
     </para>
    </listitem>
   </varlistentry>

   <varlistentry>
    <term><literal>plpy.<function>prepare</function>(<replaceable>query</replaceable> [, <replaceable>argtypes</replaceable>])</literal></term>
    <term><literal>plpy.<function>execute</function>(<replaceable>plan</replaceable> [, <replaceable>arguments</replaceable> [, <replaceable>max-rows</replaceable>]])</literal></term>
    <listitem>
     <para>
      <indexterm><primary>preparing a query</primary><secondary>in PL/Python</secondary></indexterm>
      <function>plpy.prepare</function> prepares the execution plan for a
      query.  It is called with a query string and a list of parameter types,
      if you have parameter references in the query.  For example:
<programlisting>
plan = plpy.prepare("SELECT last_name FROM my_users WHERE first_name = $1", ["text"])
</programlisting>
      <literal>text</literal> is the type of the variable you will be passing
      for <literal>$1</literal>.  The second argument is optional if you don't
      want to pass any parameters to the query.
     </para>
     <para>
      After preparing a statement, you use a variant of the
      function <function>plpy.execute</function> to run it:
<programlisting>
rv = plpy.execute(plan, ["name"], 5)
</programlisting>
      Pass the plan as the first argument (instead of the query string), and a
      list of values to substitute into the query as the second argument.  The
      second argument is optional if the query does not expect any parameters.
      The third argument is the optional row limit as before.
     </para>

     <para>
      Alternatively, you can call the <function>execute</function> method on
      the plan object:
<programlisting>
rv = plan.execute(["name"], 5)
</programlisting>
     </para>

     <para>
      Query parameters and result row fields are converted between PostgreSQL
      and Python data types as described in <xref linkend="plpython-data"/>.
     </para>

     <para>
      When you prepare a plan using the PL/Python module it is automatically
      saved.  Read the SPI documentation (<xref linkend="spi"/>) for a
      description of what this means.  In order to make effective use of this
      across function calls one needs to use one of the persistent storage
      dictionaries <literal>SD</literal> or <literal>GD</literal> (see
      <xref linkend="plpython-sharing"/>). For example:
<programlisting>
CREATE FUNCTION usesavedplan() RETURNS trigger AS $$
    if "plan" in SD:
        plan = SD["plan"]
    else:
        plan = plpy.prepare("SELECT 1")
        SD["plan"] = plan
    # rest of function
$$ LANGUAGE plpythonu;
</programlisting>
     </para>
    </listitem>
   </varlistentry>

   <varlistentry>
    <term><literal>plpy.<function>cursor</function>(<replaceable>query</replaceable>)</literal></term>
    <term><literal>plpy.<function>cursor</function>(<replaceable>plan</replaceable> [, <replaceable>arguments</replaceable>])</literal></term>
    <listitem>
     <para>
      The <literal>plpy.cursor</literal> function accepts the same arguments
      as <literal>plpy.execute</literal> (except for the row limit) and returns
      a cursor object, which allows you to process large result sets in smaller
      chunks.  As with <literal>plpy.execute</literal>, either a query string
      or a plan object along with a list of arguments can be used, or
      the <function>cursor</function> function can be called as a method of
      the plan object.
     </para>

     <para>
      The cursor object provides a <literal>fetch</literal> method that accepts
      an integer parameter and returns a result object.  Each time you
      call <literal>fetch</literal>, the returned object will contain the next
      batch of rows, never larger than the parameter value.  Once all rows are
      exhausted, <literal>fetch</literal> starts returning an empty result
      object.  Cursor objects also provide an
      <ulink url="https://docs.python.org/library/stdtypes.html#iterator-types">iterator
      interface</ulink>, yielding one row at a time until all rows are
      exhausted.  Data fetched that way is not returned as result objects, but
      rather as dictionaries, each dictionary corresponding to a single result
      row.
     </para>

     <para>
      An example of two ways of processing data from a large table is:
<programlisting>
CREATE FUNCTION count_odd_iterator() RETURNS integer AS $$
odd = 0
for row in plpy.cursor("select num from largetable"):
    if row['num'] % 2:
         odd += 1
return odd
$$ LANGUAGE plpythonu;

CREATE FUNCTION count_odd_fetch(batch_size integer) RETURNS integer AS $$
odd = 0
cursor = plpy.cursor("select num from largetable")
while True:
    rows = cursor.fetch(batch_size)
    if not rows:
        break
    for row in rows:
        if row['num'] % 2:
            odd += 1
return odd
$$ LANGUAGE plpythonu;

CREATE FUNCTION count_odd_prepared() RETURNS integer AS $$
odd = 0
plan = plpy.prepare("select num from largetable where num % $1 &lt;&gt; 0", ["integer"])
rows = list(plpy.cursor(plan, [2]))  # or: = list(plan.cursor([2]))

return len(rows)
$$ LANGUAGE plpythonu;
</programlisting>
     </para>

     <para>
      Cursors are automatically disposed of.  But if you want to explicitly
      release all resources held by a cursor, use the <literal>close</literal>
      method.  Once closed, a cursor cannot be fetched from anymore.
     </para>

     <tip>
      <para>
        Do not confuse objects created by <literal>plpy.cursor</literal> with
        DB-API cursors as defined by
        the <ulink url="https://www.python.org/dev/peps/pep-0249/">Python
        Database API specification</ulink>.  They don't have anything in common
        except for the name.
      </para>
     </tip>
    </listitem>
   </varlistentry>
  </variablelist>

  </sect2>

  <sect2 id="plpython-trapping">
   <title>Trapping Errors</title>

   <para>
    Functions accessing the database might encounter errors, which
    will cause them to abort and raise an exception.  Both
    <function>plpy.execute</function> and
    <function>plpy.prepare</function> can raise an instance of a subclass of
    <literal>plpy.SPIError</literal>, which by default will terminate
    the function.  This error can be handled just like any other
    Python exception, by using the <literal>try/except</literal>
    construct.  For example:
<programlisting>
CREATE FUNCTION try_adding_joe() RETURNS text AS $$
    try:
        plpy.execute("INSERT INTO users(username) VALUES ('joe')")
    except plpy.SPIError:
        return "something went wrong"
    else:
        return "Joe added"
$$ LANGUAGE plpythonu;
</programlisting>
   </para>

   <para>
    The actual class of the exception being raised corresponds to the
    specific condition that caused the error.  Refer
    to <xref linkend="errcodes-table"/> for a list of possible
    conditions.  The module
    <literal>plpy.spiexceptions</literal> defines an exception class
    for each <productname>PostgreSQL</productname> condition, deriving
    their names from the condition name.  For
    instance, <literal>division_by_zero</literal>
    becomes <literal>DivisionByZero</literal>, <literal>unique_violation</literal>
    becomes <literal>UniqueViolation</literal>, <literal>fdw_error</literal>
    becomes <literal>FdwError</literal>, and so on.  Each of these
    exception classes inherits from <literal>SPIError</literal>.  This
    separation makes it easier to handle specific errors, for
    instance:
<programlisting>
CREATE FUNCTION insert_fraction(numerator int, denominator int) RETURNS text AS $$
from plpy import spiexceptions
try:
    plan = plpy.prepare("INSERT INTO fractions (frac) VALUES ($1 / $2)", ["int", "int"])
    plpy.execute(plan, [numerator, denominator])
except spiexceptions.DivisionByZero:
    return "denominator cannot equal zero"
except spiexceptions.UniqueViolation:
    return "already have that fraction"
except plpy.SPIError as e:
    return "other error, SQLSTATE %s" % e.sqlstate
else:
    return "fraction inserted"
$$ LANGUAGE plpythonu;
</programlisting>
    Note that because all exceptions from
    the <literal>plpy.spiexceptions</literal> module inherit
    from <literal>SPIError</literal>, an <literal>except</literal>
    clause handling it will catch any database access error.
   </para>

   <para>
    As an alternative way of handling different error conditions, you
    can catch the <literal>SPIError</literal> exception and determine
    the specific error condition inside the <literal>except</literal>
    block by looking at the <literal>sqlstate</literal> attribute of
    the exception object.  This attribute is a string value containing
    the <quote>SQLSTATE</quote> error code.  This approach provides
    approximately the same functionality
   </para>
  </sect2>
 </sect1>

 <sect1 id="plpython-subtransaction">
  <title>Explicit Subtransactions</title>

  <para>
   Recovering from errors caused by database access as described in
   <xref linkend="plpython-trapping"/> can lead to an undesirable
   situation where some operations succeed before one of them fails,
   and after recovering from that error the data is left in an
   inconsistent state.  PL/Python offers a solution to this problem in
   the form of explicit subtransactions.
  </para>

  <sect2>
   <title>Subtransaction Context Managers</title>

   <para>
    Consider a function that implements a transfer between two
    accounts:
<programlisting>
CREATE FUNCTION transfer_funds() RETURNS void AS $$
try:
    plpy.execute("UPDATE accounts SET balance = balance - 100 WHERE account_name = 'joe'")
    plpy.execute("UPDATE accounts SET balance = balance + 100 WHERE account_name = 'mary'")
except plpy.SPIError as e:
    result = "error transferring funds: %s" % e.args
else:
    result = "funds transferred correctly"
plan = plpy.prepare("INSERT INTO operations (result) VALUES ($1)", ["text"])
plpy.execute(plan, [result])
$$ LANGUAGE plpythonu;
</programlisting>
    If the second <literal>UPDATE</literal> statement results in an
    exception being raised, this function will report the error, but
    the result of the first <literal>UPDATE</literal> will
    nevertheless be committed.  In other words, the funds will be
    withdrawn from Joe's account, but will not be transferred to
    Mary's account.
   </para>

   <para>
    To avoid such issues, you can wrap your
    <literal>plpy.execute</literal> calls in an explicit
    subtransaction.  The <literal>plpy</literal> module provides a
    helper object to manage explicit subtransactions that gets created
    with the <literal>plpy.subtransaction()</literal> function.
    Objects created by this function implement the
    <ulink url="https://docs.python.org/library/stdtypes.html#context-manager-types">
    context manager interface</ulink>.  Using explicit subtransactions
    we can rewrite our function as:
<programlisting>
CREATE FUNCTION transfer_funds2() RETURNS void AS $$
try:
    with plpy.subtransaction():
        plpy.execute("UPDATE accounts SET balance = balance - 100 WHERE account_name = 'joe'")
        plpy.execute("UPDATE accounts SET balance = balance + 100 WHERE account_name = 'mary'")
except plpy.SPIError as e:
    result = "error transferring funds: %s" % e.args
else:
    result = "funds transferred correctly"
plan = plpy.prepare("INSERT INTO operations (result) VALUES ($1)", ["text"])
plpy.execute(plan, [result])
$$ LANGUAGE plpythonu;
</programlisting>
    Note that the use of <literal>try/catch</literal> is still
    required.  Otherwise the exception would propagate to the top of
    the Python stack and would cause the whole function to abort with
    a <productname>PostgreSQL</productname> error, so that the
    <literal>operations</literal> table would not have any row
    inserted into it.  The subtransaction context manager does not
    trap errors, it only assures that all database operations executed
    inside its scope will be atomically committed or rolled back.  A
    rollback of the subtransaction block occurs on any kind of
    exception exit, not only ones caused by errors originating from
    database access.  A regular Python exception raised inside an
    explicit subtransaction block would also cause the subtransaction
    to be rolled back.
   </para>
  </sect2>

  <sect2>
   <title>Older Python Versions</title>

   <para>
    Context managers syntax using the <literal>with</literal> keyword
    is available by default in Python 2.6.  For compatibility with
    older Python versions, you can call the
    subtransaction manager's <literal>__enter__</literal> and
    <literal>__exit__</literal> functions using the
    <literal>enter</literal> and <literal>exit</literal> convenience
    aliases.  The example function that transfers funds could be
    written as:
<programlisting>
CREATE FUNCTION transfer_funds_old() RETURNS void AS $$
try:
    subxact = plpy.subtransaction()
    subxact.enter()
    try:
        plpy.execute("UPDATE accounts SET balance = balance - 100 WHERE account_name = 'joe'")
        plpy.execute("UPDATE accounts SET balance = balance + 100 WHERE account_name = 'mary'")
    except:
        import sys
        subxact.exit(*sys.exc_info())
        raise
    else:
        subxact.exit(None, None, None)
except plpy.SPIError as e:
    result = "error transferring funds: %s" % e.args
else:
    result = "funds transferred correctly"

plan = plpy.prepare("INSERT INTO operations (result) VALUES ($1)", ["text"])
plpy.execute(plan, [result])
$$ LANGUAGE plpythonu;
</programlisting>
   </para>
  </sect2>
 </sect1>

 <sect1 id="plpython-transactions">
  <title>Transaction Management</title>

  <para>
   In a procedure called from the top level or an anonymous code block
   (<command>DO</command> command) called from the top level it is possible to
   control transactions.  To commit the current transaction, call
   <literal>plpy.commit()</literal>.  To roll back the current transaction,
   call <literal>plpy.rollback()</literal>.  (Note that it is not possible to
   run the SQL commands <command>COMMIT</command> or
   <command>ROLLBACK</command> via <function>plpy.execute</function> or
   similar.  It has to be done using these functions.)  After a transaction is
   ended, a new transaction is automatically started, so there is no separate
   function for that.
  </para>

  <para>
   Here is an example:
<programlisting>
CREATE PROCEDURE transaction_test1()
LANGUAGE plpythonu
AS $$
for i in range(0, 10):
    plpy.execute("INSERT INTO test1 (a) VALUES (%d)" % i)
    if i % 2 == 0:
        plpy.commit()
    else:
        plpy.rollback()
$$;

CALL transaction_test1();
</programlisting>
  </para>

  <para>
   Transactions cannot be ended when an explicit subtransaction is active.
  </para>
 </sect1>

 <sect1 id="plpython-util">
  <title>Utility Functions</title>
  <para>
   The <literal>plpy</literal> module also provides the functions
   <simplelist>
    <member><literal>plpy.debug(<replaceable>msg, **kwargs</replaceable>)</literal></member>
    <member><literal>plpy.log(<replaceable>msg, **kwargs</replaceable>)</literal></member>
    <member><literal>plpy.info(<replaceable>msg, **kwargs</replaceable>)</literal></member>
    <member><literal>plpy.notice(<replaceable>msg, **kwargs</replaceable>)</literal></member>
    <member><literal>plpy.warning(<replaceable>msg, **kwargs</replaceable>)</literal></member>
    <member><literal>plpy.error(<replaceable>msg, **kwargs</replaceable>)</literal></member>
    <member><literal>plpy.fatal(<replaceable>msg, **kwargs</replaceable>)</literal></member>
   </simplelist>
   <indexterm><primary>elog</primary><secondary>in PL/Python</secondary></indexterm>
   <function>plpy.error</function> and <function>plpy.fatal</function>
   actually raise a Python exception which, if uncaught, propagates out to
   the calling query, causing the current transaction or subtransaction to
   be aborted.  <literal>raise plpy.Error(<replaceable>msg</replaceable>)</literal> and
   <literal>raise plpy.Fatal(<replaceable>msg</replaceable>)</literal> are
   equivalent to calling <literal>plpy.error(<replaceable>msg</replaceable>)</literal> and
   <literal>plpy.fatal(<replaceable>msg</replaceable>)</literal>, respectively but
   the <literal>raise</literal> form does not allow passing keyword arguments.
   The other functions only generate messages of different priority levels.
   Whether messages of a particular priority are reported to the client,
   written to the server log, or both is controlled by the
   <xref linkend="guc-log-min-messages"/> and
   <xref linkend="guc-client-min-messages"/> configuration
   variables. See <xref linkend="runtime-config"/> for more information.
  </para>

  <para>
   The <replaceable>msg</replaceable> argument is given as a positional argument.  For
   backward compatibility, more than one positional argument can be given. In
   that case, the string representation of the tuple of positional arguments
   becomes the message reported to the client.
  </para>

  <para>
   The following keyword-only arguments are accepted:
   <simplelist>
    <member><literal>detail</literal></member>
    <member><literal>hint</literal></member>
    <member><literal>sqlstate</literal></member>
    <member><literal>schema_name</literal></member>
    <member><literal>table_name</literal></member>
    <member><literal>column_name</literal></member>
    <member><literal>datatype_name</literal></member>
    <member><literal>constraint_name</literal></member>
   </simplelist>
   The string representation of the objects passed as keyword-only arguments
   is used to enrich the messages reported to the client. For example:

<programlisting>
CREATE FUNCTION raise_custom_exception() RETURNS void AS $$
plpy.error("custom exception message",
           detail="some info about exception",
           hint="hint for users")
$$ LANGUAGE plpythonu;

=# SELECT raise_custom_exception();
ERROR:  plpy.Error: custom exception message
DETAIL:  some info about exception
HINT:  hint for users
CONTEXT:  Traceback (most recent call last):
  PL/Python function "raise_custom_exception", line 4, in &lt;module&gt;
    hint="hint for users")
PL/Python function "raise_custom_exception"
</programlisting>
  </para>

  <para>
   Another set of utility functions are
   <literal>plpy.quote_literal(<replaceable>string</replaceable>)</literal>,
   <literal>plpy.quote_nullable(<replaceable>string</replaceable>)</literal>, and
   <literal>plpy.quote_ident(<replaceable>string</replaceable>)</literal>.  They
   are equivalent to the built-in quoting functions described in <xref
   linkend="functions-string"/>.  They are useful when constructing
   ad-hoc queries.  A PL/Python equivalent of dynamic SQL from <xref
   linkend="plpgsql-quote-literal-example"/> would be:
<programlisting>
plpy.execute("UPDATE tbl SET %s = %s WHERE key = %s" % (
    plpy.quote_ident(colname),
    plpy.quote_nullable(newvalue),
    plpy.quote_literal(keyvalue)))
</programlisting>
  </para>
 </sect1>

 <sect1 id="plpython-envar">
  <title>Environment Variables</title>

  <para>
   Some of the environment variables that are accepted by the Python
   interpreter can also be used to affect PL/Python behavior.  They
   would need to be set in the environment of the main PostgreSQL
   server process, for example in a start script.  The available
   environment variables depend on the version of Python; see the
   Python documentation for details.  At the time of this writing, the
   following environment variables have an affect on PL/Python,
   assuming an adequate Python version:
   <itemizedlist>
    <listitem>
     <para><envar>PYTHONHOME</envar></para>
    </listitem>

    <listitem>
     <para><envar>PYTHONPATH</envar></para>
    </listitem>

    <listitem>
     <para><envar>PYTHONY2K</envar></para>
    </listitem>

    <listitem>
     <para><envar>PYTHONOPTIMIZE</envar></para>
    </listitem>

    <listitem>
     <para><envar>PYTHONDEBUG</envar></para>
    </listitem>

    <listitem>
     <para><envar>PYTHONVERBOSE</envar></para>
    </listitem>

    <listitem>
     <para><envar>PYTHONCASEOK</envar></para>
    </listitem>

    <listitem>
     <para><envar>PYTHONDONTWRITEBYTECODE</envar></para>
    </listitem>

    <listitem>
     <para><envar>PYTHONIOENCODING</envar></para>
    </listitem>

    <listitem>
     <para><envar>PYTHONUSERBASE</envar></para>
    </listitem>

    <listitem>
     <para><envar>PYTHONHASHSEED</envar></para>
    </listitem>
   </itemizedlist>

   (It appears to be a Python implementation detail beyond the control
   of PL/Python that some of the environment variables listed on
   the <command>python</command> man page are only effective in a
   command-line interpreter and not an embedded Python interpreter.)
  </para>
 </sect1>
</chapter>