postgresql/doc/src/sgml/amcheck.sgml

342 lines
15 KiB
Plaintext

<!-- doc/src/sgml/amcheck.sgml -->
<sect1 id="amcheck" xreflabel="amcheck">
<title>amcheck</title>
<indexterm zone="amcheck">
<primary>amcheck</primary>
</indexterm>
<para>
The <filename>amcheck</filename> module provides functions that allow you to
verify the logical consistency of the structure of relations. If the
structure appears to be valid, no error is raised.
</para>
<para>
The functions verify various <emphasis>invariants</emphasis> in the
structure of the representation of particular relations. The
correctness of the access method functions behind index scans and
other important operations relies on these invariants always
holding. For example, certain functions verify, among other things,
that all B-Tree pages have items in <quote>logical</quote> order (e.g.,
for B-Tree indexes on <type>text</type>, index tuples should be in
collated lexical order). If that particular invariant somehow fails
to hold, we can expect binary searches on the affected page to
incorrectly guide index scans, resulting in wrong answers to SQL
queries.
</para>
<para>
Verification is performed using the same procedures as those used by
index scans themselves, which may be user-defined operator class
code. For example, B-Tree index verification relies on comparisons
made with one or more B-Tree support function 1 routines. See <xref
linkend="xindex-support"/> for details of operator class support
functions.
</para>
<para>
<filename>amcheck</filename> functions may only be used by superusers.
</para>
<sect2>
<title>Functions</title>
<variablelist>
<varlistentry>
<term>
<function>bt_index_check(index regclass, heapallindexed boolean) returns void</function>
<indexterm>
<primary>bt_index_check</primary>
</indexterm>
</term>
<listitem>
<para>
<function>bt_index_check</function> tests that its target, a
B-Tree index, respects a variety of invariants. Example usage:
<screen>
test=# SELECT bt_index_check(index =&gt; c.oid, heapallindexed =&gt; i.indisunique),
c.relname,
c.relpages
FROM pg_index i
JOIN pg_opclass op ON i.indclass[0] = op.oid
JOIN pg_am am ON op.opcmethod = am.oid
JOIN pg_class c ON i.indexrelid = c.oid
JOIN pg_namespace n ON c.relnamespace = n.oid
WHERE am.amname = 'btree' AND n.nspname = 'pg_catalog'
-- Don't check temp tables, which may be from another session:
AND c.relpersistence != 't'
-- Function may throw an error when this is omitted:
AND c.relkind = 'i' AND i.indisready AND i.indisvalid
ORDER BY c.relpages DESC LIMIT 10;
bt_index_check | relname | relpages
----------------+---------------------------------+----------
| pg_depend_reference_index | 43
| pg_depend_depender_index | 40
| pg_proc_proname_args_nsp_index | 31
| pg_description_o_c_o_index | 21
| pg_attribute_relid_attnam_index | 14
| pg_proc_oid_index | 10
| pg_attribute_relid_attnum_index | 9
| pg_amproc_fam_proc_index | 5
| pg_amop_opr_fam_index | 5
| pg_amop_fam_strat_index | 5
(10 rows)
</screen>
This example shows a session that performs verification of the
10 largest catalog indexes in the database <quote>test</quote>.
Verification of the presence of heap tuples as index tuples is
requested for the subset that are unique indexes. Since no
error is raised, all indexes tested appear to be logically
consistent. Naturally, this query could easily be changed to
call <function>bt_index_check</function> for every index in the
database where verification is supported.
</para>
<para>
<function>bt_index_check</function> acquires an <literal>AccessShareLock</literal>
on the target index and the heap relation it belongs to. This lock mode
is the same lock mode acquired on relations by simple
<literal>SELECT</literal> statements.
<function>bt_index_check</function> does not verify invariants
that span child/parent relationships, but will verify the
presence of all heap tuples as index tuples within the index
when <parameter>heapallindexed</parameter> is
<literal>true</literal>. When a routine, lightweight test for
corruption is required in a live production environment, using
<function>bt_index_check</function> often provides the best
trade-off between thoroughness of verification and limiting the
impact on application performance and availability.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>
<function>bt_index_parent_check(index regclass, heapallindexed boolean, rootdescend boolean) returns void</function>
<indexterm>
<primary>bt_index_parent_check</primary>
</indexterm>
</term>
<listitem>
<para>
<function>bt_index_parent_check</function> tests that its
target, a B-Tree index, respects a variety of invariants.
Optionally, when the <parameter>heapallindexed</parameter>
argument is <literal>true</literal>, the function verifies the
presence of all heap tuples that should be found within the
index, and that there are no missing downlinks in the index
structure. When the optional <parameter>rootdescend</parameter>
argument is <literal>true</literal>, verification re-finds
tuples on the leaf level by performing a new search from the
root page for each tuple. The checks that can be performed by
<function>bt_index_parent_check</function> are a superset of the
checks that can be performed by <function>bt_index_check</function>.
<function>bt_index_parent_check</function> can be thought of as
a more thorough variant of <function>bt_index_check</function>:
unlike <function>bt_index_check</function>,
<function>bt_index_parent_check</function> also checks
invariants that span parent/child relationships.
<function>bt_index_parent_check</function> follows the general
convention of raising an error if it finds a logical
inconsistency or other problem.
</para>
<para>
A <literal>ShareLock</literal> is required on the target index by
<function>bt_index_parent_check</function> (a
<literal>ShareLock</literal> is also acquired on the heap relation).
These locks prevent concurrent data modification from
<command>INSERT</command>, <command>UPDATE</command>, and <command>DELETE</command>
commands. The locks also prevent the underlying relation from
being concurrently processed by <command>VACUUM</command>, as well as
all other utility commands. Note that the function holds locks
only while running, not for the entire transaction.
</para>
<para>
<function>bt_index_parent_check</function>'s additional
verification is more likely to detect various pathological
cases. These cases may involve an incorrectly implemented
B-Tree operator class used by the index that is checked, or,
hypothetically, undiscovered bugs in the underlying B-Tree index
access method code. Note that
<function>bt_index_parent_check</function> cannot be used when
Hot Standby mode is enabled (i.e., on read-only physical
replicas), unlike <function>bt_index_check</function>.
</para>
</listitem>
</varlistentry>
</variablelist>
</sect2>
<sect2>
<title>Optional <parameter>heapallindexed</parameter> Verification</title>
<para>
When the <parameter>heapallindexed</parameter> argument to
verification functions is <literal>true</literal>, an additional
phase of verification is performed against the table associated with
the target index relation. This consists of a <quote>dummy</quote>
<command>CREATE INDEX</command> operation, which checks for the
presence of all hypothetical new index tuples against a temporary,
in-memory summarizing structure (this is built when needed during
the basic first phase of verification). The summarizing structure
<quote>fingerprints</quote> every tuple found within the target
index. The high level principle behind
<parameter>heapallindexed</parameter> verification is that a new
index that is equivalent to the existing, target index must only
have entries that can be found in the existing structure.
</para>
<para>
The additional <parameter>heapallindexed</parameter> phase adds
significant overhead: verification will typically take several times
longer. However, there is no change to the relation-level locks
acquired when <parameter>heapallindexed</parameter> verification is
performed.
</para>
<para>
The summarizing structure is bound in size by
<varname>maintenance_work_mem</varname>. In order to ensure that
there is no more than a 2% probability of failure to detect an
inconsistency for each heap tuple that should be represented in the
index, approximately 2 bytes of memory are needed per tuple. As
less memory is made available per tuple, the probability of missing
an inconsistency slowly increases. This approach limits the
overhead of verification significantly, while only slightly reducing
the probability of detecting a problem, especially for installations
where verification is treated as a routine maintenance task. Any
single absent or malformed tuple has a new opportunity to be
detected with each new verification attempt.
</para>
</sect2>
<sect2>
<title>Using <filename>amcheck</filename> Effectively</title>
<para>
<filename>amcheck</filename> can be effective at detecting various types of
failure modes that <link
linkend="app-initdb-data-checksums"><application>data page
checksums</application></link> will always fail to catch. These include:
<itemizedlist>
<listitem>
<para>
Structural inconsistencies caused by incorrect operator class
implementations.
</para>
<para>
This includes issues caused by the comparison rules of operating
system collations changing. Comparisons of datums of a collatable
type like <type>text</type> must be immutable (just as all
comparisons used for B-Tree index scans must be immutable), which
implies that operating system collation rules must never change.
Though rare, updates to operating system collation rules can
cause these issues. More commonly, an inconsistency in the
collation order between a master server and a standby server is
implicated, possibly because the <emphasis>major</emphasis> operating
system version in use is inconsistent. Such inconsistencies will
generally only arise on standby servers, and so can generally
only be detected on standby servers.
</para>
<para>
If a problem like this arises, it may not affect each individual
index that is ordered using an affected collation, simply because
<emphasis>indexed</emphasis> values might happen to have the same
absolute ordering regardless of the behavioral inconsistency. See
<xref linkend="locale"/> and <xref linkend="collation"/> for
further details about how <productname>PostgreSQL</productname> uses
operating system locales and collations.
</para>
</listitem>
<listitem>
<para>
Structural inconsistencies between indexes and the heap relations
that are indexed (when <parameter>heapallindexed</parameter>
verification is performed).
</para>
<para>
There is no cross-checking of indexes against their heap relation
during normal operation. Symptoms of heap corruption can be subtle.
</para>
</listitem>
<listitem>
<para>
Corruption caused by hypothetical undiscovered bugs in the
underlying <productname>PostgreSQL</productname> access method
code, sort code, or transaction management code.
</para>
<para>
Automatic verification of the structural integrity of indexes
plays a role in the general testing of new or proposed
<productname>PostgreSQL</productname> features that could plausibly allow a
logical inconsistency to be introduced. Verification of table
structure and associated visibility and transaction status
information plays a similar role. One obvious testing strategy
is to call <filename>amcheck</filename> functions continuously
when running the standard regression tests. See <xref
linkend="regress-run"/> for details on running the tests.
</para>
</listitem>
<listitem>
<para>
File system or storage subsystem faults where checksums happen to
simply not be enabled.
</para>
<para>
Note that <filename>amcheck</filename> examines a page as represented in some
shared memory buffer at the time of verification if there is only a
shared buffer hit when accessing the block. Consequently,
<filename>amcheck</filename> does not necessarily examine data read from the
file system at the time of verification. Note that when checksums are
enabled, <filename>amcheck</filename> may raise an error due to a checksum
failure when a corrupt block is read into a buffer.
</para>
</listitem>
<listitem>
<para>
Corruption caused by faulty RAM, or the broader memory subsystem.
</para>
<para>
<productname>PostgreSQL</productname> does not protect against correctable
memory errors and it is assumed you will operate using RAM that
uses industry standard Error Correcting Codes (ECC) or better
protection. However, ECC memory is typically only immune to
single-bit errors, and should not be assumed to provide
<emphasis>absolute</emphasis> protection against failures that
result in memory corruption.
</para>
<para>
When <parameter>heapallindexed</parameter> verification is
performed, there is generally a greatly increased chance of
detecting single-bit errors, since strict binary equality is
tested, and the indexed attributes within the heap are tested.
</para>
</listitem>
</itemizedlist>
In general, <filename>amcheck</filename> can only prove the presence of
corruption; it cannot prove its absence.
</para>
</sect2>
<sect2>
<title>Repairing Corruption</title>
<para>
No error concerning corruption raised by <filename>amcheck</filename> should
ever be a false positive. <filename>amcheck</filename> raises
errors in the event of conditions that, by definition, should never
happen, and so careful analysis of <filename>amcheck</filename>
errors is often required.
</para>
<para>
There is no general method of repairing problems that
<filename>amcheck</filename> detects. An explanation for the root cause of
an invariant violation should be sought. <xref
linkend="pageinspect"/> may play a useful role in diagnosing
corruption that <filename>amcheck</filename> detects. A <command>REINDEX</command>
may not be effective in repairing corruption.
</para>
</sect2>
</sect1>