Commit Graph

7134 Commits

Author SHA1 Message Date
Alvaro Herrera 0cd711271d
Better handle indirect constraint drops
It is possible for certain cases to remove not-null constraints without
maintaining the attnotnull in its correct state; for example if you drop
a column that's part of the primary key, and the other columns of the PK don't
have not-null constraints, then we should reset the attnotnull flags for
those other columns; up to this commit, we didn't.  Handle those cases
better by doing the attnotnull reset in RemoveConstraintById() instead
of in dropconstraint_internal().

However, there are some cases where we must not do so.  For example if
those other columns are in replica identity indexes or are generated
identity columns, we must keep attnotnull set, even though it results in
the catalog inconsistency that no not-null constraint supports that.

Because the attnotnull reset now happens in more places than before, for
instance when a column of the primary key changes type, we need an
additional trick to reinstate it as necessary.  Introduce a new
alter-table pass that does this, which needs simply reschedule some
AT_SetAttNotNull subcommands that were already being generated and
ignored.

Because of the exceptions in which attnotnull is not reset noted above,
we also include a pg_dump hack to include a not-null constraint when the
attnotnull flag is set even if no pg_constraint row exists.  This part
is undesirable but necessary, because failing to handle the case can
result in unrestorable dumps.

Reported-by: Tender Wang <tndrwang@gmail.com>
Co-authored-by: Tender Wang <tndrwang@gmail.com>
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CAHewXN=hMbNa3d43NOR=OCgdgpTt18S-1fmueCoEGesyeK4bqw@mail.gmail.com
2024-04-19 12:37:33 +02:00
Daniel Gustafsson 950d4a2cb1 Fix typos and duplicate words
This fixes various typos, duplicated words, and tiny bits of whitespace
mainly in code comments but also in docs.

Author: Daniel Gustafsson <daniel@yesql.se>
Author: Heikki Linnakangas <hlinnaka@iki.fi>
Author: Alexander Lakhin <exclusion@gmail.com>
Author: David Rowley <dgrowleyml@gmail.com>
Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/3F577953-A29E-4722-98AD-2DA9EFF2CBB8@yesql.se
2024-04-18 21:28:07 +02:00
Robert Haas 9e72f6bfae Restrict where INCREMENTAL.${NAME} files are recognized.
Previously, they were recognized anywhere in an incremental backup
directory; now, we restrict this to places where they are expected to
appear. That means this code will need updating if we ever do
incremental backups of files in other places (e.g. SLRU files), but
it lets you create a file called INCREMENTAL.config (or something like
that) at the top level of the data directory and still have things
work.

Patch by me, per request from David Steele, who also reviewed.

Discussion: http://postgr.es/m/5a7817da-6349-4653-8056-470300b6e512@pgmasters.net
2024-04-18 11:00:38 -04:00
Alvaro Herrera d72d32f52d
Don't try to assign smart names to constraints
This part of my previous commit seems to have broken pg_upgrade on
crake, at least from 9.2.  I'll see if there's a better fix, but in the
meantime this should suffice to keep the buildfarm green.
2024-04-18 16:10:53 +02:00
Alvaro Herrera d9f686a72e
Fix restore of not-null constraints with inheritance
In tables with primary keys, pg_dump creates tables with primary keys by
initially dumping them with throw-away not-null constraints (marked "no
inherit" so that they don't create problems elsewhere), to later drop
them once the primary key is restored.  Because of a unrelated
consideration, on tables with children we add not-null constraints to
all columns of the primary key when it is created.

If both a table and its child have primary keys, and pg_dump happens to
emit the child table first (and its throw-away not-null) and later its
parent table, the creation of the parent's PK will fail because the
throw-away not-null constraint collides with the permanent not-null
constraint that the PK wants to add, so the dump fails to restore.

We can work around this problem by letting the primary key "take over"
the child's not-null.  This requires no changes to pg_dump, just two
changes to ALTER TABLE: first, the ability to convert a no-inherit
not-null constraint into a regular inheritable one (including recursing
down to children, if there are any); second, the ability to "drop" a
constraint that is defined both directly in the table and inherited from
a parent (which simply means to mark it as no longer having a local
definition).

Secondarily, change ATPrepAddPrimaryKey() to acquire locks all the way
down the inheritance hierarchy, in case we need to recurse when
propagating constraints.

These two changes allow pg_dump to reproduce more cases involving
inheritance from versions 16 and older.

Lastly, make two changes to pg_dump: 1) do not try to drop a not-null
constraint that's marked as inherited; this allows a dump to restore
with no errors if a table with a PK inherits from another which also has
a PK; 2) avoid giving inherited constraints throwaway names, for the
rare cases where such a constraint survives after the restore.

Reported-by: Andrew Bille <andrewbille@gmail.com>
Reported-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/CAJnzarwkfRu76_yi3dqVF_WL-MpvT54zMwAxFwJceXdHB76bOA@mail.gmail.com
Discussion: https://postgr.es/m/Zh0aAH7tbZb-9HbC@pryzbyj2023
2024-04-18 15:35:15 +02:00
Peter Eisentraut 6ff21c0530 psql: Make output of \dD more stable
\dD showed domain check constraints in arbitrary order, which can
cause regression test failures, which was exposed by commit
9895b35cb8.  To fix, order the constraints by conname, which matches
what psql does in other queries listing constraints.
2024-04-15 09:28:48 +02:00
Peter Eisentraut 9895b35cb8 Fix ALTER DOMAIN NOT NULL syntax
This addresses a few problems with commit e5da0fe3c2 ("Catalog domain
not-null constraints").

In CREATE DOMAIN, a NOT NULL constraint looks like

    CREATE DOMAIN d1 AS int [ CONSTRAINT conname ] NOT NULL

(Before e5da0fe3c2, the constraint name was accepted but ignored.)

But in ALTER DOMAIN, a NOT NULL constraint looks like

    ALTER DOMAIN d1 ADD [ CONSTRAINT conname ] NOT NULL VALUE

where VALUE is where for a table constraint the column name would be.
(This works as of e5da0fe3c2.  Before e5da0fe3c2, this syntax
resulted in an internal error.)

But for domains, this latter syntax is confusing and needlessly
inconsistent between CREATE and ALTER.  So this changes it to just

    ALTER DOMAIN d1 ADD [ CONSTRAINT conname ] NOT NULL

(None of these syntaxes are per SQL standard; we are just living with
the bits of inconsistency that have built up over time.)

In passing, this also changes the psql \dD output to not show not-null
constraints in the column "Check", since it's already shown in the
column "Nullable".  This has also been off since e5da0fe3c2.

Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/9ec24d7b-633d-463a-84c6-7acff769c9e8%40eisentraut.org
2024-04-15 08:34:45 +02:00
Andrew Dunstan 929c05774b Don't allocate large buffer on the stack in pg_verifybackup
Per complaint from Andres Freund. Follow his suggestion to allocate the
buffer once in the calling routine instead.

Also make a tiny indentation improvement.

Discussion: https://postgr.es/m/20240411190147.a3yries632olfcgg@awork3.anarazel.de
2024-04-12 10:52:25 -04:00
Andrew Dunstan 661ab4e185 Fix some memory leaks associated with parsing json and manifests
Coverity complained about not freeing some memory associated with
incrementally parsing backup manifests. To fix that, provide and use a new
shutdown function for the JsonManifestParseIncrementalState object, in
line with a suggestion from Tom Lane.

While analysing the problem, I noticed a buglet in freeing memory for
incremental json lexers. To fix that remove a bogus condition on
freeing the memory allocated for them.
2024-04-12 10:32:30 -04:00
Masahiko Sawada 810f64a015 Revert indexed and enlargable binary heap implementation.
This reverts commit b840508644 and bcb14f4abc. These commits were made
for commit 5bec1d6bc5 (Improve eviction algorithm in ReorderBuffer
using max-heap for many subtransactions). However, per discussion,
commit efb8acc0d0 replaced binary heap + index with pairing heap, and
made these commits unnecessary.

Reported-by: Jeff Davis
Discussion: https://postgr.es/m/12747c15811d94efcc5cda72d6b35c80d7bf3443.camel%40j-davis.com
2024-04-11 17:18:05 +09:00
Michael Paquier deca6ac136 Add missing set_pglocale_pgservice() for pg_walsummary and pg_combinebackup
These calls are required to make both tools work with NLS.

Author: Kyotaro Horiguchi
Discussion: https://postgr.es/m/20240408.162702.183779935636035593.horikyota.ntt@gmail.com
2024-04-09 14:01:33 +09:00
Tom Lane f463de59d9 In psql, avoid leaking a PGresult after a query is cancelled.
After a query cancel, the tail end of ExecQueryAndProcessResults
took care to clear any not-yet-read PGresults; but it forgot about
the one it has already read.  There would only be such a result
when handling a multi-command string made with "\;", so that you'd
have to cancel an earlier command in such a string to reach the
bug at all.  Even then, there would only be leakage of a single
PGresult per cancel, so it's not surprising nobody noticed this.
But a leak is a leak.

Noted while re-reviewing 90f517821, but this is independent of that:
it dates to 7844c9918.  Back-patch to v15 where that came in.
2024-04-08 17:00:07 -04:00
Tom Lane c21d4c416a Further review for re-implementation of psql's FETCH_COUNT feature.
Alexander Lakhin noted an obsolete comment, which led me to revisit
some other important comments in the patch, and that study turned up a
couple of unintended ways in which the chunked-fetch code path didn't
match the normal code path in ExecQueryAndProcessResults.  The only
nontrivial problem is that it didn't call PrintQueryStatus, so that
we'd not print the final status result from DML ... RETURNING
commands.  To avoid code duplication, move the filter for whether a
result is from RETURNING from PrintQueryResult to PrintQueryStatus.

Discussion: https://postgr.es/m/0023bea5-79c0-476e-96c8-dad599cc3ad8@gmail.com
2024-04-08 15:49:10 -04:00
Masahiko Sawada 304b6b1a6b Add more tab completion support for ALTER DEFAULT PRIVILEGES in psql.
This adds tab completion of "GRANT" and "REVOKE [GRANT OPTION FOR]"
for ALTER DEFAULT PRIVILEGES, and adds "WITH GRANT OPTION" for
ALTER DEFAULT PRIVILEGES ... GRANT ... TO role.

Author: Vignesh C, with cosmetic adjustments by me
Reviewed-by: Shubham Khanna, Masahiko Sawada
Discussion: https://postgr.es/m/CALDaNm1aEdJb-QJi%3DGWStkfj_%2BEDUK_VtDkn%2BTjQ2z7HyU0MBw%40mail.gmail.com
2024-04-08 12:15:10 +09:00
Heikki Linnakangas 91044ae4ba Send ALPN in TLS handshake, require it in direct SSL connections
libpq now always tries to send ALPN. With the traditional negotiated
SSL connections, the server accepts the ALPN, and refuses the
connection if it's not what we expect, but connecting without ALPN is
still OK. With the new direct SSL connections, ALPN is mandatory.

NOTE: This uses "TBD-pgsql" as the protocol ID. We must register a
proper one with IANA before the release!

Author: Greg Stark, Heikki Linnakangas
Reviewed-by: Matthias van de Meent, Jacob Champion
2024-04-08 04:24:51 +03:00
Tom Lane 90f5178211 Re-implement psql's FETCH_COUNT feature atop libpq's chunked mode.
Formerly this was done with a cursor, which is problematic since
not all result-set-returning query types can be put into a cursor.
The new implementation is better integrated into other psql
features, too.

Daniel Vérité, reviewed by Laurenz Albe and myself (and whacked
around a bit by me, so any remaining bugs are my fault)

Discussion: https://postgr.es/m/CAKZiRmxsVTkO928CM+-ADvsMyePmU3L9DQCa9NwqjvLPcEe5QA@mail.gmail.com
2024-04-06 20:45:11 -04:00
Tom Lane 4643a2b265 Support retrieval of results in chunks with libpq.
This patch generalizes libpq's existing single-row mode to allow
individual partial-result PGresults to contain up to N rows, rather
than always one row.  This reduces malloc overhead compared to plain
single-row mode, and it is very useful for psql's FETCH_COUNT feature,
since otherwise we'd have to add code (and cycles) to either merge
single-row PGresults into a bigger one or teach psql's
results-printing logic to accept arrays of PGresults.

To avoid API breakage, PQsetSingleRowMode() remains the same, and we
add a new function PQsetChunkedRowsMode() to invoke the more general
case.  Also, PGresults obtained the old way continue to carry the
PGRES_SINGLE_TUPLE status code, while if PQsetChunkedRowsMode() is
used then their status code is PGRES_TUPLES_CHUNK.  The underlying
logic is the same either way, though.

Daniel Vérité, reviewed by Laurenz Albe and myself (and whacked
around a bit by me, so any remaining bugs are my fault)

Discussion: https://postgr.es/m/CAKZiRmxsVTkO928CM+-ADvsMyePmU3L9DQCa9NwqjvLPcEe5QA@mail.gmail.com
2024-04-06 20:45:11 -04:00
John Naylor f956ecd035 Convert uses of hash_string_pointer to fasthash equivalent
Remove duplicate hash_string_pointer() function definitions by creating
a new inline function hash_string() for this purpose.

This has the added advantage of avoiding strlen() calls when doing hash
lookup. It's not clear how many of these are perfomance-sensitive
enough to benefit from that, but the simplification is worth it on
its own.

Reviewed by Jeff Davis

Discussion: https://postgr.es/m/CANWCAZbg_XeSeY0a_PqWmWqeRATvzTzUNYRLeT%2Bbzs%2BYQdC92g%40mail.gmail.com
2024-04-06 12:20:40 +07:00
Andres Freund 9e7386924e Fix headerscheck violation introduced in f8ce4ed78c
Per ci.
2024-04-05 14:27:45 -07:00
Tomas Vondra 079d94ab34 Check HAVE_COPY_FILE_RANGE before calling copy_file_range
Fix a mistake in ac81101551 - write_reconstructed_file() called
copy_file_range() without properly checking HAVE_COPY_FILE_RANGE.

Reported by several macOS machines. Also reported by cfbot, but I missed
that issue before commit.
2024-04-05 19:38:20 +02:00
Tomas Vondra ac81101551 Allow using copy_file_range in write_reconstructed_file
This commit allows using copy_file_range() for efficient combining of
data from multiple files, instead of simply reading/writing the blocks.
Depending on the filesystem and other factors (size of the increment,
distribution of modified blocks etc.) this may be faster than the
block-by-block copy, but more importantly it enables various features
provided by CoW filesystems.

If a checksum needs to be calculated for the file, the same strategy as
when copying whole files is used - copy_file_range is used to copy the
blocks, but the file is also read for the checksum calculation.

While the checksum calculation is rarely needed when cloning whole
files, when reconstructing the files from multiple backups it needs to
happen almost always (the only exception is when the user specified
--no-manifest).

Author: Tomas Vondra
Reviewed-by: Thomas Munro, Jakub Wartak, Robert Haas
Discussion: https://postgr.es/m/3024283a-7491-4240-80d0-421575f6bb23%40enterprisedb.com
2024-04-05 19:19:36 +02:00
Tomas Vondra 8e392595e5 Remove unused variable in checksum_file()
The 'offset' variable was set but otherwise unused.

Per buildfarm animals with clang, e.g. sifaka and longlin.
2024-04-05 18:13:15 +02:00
Tomas Vondra f8ce4ed78c Allow copying files using clone/copy_file_range
Adds --clone/--copy-file-range options to pg_combinebackup, to allow
copying files using file cloning or copy_file_range(). These methods may
be faster than the standard block-by-block copy, but the main advantage
is that they enable various features provided by CoW filesystems.

This commit only uses these copy methods for files that did not change
and can be copied as a whole from a single backup.

These new copy methods may not be available on all platforms, in which
case the command throws an error (immediately, even if no files would be
copied as a whole). This early failure seems better than failing later
when trying to copy the first file, after performing a lot of work on
earlier files.

If the requested copy method is available, but a checksum needs to be
recalculated (e.g. because of a different checksum type), the file is
still copied using the requested method, but it is also read for the
checksum calculation. Depending on the filesystem this may be more
expensive than just performing the simple copy, but it does enable the
CoW benefits.

Initial patch by Jakub Wartak, various reworks and improvements by me.

Author: Tomas Vondra, Jakub Wartak
Reviewed-by: Thomas Munro, Jakub Wartak, Robert Haas
Discussion: https://postgr.es/m/3024283a-7491-4240-80d0-421575f6bb23%40enterprisedb.com
2024-04-05 18:01:32 +02:00
Tomas Vondra 10e3226ba1 Align blocks in incremental backups to BLCKSZ
Align blocks stored in incremental files to BLCKSZ, so that the
incremental backups work well with CoW filesystems.

The header of the incremental file is padded with \0 to a multiple of
BLCKSZ, so that the block data (also BLCKSZ) is aligned to BLCKSZ. The
padding is added only to files containing block data, so files with just
the header remain small. This adds a bit of extra space, but as the
number of blocks increases the overhead gets negligible very quickly.
And as the padding is \0 bytes, it does compress extremely well.

The alignment is important for CoW filesystems that usually require the
blocks to be aligned to filesystem page size for features like block
sharing, deduplication etc. to work well. With the variable sized header
the blocks in the increments were not aligned at all, negating the
benefits of the CoW filesystems.

This matters even for non-CoW filesystems, for example when placed on a
RAID array. If the block is not aligned, it may easily span multiple
devices, causing read and write amplification.

It might be better to align the blocks to the filesystem page, not
BLCKSZ, but we have no good way to determine that. Even if we determine
the page size at the time of taking the backup, the backup may move. For
now the BLCKSZ seems sufficient - the filesystem page is usually 4K, so
the default BLCKSZ (8K by default) is aligned to that.

Author: Tomas Vondra
Reviewed-by: Robert Haas, Jakub Wartak
Discussion: https://postgr.es/m/3024283a-7491-4240-80d0-421575f6bb23%40enterprisedb.com
2024-04-05 16:30:01 +02:00
Jeff Davis e2a2357671 Fix test failures when language environment is not UTF-8.
For tests that depend on UTF-8 encoding, force LC_COLLATE=C and
LC_CTYPE=C to avoid an encoding mismatch.

Reported-by: Thomas Munro
Discussion: https://postgr.es/m/CA+hUKGK-ZqV1njkG_=xcCqXh2fcMkz85FTMnhS2opm4ZerH=xw@mail.gmail.com
2024-04-04 16:10:12 -07:00
Robert Haas 12b964d781 Remove reachable call to pg_unreachable().
The loop just before this uses break, not return, so this line
is reachable.  Commit cafe105655
introduced this issue.

Jelte Fennema-Nio, reviewed by Tristan Partin

Discussion: http://postgr.es/m/CAGECzQTO72jKed5461W8cytV2Msh_e+WUZjOyX_RUQCbjk4LRA@mail.gmail.com
2024-04-04 16:22:11 -04:00
Peter Eisentraut a9d6c38684 pg_upgrade: Fix typo in message 2024-04-04 12:58:57 +02:00
Andrew Dunstan 222e11a10a Use incremental parsing of backup manifests.
This changes the three callers to json_parse_manifest() to use
json_parse_manifest_incremental_chunk() if appropriate. In the case of
the backend caller, since we don't know the size of the manifest in
advance we always call the incremental parser.

Author: Andrew Dunstan
Reviewed-By: Jacob Champion

Discussion: https://postgr.es/m/7b0a51d6-0d9d-7366-3a1a-f74397a02f55@dunslane.net
2024-04-04 06:46:40 -04:00
Daniel Gustafsson 9301308bd1 Fix indentation from cafe105655
Per buildfarm animal koel
2024-04-03 09:44:47 +02:00
Masahiko Sawada b840508644 Add functions to binaryheap for efficient key removal and update.
Previously, binaryheap didn't support updating a key and removing a
node in an efficient way. For example, in order to remove a node from
the binaryheap, the caller had to pass the node's position within the
array that the binaryheap internally has. Removing a node from the
binaryheap is done in O(log n) but searching for the key's position is
done in O(n).

This commit adds a hash table to binaryheap in order to track the
position of each nodes in the binaryheap. That way, by using newly
added functions such as binaryheap_update_up() etc., both updating a
key and removing a node can be done in O(1) on an average and O(log n)
in worst case. This is known as the indexed binary heap. The caller
can specify to use the indexed binaryheap by passing indexed = true.

The current code does not use the new indexing logic, but it will be
used by an upcoming patch.

Reviewed-by: Vignesh C, Peter Smith, Hayato Kuroda, Ajin Cherian,
Tomas Vondra, Shubham Khanna
Discussion: https://postgr.es/m/CAD21AoDffo37RC-eUuyHJKVEr017V2YYDLyn1xF_00ofptWbkg%40mail.gmail.com
2024-04-03 10:44:21 +09:00
Robert Haas cafe105655 Allow SIGINT to cancel psql database reconnections.
After installing the SIGINT handler in psql, SIGINT can no longer cancel
database reconnections. For instance, if the user starts a reconnection
and then needs to do some form of interaction (ie psql is polling),
there is no way to cancel the reconnection process currently.

Use PQconnectStartParams() in order to insert a cancel_pressed check
into the polling loop.

Tristan Partin, reviewed by Gurjeet Singh, Heikki Linnakangas, Jelte
Fennema-Nio, and me.

Discussion: http://postgr.es/m/D08WWCPVHKHN.3QELIKZJ2D9RZ@neon.tech
2024-04-02 10:26:10 -04:00
Tom Lane 959b38d770 Invent --transaction-size option for pg_restore.
This patch allows pg_restore to wrap its commands into transaction
blocks, somewhat like --single-transaction, except that we commit
and start a new block after every N objects.  Using this mode
with a size limit of 1000 or so objects greatly reduces the number
of transactions consumed by the restore, while preventing any
one transaction from taking enough locks to overrun the receiving
server's shared lock table.

(A value of 1000 works well with the default lock table size of
around 6400 locks.  Higher --transaction-size values can be used
if one has increased the receiving server's lock table size.)

Excessive consumption of XIDs has been reported as a problem for
pg_upgrade in particular, but it could be bad for any restore; and the
change also reduces the number of fsyncs and amount of WAL generated,
so it should provide speed benefits too.

This patch does not try to make parallel workers batch the SQL
commands they issue.  The trouble with doing that is that other
workers may need to see the objects a worker creates right away.
Possibly this can be improved later.

In this patch I have hard-wired pg_upgrade to use a transaction size
of 1000 divided by the number of parallel restore jobs allowed
(without that, we'd still be at risk of overrunning the shared lock
table).  Perhaps there would be value in adding another pg_upgrade
option to allow user control of that, but I'm unsure that it's worth
the trouble; I think few users would use it, and any who did would see
not that much benefit compared to the default.

Patch by me, but the original idea to batch SQL commands during
restore is due to Robins Tharakan.

Discussion: https://postgr.es/m/a9f9376f1c3343a6bb319dce294e20ac@EX13D05UWC001.ant.amazon.com
2024-04-01 16:46:24 -04:00
Tom Lane a45c78e328 Rearrange pg_dump's handling of large objects for better efficiency.
Commit c0d5be5d6 caused pg_dump to create a separate BLOB metadata TOC
entry for each large object (blob), but it did not touch the ancient
decision to put all the blobs' data into a single "BLOBS" TOC entry.
This is bad for a few reasons: for databases with millions of blobs,
the TOC becomes unreasonably large, causing performance issues;
selective restore of just some blobs is quite impossible; and we
cannot parallelize either dump or restore of the blob data, since our
architecture for that relies on farming out whole TOC entries to
worker processes.

To improve matters, let's group multiple blobs into each blob metadata
TOC entry, and then make corresponding per-group blob data TOC entries.
Selective restore using pg_restore's -l/-L switches is then possible,
though only at the group level.  (Perhaps we should provide a switch
to allow forcing one-blob-per-group for users who need precise
selective restore and don't have huge numbers of blobs.  This patch
doesn't do that, instead just hard-wiring the maximum number of blobs
per entry at 1000.)

The blobs in a group must all have the same owner, since the TOC entry
format only allows one owner to be named.  In this implementation
we also require them to all share the same ACL (grants); the archive
format wouldn't require that, but pg_dump's representation of
DumpableObjects does.  It seems unlikely that either restriction
will be problematic for databases with huge numbers of blobs.

The metadata TOC entries now have a "desc" string of "BLOB METADATA",
and their "defn" string is just a newline-separated list of blob OIDs.
The restore code has to generate creation commands, ALTER OWNER
commands, and drop commands (for --clean mode) from that.  We would
need special-case code for ALTER OWNER and drop in any case, so the
alternative of keeping the "defn" as directly executable SQL code
for creation wouldn't buy much, and it seems like it'd bloat the
archive to little purpose.

Since we require the blobs of a metadata group to share the same ACL,
we can furthermore store only one copy of that ACL, and then make
pg_restore regenerate the appropriate commands for each blob.  This
saves space in the dump file not only by removing duplicative SQL
command strings, but by not needing a separate TOC entry for each
blob's ACL.  In turn, that reduces client-side memory requirements for
handling many blobs.

ACL TOC entries that need this special processing are labeled as
"ACL"/"LARGE OBJECTS nnn..nnn".  If we have a blob with a unique ACL,
continue to label it as "ACL"/"LARGE OBJECT nnn".  We don't actually
have to make such a distinction, but it saves a few cycles during
restore for the easy case, and it seems like a good idea to not change
the TOC contents unnecessarily.

The data TOC entries ("BLOBS") are exactly the same as before,
except that now there can be more than one, so we'd better give them
identifying tag strings.

Also, commit c0d5be5d6 put the new BLOB metadata TOC entries into
SECTION_PRE_DATA, which perhaps is defensible in some ways, but
it's a rather odd choice considering that we go out of our way to
treat blobs as data.  Moreover, because parallel restore handles
the PRE_DATA section serially, this means we'd only get part of the
parallelism speedup we could hope for.  Move these entries into
SECTION_DATA, letting us parallelize the lo_create calls not just the
data loading when there are many blobs.  Add dependencies to ensure
that we won't try to load data for a blob we've not yet created.

As this stands, we still generate a separate TOC entry for any comment
or security label attached to a blob.  I feel comfortable in believing
that comments and security labels on blobs are rare, so this patch
should be enough to get most of the useful TOC compression for blobs.

We have to bump the archive file format version number, since existing
versions of pg_restore wouldn't know they need to do something special
for BLOB METADATA, plus they aren't going to work correctly with
multiple BLOBS entries or multiple-large-object ACL entries.

The directory and tar-file format handlers need some work
for multiple BLOBS entries: they used to hard-wire the file name
as "blobs.toc", which is replaced here with "blobs_<dumpid>.toc".
The 002_pg_dump.pl test script also knows about that and requires
minor updates.  (I had to drop the test for manually-compressed
blobs.toc files with LZ4, because lz4's obtuse command line
design requires explicit specification of the output file name
which seems impractical here.  I don't think we're losing any
useful test coverage thereby; that test stanza seems completely
duplicative with the gzip and zstd cases anyway.)

In passing, centralize management of the lo_buf used to hold data
while restoring blobs.  The code previously had each format handler
create lo_buf, which seems rather pointless given that the format
handlers all make it the same way.  Moreover, the format handlers
never use lo_buf directly, making this setup a failure from a
separation-of-concerns standpoint.  Let's move the responsibility into
pg_backup_archiver.c, which is the only module concerned with lo_buf.
The reason to do this in this patch is that it allows a centralized
fix for the now-false assumption that we never restore blobs in
parallel.  Also, get rid of dead code in DropLOIfExists: it's been a
long time since we had any need to be able to restore to a pre-9.0
server.

Discussion: https://postgr.es/m/a9f9376f1c3343a6bb319dce294e20ac@EX13D05UWC001.ant.amazon.com
2024-04-01 16:25:56 -04:00
Tom Lane 91cbb4b492 Fix assorted resource leaks in new pg_createsubscriber code.
Various error paths did not release resources before returning.
While it's likely that the program would just exit shortly later,
none of the functions in question have summary exit(1) calls,
so they should not be assuming that.

Ranier Vilela and Tom Lane, per reports from Coverity

Discussion: https://postgr.es/m/CAEudQAr2_SZFxB4kXJiL4+2UaNZxUk5UBJtj0oXyJYMGZu-03g@mail.gmail.com
2024-04-01 13:47:49 -04:00
Masahiko Sawada f5a227895e Add new COPY option LOG_VERBOSITY.
This commit adds a new COPY option LOG_VERBOSITY, which controls the
amount of messages emitted during processing. Valid values are
'default' and 'verbose'.

This is currently used in COPY FROM when ON_ERROR option is set to
ignore. If 'verbose' is specified, a NOTICE message is emitted for
each discarded row, providing additional information such as line
number, column name, and the malformed value. This helps users to
identify problematic rows that failed to load.

Author: Bharath Rupireddy
Reviewed-by: Michael Paquier, Atsushi Torikoshi, Masahiko Sawada
Discussion: https://www.postgresql.org/message-id/CALj2ACUk700cYhx1ATRQyRw-fBM%2BaRo6auRAitKGff7XNmYfqQ%40mail.gmail.com
2024-04-01 15:25:25 +09:00
Dean Rasheed 0294df2f1f Add support for MERGE ... WHEN NOT MATCHED BY SOURCE.
This allows MERGE commands to include WHEN NOT MATCHED BY SOURCE
actions, which operate on rows that exist in the target relation, but
not in the data source. These actions can execute UPDATE, DELETE, or
DO NOTHING sub-commands.

This is in contrast to already-supported WHEN NOT MATCHED actions,
which operate on rows that exist in the data source, but not in the
target relation. To make this distinction clearer, such actions may
now be written as WHEN NOT MATCHED BY TARGET.

Writing WHEN NOT MATCHED without specifying BY SOURCE or BY TARGET is
equivalent to writing WHEN NOT MATCHED BY TARGET.

Dean Rasheed, reviewed by Alvaro Herrera, Ted Yu and Vik Fearing.

Discussion: https://postgr.es/m/CAEZATCWqnKGc57Y_JanUBHQXNKcXd7r=0R4NEZUVwP+syRkWbA@mail.gmail.com
2024-03-30 10:00:26 +00:00
Masahiko Sawada f1bb9284f7 Improve tab completion for ALTER TABLE ALTER COLUMN SET in psql.
The commit changes the tab completion to add DATA TYPE after
ALTER TABLE ... ALTER COLUMN ... SET.

Author: Vignesh C
Reviewed-by: Shubham Khanna, Masahiko Sawada
Discussion: https://postgr.es/m/CALDaNm1aEdJb-QJi%3DGWStkfj_%2BEDUK_VtDkn%2BTjQ2z7HyU0MBw%40mail.gmail.com
2024-03-28 16:30:10 +09:00
Amit Kapila 677a45c4ae Fix random failure in 004_subscription.
After the upgrade, the failed test was ensuring that the changes made on
the publisher should be replicated to the subscriber. We missed waiting
for one of the subscriptions to catch up.

Per buildfarm

Author: Vignesh C
Reviewed-by: Kuroda Hayato
Discussion: https://postgr.es/m/CALDaNm0z=fLtio1h50K8WossUGXU+gy0H9y9=RYh1DDZiq2EDw@mail.gmail.com
2024-03-27 14:15:03 +05:30
Peter Eisentraut 8c4f2d5475 Message fixes for pg_createsubscriber
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://www.postgresql.org/message-id/20240326.140116.1116279856046587865.horikyota.ntt@gmail.com
2024-03-26 08:22:46 +01:00
Nathan Bossart 3ff01b2b6e Adjust pgbench option for debug mode.
Many other utilities use -d to specify the database to use, but
pgbench uses it to enable debug mode.  This is causing some users
to accidentally enable it.  This commit changes -d to accept the
database name and introduces --dbname.  Debug mode can still be
enabled with --debug.  This is a backward-incompatible change, but
it has been judged to be worth the trade-off, i.e., some scripts
that use pgbench will need to be updated.

Author: Greg Sabino Mullane
Reviewed-by: Tomas Vondra, Euler Taveira, Alvaro Herrera, David Christensen
Discussion: https://postgr.es/m/CAKAnmmLjAzwVtb%3DVEaeuCtnmOLpzkJ1uJ_XiQ362YdD9B72HSg%40mail.gmail.com
2024-03-25 11:08:53 -05:00
Alvaro Herrera 374c7a2290
Allow specifying an access method for partitioned tables
It's now possible to specify a table access method via
CREATE TABLE ... USING for a partitioned table, as well change it with
ALTER TABLE ... SET ACCESS METHOD.  Specifying an AM for a partitioned
table lets the value be used for all future partitions created under it,
closely mirroring the behavior of the TABLESPACE option for partitioned
tables.  Existing partitions are not modified.

For a partitioned table with no AM specified, any new partitions are
created with the default_table_access_method.

Also add ALTER TABLE ... SET ACCESS METHOD DEFAULT, which reverts to the
original state of using the default for new partitions.

The relcache of partitioned tables is not changed: rd_tableam is not
set, even if a partitioned table has a relam set.

Author: Justin Pryzby <pryzby@telsasoft.com>
Author: Soumyadeep Chakraborty <soumyadeep2007@gmail.com>
Author: Michaël Paquier <michael@paquier.xyz>
Reviewed-by: The authors themselves
Discussion: https://postgr.es/m/CAE-ML+9zM4wJCGCBGv01k96qQ3gFv4WFcFy=zqPHKeaEFwwv6A@mail.gmail.com
Discussion: https://postgr.es/m/20210308010707.GA29832%40telsasoft.com
2024-03-25 16:30:36 +01:00
Peter Eisentraut d44032d014 pg_createsubscriber: creates a new logical replica from a standby server
It must be run on the target server and should be able to connect to
the source server (publisher) and the target server (subscriber).  All
tables in the specified database(s) are included in the logical
replication setup.  A pair of publication and subscription objects are
created for each database.

The main advantage of pg_createsubscriber over the common logical
replication setup is the initial data copy.  It also reduces the
catchup phase.

Some prerequisites must be met to successfully run it.  It is
basically the logical replication requirements.  It starts creating a
publication using FOR ALL TABLES and a replication slot for each
specified database.  Write recovery parameters into the target data
directory and start the target server.  It specifies the LSN of the
last replication slot (replication start point) up to which the
recovery will proceed.  Wait until the target server is promoted.
Create one subscription per specified database (using publication and
replication slot created in a previous step) on the target server.
Set the replication progress to the replication start point for each
subscription.  Enable the subscription for each specified database on
the target server.  And finally, change the system identifier on the
target server.

Author: Euler Taveira <euler.taveira@enterprisedb.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Shlok Kyal <shlok.kyal.oss@gmail.com>
Reviewed-by: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Shubham Khanna <khannashubham1197@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://www.postgresql.org/message-id/flat/5ac50071-f2ed-4ace-a8fd-b892cffd33eb@www.fastmail.com
2024-03-25 12:42:47 +01:00
Alexander Korotkov cc0e7ebd30 reindexdb: Fix warning about uninitialized indices_tables_cell
Initialize indices_tables_cell with NULL to silence the warning.  Also,
refactor the place of the first assignment of indices_tables_cell.

Reported-by: Thomas Munro, David Rowley, Tom Lane, Richard Guo
Discussion: https://postgr.es/m/2348025.1711332418%40sss.pgh.pa.us
Discussion: https://postgr.es/m/E1roXs4-005UdX-1V%40gemulon.postgresql.org
2024-03-25 11:40:25 +02:00
Alexander Korotkov 47f99a407d reindexdb: Add the index-level REINDEX with multiple jobs
Straight-forward index-level REINDEX is not supported with multiple jobs as
we cannot control the concurrent processing of multiple indexes depending on
the same relation.  Instead, we dedicate the whole table to certain reindex
job.  Thus, if indexes in the lists belong to different tables, that gives us
a fair level of parallelism.

This commit teaches get_parallel_object_list() to fetch table names for
indexes in the case of index-level REINDEX.  The same tables are grouped
together in the output order, and the list of indexes is also rebuilt to
match that order.  Later during processingof that list, we push indexes
belonging to the same table into the same job.

Discussion: https://postgr.es/m/CACG%3DezZU_VwDi-1PN8RUSE6mcYG%2BYx1NH_rJO4%2BKe-mKqLp%3DNw%40mail.gmail.com
Author: Maxim Orlov, Svetlana Derevyanko, Alexander Korotkov
Reviewed-by: Michael Paquier
2024-03-25 02:07:15 +02:00
Tom Lane d37e0d0c50 Release PQconninfoOptions array in GetDbnameFromConnectionOptions().
It wasn't getting freed in one code path, which Coverity identified as
a resource leak.  It's probably of little consequence, but re-ordering
the code into the correct sequence is no more work than dismissing the
complaint.  Minor oversight in commit a145f424d.

While here, improve the unreasonably clunky coding of
FindDbnameInConnParams: use of an output parameter is unnecessary
and prone to uninitialized-variable problems.
2024-03-24 12:31:05 -04:00
Tom Lane 225e1dde46 Release temporary array in check_for_data_types_usage().
Coverity identified this as a resource leak.  It's surely of no
consequence given that the function is called only once per run, but
freeing the storage is no more work than dismissing the complaint.
Minor oversight in commit 347758b12.
2024-03-24 12:13:35 -04:00
Amit Kapila 6ae701b437 Track invalidation_reason in pg_replication_slots.
Till now, the reason for replication slot invalidation is not tracked
directly in pg_replication_slots. A recent commit 007693f2a3 added
'conflict_reason' to show the reasons for slot conflict/invalidation, but
only for logical slots.

This commit adds a new column 'invalidation_reason' to show invalidation
reasons for both physical and logical slots. And, this commit also turns
'conflict_reason' text column to 'conflicting' boolean column (effectively
reverting commit 007693f2a3). The 'conflicting' column is true for
invalidation reasons 'rows_removed' and 'wal_level_insufficient' because
those make the slot conflict with recovery. When 'conflicting' is true,
one can now look at the new 'invalidation_reason' column for the reason
for the logical slot's conflict with recovery.

The new 'invalidation_reason' column will also be useful to track other
invalidation reasons in the future commit.

Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/ZfR7HuzFEswakt/a%40ip-10-97-1-34.eu-west-3.compute.internal
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
2024-03-22 13:52:05 +05:30
Daniel Gustafsson 7e65ad197f Fix dumping role comments when using --no-role-passwords
Commit 9a83d56b38 added support for allowing pg_dumpall to dump
roles without including passwords, which accidentally made dumps
omit COMMENTs on roles.  This fixes it by using pg_authid to get
the comment.

Backpatch to all supported versions. Patch simultaneously written
independently by Álvaro and myself.

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Author: Daniel Gustafsson <daniel@yesql.se>
Reported-by: Bartosz Chroł <bartosz.chrol@handen.pl>
Discussion: https://postgr.es/m/AS8P194MB1271CDA0ADCA7B75FCD8E767F7332@AS8P194MB1271.EURP194.PROD.OUTLOOK.COM
Discussion: https://postgr.es/m/CAEP4nAz9V4H41_4ESJd1Gf0v%3DdevkqO1%3Dpo91jUw-GJSx8Hxqg%40mail.gmail.com
Backpatch-through: v12
2024-03-21 23:31:57 +01:00
Amit Kapila a145f424d5 Allow dbname to be written as part of connstring via pg_basebackup's -R option.
Commit cca97ce6a6 allowed dbname in pg_basebackup connstring and in this
commit we allow it to be written in postgresql.auto.conf when -R option is
used. The database name in the connection string will be used by the
logical replication slot synchronization on standby.

The dbname will be recorded only if specified explicitly in the connection
string or environment variable.

Masahiko Sawada hasn't reviewed the code in detail but endorsed the idea.

Author: Vignesh C, Kuroda Hayato
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/CAB8KJ=hdKdg+UeXhReeHpHA6N6v3e0qFF+ZsPFHk9_ThWKf=2A@mail.gmail.com
2024-03-21 10:50:33 +05:30
Nathan Bossart 2b520860c0 Revert "Temporary patch to help debug pg_walsummary test failures."
Thanks to commits ea18eb7d62, b6ee30ec08, and 19a829a327, the
002_blocks.pl test now consistently passes, so we can remove this
temporary debugging code.

This reverts commit 5ddf997347.

Discussion: https://postgr.es/m/20240314210010.GA3056455%40nathanxps13
2024-03-20 11:34:00 -05:00