postgresql

Commit Graph

Author	SHA1	Message	Date
Andrew Dunstan	88620824c2	Tidy up after incremental JSON parser patch Remove junk left over from non-vpath builds. Try to remedy gettext error on some platforms.	2024-04-04 12:41:55 -04:00
Andrew Dunstan	1b00fe30a6	Fix warnings re typedef redefinition in `ea7b4e9a2a` and `3311ea86ed` Per gripe from Tom Lane and the buildfarm	2024-04-04 11:36:26 -04:00
Amit Langote	6f4d63e989	Add missing initialization in transformJsonFuncExpr() `de3600452b` added some code for the new JSON_TABLE_OP to that function but missed to initialize the default_format variable. Reported-by: Erik Rijkers <er@xs4all.nl> Discussion: https://postgr.es/m/254b2fa2-2f6b-a30a-20ee-21f8a2c12a50@xs4all.nl	2024-04-04 22:01:13 +09:00
Amit Langote	2f6e78b061	Fix typo introduced in `6185c9737` Reported-by: Jian He <jian.universality@gmail.com> Discussion: https://postgr.es/m/CACJufxGHiU0p0usjh5hnR0_ByZn4tq1FC3eKAtrQgJeKU6W9kw@mail.gmail.com	2024-04-04 20:53:23 +09:00
Amit Langote	de3600452b	Add basic JSON_TABLE() functionality JSON_TABLE() allows JSON data to be converted into a relational view and thus used, for example, in a FROM clause, like other tabular data. Data to show in the view is selected from a source JSON object using a JSON path expression to get a sequence of JSON objects that's called a "row pattern", which becomes the source to compute the SQL/JSON values that populate the view's output columns. Column values themselves are computed using JSON path expressions applied to each of the JSON objects comprising the "row pattern", for which the SQL/JSON query functions added in `6185c9737c` are used. To implement JSON_TABLE() as a table function, this augments the TableFunc and TableFuncScanState nodes that are currently used to support XMLTABLE() with some JSON_TABLE()-specific fields. Note that the JSON_TABLE() spec includes NESTED COLUMNS and PLAN clauses, which are required to provide more flexibility to extract data out of nested JSON objects, but they are not implemented here to keep this commit of manageable size. Author: Nikita Glukhov <n.gluhov@postgrespro.ru> Author: Teodor Sigaev <teodor@sigaev.ru> Author: Oleg Bartunov <obartunov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Andrew Dunstan <andrew@dunslane.net> Author: Amit Langote <amitlangote09@gmail.com> Author: Jian He <jian.universality@gmail.com> Reviewers have included (in no particular order): Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby, Álvaro Herrera, Jian He Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com	2024-04-04 20:20:15 +09:00
Peter Eisentraut	a9d6c38684	pg_upgrade: Fix typo in message	2024-04-04 12:58:57 +02:00
Andrew Dunstan	222e11a10a	Use incremental parsing of backup manifests. This changes the three callers to json_parse_manifest() to use json_parse_manifest_incremental_chunk() if appropriate. In the case of the backend caller, since we don't know the size of the manifest in advance we always call the incremental parser. Author: Andrew Dunstan Reviewed-By: Jacob Champion Discussion: https://postgr.es/m/7b0a51d6-0d9d-7366-3a1a-f74397a02f55@dunslane.net	2024-04-04 06:46:40 -04:00
Andrew Dunstan	ea7b4e9a2a	Add support for incrementally parsing backup manifests This adds the infrastructure for using the new non-recursive JSON parser in processing manifests. It's important that callers make sure that the last piece of json handed to the incremental manifest parser contains the entire last few lines of the manifest, including the checksum. Author: Andrew Dunstan Reviewed-By: Jacob Champion Discussion: https://postgr.es/m/7b0a51d6-0d9d-7366-3a1a-f74397a02f55@dunslane.net	2024-04-04 06:46:40 -04:00
Andrew Dunstan	3311ea86ed	Introduce a non-recursive JSON parser This parser uses an explicit prediction stack, unlike the present recursive descent parser where the parser state is represented on the call stack. This difference makes the new parser suitable for use in incremental parsing of huge JSON documents that cannot be conveniently handled piece-wise by the recursive descent parser. One potential use for this will be in parsing large backup manifests associated with incremental backups. Because this parser is somewhat slower than the recursive descent parser, it is not replacing that parser, but is an additional parser available to callers. For testing purposes, if the build is done with -DFORCE_JSON_PSTACK, all JSON parsing is done with the non-recursive parser, in which case only trivial regression differences in error messages should be observed. Author: Andrew Dunstan Reviewed-By: Jacob Champion Discussion: https://postgr.es/m/7b0a51d6-0d9d-7366-3a1a-f74397a02f55@dunslane.net	2024-04-04 06:46:40 -04:00
David Rowley	3a4a3537a9	Secondary refactor of heap scanning functions Similar to `44086b097`, refactor heap scanning functions to be more suitable for the read stream API. Author: Melanie Plageman Discussion: https://postgr.es/m/CAAKRu_YtXJiYKQvb5JsA2SkwrsizYLugs4sSOZh3EAjKUg=gEQ@mail.gmail.com	2024-04-04 19:22:45 +13:00
Michael Paquier	2a217c3717	Coordinate emit_log_hook and all log destinations to share the same timeval This would cause the timestamp values used by emit_log_hook and all the other log destinations to differ, because the timestamps are reset before sending the logs to the server and after calling the hook. This change matters for emit_log_hook when generating log information with 'n' or 'm' in log_line_prefix through log_status_format(), or when doing direct calls to get_formatted_log_time() like in the JSON or CSV log formats. While on it, this commit fixes a couple of comments related to the formatted timestamps where the JSON was not mentioned. Oversight in `dc686681e0`, that I have noticed while reviewing this patch. Author: Kambam Vinay, Michael Paquier Discussion: https://postgr.es/m/CANiRfmsK36A0i8mnQtzaxhSm3CUCimPwJPp4WQNq53OdSNkgWg@mail.gmail.com	2024-04-04 14:15:22 +09:00
David Rowley	44086b0975	Preliminary refactor of heap scanning functions To allow the use of the read stream API added in `b5a9b18cd` for sequential scans on heap tables, here we make some adjustments to make that change less invasive and perhaps make the code easier to follow in the process. Here heapgetpage() gets broken into two functions: 1) The part which reads the block has now been moved into a function named heapfetchbuf(). 2) The part which performed pruning and populated the scan's rs_vistuples[] array is now moved into a new function named heap_prepare_pagescan(). The functionality provided by heap_prepare_pagescan() was only ever required by SO_ALLOW_PAGEMODE scans, so the branching that was previously done in heapgetpage() is no longer needed as we simply just don't call heap_prepare_pagescan() from heapgettup() in the refactored code. Author: Melanie Plageman Discussion: https://postgr.es/m/CAAKRu_YtXJiYKQvb5JsA2SkwrsizYLugs4sSOZh3EAjKUg=gEQ@mail.gmail.com	2024-04-04 16:41:13 +13:00
Michael Paquier	85230a247c	pg_regress: Save errno in emit_tap_output_v() and switch to %m emit_tap_output_v() includes some fprintf() calls for some output related to the TAP protocol, that may clobber errno and break %m. This commit makes the logging of pg_regress smarter by saving errno before restoring it in vfprintf() where the input strings are used, removing the need for strerror(). All logs are switched to %m rather than strerror(), shaving some code. This was not a problem until now as pg_regress.c did not use %m, but the change is simple enough that we have no reason to not support this placeholder, and that will avoid future mistakes if new logs that include %m are added. Author: Dagfinn Ilmari Mannsåker Reviewed-by: Peter Eisentraunt, Michael Paquier Discussion: https://postgr.es/m/87sf13jhuw.fsf@wibble.ilmari.org	2024-04-04 11:33:07 +09:00
Jeff Davis	71b66171d0	CREATE INDEX: do not update stats during binary upgrade. During binary upgrade, indexes are created before the data is moved into place, so it will always be zero. This is not currently a major problem, but will be when we try to preserve statistics during upgrade. Author: Corey Huinker Discussion: https://postgr.es/m/CADkLM=daPdFB8V0tgFxK-dLowFsAEzWRWJHyxij7BG3kBjcouA@mail.gmail.com	2024-04-03 16:12:45 -07:00
Tom Lane	06286709ee	Invent SERIALIZE option for EXPLAIN. EXPLAIN (ANALYZE, SERIALIZE) allows collection of statistics about the volume of data emitted by a query, as well as the time taken to convert the data to the on-the-wire format. Previously there was no way to investigate this without actually sending the data to the client, in which case network transmission costs might swamp what you wanted to see. In particular this feature allows investigating the costs of de-TOASTing compressed or out-of-line data during formatting. Stepan Rutz and Matthias van de Meent, reviewed by Tomas Vondra and myself Discussion: https://postgr.es/m/ca0adb0e-fa4e-c37e-1cd7-91170b18cae1@gmx.de	2024-04-03 17:41:57 -04:00
Alexander Korotkov	97ce821e3e	Fix the parameters order for TableAmRoutine.relation_copy_for_cluster() Specify OldTable first, NewTable second as used by table_relation_copy_for_cluster() and as implemented in heapam_relation_copy_for_cluster(). Backpatch to PostgreSQL 12, where TableAmRoutine was introduced. Discussion: https://postgr.es/m/ME3P282MB3166860D4911AE82F92DF7C5B63F2%40ME3P282MB3166.AUSP282.PROD.OUTLOOK.COM Author: Japin Li Reviewed-by: Pavel Borisov Backpatch-through: 12	2024-04-04 00:34:28 +03:00
Alvaro Herrera	c9920a9068	Split XLogCtl->LogwrtResult into separate struct members After this change we have XLogCtl->logWriteResult and ->logFlushResult. There's no functional change, other than the fact that the assignment from shared memory to local is no longer done via struct assignment, but instead using a macro that copies each member separately. The current representation is inconvenient going forward; notably, we would like to add a new member "Copy" (to keep track of the last position copied into WAL buffers), so the symmetry between the values in shared memory vs. those in local would be lost. This also gives us freedom to later change the concurrency model for the values in shared memory: we can make them use atomics instead of relying on the info_lck spinlock. Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Discussion: https://postgr.es/m/202404031119.cd2kugjk2vho@alvherre.pgsql	2024-04-03 19:55:11 +02:00
Nathan Bossart	deb1486c7d	Inline pg_popcount() for small buffers. If there aren't many bytes to process, the function call overhead of the optimized implementation isn't worth taking, so instead we inline a loop that consults pg_number_of_ones in that case. If there are many bytes to process, we accept the function call overhead because the optimized versions are likely to be faster. The threshold at which we use the optimized implementation is set to the smallest amount of data required to use special popcount instructions. Reviewed-by: Alvaro Herrera, Tom Lane Discussion: https://postgr.es/m/20240402155301.GA2750455%40nathanxps13	2024-04-03 12:22:02 -05:00
Heikki Linnakangas	6dbb490261	Combine freezing and pruning steps in VACUUM Execute both freezing and pruning of tuples in the same heap_page_prune() function, now called heap_page_prune_and_freeze(), and emit a single WAL record containing all changes. That reduces the overall amount of WAL generated. This moves the freezing logic from vacuumlazy.c to the heap_page_prune_and_freeze() function. The main difference in the coding is that in vacuumlazy.c, we looked at the tuples after the pruning had already happened, but in heap_page_prune_and_freeze() we operate on the tuples before pruning. The heap_prepare_freeze_tuple() function is now invoked after we have determined that a tuple is not going to be pruned away. VACUUM no longer needs to loop through the items on the page after pruning. heap_page_prune_and_freeze() does all the work. It now returns the list of dead offsets, including existing LP_DEAD items, to the caller. Similarly it's now responsible for tracking 'all_visible', 'all_frozen', and 'hastup' on the caller's behalf. Author: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://www.postgresql.org/message-id/20240330055710.kqg6ii2cdojsxgje@liskov	2024-04-03 19:32:28 +03:00
Heikki Linnakangas	26d138f644	Refactor how heap_prune_chain() updates prunable_xid In preparation of freezing and counting tuples which are not candidates for pruning, split heap_prune_record_unchanged() into multiple functions, depending the kind of line pointer. That's not too interesting right now, but makes the next commit smaller. Recording the lowest soon-to-be prunable xid is one of the actions we take for unchanged LP_NORMAL item pointers but not for others, so move that to the new heap_prune_record_unchanged_lp_normal() function. The next commit will add more actions to these functions. Author: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://www.postgresql.org/message-id/20240330055710.kqg6ii2cdojsxgje@liskov	2024-04-03 19:32:21 +03:00
Alvaro Herrera	be2f073100	Fix zeroing of pg_serial page without SLRU bank lock Bug in commit 53c2a97a9266: we failed to acquire the correct SLRU bank lock when iterating to zero-out intermediate pages in predicate.c. Rewrite the code block so that we follow the locking protocol correctly. Also update an outdated comment in the same file -- SerialSLRULock exists no more. Reported-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Discussion: https://postgr.es/m/2a25eaf4-a3a4-5fd1-6241-9d7c73142085@gmail.com	2024-04-03 17:49:44 +02:00
Alexander Korotkov	bf1e650806	Use the pairing heap instead of a flat array for LSN replay waiters `06c418e163` introduced pg_wal_replay_wait() procedure allowing to wait for the particular LSN to be replayed on standby. The waiters were stored in the flat array. Even though scanning small arrays is fast, that might be a problem at scale (a lot of waiting processes). This commit replaces the flat shared memory array with the pairing heap, which holds the waiter with the least LSN at the top. This gives us O(log N) complexity for both inserting and removing waiters. Reported-by: Alvaro Herrera Discussion: https://postgr.es/m/202404030658.hhj3vfxeyhft%40alvherre.pgsql	2024-04-03 18:15:41 +03:00
Daniel Gustafsson	936e3fa378	Drop global objects after completed test Project policy is to not leave global objects behind after a regress test run. This was found as a result of the development of a patch to make pg_regress detect such leftovers automatically, which in the end was withdrawn due to issues with parallel runs. Discussion: https://postgr.es/m/E1phvk7-000VAH-7k@gemulon.postgresql.org	2024-04-03 13:33:25 +02:00
Amit Kapila	2ec005b4e2	Ensure that the sync slots reach a consistent state after promotion without losing data. We were directly copying the LSN locations while syncing the slots on the standby. Now, it is possible that at some particular restart_lsn there are some running xacts, which means if we start reading the WAL from that location after promotion, we won't reach a consistent snapshot state at that point. However, on the primary, we would have already been in a consistent snapshot state at that restart_lsn so we would have just serialized the existing snapshot. To avoid this problem we will use the advance_slot functionality unless the snapshot already exists at the synced restart_lsn location. This will help us to ensure that snapbuilder/slot statuses are updated properly without generating any changes. Note that the synced slot will remain as RS_TEMPORARY till the decoding from corresponding restart_lsn can reach a consistent snapshot state after which they will be marked as RS_PERSISTENT. Per buildfarm Author: Hou Zhijie Reviewed-by: Bertrand Drouvot, Shveta Malik, Bharath Rupireddy, Amit Kapila Discussion: https://postgr.es/m/OS0PR01MB5716B3942AE49F3F725ACA92943B2@OS0PR01MB5716.jpnprd01.prod.outlook.com	2024-04-03 14:04:59 +05:30
Alexander Korotkov	e37662f221	Minor improvements for waitlsn.c * Remove extra includes * Fill 'cur' in addLSNWaiter() before taking the spinlock * Initialize 'endtime' with zero in WaitForLSN() to avoid compiler warning Reported-by: Alvaro Herrera, Masahiko Sawada, Daniel Gustafsson Discussion: https://postgr.es/m/202404030658.hhj3vfxeyhft%40alvherre.pgsql Discussion: https://postgr.es/m/CAD21AoAx7irptnPH1OkkkNh9E0M6X-phfX7sYZfwoMsc1qV1sQ%40mail.gmail.com	2024-04-03 11:32:39 +03:00
Daniel Gustafsson	9301308bd1	Fix indentation from `cafe105655` Per buildfarm animal koel	2024-04-03 09:44:47 +02:00
Daniel Gustafsson	226261f387	Add error codes to some PANIC/FATAL errors reports This adds errcodes to a set of PANIC and FATAL errors in xlog.c and relcache.c, which previously had no errcode at all set, in order to make fleetwide analysis of errorlogs easier. There are many more ereport/elogs left which could benefit from having an errcode but this at least makes a dent in the issue. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/CAN55FZ1k8LgLEqncPGmz_fWnrobV6bjABOTH4tOWta6xNcPQig@mail.gmail.com	2024-04-03 09:19:25 +02:00
Nathan Bossart	c627d944e6	Add built-in ERROR handling for archive callbacks. Presently, the archiver process restarts when an archive callback ERRORs. To avoid this, archive module authors can use sigsetjmp(), manage a memory context, etc., but that requires a lot of extra code that will likely look roughly the same between modules. This commit adds basic archive callback ERROR handling to pgarch.c so that module authors won't ordinarily need to worry about this. While this built-in handler attempts to clean up anything that an archive module could conceivably have left behind, it is possible that some modules are doing unexpected things that require additional cleanup. Module authors should be sure to do any extra required cleanup in a PG_CATCH block within the archiving callback. The archiving callback is now called in a short-lived memory context that the archiver process resets between invocations. If a module requires longer-lived storage, it must maintain its own memory context. Thanks to these changes, the basic_archive module can be greatly simplified. Suggested-by: Andres Freund Reviewed-by: Andres Freund, Yong Li Discussion: https://postgr.es/m/20230217215624.GA3131134%40nathanxps13	2024-04-02 22:28:11 -05:00
Masahiko Sawada	5bec1d6bc5	Improve eviction algorithm in ReorderBuffer using max-heap for many subtransactions. Previously, when selecting the transaction to evict during logical decoding, we check all transactions to find the largest transaction. This could lead to a significant replication lag especially in the case where there are many subtransactions. This commit improves the eviction algorithm in ReorderBuffer using the max-heap with transaction size as the key to efficiently find the largest transaction. The max-heap starts with empty. While the max-heap is empty, we don't do anything for the max-heap when updating the memory counter. Therefore, we get the largest transaction in O(N) time, where N is the number of transactions including top-level transactions and subtransactions. We build the max-heap just before selecting the largest transactions if the number of transactions being decoded is higher than the threshold, MAX_HEAP_TXN_COUNT_THRESHOLD. After building the max-heap, we also update the max-heap when updating the memory counter. The intention is to efficiently find the largest transaction in O(1) time instead of incurring the cost of memory counter updates (O(log N)). Once the number of transactions got lower than the threshold, we reset the max-heap. The performance benchmark results showed significant speed up (more than x30 speed up on my machine) in decoding a transaction with 100k subtransactions, whereas there is no visible overhead in other cases. Reviewed-by: Amit Kapila, Hayato Kuroda, Vignesh C, Ajin Cherian, Tomas Vondra, Shubham Khanna, Peter Smith, Álvaro Herrera, Euler Taveira Discussion: https://postgr.es/m/CAD21AoAfKTgrBrLq96GcTv9d6k97zaQcDM-rxfKEt4GSe0qnaQ%40mail.gmail.com	2024-04-03 11:40:42 +09:00
David Rowley	7487044d6c	Don't adjust ressortgroupref in generate_setop_child_grouplist() This is already done inside assignSortGroupRef(), therefore is redundant. Oversight from `66c0185a3`. Reported-by: Tom Lane Discussion: https://postgr.es/m/3703023.1711654574@sss.pgh.pa.us	2024-04-03 15:39:29 +13:00
Masahiko Sawada	b840508644	Add functions to binaryheap for efficient key removal and update. Previously, binaryheap didn't support updating a key and removing a node in an efficient way. For example, in order to remove a node from the binaryheap, the caller had to pass the node's position within the array that the binaryheap internally has. Removing a node from the binaryheap is done in O(log n) but searching for the key's position is done in O(n). This commit adds a hash table to binaryheap in order to track the position of each nodes in the binaryheap. That way, by using newly added functions such as binaryheap_update_up() etc., both updating a key and removing a node can be done in O(1) on an average and O(log n) in worst case. This is known as the indexed binary heap. The caller can specify to use the indexed binaryheap by passing indexed = true. The current code does not use the new indexing logic, but it will be used by an upcoming patch. Reviewed-by: Vignesh C, Peter Smith, Hayato Kuroda, Ajin Cherian, Tomas Vondra, Shubham Khanna Discussion: https://postgr.es/m/CAD21AoDffo37RC-eUuyHJKVEr017V2YYDLyn1xF_00ofptWbkg%40mail.gmail.com	2024-04-03 10:44:21 +09:00
Masahiko Sawada	bcb14f4abc	Make binaryheap enlargeable. The node array space of the binaryheap is doubled when there is no available space. Reviewed-by: Vignesh C, Peter Smith, Hayato Kuroda, Ajin Cherian, Tomas Vondra, Shubham Khanna Discussion: https://postgr.es/m/CAD21AoDffo37RC-eUuyHJKVEr017V2YYDLyn1xF_00ofptWbkg%40mail.gmail.com	2024-04-03 10:27:43 +09:00
Alexander Korotkov	2c91e13013	Move WaitLSNShmemInit() to CreateOrAttachShmemStructs() Thanks to Andres Freund, Thomas Munrom and David Rowley for investigating this issue. Discussion: https://postgr.es/m/CAPpHfdvap5mMLikt8CUjA0osAvCJHT0qnYeR3f84EJ_Kvse0mg%40mail.gmail.com	2024-04-03 02:55:03 +03:00
David Rowley	3b1a7eb289	Don't zero tuple_fraction when planning UNIONs with ORDER BYs Since `66c0185a3`, the planner is able to use Merge Append -> Unique to implement UNION queries and each subquery is prompted to produce Paths correctly sorted by the UNION's targetlist. Here we remove some now redundant code which was zeroing the tuple_fraction at the parent level. This will allow the planner to consider cheap startup paths when planning the UNION's subqueries. EXCEPT and INTERSECT set operations still have the tuple_fraction zeroed in generate_nonunion_paths(). These operations currently always read all of their subqueries' tuples. Reported-by: Tom Lane Discussion: https://postgr.es/m/3703023.1711654574@sss.pgh.pa.us	2024-04-03 11:40:33 +13:00
Alexander Korotkov	06c418e163	Implement pg_wal_replay_wait() stored procedure pg_wal_replay_wait() is to be used on standby and specifies waiting for the specific WAL location to be replayed before starting the transaction. This option is useful when the user makes some data changes on primary and needs a guarantee to see these changes on standby. The queue of waiters is stored in the shared memory array sorted by LSN. During replay of WAL waiters whose LSNs are already replayed are deleted from the shared memory array and woken up by setting of their latches. pg_wal_replay_wait() needs to wait without any snapshot held. Otherwise, the snapshot could prevent the replay of WAL records implying a kind of self-deadlock. This is why it is only possible to implement pg_wal_replay_wait() as a procedure working in a non-atomic context, not a function. Catversion is bumped. Discussion: https://postgr.es/m/eb12f9b03851bb2583adab5df9579b4b%40postgrespro.ru Author: Kartyshov Ivan, Alexander Korotkov Reviewed-by: Michael Paquier, Peter Eisentraut, Dilip Kumar, Amit Kapila Reviewed-by: Alexander Lakhin, Bharath Rupireddy, Euler Taveira	2024-04-02 22:48:03 +03:00
Tom Lane	6faca9ae28	Avoid deadlock during orphan temp table removal. If temp tables have dependencies (such as sequences) then it's possible for autovacuum's cleanup of orphan temp tables to deadlock against an incoming backend that's trying to clean out the temp namespace for its own use. That can happen because RemoveTempRelations' performDeletion call can visit objects within the namespace in an order different from the order in which a per-table deletion will visit them. To fix, observe that performDeletion will begin by taking an exclusive lock on the temp namespace (even though it won't actually delete it). So, if we can get a shared lock on the namespace, we can be sure we're not running concurrently with RemoveTempRelations, while also not conflicting with ordinary use of the namespace. This requires introducing a conditional version of LockDatabaseObject, but that's no big deal. (It's surprising we've got along without that this long.) Report and patch by Mikhail Zhilin. Back-patch to all supported branches. Discussion: https://postgr.es/m/c43ce028-2bc2-4865-9b89-3f706246eed5@postgrespro.ru	2024-04-02 14:59:32 -04:00
Nathan Bossart	4133c1f45c	Avoid function call overhead of pg_popcount() in syslogger.c. Instead of calling the pg_popcount() function for a single byte, we can look up the value in the pg_number_of_ones array. Discussion: https://postgr.es/m/20240401221117.GB2362108%40nathanxps13	2024-04-02 10:32:49 -05:00
Nathan Bossart	6687430c98	Refactor code for setting pg_popcount* function pointers. Presently, there are three copies of this code, and a proposed follow-up patch would add more code to each copy. This commit introduces a new inline function for this code and makes use of it in the pg_popcount*_choose functions, thereby reducing code duplication. Author: Paul Amonson Discussion: https://postgr.es/m/BL1PR11MB5304097DF7EA81D04C33F3D1DCA6A%40BL1PR11MB5304.namprd11.prod.outlook.com	2024-04-02 10:16:00 -05:00
Tom Lane	38698dd38e	Unwind #if spaghetti in hmac_openssl.c a bit. Make this code a little less confusing by defining a separate macro that controls whether we'll use ResourceOwner facilities to track the existence of a pg_hmac_ctx context. The proximate reason to touch this is that since `b8bff07da`, we got "unused function" warnings if building with older OpenSSL, because the #if guards around the ResourceOwner wrapper function definitions were different from those around the calls of those functions. Pulling the ResourceOwner machinations outside of the #ifdef HAVE_xxx guards fixes that and makes the code clearer too. Discussion: https://postgr.es/m/1394271.1712016101@sss.pgh.pa.us	2024-04-02 10:41:44 -04:00
Robert Haas	cafe105655	Allow SIGINT to cancel psql database reconnections. After installing the SIGINT handler in psql, SIGINT can no longer cancel database reconnections. For instance, if the user starts a reconnection and then needs to do some form of interaction (ie psql is polling), there is no way to cancel the reconnection process currently. Use PQconnectStartParams() in order to insert a cancel_pressed check into the polling loop. Tristan Partin, reviewed by Gurjeet Singh, Heikki Linnakangas, Jelte Fennema-Nio, and me. Discussion: http://postgr.es/m/D08WWCPVHKHN.3QELIKZJ2D9RZ@neon.tech	2024-04-02 10:26:10 -04:00
Robert Haas	f5e4dedfa8	Expose PQsocketPoll via libpq This is useful when connecting to a database asynchronously via PQconnectStart(), since it handles deciding between poll() and select(), and some of the required boilerplate. Tristan Partin, reviewed by Gurjeet Singh, Heikki Linnakangas, Jelte Fennema-Nio, and me. Discussion: http://postgr.es/m/D08WWCPVHKHN.3QELIKZJ2D9RZ@neon.tech	2024-04-02 10:15:56 -04:00
Thomas Munro	b5a9b18cd0	Provide API for streaming relation data. Introduce an abstraction allowing relation data to be accessed as a stream of buffers, with an implementation that is more efficient than the equivalent sequence of ReadBuffer() calls. Client code supplies a callback that can say which block number it wants next, and then consumes individual buffers one at a time from the stream. This division puts read_stream.c in control of how far ahead it can see and allows it to read clusters of neighboring blocks with StartReadBuffers(). It also issues POSIX_FADV_WILLNEED advice ahead of time when random access is detected. Other variants of I/O stream will be proposed in future work (for example to support recovery, whose LsnReadQueue device in xlogprefetcher.c is a distant cousin of this code and should eventually be replaced by this), but this basic API is sufficient for many common executor usage patterns involving predictable access to a single fork of a single relation. Several patches using this API are proposed separately. This stream concept is loosely based on ideas from Andres Freund on how we should pave the way for later work on asynchronous I/O. Author: Thomas Munro <thomas.munro@gmail.com> Author: Heikki Linnakangas <hlinnaka@iki.fi> (contributions) Author: Melanie Plageman <melanieplageman@gmail.com> (contributions) Suggested-by: Andres Freund <andres@anarazel.de> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Tested-by: Tomas Vondra <tomas.vondra@enterprisedb.com> Discussion: https://postgr.es/m/CA+hUKGJkOiOCa+mag4BF+zHo7qo=o9CFheB8=g6uT5TUm2gkvA@mail.gmail.com	2024-04-03 00:49:46 +13:00
Thomas Munro	210622c60e	Provide vectored variant of ReadBuffer(). Break ReadBuffer() up into two steps. StartReadBuffers() and WaitReadBuffers() give us two main advantages: 1. Multiple consecutive blocks can be read with one system call. 2. Advice (hints of future reads) can optionally be issued to the kernel ahead of time. The traditional ReadBuffer() function is now implemented in terms of those functions, to avoid duplication. A new GUC io_combine_limit is defined, and the functions for limiting per-backend pin counts are made into public APIs. Those are provided for use by callers of StartReadBuffers(), when deciding how many buffers to read at once. The following commit will add a higher level mechanism for doing that automatically with a practical interface. With some more infrastructure in later work, StartReadBuffers() could be extended to start real asynchronous I/O instead of just issuing advice and leaving WaitReadBuffers() to do the work synchronously. Author: Thomas Munro <thomas.munro@gmail.com> Author: Andres Freund <andres@anarazel.de> (some optimization tweaks) Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Andres Freund <andres@anarazel.de> Tested-by: Tomas Vondra <tomas.vondra@enterprisedb.com> Discussion: https://postgr.es/m/CA+hUKGJkOiOCa+mag4BF+zHo7qo=o9CFheB8=g6uT5TUm2gkvA@mail.gmail.com	2024-04-03 00:23:20 +13:00
Alvaro Herrera	13b3b62746	Don't use the pg_am system catalog in new test This causes deadlocks because it's a highly trafficked catalog. Use a regular table created by the same test instead. Discussion: https://postgr.es/m/f3e61e27-19d0-5e40-3eb2-53282fa0532a@gmail.com	2024-04-02 13:10:16 +02:00
Alexander Korotkov	867cc7b6dd	Revert "Custom reloptions for table AM" This reverts commit `c95c25f9af` due to multiple design issues spotted after commit. Reported-by: Jeff Davis Discussion: https://postgr.es/m/11550b536211d5748bb2865ed6cb3502ff073bf7.camel%40j-davis.com	2024-04-02 11:29:00 +03:00
Masahiko Sawada	667e65aac3	Use TidStore for dead tuple TIDs storage during lazy vacuum. Previously, we used a simple array for storing dead tuple IDs during lazy vacuum, which had a number of problems: * The array used a single allocation and so was limited to 1GB. * The allocation was pessimistically sized according to table size. * Lookup with binary search was slow because of poor CPU cache and branch prediction behavior. This commit replaces that array with the TID store from commit `30e144287a`. Since the backing radix tree makes small allocations as needed, the 1GB limit is now gone. Further, the total memory used is now often smaller by an order of magnitude or more, depending on the distribution of blocks and offsets. These two features should make multiple rounds of heap scanning and index cleanup an extremely rare event. TID lookup during index cleanup is also several times faster, even more so when index order is correlated with heap tuple order. Since there is no longer a predictable relationship between the number of dead tuples vacuumed and the space taken up by their TIDs, the number of tuples no longer provides any meaningful insights for users, nor is the maximum number predictable. For that reason this commit also changes to byte-based progress reporting, with the relevant columns of pg_stat_progress_vacuum renamed accordingly to max_dead_tuple_bytes and dead_tuple_bytes. For parallel vacuum, both the TID store and supplemental information specific to vacuum are shared among the parallel vacuum workers. As with the previous array, we don't take any locks on TidStore during parallel vacuum since writes are still only done by the leader process. Bump catalog version. Reviewed-by: John Naylor, (in an earlier version) Dilip Kumar Discussion: https://postgr.es/m/CAD21AoAfOZvmfR0j8VmZorZjL7RhTiQdVttNuC4W-Shdc2a-AA%40mail.gmail.com	2024-04-02 10:15:37 +09:00
David Rowley	d5d2205c8d	Fix assert failure when planning setop subqueries with CTEs `66c0185a3` adjusted the UNION planner to request that union child queries produce Paths correctly ordered to implement the UNION by way of MergeAppend followed by Unique. The code there made a bad assumption that if the root->parent_root->parse had setOperations set that the query must be the child subquery of a set operation. That's not true when it comes to planning a non-inlined CTE which is parented by a set operation. This causes issues as the CTE's targetlist has no requirement to match up to the SetOperationStmt's groupClauses Fix this by adding a new parameter to both subquery_planner() and grouping_planner() to explicitly pass the SetOperationStmt only when planning set operation child subqueries. Thank you to Tom Lane for helping to rationalize the decision on the best function signature for subquery_planner(). Reported-by: Alexander Lakhin Discussion: https://postgr.es/m/242fc7c6-a8aa-2daf-ac4c-0a231e2619c1@gmail.com	2024-04-02 12:15:45 +13:00
Tom Lane	3622c80846	Avoid "unused variable" warning on non-USE_SSL_ENGINE platforms. If we are building with openssl but USE_SSL_ENGINE didn't get set, initialize_SSL's variable "pkey" is declared but used nowhere. Apparently this combination hasn't been exercised in the buildfarm before now, because I've not seen this warning before, even though the code has been like this a long time. Move the declaration to silence the warning (and remove its useless initialization). Per buildfarm member sawshark. Back-patch to all supported branches.	2024-04-01 19:01:36 -04:00
Heikki Linnakangas	3d0f730bf1	Introduce 'options' argument to heap_page_prune() Currently there is only one option, HEAP_PAGE_PRUNE_MARK_UNUSED_NOW which replaces the old boolean argument, but upcoming patches will introduce at least one more. Having a lot of boolean arguments makes it hard to see at the call sites what the arguments mean, so prefer a bitmask of options with human-readable names. Author: Melanie Plageman <melanieplageman@gmail.com> Author: Heikki Linnakangas <heikki.linnakangas@iki.fi> Discussion: https://www.postgresql.org/message-id/20240401172219.fngjosaqdgqqvg4e@liskov	2024-04-02 00:56:05 +03:00
Tom Lane	959b38d770	Invent --transaction-size option for pg_restore. This patch allows pg_restore to wrap its commands into transaction blocks, somewhat like --single-transaction, except that we commit and start a new block after every N objects. Using this mode with a size limit of 1000 or so objects greatly reduces the number of transactions consumed by the restore, while preventing any one transaction from taking enough locks to overrun the receiving server's shared lock table. (A value of 1000 works well with the default lock table size of around 6400 locks. Higher --transaction-size values can be used if one has increased the receiving server's lock table size.) Excessive consumption of XIDs has been reported as a problem for pg_upgrade in particular, but it could be bad for any restore; and the change also reduces the number of fsyncs and amount of WAL generated, so it should provide speed benefits too. This patch does not try to make parallel workers batch the SQL commands they issue. The trouble with doing that is that other workers may need to see the objects a worker creates right away. Possibly this can be improved later. In this patch I have hard-wired pg_upgrade to use a transaction size of 1000 divided by the number of parallel restore jobs allowed (without that, we'd still be at risk of overrunning the shared lock table). Perhaps there would be value in adding another pg_upgrade option to allow user control of that, but I'm unsure that it's worth the trouble; I think few users would use it, and any who did would see not that much benefit compared to the default. Patch by me, but the original idea to batch SQL commands during restore is due to Robins Tharakan. Discussion: https://postgr.es/m/a9f9376f1c3343a6bb319dce294e20ac@EX13D05UWC001.ant.amazon.com	2024-04-01 16:46:24 -04:00
Tom Lane	a45c78e328	Rearrange pg_dump's handling of large objects for better efficiency. Commit `c0d5be5d6` caused pg_dump to create a separate BLOB metadata TOC entry for each large object (blob), but it did not touch the ancient decision to put all the blobs' data into a single "BLOBS" TOC entry. This is bad for a few reasons: for databases with millions of blobs, the TOC becomes unreasonably large, causing performance issues; selective restore of just some blobs is quite impossible; and we cannot parallelize either dump or restore of the blob data, since our architecture for that relies on farming out whole TOC entries to worker processes. To improve matters, let's group multiple blobs into each blob metadata TOC entry, and then make corresponding per-group blob data TOC entries. Selective restore using pg_restore's -l/-L switches is then possible, though only at the group level. (Perhaps we should provide a switch to allow forcing one-blob-per-group for users who need precise selective restore and don't have huge numbers of blobs. This patch doesn't do that, instead just hard-wiring the maximum number of blobs per entry at 1000.) The blobs in a group must all have the same owner, since the TOC entry format only allows one owner to be named. In this implementation we also require them to all share the same ACL (grants); the archive format wouldn't require that, but pg_dump's representation of DumpableObjects does. It seems unlikely that either restriction will be problematic for databases with huge numbers of blobs. The metadata TOC entries now have a "desc" string of "BLOB METADATA", and their "defn" string is just a newline-separated list of blob OIDs. The restore code has to generate creation commands, ALTER OWNER commands, and drop commands (for --clean mode) from that. We would need special-case code for ALTER OWNER and drop in any case, so the alternative of keeping the "defn" as directly executable SQL code for creation wouldn't buy much, and it seems like it'd bloat the archive to little purpose. Since we require the blobs of a metadata group to share the same ACL, we can furthermore store only one copy of that ACL, and then make pg_restore regenerate the appropriate commands for each blob. This saves space in the dump file not only by removing duplicative SQL command strings, but by not needing a separate TOC entry for each blob's ACL. In turn, that reduces client-side memory requirements for handling many blobs. ACL TOC entries that need this special processing are labeled as "ACL"/"LARGE OBJECTS nnn..nnn". If we have a blob with a unique ACL, continue to label it as "ACL"/"LARGE OBJECT nnn". We don't actually have to make such a distinction, but it saves a few cycles during restore for the easy case, and it seems like a good idea to not change the TOC contents unnecessarily. The data TOC entries ("BLOBS") are exactly the same as before, except that now there can be more than one, so we'd better give them identifying tag strings. Also, commit `c0d5be5d6` put the new BLOB metadata TOC entries into SECTION_PRE_DATA, which perhaps is defensible in some ways, but it's a rather odd choice considering that we go out of our way to treat blobs as data. Moreover, because parallel restore handles the PRE_DATA section serially, this means we'd only get part of the parallelism speedup we could hope for. Move these entries into SECTION_DATA, letting us parallelize the lo_create calls not just the data loading when there are many blobs. Add dependencies to ensure that we won't try to load data for a blob we've not yet created. As this stands, we still generate a separate TOC entry for any comment or security label attached to a blob. I feel comfortable in believing that comments and security labels on blobs are rare, so this patch should be enough to get most of the useful TOC compression for blobs. We have to bump the archive file format version number, since existing versions of pg_restore wouldn't know they need to do something special for BLOB METADATA, plus they aren't going to work correctly with multiple BLOBS entries or multiple-large-object ACL entries. The directory and tar-file format handlers need some work for multiple BLOBS entries: they used to hard-wire the file name as "blobs.toc", which is replaced here with "blobs_<dumpid>.toc". The 002_pg_dump.pl test script also knows about that and requires minor updates. (I had to drop the test for manually-compressed blobs.toc files with LZ4, because lz4's obtuse command line design requires explicit specification of the output file name which seems impractical here. I don't think we're losing any useful test coverage thereby; that test stanza seems completely duplicative with the gzip and zstd cases anyway.) In passing, centralize management of the lo_buf used to hold data while restoring blobs. The code previously had each format handler create lo_buf, which seems rather pointless given that the format handlers all make it the same way. Moreover, the format handlers never use lo_buf directly, making this setup a failure from a separation-of-concerns standpoint. Let's move the responsibility into pg_backup_archiver.c, which is the only module concerned with lo_buf. The reason to do this in this patch is that it allows a centralized fix for the now-false assumption that we never restore blobs in parallel. Also, get rid of dead code in DropLOIfExists: it's been a long time since we had any need to be able to restore to a pre-9.0 server. Discussion: https://postgr.es/m/a9f9376f1c3343a6bb319dce294e20ac@EX13D05UWC001.ant.amazon.com	2024-04-01 16:25:56 -04:00
Tom Lane	5eac8cef24	Avoid possible longjmp-induced logic error in PLy_trigger_build_args. The "pltargs" variable wasn't marked volatile, which makes it unsafe to change its value within the PG_TRY block. It looks like the worst outcome would be to fail to release a refcount on Py_None during an (improbable) error exit, which would likely go unnoticed in the field. Still, it's a bug. A one-liner fix could be to mark pltargs volatile, but on the whole it seems cleaner to arrange things so that we don't change its value within PG_TRY. Per report from Xing Guo. This has been there for quite awhile, so back-patch to all supported branches. Discussion: https://postgr.es/m/CACpMh+DLrk=fDv07MNpBT4J413fDAm+gmMXgi8cjPONE+jvzuw@mail.gmail.com	2024-04-01 15:15:03 -04:00
Tom Lane	91cbb4b492	Fix assorted resource leaks in new pg_createsubscriber code. Various error paths did not release resources before returning. While it's likely that the program would just exit shortly later, none of the functions in question have summary exit(1) calls, so they should not be assuming that. Ranier Vilela and Tom Lane, per reports from Coverity Discussion: https://postgr.es/m/CAEudQAr2_SZFxB4kXJiL4+2UaNZxUk5UBJtj0oXyJYMGZu-03g@mail.gmail.com	2024-04-01 13:47:49 -04:00
Heikki Linnakangas	6f47f68831	Handle non-chain tuples outside of heap_prune_chain() Handle dead branches of aborted HOT chains outside heap_prune_chain() as a separate phase. This simplifies the logic in heap_prune_chain(), as well as allowing us to clean up more RECENTLY_DEAD -> DEAD chains. To accomplish this efficiently, partition tuples into HOT and non-HOT while first collecting visibility information for each tuple in heap_page_prune(). Then call heap_prune_chain() only on potential chain members. Then mop up the leftover HOT tuples afterwards. As part of this, keep track of which items on page have already been processed, in 'processed' array. This replaces the 'marked' array which was only set for tuples marked for removal or redirection. The 'processed' array is updated also for items that are left unchanged, when we conclude that an item can be left unchanged. At the end of pruning, every item on the page should be marked as processed in the array; an assertion is added for that. Author: Melanie Plageman <melanieplageman@gmail.com> Author: Heikki Linnakangas <heikki.linnakangas@iki.fi> Discussion: https://www.postgresql.org/message-id/20240330055710.kqg6ii2cdojsxgje@liskov	2024-04-01 20:33:50 +03:00
Heikki Linnakangas	7aa00f1360	Refactor heap_prune_chain() Keep track of the number of deleted tuples in PruneState and record this information when recording a tuple dead, unused or redirected. This removes a special case from the traversal and chain processing logic as well as setting a precedent of recording the impact of prune actions in the record functions themselves. This paradigm will be used in future commits which move tracking of additional statistics on pruning actions from lazy_scan_prune() to heap_prune_chain(). Simplify heap_prune_chain()'s chain traversal logic by handling each case explicitly. That is, do not attempt to share code when processing different types of chains. For each category of chain, process it specifically and procedurally: first handling the root, then any intervening tuples, and, finally, the end of the chain. While we are at it, add a few new comments to heap_prune_chain() clarifying some special cases involving RECENTLY_DEAD tuples. Author: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://www.postgresql.org/message-id/20240330055710.kqg6ii2cdojsxgje@liskov	2024-04-01 13:28:44 +03:00
Heikki Linnakangas	9917e79d99	Minor refactoring in heap_page_prune Pass 'page', 'blockno' and 'maxoff' to heap_prune_chain() as arguments, so that it doesn't need to fetch them from the buffer. This saves a few cycles per chain. Remove the "if (off_loc != NULL)" checks, and require the caller to pass a non-NULL 'off_loc'. Pass a pointer to a dummy local variable when it's not needed. Those checks are cheap, but it's still better to avoid them in the per-chain loops when we can do so easily. The CPU time saving from these changes are hardly measurable, but fewer instructions is good anyway, so why not. I spotted the potential for these while reviewing Melanie Plageman's patch set to combine prune and freeze records. Discussion: https://www.postgresql.org/message-id/CAAKRu_abm2tHhrc0QSQa%3D%3DsHe%3DVA1%3Doz1dJMQYUOKuHmu%2B9Xrg%40mail.gmail.com	2024-04-01 12:07:30 +03:00
Masahiko Sawada	f5a227895e	Add new COPY option LOG_VERBOSITY. This commit adds a new COPY option LOG_VERBOSITY, which controls the amount of messages emitted during processing. Valid values are 'default' and 'verbose'. This is currently used in COPY FROM when ON_ERROR option is set to ignore. If 'verbose' is specified, a NOTICE message is emitted for each discarded row, providing additional information such as line number, column name, and the malformed value. This helps users to identify problematic rows that failed to load. Author: Bharath Rupireddy Reviewed-by: Michael Paquier, Atsushi Torikoshi, Masahiko Sawada Discussion: https://www.postgresql.org/message-id/CALj2ACUk700cYhx1ATRQyRw-fBM%2BaRo6auRAitKGff7XNmYfqQ%40mail.gmail.com	2024-04-01 15:25:25 +09:00
John Naylor	f4ad0021af	Revert "Speed up tail processing when hashing aligned C strings" This reverts commit `07f0f6abfc`. This has shown failures on both Valgrind and big-endian machines, per members skink and pike.	2024-03-31 14:18:36 +07:00
John Naylor	07f0f6abfc	Speed up tail processing when hashing aligned C strings After encountering the NUL terminator, the word-at-a-time loop exits and we must hash the remaining bytes. Previously we calculated the terminator's position and re-loaded the remaining bytes from the input string. We already have all the data we need in a register, so let's just mask off the bytes we need and hash them immediately. The mask can be cheaply computed without knowing the terminator's position. We still need that position for the length calculation, but the CPU can now do that in parallel with other work, shortening the dependency chain. Ants Aasma and John Naylor Discussion: https://postgr.es/m/CANwKhkP7pCiW_5fAswLhs71-JKGEz1c1%2BPC0a_w1fwY4iGMqUA%40mail.gmail.com	2024-03-31 12:19:16 +07:00
Alexander Korotkov	b1484a3f19	Let table AM insertion methods control index insertion Previously, the executor did index insert unconditionally after calling table AM interface methods tuple_insert() and multi_insert(). This commit introduces the new parameter insert_indexes for these two methods. Setting '*insert_indexes' to true saves the current logic. Setting it to false indicates that table AM cares about index inserts itself and doesn't want the caller to do that. Discussion: https://postgr.es/m/CAPpHfdurb9ycV8udYqM%3Do0sPS66PJ4RCBM1g-bBpvzUfogY0EA%40mail.gmail.com Reviewed-by: Pavel Borisov, Matthias van de Meent, Mark Dilger	2024-03-30 22:53:56 +02:00
Alexander Korotkov	c95c25f9af	Custom reloptions for table AM Let table AM define custom reloptions for its tables. This allows to specify AM-specific parameters by WITH clause when creating a table. The code may use some parts from prior work by Hao Wu. Discussion: https://postgr.es/m/CAPpHfdurb9ycV8udYqM%3Do0sPS66PJ4RCBM1g-bBpvzUfogY0EA%40mail.gmail.com Discussion: https://postgr.es/m/AMUA1wBBBxfc3tKRLLdU64rb.1.1683276279979.Hmail.wuhao%40hashdata.cn Reviewed-by: Reviewed-by: Pavel Borisov, Matthias van de Meent	2024-03-30 22:36:25 +02:00
Alexander Korotkov	27bc1772fc	Generalize relation analyze in table AM interface Currently, there is just one algorithm for sampling tuples from a table written in acquire_sample_rows(). Custom table AM can just redefine the way to get the next block/tuple by implementing scan_analyze_next_block() and scan_analyze_next_tuple() API functions. This approach doesn't seem general enough. For instance, it's unclear how to sample this way index-organized tables. This commit allows table AM to encapsulate the whole sampling algorithm (currently implemented in acquire_sample_rows()) into the relation_analyze() API function. Discussion: https://postgr.es/m/CAPpHfdurb9ycV8udYqM%3Do0sPS66PJ4RCBM1g-bBpvzUfogY0EA%40mail.gmail.com Reviewed-by: Pavel Borisov, Matthias van de Meent	2024-03-30 22:34:04 +02:00
Tom Lane	b154d8a6d0	Add pg_basetype() function to extract a domain's base type. This SQL-callable function behaves much like our internal utility function getBaseType(), except it returns NULL rather than failing for an invalid type OID. (That behavior is modeled on our experience with other catalog-inquiry functions such as the ACL checking functions.) The key advantage over doing a join to pg_type is that it will loop as needed to find the bottom base type of a nest of domains. Steve Chavez, reviewed by jian he and others Discussion: https://postgr.es/m/CAGRrpzZSX8j=MQcbCSEisFA=ic=K3bknVfnFjAv1diVJxFHJvg@mail.gmail.com	2024-03-30 13:57:19 -04:00
Dean Rasheed	0294df2f1f	Add support for MERGE ... WHEN NOT MATCHED BY SOURCE. This allows MERGE commands to include WHEN NOT MATCHED BY SOURCE actions, which operate on rows that exist in the target relation, but not in the data source. These actions can execute UPDATE, DELETE, or DO NOTHING sub-commands. This is in contrast to already-supported WHEN NOT MATCHED actions, which operate on rows that exist in the data source, but not in the target relation. To make this distinction clearer, such actions may now be written as WHEN NOT MATCHED BY TARGET. Writing WHEN NOT MATCHED without specifying BY SOURCE or BY TARGET is equivalent to writing WHEN NOT MATCHED BY TARGET. Dean Rasheed, reviewed by Alvaro Herrera, Ted Yu and Vik Fearing. Discussion: https://postgr.es/m/CAEZATCWqnKGc57Y_JanUBHQXNKcXd7r=0R4NEZUVwP+syRkWbA@mail.gmail.com	2024-03-30 10:00:26 +00:00
Jeff Davis	46e5441fa5	Add unicode_strtitle() for Unicode Default Case Conversion. This brings the titlecasing implementation for the builtin provider out of formatting.c and into unicode_case.c, along with unicode_strlower() and unicode_strupper(). Accepts an arbitrary word boundary callback. Simple for now, but can be extended to support the Unicode Default Case Conversion algorithm with full case mapping. Discussion: https://postgr.es/m/3bc653b5d562ae9e2838b11cb696816c328a489a.camel@j-davis.com Reviewed-by: Peter Eisentraut	2024-03-29 17:35:07 -07:00
Daniel Gustafsson	a96a8b15fa	Remove superfluous trailing semicolons Two semicolons were accidentally added to rows which were already terminated semicolons. While harmless, fix by removing these. Author: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/CAMbWs4_fnJ0+yOgFioswzLE7t6R8P6cqbuacFVeZqbESFAjs1A@mail.gmail.com	2024-03-29 23:51:43 +01:00
Jeff Davis	46a44dc372	Use version for builtin collations. Given that the version field already exists, there's little reason not to use it. Suggestion from Peter Eisentraut. Discussion: https://postgr.es/m/613c120a-5413-4fa7-a501-6590eae558f8@eisentraut.org Reviewed-by: Peter Eisentraut	2024-03-29 10:53:26 -07:00
Tom Lane	c2df2ed90a	Try to stabilize flappy test result. This recently-added test case checks the plan of an inner join between two identical tables. It's just chance which join order the planner will pick, and in the presence of any variation in the underlying statistics, the displayed plan might change. Add a WHERE condition to break the cost symmetry and hopefully stabilize matters. (We're still trying to understand exactly why the underlying statistics aren't as stable as intended, but this seems like a good change anyway, since this test would surely bite us again in future.) While here, clean up assorted comment spelling, grammar, and whitespace problems. Discussion: https://postgr.es/m/4168116.1711720146@sss.pgh.pa.us	2024-03-29 10:40:31 -04:00
Robert Haas	d3ae2a24f2	Add allow_alter_system GUC. This is marked PGC_SIGHUP, so it can only be set in a configuration file, not anywhere else; and it is also marked GUC_DISALLOW_IN_AUTO_FILE, so it can't be set using ALTER SYSTEM. When set to false, the ALTER SYSTEM command is disallowed. There was considerable concern that this would be misinterpreted as a security feature, which it is not, because a determined superuser has various ways of bypassing it. Hence, a lot of work has gone into wordsmithing the documentation, in the hopes of avoiding any such confusion. Jelte Fennemia-Nio and Gabriele Bartolini, with wording suggestions for the documentation from many others. Discussion: http://postgr.es/m/CA%2BVUV5rEKt2%2BCdC_KUaPoihMu%2Bi5ChT4WVNTr4CD5-xXZUfuQw%40mail.gmail.com	2024-03-29 08:45:11 -04:00
Tom Lane	0075d78947	Allow "internal" subtransactions in parallel mode. Allow use of BeginInternalSubTransaction() in parallel mode, so long as the subtransaction doesn't attempt to acquire an XID or increment the command counter. Given those restrictions, the other parallel processes don't need to know about the subtransaction at all, so this should be safe. The benefit is that it allows subtransactions intended for error recovery, such as pl/pgsql exception blocks, to be used in PARALLEL SAFE functions. Another reason for doing this is that the API of BeginInternalSubTransaction() doesn't allow reporting failure. pl/python for one, and perhaps other PLs, copes very poorly with an error longjmp out of BeginInternalSubTransaction(). The headline feature of this patch removes the only easily-triggerable failure case within that function. There remain some resource-exhaustion and similar cases, which we now deal with by promoting them to FATAL errors, so that callers need not try to clean up. (It is likely that such errors would leave us with corrupted transaction state inside xact.c, making recovery difficult if not impossible anyway.) Although this work started because of a report of a pl/python crash, we're not going to do anything about that in the back branches. Back-patching this particular fix is obviously not very wise. While we could contemplate some narrower band-aid, pl/python is already an untrusted language, so it seems okay to classify this as a "so don't do that" case. Patch by me, per report from Hao Zhang. Thanks to Robert Haas for review. Discussion: https://postgr.es/m/CALY6Dr-2yLVeVPhNMhuBnRgOZo1UjoTETgtKBx1B2gUi8yy+3g@mail.gmail.com	2024-03-28 12:43:10 -04:00
Alvaro Herrera	e2395cdbe8	ALTER TABLE: rework determination of access method ID Avoid setting an access method OID for relation kinds that don't take one. Code review for new feature added in `374c7a2290`. Author: Justin Pryzby <pryzby@telsasoft.com> Reported-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://postgr.es/m/e5516ac1-5264-c3c0-d822-9e6f614ea93b@gmail.com	2024-03-28 16:51:20 +01:00
Tom Lane	be98a550cc	Update comment in set_dummy_rel_pathlist(). This comment claimed that set_dummy_rel_pathlist() has callers other than (possibly indirectly) set_rel_size(). It doesn't, so revise the argument to not rely on that. Noted by Richard Guo. Discussion: https://postgr.es/m/CAMbWs4-KFEU_fDuJPNCOkUu3rwvZvKBEytkd9VrM4kH4-2h1CQ@mail.gmail.com	2024-03-28 11:44:00 -04:00
Alvaro Herrera	213c959a29	Remove translation markers from libpq-be-fe-helpers.h Apparently these markers cause the modules to not link correctly in some platforms, at least per buildfarm member indri; moreover, this code is only used in modules that don't have a translation. If we someday add i18n support to contrib/ it might be worth revisiting this.	2024-03-28 13:12:12 +01:00
Alvaro Herrera	2466d6654f	libpq-be-fe-helpers.h: wrap new cancel APIs Commit `61461a300c` introduced new functions to libpq for cancelling queries. This commit introduces a helper function that backend-side libraries and extensions can use to invoke those. This function takes a timeout and can itself be interrupted while it is waiting for a cancel request to be sent and processed, instead of being blocked. This replaces the usage of the old functions in postgres_fdw and dblink. Finally, it also adds some test coverage for the cancel support in postgres_fdw. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://postgr.es/m/CAGECzQT_VgOWWENUqvUV9xQmbaCyXjtRRAYO8W07oqashk_N+g@mail.gmail.com	2024-03-28 11:31:03 +01:00
Heikki Linnakangas	427005742b	Remove obsolete comment about VACUUM retrying pruning Commit `1ccc1e05ae` removed the retry logic that the comment talked about. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://www.postgresql.org/message-id/20240328015326.x5gnzsohl6j23b42@liskov	2024-03-28 10:18:48 +02:00
Masahiko Sawada	f1bb9284f7	Improve tab completion for ALTER TABLE ALTER COLUMN SET in psql. The commit changes the tab completion to add DATA TYPE after ALTER TABLE ... ALTER COLUMN ... SET. Author: Vignesh C Reviewed-by: Shubham Khanna, Masahiko Sawada Discussion: https://postgr.es/m/CALDaNm1aEdJb-QJi%3DGWStkfj_%2BEDUK_VtDkn%2BTjQ2z7HyU0MBw%40mail.gmail.com	2024-03-28 16:30:10 +09:00
Nathan Bossart	7188a7806d	Improve style of pg_lfind32(). This commit simplifies pg_lfind32() a bit by moving the standard one-by-one linear search code to an inline helper function. Reviewed-by: Tom Lane Discussion: https://postgr.es/m/20240327013616.GA3940109%40nathanxps13	2024-03-27 20:26:05 -05:00
Masahiko Sawada	2d8f56dabb	Rethink create and attach APIs of shared TidStore. Previously, the behavior of TidStoreCreate() was inconsistent between local and shared TidStore instances in terms of memory limitation. For local TidStore, a memory context was created with initial and maximum memory block sizes, as well as a minimum memory context size, based on the specified max_bytes values. However, for shared TidStore, the provided DSA area was used for TID storage. Although commit `bb952c8c8b` allowed specifying the initial and maximum DSA segment sizes, callers would have needed to clamp their own limits, which was not consistent and user-friendly. With this commit, when creating a shared TidStore, a dedicated DSA area is created for TID storage instead of using a provided DSA area. The initial and maximum DSA segment sizes are chosen based on the specified max_bytes. Other processes can attach to the shared TidStore using the handle of the created DSA returned by the new TidStoreGetDSA() function and the DSA pointer returned by TidStoreGetHandle(). The created DSA has the same lifetime as the shared TidStore and is deleted when all processes detach from it. To improve clarity, the TidStoreCreate() function has been divided into two separate functions: TidStoreCreateLocal() and TidStoreCreateShared(). Reviewed-by: John Naylor Discussion: https://postgr.es/m/CAD21AoAyc1j%3DBCdUqZfk6qbdjZ68UgRx1Gkpk0oah4K7S0Ri9g%40mail.gmail.com	2024-03-28 10:03:28 +09:00
Jeff Davis	b0be28761e	Run perltidy on generate-unicode_version.pl.	2024-03-27 13:21:29 -07:00
Tom Lane	a767cdc84c	Fix unnecessary use of moving-aggregate mode with non-moving frame. When a plain aggregate is used as a window function, and the window frame start is specified as UNBOUNDED PRECEDING, the frame's head cannot move so we do not need to use moving-aggregate mode. The check for that was put into initialize_peragg(), failing to notice that ExecInitWindowAgg() calls that function before it's filled in winstate->frameOptions. Since makeNode() would have zeroed the field, this didn't provoke uninitialized-value complaints, nor would the erroneous decision have resulted in more than a little inefficiency. Still, it's wrong, so move the initialization of winstate->frameOptions earlier to make it work properly. While here, also fix a thinko in a comment. Both errors crept in in commit `a9d9acbf2` which introduced the moving-aggregate mode. Spotted by Vallimaharajan G. Back-patch to all supported branches. Discussion: https://postgr.es/m/18e7f2a5167.fe36253866818.977923893562469143@zohocorp.com	2024-03-27 13:39:03 -04:00
Robert Haas	de7e96bd0f	Rename COMPAT_OPTIONS_CLIENT to COMPAT_OPTIONS_OTHER. The user-facing name is "Other Platforms and Clients", but the internal name seems too focused on clients specifically, especially given the plan to add a new setting to this session that is about platform or deployment model compatibility rather than client compatibility. Jelte Fennema-Nio Discussion: http://postgr.es/m/CAGECzQTfMbDiM6W3av+3weSnHxJvPmuTEcjxVvSt91sQBdOxuQ@mail.gmail.com	2024-03-27 10:45:28 -04:00
David Rowley	d6a6957d53	Fix unstable aggregate regression test Buildfarm member avocet has shown a plan change by switching the finalize aggregate stage to use a GroupAggregate rather than a HashAggregate. This is consistent with autovacuum having triggered on the table, per analysis by Alexander Lakhin. Fix this by disabling autovacuum on the table. Reported-by: Alexander Lakhin Discussion: https://postgr.es/m/d4493a28-589a-5328-fed5-250f2d7d3e2a@gmail.com Backpatch-through: 16, where this test was added.	2024-03-28 00:13:48 +13:00
Dean Rasheed	e6341323a8	Add functions to generate random numbers in a specified range. This adds 3 new variants of the random() function: random(min integer, max integer) returns integer random(min bigint, max bigint) returns bigint random(min numeric, max numeric) returns numeric Each returns a random number x in the range min <= x <= max. For the numeric function, the number of digits after the decimal point is equal to the number of digits that "min" or "max" has after the decimal point, whichever has more. The main entry points for these functions are in a new C source file. The existing random(), random_normal(), and setseed() functions are moved there too, so that they can all share the same PRNG state, which is kept private to that file. Dean Rasheed, reviewed by Jian He, David Zhang, Aleksander Alekseev, and Tomas Vondra. Discussion: https://postgr.es/m/CAEZATCV89Vxuq93xQdmc0t-0Y2zeeNQTdsjbmV7dyFBPykbV4Q@mail.gmail.com	2024-03-27 10:12:39 +00:00
Alexander Korotkov	818861eb57	Fix some typos and grammar issues from commit `87985cc925` Reported-by: Alexander Lakhin	2024-03-27 11:47:41 +02:00
Amit Kapila	677a45c4ae	Fix random failure in 004_subscription. After the upgrade, the failed test was ensuring that the changes made on the publisher should be replicated to the subscriber. We missed waiting for one of the subscriptions to catch up. Per buildfarm Author: Vignesh C Reviewed-by: Kuroda Hayato Discussion: https://postgr.es/m/CALDaNm0z=fLtio1h50K8WossUGXU+gy0H9y9=RYh1DDZiq2EDw@mail.gmail.com	2024-03-27 14:15:03 +05:30
Amit Kapila	6d49c8d4b4	Change last_inactive_time to inactive_since in pg_replication_slots. Commit `a11f330b55` added last_inactive_time to show the last time the slot was inactive. But, it tells the last time that a currently-inactive slot previously WAS active. This could be unclear, so we changed the name to inactive_since. Reported-by: Robert Haas Author: Bharath Rupireddy Reviewed-by: Bertrand Drouvot, Shveta Malik, Amit Kapila Discussion: https://postgr.es/m/CA+Tgmob_Ta-t2ty8QrKHBGnNLrf4ZYcwhGHGFsuUoFrAEDw4sA@mail.gmail.com Discussion: https://postgr.es/m/CALj2ACUXS0SfbHzsX8bqo+7CZhocsV52Kiu7OWGb5HVPAmJqnA@mail.gmail.com	2024-03-27 09:27:44 +05:30
Masahiko Sawada	bb952c8c8b	Allow specifying initial and maximum segment sizes for DSA. Previously, the DSA segment size always started with 1MB and grew up to DSA_MAX_SEGMENT_SIZE. It was inconvenient in certain scenarios, such as when the caller desired a soft constraint on the total DSA segment size, limiting it to less than 1MB. This commit introduces the capability to specify the initial and maximum DSA segment sizes when creating a DSA area, providing more flexibility and control over memory usage. Reviewed-by: John Naylor, Tomas Vondra Discussion: https://postgr.es/m/CAD21AoAYGGC1ePjVX0H%2Bpp9rH%3D9vuPK19nNOiu12NprdV5TVJA%40mail.gmail.com	2024-03-27 11:43:29 +09:00
Nathan Bossart	1f42337be5	Fix compiler warning for pg_lfind32(). The newly-introduced "one_by_one" label produces -Wunused-label warnings when building without SIMD support. To fix, move the label into the SIMD section of this function. Oversight in commit `7644a7340c`. Reported-by: Tom Lane Discussion: https://postgr.es/m/3189995.1711495704%40sss.pgh.pa.us	2024-03-26 20:27:46 -05:00
Tom Lane	9d00cf4772	Remove some redundant set_cheapest() calls. Commit `e2fa76d80` centralized the responsibility for doing set_cheapest() for a baserel, but these functions added later seemingly didn't get the memo. There's no apparent reason why we need the cheapest path for these relation types to be available any sooner than it is for other base relation types, so delete the duplicate calls. Doesn't save much since there's only one path in these cases, but it might improve clarity. Richard Guo Discussion: https://postgr.es/m/CAMbWs4-KFEU_fDuJPNCOkUu3rwvZvKBEytkd9VrM4kH4-2h1CQ@mail.gmail.com	2024-03-26 16:02:44 -04:00
Nathan Bossart	d365ae7054	Optimize roles_is_member_of() with a Bloom filter. When the list of roles gathered by roles_is_member_of() grows very large, a Bloom filter is created to help avoid some linear searches through the list. The threshold for creating the Bloom filter is set arbitrarily high and may require future adjustment. Suggested-by: Tom Lane Reviewed-by: Tom Lane Discussion: https://postgr.es/m/CAGvXd3OSMbJQwOSc-Tq-Ro1CAz%3DvggErdSG7pv2s6vmmTOLJSg%40mail.gmail.com	2024-03-26 14:43:37 -05:00
Tom Lane	fad3b5b5ac	Fix failure of ALTER FOREIGN TABLE SET SCHEMA to move sequences. Ordinary ALTER TABLE SET SCHEMA will also move any owned sequences into the new schema. We failed to do likewise for foreign tables, because AlterTableNamespaceInternal believed that only certain relkinds could have indexes, owned sequences, or constraints. We could simply add foreign tables to that relkind list, but it seems likely that the same oversight could be made again in future. Instead let's remove the relkind filter altogether. These functions shouldn't cost much when there are no objects that they need to process, and surely this isn't an especially performance-critical case anyway. Per bug #18407 from Vidushi Gupta. Back-patch to all supported branches. Discussion: https://postgr.es/m/18407-4fd07373d252c6a0@postgresql.org	2024-03-26 15:28:31 -04:00
Nathan Bossart	7644a7340c	Micro-optimize pg_lfind32(). This commit improves the performance of pg_lfind32() in many cases by modifying it to process the remaining "tail" of elements with SIMD instructions instead of processing them one-by-one. Since the SIMD code processes a large block of elements, this means that we will process a subset of elements more than once, but that won't affect the correctness of the result, and testing has shown that this helps more cases than it regresses. With this change, the standard one-by-one linear search code is only used for small arrays and for platforms without SIMD support. Suggested-by: John Naylor Reviewed-by: John Naylor Discussion: https://postgr.es/m/20231129171526.GA857928%40nathanxps13	2024-03-26 14:03:32 -05:00
Tom Lane	a65724dfa7	Propagate pathkeys from CTEs up to the outer query. If we know the sort order of a CTE's output, and it is relevant to the outer query, label the CTE's outer-query access path using those pathkeys. This may enable optimizations such as avoiding a sort in the outer query. The code for hoisting pathkeys into the outer query already exists for regular RTE_SUBQUERY subqueries, but it wasn't getting used for CTEs, possibly out of concern for maintaining an optimization fence between the CTE and the outer query. However, on the same arguments used for commit `f7816aec2`, there seems no harm in letting the outer query know what the inner query decided to do. In support of this, we now remember the best Path as well as Plan for each subquery for the rest of the planner run. There may be future applications for having that at hand, and it surely costs little to build one more List. Richard Guo (minor mods by me) Discussion: https://postgr.es/m/CAMbWs49xYd3f8CrE8-WW3--dV1zH_sDSDn-vs2DzHj81Wcnsew@mail.gmail.com	2024-03-26 13:05:51 -04:00
Bruce Momjian	e648e77e25	C comment: mention no doc for negative start of substring(text) Also add URL to hackers discussion. Backpatch-through: master	2024-03-26 12:27:41 -04:00
Tom Lane	8a92b70c11	Allow "make check"-style testing to work with musl C library. The musl dynamic linker saves a pointer to the process' environment value of LD_LIBRARY_PATH very early in startup. When we move/clobber the environment to make more room for ps status strings, we clobber that value and thereby prevent libraries from being found via LD_LIBRARY_PATH, which breaks the use of a temporary installation for testing purposes. To fix, stop collecting usable space for ps status if we notice that the variable we are about to clobber is LD_LIBRARY_PATH. This will result in some reduction in how long the ps status can be, but it's only likely to occur in temporary test contexts, so it doesn't seem like a big problem. In any case, we don't have to do it if we see we are on glibc, which surely is where the majority of our Linux testing is done. Thomas Munro, Bruce Momjian, and Tom Lane, per report from Wolfgang Walther. Back-patch to all supported branches, with the hope that we'll set up a buildfarm animal to test on this platform. Discussion: https://postgr.es/m/fddd1cd6-dc16-40a2-9eb5-d7fef2101488@technowledgy.de	2024-03-26 11:44:49 -04:00
Peter Eisentraut	89e5ef7e21	Remove ObjectClass type ObjectClass is an enum whose values correspond to catalog OIDs. But the extra layer of redirection, which is used only in small parts of the code, and the similarity to ObjectType, are confusing and cumbersome. One advantage has been that some switches processing the OCLASS enum don't have "default:" cases. This is so that the compiler tells us when we fail to add support for some new object class. But you can also handle that with some assertions and proper test coverage. It's not even clear how strong this benefit is. For example, in AlterObjectNamespace_oid(), you could still put a new OCLASS into the "ignore object types that don't have schema-qualified names" case, and it might or might not be wrong. Also, there are already various OCLASS switches that do have a default case, so it's not even clear what the preferred coding style should be. Reviewed-by: jian he <jian.universality@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/flat/CAGECzQT3caUbcCcszNewCCmMbCuyP7XNAm60J3ybd6PN5kH2Dw%40mail.gmail.com	2024-03-26 10:08:56 +01:00
Peter Eisentraut	8c4f2d5475	Message fixes for pg_createsubscriber Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://www.postgresql.org/message-id/20240326.140116.1116279856046587865.horikyota.ntt@gmail.com	2024-03-26 08:22:46 +01:00
Masahiko Sawada	a0e22ef911	Fix inconsistent function prototypes with function definitions. Introduced by `30e144287a`. Reviewed-by: John Naylor Discussion: https://postgr.es/m/CAD21AoCaDT%2B-ZaVjbtvumms0tyyHPNLELK2UX-MLG9XCgioaNw%40mail.gmail.com	2024-03-26 13:13:26 +09:00
Masahiko Sawada	4edb37e322	Fix a calculation in TidStoreCreate(). Since we expect that the max_bytes is in bytes, not in kilobytes, it should not be multiplied by 1024. Introduced by `30e144287a`. Reported-by: John Naylor, David Rowley Reviewed-by: John Naylor Discussion: https://postgr.es/m/CANWCAZZTE-14ofsucofTuhFsfuDGBNf%3DNZb22TMYT8bxA41oQQ%40mail.gmail.com Discussion: https://postgr.es/m/CAApHDvojg82NDaDEpj1WEZSbVTafj%3DDRmW%2BFrkBdW8ScL4OFxA%40mail.gmail.com	2024-03-26 13:06:06 +09:00
Alexander Korotkov	41d3780d3d	Improve error message for tts_(virtual\|minimal)_is_current_xact_tuple Discussion: https://postgr.es/m/CALT9ZEHNeagO5PLb4Nv9J_ZaCtp%2BArdVmbSLc0RHUzx_RPAa4w%40mail.gmail.com Author: Pavel Borisov	2024-03-26 01:55:22 +02:00
Alexander Korotkov	10baee0c95	Add comments on some MinimalTupleSlots methods usage Discussion: https://postgr.es/m/CALT9ZEHNeagO5PLb4Nv9J_ZaCtp%2BArdVmbSLc0RHUzx_RPAa4w%40mail.gmail.com Author: Pavel Borisov	2024-03-26 01:53:34 +02:00
Alexander Korotkov	8ffc2aa720	Add EvalPlanQual delete returning isolation test Author: Andres Freund Reviewed-by: Pavel Borisov Discussion: https://www.postgresql.org/message-id/flat/CAPpHfdua-YFw3XTprfutzGp28xXLigFtzNbuFY8yPhqeq6X5kg%40mail.gmail.com	2024-03-26 01:28:05 +02:00
Alexander Korotkov	87985cc925	Allow locking updated tuples in tuple_update() and tuple_delete() Currently, in read committed transaction isolation mode (default), we have the following sequence of actions when tuple_update()/tuple_delete() finds the tuple updated by the concurrent transaction. 1. Attempt to update/delete tuple with tuple_update()/tuple_delete(), which returns TM_Updated. 2. Lock tuple with tuple_lock(). 3. Re-evaluate plan qual (recheck if we still need to update/delete and calculate the new tuple for update). 4. Second attempt to update/delete tuple with tuple_update()/tuple_delete(). This attempt should be successful, since the tuple was previously locked. This commit eliminates step 2 by taking the lock during the first tuple_update()/tuple_delete() call. The heap table access method saves some effort by checking the updated tuple once instead of twice. Future undo-based table access methods, which will start from the latest row version, can immediately place a lock there. Also, this commit makes tuple_update()/tuple_delete() optionally save the old tuple into the dedicated slot. That saves efforts on re-fetching tuples in certain cases. The code in nodeModifyTable.c is simplified by removing the nested switch/case. Discussion: https://postgr.es/m/CAPpHfdua-YFw3XTprfutzGp28xXLigFtzNbuFY8yPhqeq6X5kg%40mail.gmail.com Reviewed-by: Aleksander Alekseev, Pavel Borisov, Vignesh C, Mason Sharp Reviewed-by: Andres Freund, Chris Travers	2024-03-26 01:27:56 +02:00
Tom Lane	c7076ba6ad	Refactor predicate_{implied,refuted}_by_simple_clause. Put the node-type-dependent operations into switches on nodeTag. This should ease addition of new proof rules for other expression node types. There is no functional change, although some tests are made in a different order than before. Also, add a couple of new cross-checks in test_predtest.c. James Coleman (part of a larger patch series) Discussion: https://postgr.es/m/CAAaqYe8Bo4bf_i6qKj8KBsmHMYXhe3Xt6vOe3OBQnOaf3_XBWg@mail.gmail.com	2024-03-25 17:45:15 -04:00
Jeff Davis	bc5fcaa289	Clarify comment for LogicalTapeSetBlocks(). Discussion: https://postgr.es/m/1229327.1711160246@sss.pgh.pa.us Backpatch-through: 13	2024-03-25 11:51:44 -07:00
Nathan Bossart	3ff01b2b6e	Adjust pgbench option for debug mode. Many other utilities use -d to specify the database to use, but pgbench uses it to enable debug mode. This is causing some users to accidentally enable it. This commit changes -d to accept the database name and introduces --dbname. Debug mode can still be enabled with --debug. This is a backward-incompatible change, but it has been judged to be worth the trade-off, i.e., some scripts that use pgbench will need to be updated. Author: Greg Sabino Mullane Reviewed-by: Tomas Vondra, Euler Taveira, Alvaro Herrera, David Christensen Discussion: https://postgr.es/m/CAKAnmmLjAzwVtb%3DVEaeuCtnmOLpzkJ1uJ_XiQ362YdD9B72HSg%40mail.gmail.com	2024-03-25 11:08:53 -05:00
Alvaro Herrera	374c7a2290	Allow specifying an access method for partitioned tables It's now possible to specify a table access method via CREATE TABLE ... USING for a partitioned table, as well change it with ALTER TABLE ... SET ACCESS METHOD. Specifying an AM for a partitioned table lets the value be used for all future partitions created under it, closely mirroring the behavior of the TABLESPACE option for partitioned tables. Existing partitions are not modified. For a partitioned table with no AM specified, any new partitions are created with the default_table_access_method. Also add ALTER TABLE ... SET ACCESS METHOD DEFAULT, which reverts to the original state of using the default for new partitions. The relcache of partitioned tables is not changed: rd_tableam is not set, even if a partitioned table has a relam set. Author: Justin Pryzby <pryzby@telsasoft.com> Author: Soumyadeep Chakraborty <soumyadeep2007@gmail.com> Author: Michaël Paquier <michael@paquier.xyz> Reviewed-by: The authors themselves Discussion: https://postgr.es/m/CAE-ML+9zM4wJCGCBGv01k96qQ3gFv4WFcFy=zqPHKeaEFwwv6A@mail.gmail.com Discussion: https://postgr.es/m/20210308010707.GA29832%40telsasoft.com	2024-03-25 16:30:36 +01:00
Daniel Gustafsson	b8528fe026	Fix typo in comment Spotted while reviewing a patch changing things around this area.	2024-03-25 14:49:17 +01:00
Daniel Gustafsson	b2d6b4c728	ecpg: Fix return code for overflow in numeric conversion The decimal conversion functions dectoint and dectolong are documented to return ECPG_INFORMIX_NUM_OVERFLOW in case of overflows, but always returned -1 on all errors due to incorrectly checking the returnvalue from the PGTYPES* functions. Author: Aidar Imamov <a.imamov@postgrespro.ru> Discussion: https://postgr.es/m/54d2b53327516d9454daa5fb2f893bdc@postgrespro.ru	2024-03-25 14:18:36 +01:00
Daniel Gustafsson	64e401b62b	Fix indentation from `a11f330b5` Per buildfarm animal koel	2024-03-25 14:18:33 +01:00
Heikki Linnakangas	f83d709760	Merge prune, freeze and vacuum WAL record formats The new combined WAL record is now used for pruning, freezing and 2nd pass of vacuum. This is in preparation for changing VACUUM to write a combined prune+freeze record per page, instead of separate two records. The new WAL record format now supports that, but the code still always writes separate records for pruning and freezing. This reserves separate XLOG_HEAP2_* info codes for when the pruning record is emitted for on-access pruning or VACUUM, per Peter Geoghegan's suggestion. The record format is identical, but having separate info codes makes it easier analyze pruning and vacuuming with pg_waldump. The function to emit the new WAL record, log_heap_prune_and_freeze(), is in pruneheap.c. The existing heap_log_freeze_plan() and its subroutines are moved to pruneheap.c without changes, to keep them together with log_heap_prune_and_freeze(). Author: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://www.postgresql.org/message-id/CAAKRu_azf-zH%3DDgVbquZ3tFWjMY1w5pO8m-TXJaMdri8z3933g@mail.gmail.com Discussion: https://www.postgresql.org/message-id/CAAKRu_b2oE4GL%3Dq4g9mcByS9yT7wTQvEH9OLpabj28e%2BWKFi2A@mail.gmail.com	2024-03-25 14:59:58 +02:00
Peter Eisentraut	d44032d014	pg_createsubscriber: creates a new logical replica from a standby server It must be run on the target server and should be able to connect to the source server (publisher) and the target server (subscriber). All tables in the specified database(s) are included in the logical replication setup. A pair of publication and subscription objects are created for each database. The main advantage of pg_createsubscriber over the common logical replication setup is the initial data copy. It also reduces the catchup phase. Some prerequisites must be met to successfully run it. It is basically the logical replication requirements. It starts creating a publication using FOR ALL TABLES and a replication slot for each specified database. Write recovery parameters into the target data directory and start the target server. It specifies the LSN of the last replication slot (replication start point) up to which the recovery will proceed. Wait until the target server is promoted. Create one subscription per specified database (using publication and replication slot created in a previous step) on the target server. Set the replication progress to the replication start point for each subscription. Enable the subscription for each specified database on the target server. And finally, change the system identifier on the target server. Author: Euler Taveira <euler.taveira@enterprisedb.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: Vignesh C <vignesh21@gmail.com> Reviewed-by: Shubham Khanna <khannashubham1197@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/flat/5ac50071-f2ed-4ace-a8fd-b892cffd33eb@www.fastmail.com	2024-03-25 12:42:47 +01:00
Amit Kapila	a11f330b55	Track last_inactive_time in pg_replication_slots. This commit adds a new property called last_inactive_time for slots. It is set to 0 whenever a slot is made active/acquired and set to the current timestamp whenever the slot is inactive/released or restored from the disk. Note that we don't set the last_inactive_time for the slots currently being synced from the primary to the standby because such slots are typically inactive as decoding is not allowed on those. The 'last_inactive_time' will be useful on production servers to debug and analyze inactive replication slots. It will also help to know the lifetime of a replication slot - one can know how long a streaming standby, logical subscriber, or replication slot consumer is down. The 'last_inactive_time' will also be useful to implement inactive timeout-based replication slot invalidation in a future commit. Author: Bharath Rupireddy Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com	2024-03-25 16:34:33 +05:30
Amit Langote	0f7863afef	Code review for `6190d828cd` * Fix the comment of init_dummy_sjinfo() to remove references to non-existing parameters 'rel1' and 'rel2'. * Adjust consider_new_or_clause() to call init_dummy_sjinfo() to make up a SpecialJoinInfo for inner joins like other code sites that were adjusted in `6190d828cd` to do so. Author: Richard Guo <guofenglinux@gmail.com> Reported-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/CAExHW5tHqEf3ASVqvFFcghYGPfpy7o3xnvhHwBGbJFMRH8KjNw@mail.gmail.com	2024-03-25 19:45:27 +09:00
Alexander Korotkov	cc0e7ebd30	reindexdb: Fix warning about uninitialized indices_tables_cell Initialize indices_tables_cell with NULL to silence the warning. Also, refactor the place of the first assignment of indices_tables_cell. Reported-by: Thomas Munro, David Rowley, Tom Lane, Richard Guo Discussion: https://postgr.es/m/2348025.1711332418%40sss.pgh.pa.us Discussion: https://postgr.es/m/E1roXs4-005UdX-1V%40gemulon.postgresql.org	2024-03-25 11:40:25 +02:00
Amit Langote	6190d828cd	Do not translate dummy SpecialJoinInfos for child joins This teaches build_child_join_sjinfo() to create the dummy SpecialJoinInfos (those created for inner joins) directly for a given child join, skipping the unnecessary overhead of translating the parent joinrel's SpecialJoinInfo. To that end, this commit moves the code to initialize the dummy SpecialJoinInfos to a new function named init_dummy_sjinfo() and changes the few existing sites that have this code and build_child_join_sjinfo() to call this new function. Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Andrey Lepikhov <a.lepikhov@postgrespro.ru> Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com> Discussion: https://postgr.es/m/CAExHW5tHqEf3ASVqvFFcghYGPfpy7o3xnvhHwBGbJFMRH8KjNw@mail.gmail.com	2024-03-25 18:06:47 +09:00
Amit Langote	5278d0a2e8	Reduce memory used by partitionwise joins Specifically, this commit reduces the memory consumed by the SpecialJoinInfos that are allocated for child joins in try_partitionwise_join() by freeing them at the end of creating paths for each child join. A SpecialJoinInfo allocated for a given child join is a copy of the parent join's SpecialJoinInfo, which contains the translated copies of the various Relids bitmapsets and semi_rhs_exprs, which is a List of Nodes. The newly added freeing step frees the struct itself and the various bitmapsets, but not semi_rhs_exprs, because there's no handy function to free the memory of Node trees. Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Andrey Lepikhov <a.lepikhov@postgrespro.ru> Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com> Discussion: https://postgr.es/m/CAExHW5tHqEf3ASVqvFFcghYGPfpy7o3xnvhHwBGbJFMRH8KjNw@mail.gmail.com	2024-03-25 18:06:46 +09:00
Masahiko Sawada	80d5d4937c	Fix potential integer handling issue in radixtree.h. Coverity complained about the integer handling issue; if we start with an arbitrary non-negative shift value, the loop may decrement it down to something less than zero before exiting. This commit adds an assertion to make sure the 'shift' is always 0 after the loop, and uses 0 as the shift to get the key chunk in the following operation. Introduced by `ee1b30f12`. Reported-by: Tom Lane as per coverity Reviewed-by: Tom Lane Discussion: https://postgr.es/m/2089517.1711299216%40sss.pgh.pa.us	2024-03-25 12:06:41 +09:00
David Rowley	66c0185a3d	Allow planner to use Merge Append to efficiently implement UNION Until now, UNION queries have often been suboptimal as the planner has only ever considered using an Append node and making the results unique by either using a Hash Aggregate, or by Sorting the entire Append result and running it through the Unique operator. Both of these methods always require reading all rows from the union subqueries. Here we adjust the union planner so that it can request that each subquery produce results in target list order so that these can be Merge Appended together and made unique with a Unique node. This can improve performance significantly as the union child can make use of the likes of btree indexes and/or Merge Joins to provide the top-level UNION with presorted input. This is especially good if the top-level UNION contains a LIMIT node that limits the output rows to a small subset of the unioned rows as cheap startup plans can be used. Author: David Rowley Reviewed-by: Richard Guo, Andy Fan Discussion: https://postgr.es/m/CAApHDvpb_63XQodmxKUF8vb9M7CxyUyT4sWvEgqeQU-GB7QFoQ@mail.gmail.com	2024-03-25 14:31:14 +13:00
Alexander Korotkov	47f99a407d	reindexdb: Add the index-level REINDEX with multiple jobs Straight-forward index-level REINDEX is not supported with multiple jobs as we cannot control the concurrent processing of multiple indexes depending on the same relation. Instead, we dedicate the whole table to certain reindex job. Thus, if indexes in the lists belong to different tables, that gives us a fair level of parallelism. This commit teaches get_parallel_object_list() to fetch table names for indexes in the case of index-level REINDEX. The same tables are grouped together in the output order, and the list of indexes is also rebuilt to match that order. Later during processingof that list, we push indexes belonging to the same table into the same job. Discussion: https://postgr.es/m/CACG%3DezZU_VwDi-1PN8RUSE6mcYG%2BYx1NH_rJO4%2BKe-mKqLp%3DNw%40mail.gmail.com Author: Maxim Orlov, Svetlana Derevyanko, Alexander Korotkov Reviewed-by: Michael Paquier	2024-03-25 02:07:15 +02:00
Jeff Davis	503c0ad976	Fix convert_case(), introduced in `5c40364dd6`. Check source length before checking for NUL terminator to avoid reading one byte past the string end. Also fix unreachable bug when caller does not expect NUL-terminated result. Add unit test coverage of convert_case() in case_test.c, which makes it easier to reproduce the valgrind failure. Discussion: https://postgr.es/m/7a9fd36d-7a38-4dc2-e676-fc939491a95a@gmail.com Reported-by: Alexander Lakhin	2024-03-24 16:28:34 -07:00
Tom Lane	af1d395843	Allow more cases to pass the unsafe-use-of-new-enum-value restriction. Up to now we've rejected cases like BEGIN; CREATE TYPE rainbow AS ENUM (); ALTER TYPE rainbow ADD VALUE 'red'; -- use the value 'red', perhaps in a constraint or index COMMIT; The concern is that the uncommitted enum value 'red' might get into an index and then break the index if we roll back the ALTER ADD. If the ALTER is in the same transaction as the CREATE then it's really perfectly safe, but we weren't taking the trouble to identify that. pg_dump in binary-upgrade mode will emit enum definitions that look like the above, which up to now didn't fall foul of the unsafe-usage check because we processed each restore command as a separate transaction. However an upcoming patch proposes to bundle the restore commands into large transactions to reduce XID consumption during pg_upgrade, and that makes this behavior a problem. To fix, remember the OIDs of enum types created in the current transaction, and allow use of enum values that are added to one later in the same transaction. To do this fully correctly in the presence of subtransactions, we'd have to track subtransaction nesting level of the CREATE and do maintenance work at every subsequent subtransaction exit. That seems expensive, and we don't need it to satisfy pg_dump's usage. Hence, apply the additional optimization only when the CREATE and ALTER are at outermost transaction level. Patch by me, reviewed by Andrew Dunstan Discussion: https://postgr.es/m/1548468.1711220438@sss.pgh.pa.us	2024-03-24 14:30:29 -04:00
Tom Lane	d37e0d0c50	Release PQconninfoOptions array in GetDbnameFromConnectionOptions(). It wasn't getting freed in one code path, which Coverity identified as a resource leak. It's probably of little consequence, but re-ordering the code into the correct sequence is no more work than dismissing the complaint. Minor oversight in commit `a145f424d`. While here, improve the unreasonably clunky coding of FindDbnameInConnParams: use of an output parameter is unnecessary and prone to uninitialized-variable problems.	2024-03-24 12:31:05 -04:00
Tom Lane	225e1dde46	Release temporary array in check_for_data_types_usage(). Coverity identified this as a resource leak. It's surely of no consequence given that the function is called only once per run, but freeing the storage is no more work than dismissing the complaint. Minor oversight in commit `347758b12`.	2024-03-24 12:13:35 -04:00
Peter Eisentraut	fc2d260c7e	ci: freebsd repartition script didn't copy .git directory We need a slightly different "cp" incantation to make sure top-level "dot" files, such as ".git", are also copied. This is relevant for example if a script wants to execute a git command. This currently does not happen, but it has come up while testing other patches. Reviewed-by: Tristan Partin <tristan@neon.tech> Discussion: https://www.postgresql.org/message-id/flat/40e80f77-a294-4f29-a16f-e21bc7bc75fc%40eisentraut.org	2024-03-24 08:41:14 +01:00
Peter Eisentraut	34768ee361	Add temporal FOREIGN KEY contraints Add PERIOD clause to foreign key constraint definitions. This is supported for range and multirange types. Temporal foreign keys check for range containment instead of equality. This feature matches the behavior of the SQL standard temporal foreign keys, but it works on PostgreSQL's native ranges instead of SQL's "periods", which don't exist in PostgreSQL (yet). Reference actions ON {UPDATE,DELETE} {CASCADE,SET NULL,SET DEFAULT} are not supported yet. Author: Paul A. Jungwirth <pj@illuminatedcomputing.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: jian he <jian.universality@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com	2024-03-24 07:37:13 +01:00
Daniel Gustafsson	697f8d266c	Revert "Add notBefore and notAfter to SSL cert info display" This reverts commit `6acb0a628e` since LibreSSL didn't support ASN1_TIME_diff until OpenBSD 7.1, leaving the older OpenBSD animals in the buildfarm complaining. Per plover in the buildfarm. Discussion: https://postgr.es/m/F0DF7102-192D-4C21-96AE-9A01AE153AD1@yesql.se	2024-03-22 22:58:41 +01:00
Tom Lane	473182c952	Use a hash table for catcache.c's CatCList objects. Up to now, all of the "catcache list" objects within a catalog cache were just chained together on a single dlist, requiring O(N) time to search. Remarkably, we've not had serious performance problems with that so far; but we got a complaint of a bad performance regression from v15 in a case with a large number of roles in the system, which traced down to O(N^2) total time when we probed N catcache lists. Replace that data structure with a hashtable having an enlargeable number of dlists, in an exactly parallel way to the data structure we've used for years for the plain CatCTup cache members. The extra cost of maintaining a hash table seems negligible, since we were already computing a hash value for list searches. Normally this'd be HEAD-only material, but in view of the performance regression it seems advisable to back-patch into v16. In the v16 version of the patch, leave the dead cc_lists field where it is and add the new fields at the end of struct catcache, to avoid possible ABI breakage in case any external code is looking at these structs. (We assume no external code is actually allocating new catcache structs.) Per report from alex work. Discussion: https://postgr.es/m/CAGvXd3OSMbJQwOSc-Tq-Ro1CAz=vggErdSG7pv2s6vmmTOLJSg@mail.gmail.com	2024-03-22 17:13:53 -04:00
Daniel Gustafsson	6acb0a628e	Add notBefore and notAfter to SSL cert info display This adds the X509 attributes notBefore and notAfter to sslinfo as well as pg_stat_ssl to allow verifying and identifying the validity period of the current client certificate. OpenSSL has APIs for extracting notAfter and notBefore, but they are only supported in recent versions so we have to calculate the dates by hand in order to make this work for the older versions of OpenSSL that we still support. Original patch by Cary Huang with additional hacking by Jacob and myself. Author: Cary Huang <cary.huang@highgo.ca> Co-author: Jacob Champion <jacob.champion@enterprisedb.com> Co-author: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/182b8565486.10af1a86f158715.2387262617218380588@highgo.ca	2024-03-22 21:25:25 +01:00
Alexander Korotkov	b670b93a66	Fix an oversight in refactoring in `06b10f80ba`. It was against intended skipping prechecking keys optimization in the first page of range queries to not influence point queries performance. Reported-by: Anton Melnikov Discussion: https://postgr.es/m/30cd7524-b9f1-4cf8-9c4a-223eb2e34441%40postgrespro.ru Author: Pavel Borisov	2024-03-22 15:25:53 +02:00
Peter Eisentraut	d20d8fbd3e	Do not output actual value of location fields in node serialization by default This changes nodeToString() to not output the actual value of location fields in nodes, but instead it writes -1. This mirrors the fact that stringToNode() also does not read location field values but always stores -1. For most uses of nodeToString(), which is to store nodes in catalog fields, this is more useful. We don't store original query texts in catalogs, so any lingering query location values are not meaningful. For debugging purposes, there is a new nodeToStringWithLocations(), which mirrors the existing stringToNodeWithLocations(). This is used for WRITE_READ_PARSE_PLAN_TREES and nodes/print.c functions, which covers all the debugging uses. Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAEze2WgrCiR3JZmWyB0YTc8HV7ewRdx13j0CqD6mVkYAW+SFGQ@mail.gmail.com	2024-03-22 09:49:12 +01:00
Amit Kapila	6ae701b437	Track invalidation_reason in pg_replication_slots. Till now, the reason for replication slot invalidation is not tracked directly in pg_replication_slots. A recent commit `007693f2a3` added 'conflict_reason' to show the reasons for slot conflict/invalidation, but only for logical slots. This commit adds a new column 'invalidation_reason' to show invalidation reasons for both physical and logical slots. And, this commit also turns 'conflict_reason' text column to 'conflicting' boolean column (effectively reverting commit `007693f2a3`). The 'conflicting' column is true for invalidation reasons 'rows_removed' and 'wal_level_insufficient' because those make the slot conflict with recovery. When 'conflicting' is true, one can now look at the new 'invalidation_reason' column for the reason for the logical slot's conflict with recovery. The new 'invalidation_reason' column will also be useful to track other invalidation reasons in the future commit. Author: Bharath Rupireddy Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik Discussion: https://www.postgresql.org/message-id/ZfR7HuzFEswakt/a%40ip-10-97-1-34.eu-west-3.compute.internal Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com	2024-03-22 13:52:05 +05:30
Peter Eisentraut	b4080fa3dc	Make RangeTblEntry dump order consistent Put the fields alias and eref earlier in the struct, so that it matches the order in _outRangeTblEntry()/_readRangeTblEntry(). This helps if we ever want to fully automate out/read of RangeTblEntry. Also, it makes dumps in the debugger easier to read in the same way. Internally, this makes no difference. Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://www.postgresql.org/message-id/flat/4b27fc50-8cd6-46f5-ab20-88dbaadca645@eisentraut.org	2024-03-22 07:28:33 +01:00
Peter Eisentraut	367c989cd8	Remove custom _jumbleRangeTblEntry() This is part of an effort to reduce the number of special cases in the automatically generated node support functions. This patch removes _jumbleRangeTblEntry() and instead adds per-field query_jumble_ignore annotations to match the behavior of the previous custom code. The pg_stat_statements test suite has some coverage of this. It gets rid of the switch on rtekind; this should be technically correct, since we do the equal and copy functions like this also. The list of fields to jumble has been checked and is considered correct as of `8b29a119fd`. Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://www.postgresql.org/message-id/flat/4b27fc50-8cd6-46f5-ab20-88dbaadca645@eisentraut.org	2024-03-22 07:23:47 +01:00
Peter Eisentraut	d575051b9a	Reformat some node comments Reformat some comments in node field definitions to avoid long lines. This makes room for per-field annotations. Similar to `835d476fd2`. Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://www.postgresql.org/message-id/flat/4b27fc50-8cd6-46f5-ab20-88dbaadca645@eisentraut.org	2024-03-22 07:21:51 +01:00
Peter Eisentraut	1e1eb12c25	Improve comment Clarify that RangeTblEntry.lateral reflects whether LATERAL was specified in the statement (as opposed to whether lateralness is implicit). Also, the list of applicable entry types was incomplete. Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://www.postgresql.org/message-id/flat/4b27fc50-8cd6-46f5-ab20-88dbaadca645@eisentraut.org	2024-03-22 07:21:06 +01:00
Peter Eisentraut	83d8065b1f	Remove obsolete comment The idea to use a union in the definition of RangeTblEntry is clearly not being pursued. Reviewed-by: Andrew Dunstan <andrew@dunslane.net> Discussion: https://www.postgresql.org/message-id/flat/4b27fc50-8cd6-46f5-ab20-88dbaadca645@eisentraut.org	2024-03-22 07:20:11 +01:00
Amit Langote	085e759e9d	Avoid splitting errmsg string to span multiple lines The error message being fixed was added in `6185c9737c`. While at it, add an "a" to the sentence. Reported-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://postgr.es/m/20240322.095149.895185546948714852.horikyota.ntt%40gmail.com	2024-03-22 12:04:06 +09:00
Daniel Gustafsson	7e65ad197f	Fix dumping role comments when using --no-role-passwords Commit `9a83d56b38` added support for allowing pg_dumpall to dump roles without including passwords, which accidentally made dumps omit COMMENTs on roles. This fixes it by using pg_authid to get the comment. Backpatch to all supported versions. Patch simultaneously written independently by Álvaro and myself. Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Author: Daniel Gustafsson <daniel@yesql.se> Reported-by: Bartosz Chroł <bartosz.chrol@handen.pl> Discussion: https://postgr.es/m/AS8P194MB1271CDA0ADCA7B75FCD8E767F7332@AS8P194MB1271.EURP194.PROD.OUTLOOK.COM Discussion: https://postgr.es/m/CAEP4nAz9V4H41_4ESJd1Gf0v%3DdevkqO1%3Dpo91jUw-GJSx8Hxqg%40mail.gmail.com Backpatch-through: v12	2024-03-21 23:31:57 +01:00
Alexander Korotkov	0997e0af27	Add TupleTableSlotOps.is_current_xact_tuple() method This allows us to abstract how/whether table AM uses transaction identifiers. A custom table AM can use a custom slot, which may not store xmin directly, but determine the tuple belonging to the current transaction in the other way. Discussion: https://postgr.es/m/CAPpHfdurb9ycV8udYqM%3Do0sPS66PJ4RCBM1g-bBpvzUfogY0EA%40mail.gmail.com Reviewed-by: Matthias van de Meent, Mark Dilger, Pavel Borisov Reviewed-by: Nikita Malakhov, Japin Li	2024-03-21 23:00:43 +02:00
Alexander Korotkov	c35a3fb5e0	Allow table AM tuple_insert() method to return the different slot This allows table AM to return a native tuple slot even if VirtualTupleTableSlot is given as an input. Native tuple slots have knowledge about system attributes, which could be accessed in the future. table_multi_insert() method already can modify the input 'slots' array. Discussion: https://postgr.es/m/CAPpHfdurb9ycV8udYqM%3Do0sPS66PJ4RCBM1g-bBpvzUfogY0EA%40mail.gmail.com Reviewed-by: Matthias van de Meent, Mark Dilger, Pavel Borisov Reviewed-by: Nikita Malakhov, Japin Li	2024-03-21 23:00:40 +02:00
Alexander Korotkov	02eb07ea89	Allow table AM to store complex data structures in rd_amcache The new table AM method free_rd_amcache is responsible for freeing all the memory related to rd_amcache and setting free_rd_amcache to NULL. If the new method is not specified, we still assume rd_amcache to be a single chunk of memory, which could be just pfree'd. Discussion: https://postgr.es/m/CAPpHfdurb9ycV8udYqM%3Do0sPS66PJ4RCBM1g-bBpvzUfogY0EA%40mail.gmail.com Reviewed-by: Matthias van de Meent, Mark Dilger, Pavel Borisov Reviewed-by: Nikita Malakhov, Japin Li	2024-03-21 23:00:34 +02:00
Daniel Gustafsson	adcdb2c8dd	Explicitly require password for SCRAM exchange This refactors the SASL init flow to set password_needed on the two SCRAM exchanges currently supported. The code already required this but was set up in such a way that all SASL exchanges required using a password, a restriction which may not hold for all exchanges (the example at hand being the proposed OAuthbearer exchange). This was extracted from a larger patchset to introduce OAuthBearer authentication and authorization. Author: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://postgr.es/m/d1b467a78e0e36ed85a09adf979d04cf124a9d4b.camel@vmware.com	2024-03-21 14:45:54 +01:00
Daniel Gustafsson	24178e235e	Refactor SASL exchange to return tri-state status The SASL exchange callback returned state in to output variables: done and success. This refactors that logic by introducing a new return variable of type SASLStatus which makes the code easier to read and understand, and prepares for future SASL exchanges which operate asynchronously. This was extracted from a larger patchset to introduce OAuthBearer authentication and authorization. Author: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://postgr.es/m/d1b467a78e0e36ed85a09adf979d04cf124a9d4b.camel@vmware.com	2024-03-21 14:45:46 +01:00
David Rowley	1db689715d	Temporarily install debugging in partition_prune test The buildfarm animal parula has been sporadically failing in the partition_prune test for the past week or so. It appears like an auto-vacuum or auto-analyze has run on one of the partitions of the "ab" table, causing the plan to change. This is unexpected as the table is empty. Here we install some telemetry to find out if this is the case. This also joins in pg_index to see if something has gone wrong with the index which could result in the planner being unable to use that index. We can revert this once we've figured out the cause of the plan instability. Reported-by: Tom Lane Investigation-by: Tom Lane Discussion: https://postgr.es/m/4009739.1710878318%40sss.pgh.pa.us	2024-03-21 21:21:05 +13:00
Amit Langote	6185c9737c	Add SQL/JSON query functions This introduces the following SQL/JSON functions for querying JSON data using jsonpath expressions: JSON_EXISTS(), which can be used to apply a jsonpath expression to a JSON value to check if it yields any values. JSON_QUERY(), which can be used to to apply a jsonpath expression to a JSON value to get a JSON object, an array, or a string. There are various options to control whether multi-value result uses array wrappers and whether the singleton scalar strings are quoted or not. JSON_VALUE(), which can be used to apply a jsonpath expression to a JSON value to return a single scalar value, producing an error if it multiple values are matched. Both JSON_VALUE() and JSON_QUERY() functions have options for handling EMPTY and ERROR conditions, which can be used to specify the behavior when no values are matched and when an error occurs during jsonpath evaluation, respectively. Author: Nikita Glukhov <n.gluhov@postgrespro.ru> Author: Teodor Sigaev <teodor@sigaev.ru> Author: Oleg Bartunov <obartunov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Andrew Dunstan <andrew@dunslane.net> Author: Amit Langote <amitlangote09@gmail.com> Author: Peter Eisentraut <peter@eisentraut.org> Author: Jian He <jian.universality@gmail.com> Reviewers have included (in no particular order): Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby, Álvaro Herrera, Jian He, Anton A. Melnikov, Nikita Malakhov, Peter Eisentraut, Tomas Vondra Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org Discussion: https://postgr.es/m/CA+HiwqHROpf9e644D8BRqYvaAPmgBZVup-xKMDPk-nd4EpgzHw@mail.gmail.com Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com	2024-03-21 17:07:03 +09:00
Amit Kapila	a145f424d5	Allow dbname to be written as part of connstring via pg_basebackup's -R option. Commit `cca97ce6a6` allowed dbname in pg_basebackup connstring and in this commit we allow it to be written in postgresql.auto.conf when -R option is used. The database name in the connection string will be used by the logical replication slot synchronization on standby. The dbname will be recorded only if specified explicitly in the connection string or environment variable. Masahiko Sawada hasn't reviewed the code in detail but endorsed the idea. Author: Vignesh C, Kuroda Hayato Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/CAB8KJ=hdKdg+UeXhReeHpHA6N6v3e0qFF+ZsPFHk9_ThWKf=2A@mail.gmail.com	2024-03-21 10:50:33 +05:30
Masahiko Sawada	30e144287a	Add TIDStore, to store sets of TIDs (ItemPointerData) efficiently. TIDStore is a data structure designed to efficiently store large sets of TIDs. For TID storage, it employs a radix tree, where the key is a block number, and the value is a bitmap representing offset numbers. The TIDStore can be created on a DSA area and used by multiple backend processes simultaneously. There are potential future users such as tidbitmap.c, though it's very likely the interface will need to evolve as we come to understand the needs of different kinds of users. For example, we can support updating the offset bitmap of existing values. Currently, the TIDStore is not used for anything yet, aside from the test code. But an upcoming patch will use it. This includes a unit test module, in src/test/modules/test_tidstore. Co-authored-by: John Naylor Discussion: https://postgr.es/m/CAD21AoAfOZvmfR0j8VmZorZjL7RhTiQdVttNuC4W-Shdc2a-AA%40mail.gmail.com	2024-03-21 10:08:42 +09:00
Tom Lane	995e0fbc1c	Un-break genbki.pl's error reporting capabilities. This essentially reverts commit `69eb643b2`, which added a fast path in Catalog::ParseData, but neglected to preserve the behavior of adding a line_number field in each hash. That makes it impossible for genbki.pl to provide any localization of error reports, which is bad enough; but actually the affected error reports failed entirely, producing useless bleats like "use of undefined value in sprintf". `69eb643b2` claimed to get a 15% speedup, but I'm not sure I believe that: the time to rebuild the bki files changes by less than 1% for me. In any case, making debugging of mistakes in .dat files more difficult would not be justified by even an order of magnitude speedup here; it's just not that big a chunk of the total build time. Per report from David Wheeler. Although it's also broken in v16, I don't think this is worth a back-patch, since we're very unlikely to touch the v16 catalog data again. Discussion: https://postgr.es/m/19238.1710953049@sss.pgh.pa.us	2024-03-20 18:02:50 -04:00
Tom Lane	1218ca9956	Add to_regtypemod function to extract typemod from a string type name. In combination with to_regtype, this allows converting a string to the "canonicalized" form emitted by format_type. That usage requires parsing the string twice, which is slightly annoying but not really too expensive. We considered alternatives such as returning a record type, but that way was notationally uglier than this, and possibly less flexible. Like to_regtype(), we'd rather that this return NULL for any bad input, but the underlying type-parsing logic isn't yet capable of not throwing syntax errors. Adjust the documentation for both functions to point that out. In passing, fix up a couple of nearby entries in the System Catalog Information Functions table that had not gotten the word about our since-v13 convention for displaying function usage examples. David Wheeler and Erik Wienhold, reviewed by Pavel Stehule, Jim Jones, and others. Discussion: https://postgr.es/m/DF2324CA-2673-4ABE-B382-26B5770B6AA3@justatheory.com	2024-03-20 17:11:28 -04:00
Nathan Bossart	80686761c4	Avoid overflow in MaybeRemoveOldWalSummaries(). This commit limits the maximum value of wal_summary_keep_time to INT_MAX / SECS_PER_MINUTE to avoid overflow when it is converted to seconds. In passing, use the HOURS_PER_DAY, MINS_PER_HOUR, and SECS_PER_MINUTE macros in the code for this GUC instead of hard- coding those values. Discussion: https://postgr.es/m/20240314210010.GA3056455%40nathanxps13	2024-03-20 13:31:58 -05:00
Jeff Davis	9acae56ce0	Inline basic UTF-8 functions. Shows a measurable speedup when processing UTF-8 data, such as with the new builtin collation provider. Discussion: https://postgr.es/m/163f4e2190cdf67f67016044e503c5004547e5a9.camel@j-davis.com Reviewed-by: Peter Eisentraut	2024-03-20 09:40:57 -07:00
Nathan Bossart	2b520860c0	Revert "Temporary patch to help debug pg_walsummary test failures." Thanks to commits `ea18eb7d62`, `b6ee30ec08`, and `19a829a327`, the 002_blocks.pl test now consistently passes, so we can remove this temporary debugging code. This reverts commit `5ddf997347`. Discussion: https://postgr.es/m/20240314210010.GA3056455%40nathanxps13	2024-03-20 11:34:00 -05:00
Alvaro Herrera	a0390f6ca6	Review wording on tablespaces w.r.t. partitioned tables Remove a redundant comment, and document pg_class.reltablespace properly in catalogs.sgml. After commits `a36c84c3e4`, `87259588d0` and others. Backpatch to 12. Discussion: https://postgr.es/m/202403191013.w2kr7wqlamqz@alvherre.pgsql	2024-03-20 15:28:14 +01:00
Alvaro Herrera	da952b415f	Rework lwlocknames.txt to become lwlocklist.h This way, we can fold the list of lock names to occur in BuiltinTrancheNames instead of having its own separate array. This saves two lines of code in GetLWTrancheName and some space in BuiltinTrancheNames, as foreseen in commit `74a7306310`, as well as removing the need for a separate lwlocknames.c file. We still have to build lwlocknames.h using Perl code, which initially I wanted to avoid, but it gives us the chance to cross-check wait_event_names.txt. Discussion: https://postgr.es/m/202401231025.gbv4nnte5fmm@alvherre.pgsql	2024-03-20 11:55:20 +01:00
Peter Eisentraut	e5da0fe3c2	Catalog domain not-null constraints This applies the explicit catalog representation of not-null constraints introduced by `b0e96f3119` for table constraints also to domain not-null constraints. Reviewed-by: Aleksander Alekseev <aleksander@timescale.com> Reviewed-by: jian he <jian.universality@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/9ec24d7b-633d-463a-84c6-7acff769c9e8%40eisentraut.org	2024-03-20 10:05:37 +01:00
Heikki Linnakangas	c9c260decd	Remove unused PruneState member rel PruneState->rel is no longer being used, so just remove it. Author: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://www.postgresql.org/message-id/20240320013602.6sypr4cx6sefpemg@liskov	2024-03-20 10:13:42 +02:00
Heikki Linnakangas	c33084205a	Reorganize heap_page_prune() function comment heap_page_prune()'s function header comment didn't explain the parameters in the same order they appear in the function. Fix that. Author: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://www.postgresql.org/message-id/20240320013602.6sypr4cx6sefpemg@liskov	2024-03-20 10:13:39 +02:00
Dean Rasheed	522ed12f7c	Add "--exclude-extension" to pg_dump's options. This option (or equivalently specifying "exclude extension pattern" in a filter file) allows extensions matching the specified pattern to be excluded from the dump. Ayush Vatsa, reviewed by Junwang Zhao, Dean Rasheed, and Daniel Gustafsson. Discussion: https://postgr.es/m/CACX+KaP=VgVy9h-EUh598DTu+-fNr1jyEmpghC8rRp9s=w33Kg@mail.gmail.com	2024-03-20 08:05:44 +00:00
Heikki Linnakangas	d63d486d6c	Remove assertions that some compiler say are tautological To avoid the compiler warnings: launch_backend.c:211:39: warning: comparison of constant 16 with expression of type 'BackendType' (aka 'enum BackendType') is always true [-Wtautological-constant-out-of-range-compare] launch_backend.c:233:39: warning: comparison of constant 16 with expression of type 'BackendType' (aka 'enum BackendType') is always true [-Wtautological-constant-out-of-range-compare] The point of the assertions was to fail more explicitly if someone adds a new BackendType to the end of the enum, but forgets to add it to the child_process_kinds array. It was a pretty weak assertion to begin with, because it wouldn't catch if you added a new BackendType in the middle of the enum. So let's just remove it. Per buildfarm member ayu and a few others, spotted by Tom Lane. Discussion: https://www.postgresql.org/message-id/4119680.1710913067@sss.pgh.pa.us	2024-03-20 09:14:51 +02:00
Peter Eisentraut	9578393bc5	Add tests for domain-related information schema views Reviewed-by: Aleksander Alekseev <aleksander@timescale.com> Discussion: https://www.postgresql.org/message-id/flat/9ec24d7b-633d-463a-84c6-7acff769c9e8%40eisentraut.org	2024-03-20 07:08:01 +01:00
Jeff Davis	f69319f2f1	Support C.UTF-8 locale in the new builtin collation provider. The builtin C.UTF-8 locale has similar semantics to the libc locale of the same name. That is, code point sort order (fast, memcmp-based) combined with Unicode semantics for character operations such as pattern matching, regular expressions, and LOWER()/INITCAP()/UPPER(). The character semantics are based on Unicode simple case mappings. The builtin provider's C.UTF-8 offers several important advantages over libc: * faster sorting -- benefits from additional optimizations such as abbreviated keys and varstrfastcmp_c * faster case conversion, e.g. LOWER(), at least compared with some libc implementations * available on all platforms with identical semantics, and the semantics are stable, testable, and documentable within a given Postgres major version Being based on memcmp, the builtin C.UTF-8 locale does not offer natural language sort order. But it is an improvement for most use cases that might otherwise use libc's "C.UTF-8" locale, as well as many use cases that use libc's "C" locale. Discussion: https://postgr.es/m/ff4c2f2f9c8fc7ca27c1c24ae37ecaeaeaff6b53.camel%40j-davis.com Reviewed-by: Daniel Vérité, Peter Eisentraut, Jeremy Schneider	2024-03-19 15:24:41 -07:00
Tom Lane	fd0398fcb0	Improve EXPLAIN's display of SubPlan nodes and output parameters. Historically we've printed SubPlan expression nodes as "(SubPlan N)", which is pretty uninformative. Trying to reproduce the original SQL for the subquery is still as impractical as before, and would be mighty verbose as well. However, we can still do better than that. Displaying the "testexpr" when present, and adding a keyword to indicate the SubLinkType, goes a long way toward showing what's really going on. In addition, this patch gets rid of EXPLAIN's use of "$n" to represent subplan and initplan output Params. Instead we now print "(SubPlan N).colX" or "(InitPlan N).colX" to represent the X'th output column of that subplan. This eliminates confusion with the use of "$n" to represent PARAM_EXTERN Params, and it's useful for the first part of this change because it eliminates needing some other indication of which subplan is referenced by a SubPlan that has a testexpr. In passing, this adds simple regression test coverage of the ROWCOMPARE_SUBLINK code paths, which were entirely unburdened by testing before. Tom Lane and Dean Rasheed, reviewed by Aleksander Alekseev. Thanks to Chantal Keller for raising the question of whether this area couldn't be improved. Discussion: https://postgr.es/m/2838538.1705692747@sss.pgh.pa.us	2024-03-19 18:19:24 -04:00
Nathan Bossart	cc4826dd5e	Inline pg_popcount{32,64} into pg_popcount(). On some systems, calls to pg_popcount{32,64} are indirected through a function pointer. This commit converts pg_popcount() to a function pointer on those systems so that we can inline the appropriate pg_popcount{32,64} implementations into each of the pg_popcount() implementations. Since pg_popcount() may call pg_popcount{32,64} several times, this can significantly improve its performance. Suggested-by: David Rowley Reviewed-by: David Rowley Discussion: https://postgr.es/m/CAApHDvrb7MJRB6JuKLDEY2x_LKdFHwVbogpjZBCX547i5%2BrXOQ%40mail.gmail.com	2024-03-19 14:46:16 -05:00
Tom Lane	b7e2121ab7	Postpone reparameterization of paths until create_plan(). When considering nestloop paths for individual partitions within a partitionwise join, if the inner path is parameterized, it is parameterized by the topmost parent of the outer rel, not the corresponding outer rel itself. Therefore, we need to translate the parameterization so that the inner path is parameterized by the corresponding outer rel. Up to now, we did this while generating join paths. However, that's problematic because we must also translate some expressions that are shared across all paths for a relation, such as restriction clauses (kept in the RelOptInfo and/or IndexOptInfo) and TableSampleClauses (kept in the RangeTblEntry). The existing code fails to translate these at all, leading to wrong answers, odd failures such as "variable not found in subplan target list", or executor crashes. But we can't modify them during path generation, because that would break things if we end up choosing some non-partitioned-join path. So this patch postpones reparameterization of the inner path until createplan.c, where it is safe to modify the referenced RangeTblEntry, RelOptInfo or IndexOptInfo, because we have made a final choice of which Path to use. We do still have to check during path generation that the reparameterization will be possible. So we introduce a new function path_is_reparameterizable_by_child() to detect that. The duplication between path_is_reparameterizable_by_child() and reparameterize_path_by_child() is a bit annoying, but there seems no other good answer. A small benefit is that we can avoid building useless reparameterized trees in cases where a non-partitioned join is ultimately chosen. Also, reparameterize_path_by_child() can now be allowed to scribble on the input paths, saving a few cycles. This fix repairs the same problems previously addressed in the back branches by commits `62f120203` et al. Richard Guo, reviewed at various times by Ashutosh Bapat, Andrei Lepikhov, Alena Rybakina, Robert Haas, and myself Discussion: https://postgr.es/m/CAMbWs496+N=UAjOc=rcD3P7B6oJe4rZw08e_TZRUsWbPxZW3Tw@mail.gmail.com	2024-03-19 14:51:58 -04:00
Peter Eisentraut	605721f819	gen_node_support.pl: Mark location fields as type alias ParseLoc Instead of the rather ugly type=int + name ~= location$, we now have a marker type for offset pointers or sizes that are only relevant when a query text is included, which decreases the complexity required in gen_node_support.pl for handling these values. Author: Matthias van de Meent <boekewurm+postgres@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAEze2WgrCiR3JZmWyB0YTc8HV7ewRdx13j0CqD6mVkYAW+SFGQ@mail.gmail.com	2024-03-19 16:56:44 +01:00
Daniel Gustafsson	347758b120	pg_upgrade: run all data type checks per connection The checks for data type usage were each connecting to all databases in the cluster and running their query. On clusters which have a lot of databases this can become unnecessarily expensive. This moves the checks to run in a single connection instead to minimize setup and teardown overhead. Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Justin Pryzby <pryzby@telsasoft.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/BB4C76F-D416-4F9F-949E-DBE950D37787@yesql.se	2024-03-19 13:32:50 +01:00
Peter Eisentraut	5577a71fb0	Use half-open interval notation in without_overlaps tests This way, the input literals match the output in any error messages. Author: Paul A. Jungwirth <pj@illuminatedcomputing.com> Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com	2024-03-19 11:44:14 +01:00
Peter Eisentraut	49b579f92d	Fix misleading comments To match code changes in `229fb58d4f`.	2024-03-19 10:55:51 +01:00
Peter Eisentraut	a88c800deb	Use daterange and YMD in without_overlaps tests instead of tsrange. This makes things a lot easier to read, especially when we get to the FOREIGN KEY tests later. Author: Paul A. Jungwirth <pj@illuminatedcomputing.com> Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com	2024-03-19 10:17:03 +01:00
Peter Eisentraut	794f10f6b9	Add some UUID support functions Add uuid_extract_timestamp() and uuid_extract_version(). Author: Andrey Borodin Reviewed-by: Sergey Prokhorenko, Kirk Wolak, Przemysław Sztoch Reviewed-by: Nikolay Samokhvalov, Jelte Fennema-Nio, Aleksander Alekseev Reviewed-by: Peter Eisentraut, Chris Travers, Lukas Fittl Discussion: https://postgr.es/m/CAAhFRxitJv%3DyoGnXUgeLB_O%2BM7J2BJAmb5jqAT9gZ3bij3uLDA%40mail.gmail.com	2024-03-19 09:32:04 +01:00
Peter Eisentraut	d56cb42b54	Activate perlcritic InputOutput::RequireCheckedSyscalls and fix resulting warnings This checks that certain I/O-related Perl functions properly check their return value. Some parts of the PostgreSQL code had been a bit sloppy about that. The new perlcritic warnings are fixed here. I didn't design any beautiful error messages, mostly just used "or die $!", which mostly matches existing code, and also this is developer-level code, so having the system error plus source code reference should be ok. Initially, we only activate this check for a subset of what the perlcritic check would warn about. The effective list is chmod flock open read rename seek symlink system The initial set of functions is picked because most existing code already checked the return value of those, so any omissions are probably unintended, or because it seems important for test correctness. The actual perlcritic configuration is written as an exclude list. That seems better so that we are clear on what we are currently not checking. Maybe future patches want to investigate checking some of the other functions. (In principle, we might eventually want to check all of them, but since this is test and build support code, not production code, there are probably some reasonable compromises to be made.) Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://www.postgresql.org/message-id/flat/88b7d4f2-46d9-4cc7-b1f7-613c90f9a76a%40eisentraut.org	2024-03-19 07:09:31 +01:00
Jeff Davis	f9f3fb1cb7	Update src/common/unicode/README. Change the test description to include the case mapping test. Oversight in `5c40364dd6`.	2024-03-18 16:39:29 -07:00
Jeff Davis	60769c62dc	Fix another warning, introduced by `846311051e`. Discussion: https://postgr.es/m/3703896.1710799495@sss.pgh.pa.us Reported-by: Tom Lane	2024-03-18 15:34:43 -07:00
Jeff Davis	846311051e	Address more review comments on commit `2d819a08a1`. Based on comments from Peter Eisentraut. * Document CREATE DATABASE ... BUILTIN_LOCALE. * Determine required encoding based on locale name for CREATE COLLATION. Use -1 for "C" (requires catversion bump). * initdb output fixups. * Make ctype_is_c a constant true for now. * Fixups to ICU 010_create_database.pl test. Discussion: https://postgr.es/m/4135cf11-206d-40ed-96c0-9363c1232379@eisentraut.org	2024-03-18 11:58:13 -07:00
Alvaro Herrera	66ab9371a2	dblink/isolationtester/fe_utils: Use new cancel API Commit `61461a300c` introduced new functions to libpq for cancelling queries. This replaces the usage of the old ones in parts of the codebase with these newer ones. This specifically leaves out changes to psql and pgbench, as those would need a much larger refactor to be able to call them due to the new functions not being signal-safe; and also postgres_fdw, because the original code there is not clear to me (Álvaro) and not fully tested. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://postgr.es/m/CAGECzQT_VgOWWENUqvUV9xQmbaCyXjtRRAYO8W07oqashk_N+g@mail.gmail.com	2024-03-18 19:28:58 +01:00
Jeff Davis	61f352ece9	Fix unreachable code warning from commit `2d819a08a1`. Found by Coverity. Discussion: https://postgr.es/m/3422201.1710711993@sss.pgh.pa.us Reported-by: Tom Lane	2024-03-18 09:15:47 -07:00
Peter Eisentraut	5162f663fc	Add missing source files to nls.mk	2024-03-18 16:46:22 +01:00
Alvaro Herrera	6b3678d347	Put libpq_pipeline cancel test back I disabled it in `cc6e64afda` because it was unstable; hopefully the changes here make it more stable. (Jelte proposed to submit the queries using simple query protocol, but I chose to instead make the monitor connection wait until the PgSleep wait event is seen. This is probably not a terribly meaningful increase in coverage ...) Discussion: https://postgr.es/m/CAGECzQRQh_5tSy+5cndgv08eNJ2O0Zpwn2YddJtSsmC=Wpy1BQ@mail.gmail.com	2024-03-18 13:14:55 +01:00
Heikki Linnakangas	0960ae1967	Fix EXPLAIN Bitmap heap scan to count pages with no visible tuples Previously, bitmap heap scans only counted lossy and exact pages for explain when there was at least one visible tuple on the page. heapam_scan_bitmap_next_block() returned true only if there was a "valid" page with tuples to be processed. However, the lossy and exact page counters in EXPLAIN should count the number of pages represented in a lossy or non-lossy way in the constructed bitmap, regardless of whether or not the pages ultimately contained visible tuples. Backpatch to all supported versions. Author: Melanie Plageman Discussion: https://www.postgresql.org/message-id/CAAKRu_ZwCwWFeL_H3ia26bP2e7HiKLWt0ZmGXPVwPO6uXq0vaA@mail.gmail.com Discussion: https://www.postgresql.org/message-id/CAAKRu_bxrXeZ2rCnY8LyeC2Ls88KpjWrQ%2BopUrXDRXdcfwFZGA@mail.gmail.com	2024-03-18 14:03:58 +02:00
Peter Eisentraut	48018f1d8c	Add some const decorations Discussion: https://www.postgresql.org/message-id/flat/5ac50071-f2ed-4ace-a8fd-b892cffd33eb@www.fastmail.com	2024-03-18 12:07:09 +01:00
Heikki Linnakangas	05c3980e7f	Move code for backend startup to separate file This is code that runs in the backend process after forking, rather than postmaster. Move it out of postmaster.c for clarity. Reviewed-by: Tristan Partin, Andres Freund Discussion: https://www.postgresql.org/message-id/7a59b073-5b5b-151e-7ed3-8b01ff7ce9ef@iki.fi	2024-03-18 11:38:10 +02:00
Heikki Linnakangas	aafc05de1b	Refactor postmaster child process launching Introduce new postmaster_child_launch() function that deals with the differences in EXEC_BACKEND mode. Refactor the mechanism of passing information from the parent to child process. Instead of using different command-line arguments when launching the child process in EXEC_BACKEND mode, pass a variable-length blob of startup data along with all the global variables. The contents of that blob depend on the kind of child process being launched. In !EXEC_BACKEND mode, we use the same blob, but it's simply inherited from the parent to child process. Reviewed-by: Tristan Partin, Andres Freund Discussion: https://www.postgresql.org/message-id/7a59b073-5b5b-151e-7ed3-8b01ff7ce9ef@iki.fi	2024-03-18 11:35:30 +02:00
Heikki Linnakangas	f1baed18bc	Move some functions from postmaster.c to a new source file This just moves the functions, with no other changes, to make the next commits smaller and easier to review. The moved functions are related to launching postmaster child processes in EXEC_BACKEND mode. Reviewed-by: Tristan Partin, Andres Freund Discussion: https://www.postgresql.org/message-id/7a59b073-5b5b-151e-7ed3-8b01ff7ce9ef@iki.fi	2024-03-18 11:35:05 +02:00
Heikki Linnakangas	14cddee9cc	Split registration of Win32 deadchild callback to separate function The next commit will move the internal_forkexec() function to a different source file, but it makes sense to keep all the code related to the win32 waitpid() emulation in postmaster.c. Split it off to a separate function now, to make the next commit more mechanical. Reviewed-by: Tristan Partin, Andres Freund Discussion: https://www.postgresql.org/message-id/7a59b073-5b5b-151e-7ed3-8b01ff7ce9ef@iki.fi	2024-03-18 11:35:01 +02:00
Michael Paquier	ca108be72e	Remove references to backup_fs_hot() in Cluster.pm This routine has been removed in `39969e2a1e` with the exclusive backup mode but there were still references to it. Issue noticed while working on `071e3ad59d`.	2024-03-18 14:21:32 +09:00
Nathan Bossart	949300402b	Initialize variables to placate compiler. Since commit `012460ee93`, some compilers have been warning that a couple of variables may be used uninitialized. There doesn't appear to be any actual risk, so let's just initialize these variables to 0 to silence the compiler warnings. Discussion: https://postgr.es/m/20240317192927.GA3978212%40nathanxps13	2024-03-17 20:16:15 -05:00
Daniel Gustafsson	d6607016c7	Support json_errdetail in FRONTEND code Allocate memory for the error message inside memory owned by the JsonLexContext and move responsibility away from the caller for freeing it. This means that we can partially revert `b44669b2ca` as this is now safe to use in FRONTEND code. The motivation for this comes from the OAuth and incremental JSON patchsets but it also adds value on its own. Author: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAOYmi+mWdTd6ujtyF7MsvXvk7ToLRVG_tYAcaGbQLvf=N4KrQw@mail.gmail.com	2024-03-17 23:56:15 +01:00
Tom Lane	33f13168cc	Mark hash_corrupted() as pg_attribute_noreturn. Coverity started complaining about this after `cc5ef90ed`. The code's not really different from before, but might as well clarify its intent.	2024-03-17 17:54:45 -04:00
Dean Rasheed	c649fa24a4	Add RETURNING support to MERGE. This allows a RETURNING clause to be appended to a MERGE query, to return values based on each row inserted, updated, or deleted. As with plain INSERT, UPDATE, and DELETE commands, the returned values are based on the new contents of the target table for INSERT and UPDATE actions, and on its old contents for DELETE actions. Values from the source relation may also be returned. As with INSERT/UPDATE/DELETE, the output of MERGE ... RETURNING may be used as the source relation for other operations such as WITH queries and COPY commands. Additionally, a special function merge_action() is provided, which returns 'INSERT', 'UPDATE', or 'DELETE', depending on the action executed for each row. The merge_action() function can be used anywhere in the RETURNING list, including in arbitrary expressions and subqueries, but it is an error to use it anywhere outside of a MERGE query's RETURNING list. Dean Rasheed, reviewed by Isaac Morland, Vik Fearing, Alvaro Herrera, Gurjeet Singh, Jian He, Jeff Davis, Merlin Moncure, Peter Eisentraut, and Wolfgang Walther. Discussion: http://postgr.es/m/CAEZATCWePEGQR5LBn-vD6SfeLZafzEm2Qy_L_Oky2=qw2w3Pzg@mail.gmail.com	2024-03-17 13:58:59 +00:00
Peter Eisentraut	6a004f1be8	Add attstattarget to FormExtraData_pg_attribute This allows setting attstattarget when a relation is created. We make use of this by having index_concurrently_create_copy() copy over the attstattarget values when the new index is created, instead of having index_concurrently_swap() fix it up later. Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/4da8d211-d54d-44b9-9847-f2a9f1184c76@eisentraut.org	2024-03-17 12:38:27 +01:00
Peter Eisentraut	d939cb2fd6	Generalize handling of nullable pg_attribute columns in DDL DDL code uses tuple descriptors to pass around pg_attribute values during table and index creation. But tuple descriptors don't include the variable-length/nullable columns of pg_attribute, so they have to be handled separately. Right now, the attoptions field is handled in a one-off way with a separate argument passed to InsertPgAttributeTuples(). The other affected fields of pg_attribute are right now not needed at relation creation time. The goal of this patch is to generalize this to allow handling additional variable-length/nullable columns of pg_attribute in a similar manner. For that, create a new struct FormExtraData_pg_attribute, which is to be passed around in parallel to the tuple descriptor and optionally supplies the additional columns. Right now, this struct only contains one field for attoptions, so no functionality is actually changed by this. Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/4da8d211-d54d-44b9-9847-f2a9f1184c76@eisentraut.org	2024-03-17 12:30:51 +01:00
Peter Eisentraut	012460ee93	Make stxstattarget nullable To match attstattarget change (commit `4f622503d6`). The logic inside CreateStatistics() is clarified a bit compared to that previous patch, and so here we also update ATExecSetStatistics() to match. Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/4da8d211-d54d-44b9-9847-f2a9f1184c76@eisentraut.org	2024-03-17 12:26:26 +01:00
Dean Rasheed	33e729c514	Fix EXPLAIN output for subplans in MERGE. Given a subplan in a MERGE query, EXPLAIN would sometimes fail to properly display expressions involving Params referencing variables in other parts of the plan tree. This would affect subplans outside the topmost join plan node, for which expansion of Params would go via the top-level ModifyTable plan node. The problem was that "inner_tlist" for the ModifyTable node's deparse_namespace was set to the join node's targetlist, but "inner_plan" was set to the ModifyTable node itself, rather than the join node, leading to incorrect results when descending to the referenced variable. Fix and backpatch to v15, where MERGE was introduced. Discussion: https://postgr.es/m/CAEZATCWAv-sZuH%2BwG5xJ-%2BGt7qGNGX8wUQd3XYydMFDKgRB9nw%40mail.gmail.com	2024-03-17 10:17:11 +00:00
Peter Eisentraut	20e58105ba	Separate equalRowTypes() from equalTupleDescs() This introduces a new function equalRowTypes() that is effectively a subset of equalTupleDescs() but only compares the number of attributes and attribute name, type, typmod, and collation. This is enough for most existing uses of equalTupleDescs(), which are changed to use the new function. The only remaining callers of equalTupleDescs() are those that really want to check the full tuple descriptor as such, without concern about record or row or record type semantics. The existing function hashTupleDesc() is renamed to hashRowType(), because it now corresponds more to equalRowTypes(). The purpose of this change is to be clearer about the semantics of the equality asked for by each caller. (At least one caller had a comment that questioned whether equalTupleDescs() was too restrictive.) For example, `4f622503d6` removed attstattarget from the tuple descriptor structure. It was not fully clear at the time how this should affect equalTupleDescs(). Now the answer is clear: By their own definitions, equalRowTypes() does not care, and equalTupleDescs() just compares whatever is in the tuple descriptor but does not care why it is in there. Reviewed-by: Tomas Vondra <tomas.vondra@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/f656d6d9-6660-4518-a006-2f65cafbebd1%40eisentraut.org	2024-03-17 05:58:04 +01:00
Daniel Gustafsson	b783186515	Add destroyStringInfo function for cleaning up StringInfos destroyStringInfo() is a counterpart to makeStringInfo(), freeing a palloc'd StringInfo and its data. This is a convenience function to align the StringInfo API with the PQExpBuffer API. Originally added in the OAuth patchset, it was extracted and committed separately in order to aid upcoming JSON work. Author: Daniel Gustafsson <daniel@yesql.se> Author: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAOYmi+mWdTd6ujtyF7MsvXvk7ToLRVG_tYAcaGbQLvf=N4KrQw@mail.gmail.com	2024-03-16 23:18:28 +01:00
Alexander Korotkov	927332b95e	psql: fix variable existence tab completion psql has the :{?name} syntax for testing for a psql variable existence. This commit implements a tab completion for this syntax. Notably, in order to implement this we have to remove '{' from WORD_BREAKS. It appears that '{' here from the very beginning and it comes from the default value of rl_basic_word_break_characters. And :{?name} is the only psql syntax using the '{' sign. So, removing it from WORD_BREAKS shouldn't break anything. Discussion: https://postgr.es/m/CAGRrpzZU48F2oV3d8eDLr%3D4TU9xFH5Jt9ED%2BqU1%2BX91gMH68Sw%40mail.gmail.com Author: Steve Chavez Reviewed-by: Erik Wienhold	2024-03-16 23:49:10 +02:00
Alexander Korotkov	605062227f	Use locale-aware value for \watch in 005_timeouts.pl Reported-by: Alexander Lakhin	2024-03-15 21:37:17 +02:00
Daniel Gustafsson	196eeb6b2f	Fix handling of expecteddir in pg_regress Commit `c855872074` introduced a new parameter to pg_regress to set the directory where to look for expected files, but accidentally only implemented it for when compiling pg_regress for ECPG tests. Fix by adding support for the parameter to the main regression test compilation of pg_regress as well. Backpatch to v16 where --expecteddir was introduced. Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Discussion: https://postgr.es/m/CAO6_Xqq5yKJHcJsq__LPcKwSY0XHRqVERNWGxx5ttNXXj7+W=A@mail.gmail.com Backpatch-through: 16	2024-03-15 17:02:07 +01:00
Heikki Linnakangas	d802ff06d0	Fix backstop in gin test if injection point is not reached Per Tom Lane's observation that the test got stuck in infinite loop if the injection_points module was not loaded. It was supposed to give up after 10000 iterations, but the backstop was broken. Discussion: https://www.postgresql.org/message-id/2498595.1710511222%40sss.pgh.pa.us	2024-03-15 17:55:12 +02:00
Heikki Linnakangas	85f65d7a26	Try to unbreak injection-fault tests in the buildfarm The buildfarm script attempts to run all tests marked as NO_INSTALLCHECK under src/test/modules without paying attention to whether they are enabled or disabled in the parent Makefile. That hasn't been a problem so far, because all the tests marked with NO_INSTALLCHECK ran unconditionally in "make check". But commit `e2e3b8ae9e` changed that: the injection fault tests are marked as NO_INSTALLCHECK, and also depend on --enable-injection-points. Try to work around that by ensuring that "make check" does nothing in the those subdirectories. We can hopefully get rid of this hack soon, after fixing the buildfarm client, or by switching to meson. Discussion: https://www.postgresql.org/message-id/8e4cf596-dd70-432e-9068-16466ed596ed%40iki.fi	2024-03-15 15:22:12 +02:00
Alexander Korotkov	7a65cc079e	Fix wordings in timeouts TAP test Reported-by: Kyotaro Horiguchi Discussion: https://postgr.es/m/20240315.104235.1835366724413653745.horikyota.ntt%40gmail.com Author: Andrey Borodin	2024-03-15 14:38:22 +02:00
Alexander Korotkov	4c2eda67f5	Fix race condition in transaction timeout TAP tests The interruption handler within the injection point can get stuck in an infinite loop while handling transaction timeout. To avoid this situation we reset the timeout flag before invoking the injection point. Author: Alexander Korotkov Reviewed-by: Andrey Borodin Discussion: https://postgr.es/m/ZfPchPC6oNN71X2J%40paquier.xyz	2024-03-15 14:38:22 +02:00
Heikki Linnakangas	a3f349c612	Improve log messages referring to background worker processes "Worker" could also mean autovacuum worker or slot sync worker, so let's be more explicit. Per Tristan Partin's suggestion. Discussion: https://www.postgresql.org/message-id/CZM6WDX5H4QI.NZG1YUCKWLA@neon.tech	2024-03-15 13:14:38 +02:00
Heikki Linnakangas	e2e3b8ae9e	Disable tests using injection points in installcheck The 'gin' test injections faults to GIN index build. If another test running concurrently in the same cluster also tries to create a GIN index, it will hit the fault, too. To fix, disable tests using injection points when running against an existing cluster. A better long-term solution would be to make the injection points scoped to the database or process, but this will do for now. Discussion: https://www.postgresql.org/message-id/CA%2BhUKGJYhcG_o2nwSK6r01eOZJwNWUJUbX%3D%3DAVnW84f-%2B8yamQ@mail.gmail.com Discussion: https://www.postgresql.org/message-id/10fd6cdd-c5d9-46fe-9fa1-7e661191309e@iki.fi	2024-03-15 13:06:57 +02:00
Michael Paquier	071e3ad59d	Add basic TAP tests for the low-level backup method, take two There are currently no tests for the low-level backup method where pg_backup_start() and pg_backup_stop() are involved while taking a file-system backup. The tests introduced in this commit rely on a background psql process to make sure that the backup is taken while the session doing the SQL start and stop calls remains alive. Two cases are checked here with the backup taken: - Recovery without a backup_label, leading to a corrupted state. - Recovery with a backup_label, with a consistent state reached. Both cases cross-check some patterns in the logs generated when running recovery. Compared to the first attempt in `99b4a63bef`, this includes a couple of fixes making the CI stable (5 runs succeeded here): - Add the file to the list of tests in meson.build. - Race condition with the first WAL segment that we expect in the primary's archives, by adding a poll on pg_stat_archiver. The second segment with the checkpoint record is archived thanks to pg_backup_stop waiting for it. - Fix failure of test where the backup_label does not exist. The cluster inherits the configuration of the first node; it was attempting to store segments in the first node's archives, triggering failures with copy on Windows. - Fix failure of test on Windows because of incorrect parsing of the backup_file in the success case. The data of the backup_label file is retrieved from the output pg_backup_stop() from a BackgroundPsql written directly to the backup's data folder. This would include CRLFs (\r\n), causing the startup process to fail at the beginning of recovery when parsing the backup_label because only LFs (\n) are allowed. Author: David Steele Discussion: https://postgr.es/m/f20fcc82-dadb-478d-beb4-1e2ffb0ace76@pgmasters.net	2024-03-15 08:29:54 +09:00
Michael Paquier	cc5ef90edd	Refactor initial hash lookup in dynahash.c The same pattern is used three times in dynahash.c to retrieve a bucket number and a hash bucket from a hash value. This has popped up while discussing improvements for the type cache, where this piece of refactoring would become useful. Note that hash_search_with_hash_value() does not need the bucket number, just the hash bucket. Author: Teodor Sigaev Reviewed-by: Aleksander Alekseev, Michael Paquier Discussion: https://postgr.es/m/5812a6e5-68ae-4d84-9d85-b443176966a1@sigaev.ru	2024-03-15 07:57:17 +09:00
David Rowley	4169850f0b	Trim ORDER BY/DISTINCT aggregate pathkeys in gather_grouping_paths Similar to `d8a295389`, trim off any PathKeys which are for ORDER BY / DISTINCT aggregate functions from the PathKey List for the Gather Merge paths created by gather_grouping_paths(). These additional PathKeys are not valid to use after grouping has taken place as these PathKeys belong to columns which are inputs to an aggregate function and, therefore are unavailable after aggregation. Reported-by: Alexander Lakhin Discussion: https://postgr.es/m/cf63174c-8c89-3953-cb49-48f41f74941a@gmail.com Backpatch-through: 16, where `1349d2790` was added	2024-03-15 11:54:36 +13:00
Daniel Gustafsson	4665cebc8a	Login event trigger documentation wordsmithing Minor wordsmithing on the login trigger documentation and code comments to improve readability, as well as fixing a few small incorrect statements in the comments. Author: Robert Treat <rob@xzilla.net> Discussion: https://postgr.es/m/CAJSLCQ0aMWUh1m6E9YdjeqV61baQ=EhteJX8XOxXg8H_2Lcr0Q@mail.gmail.com	2024-03-14 23:35:35 +01:00
Tom Lane	b4a71cf65d	Make INSERT-from-multiple-VALUES-rows handle domain target columns. Commit `a3c7a993d` fixed some cases involving target columns that are arrays or composites by applying transformAssignedExpr to the VALUES entries, and then stripping off any assignment ArrayRefs or FieldStores that the transformation added. But I forgot about domains over arrays or composites :-(. Such cases would either fail with surprising complaints about mismatched datatypes, or insert unexpected coercions that could lead to odd results. To fix, extend the stripping logic to get rid of CoerceToDomain if it's atop an ArrayRef or FieldStore. While poking at this, I realized that there's a poorly documented and not-at-all-tested behavior nearby: we coerce each VALUES column to the domain type separately, and rely on the rewriter to merge those operations so that the domain constraints are checked only once. If that merging did not happen, it's entirely possible that we'd get unexpected domain constraint failures due to checking a partially-updated container value. There's no bug there, but while we're here let's improve the commentary about it and add some test cases that explicitly exercise that behavior. Per bug #18393 from Pablo Kharo. Back-patch to all supported branches. Discussion: https://postgr.es/m/18393-65fedb1a0de9260d@postgresql.org	2024-03-14 14:57:16 -04:00
Nathan Bossart	d1162cfda8	Add pg_column_toast_chunk_id(). This function returns the chunk_id of an on-disk TOASTed value. If the value is un-TOASTed or not on-disk, it returns NULL. This is useful for identifying which values are actually TOASTed and for investigating "unexpected chunk number" errors. Bumps catversion. Author: Yugo Nagata Reviewed-by: Jian He Discussion: https://postgr.es/m/20230329105507.d764497456eeac1ca491b5bd%40sraoss.co.jp	2024-03-14 10:58:00 -05:00
Heikki Linnakangas	84c18acaf6	Remove redundant snapshot copying from parallel leader to workers The parallel query infrastructure copies the leader backend's active snapshot to the worker processes. But BitmapHeapScan node also had bespoken code to pass the snapshot from leader to the worker. That was redundant, so remove it. The removed code was analogous to the snapshot serialization in table_parallelscan_initialize(), but that was the wrong role model. A parallel bitmap heap scan is more like an independent non-parallel bitmap heap scan in each parallel worker as far as the table AM is concerned, because the coordination is done in nodeBitmapHeapscan.c, and the table AM doesn't need to know anything about it. This relies on the assumption that es_snapshot == GetActiveSnapshot(). That's not a new assumption, things would get weird if you used the QueryDesc's snapshot for visibility checks in the scans, but the active snapshot for evaluating quals, for example. This could use some refactoring and cleanup, but for now, just add some assertions. Reviewed-by: Dilip Kumar, Robert Haas Discussion: https://www.postgresql.org/message-id/5f3b9d59-0f43-419d-80ca-6d04c07cf61a@iki.fi	2024-03-14 15:18:10 +02:00
Robert Haas	2346df6fc3	Allow a no-wait lock acquisition to succeed in more cases. We don't determine the position at which a process waiting for a lock should insert itself into the wait queue until we reach ProcSleep(), and we may at that point discover that we must insert ourselves ahead of everyone who wants a conflicting lock, in which case we obtain the lock immediately. Up until now, a no-wait lock acquisition would fail in such cases, erroneously claiming that the lock couldn't be obtained immediately. Fix that by trying ProcSleep even in the no-wait case. No back-patch for now, because I'm treating this as an improvement to the existing no-wait feature. It could instead be argued that it's a bug fix, on the theory that there should never be any case whatsoever where no-wait fails to obtain a lock that would have been obtained immediately without no-wait, but I'm reluctant to interpret the semantics of no-wait that strictly. Robert Haas and Jingxian Li Discussion: http://postgr.es/m/CA+TgmobCH-kMXGVpb0BB-iNMdtcNkTvcZ4JBxDJows3kYM+GDg@mail.gmail.com	2024-03-14 08:56:06 -04:00
Alexander Korotkov	eeefd4280f	Add TAP tests for timeouts This commit adds new tests to verify that transaction_timeout, idle_session_timeout, and idle_in_transaction_session_timeout work as expected. We introduce new injection points in before throwing a timeout FATAL error and check these injection points are reached. Discussion: https://postgr.es/m/CAAhFRxiQsRs2Eq5kCo9nXE3HTugsAAJdSQSmxncivebAxdmBjQ%40mail.gmail.com Author: Andrey Borodin Reviewed-by: Alexander Korotkov	2024-03-14 13:12:15 +02:00
Alexander Korotkov	e85662df44	Fix false reports in pg_visibility Currently, pg_visibility computes its xid horizon using the GetOldestNonRemovableTransactionId(). The problem is that this horizon can sometimes go backward. That can lead to reporting false errors. In order to fix that, this commit implements a new function GetStrictOldestNonRemovableTransactionId(). This function computes the xid horizon, which would be guaranteed to be newer or equal to any xid horizon computed before. We have to do the following to achieve this. 1. Ignore processes xmin's, because they consider connection to other databases that were ignored before. 2. Ignore KnownAssignedXids, because they are not database-aware. At the same time, the primary could compute its horizons database-aware. 3. Ignore walsender xmin, because it could go backward if some replication connections don't use replication slots. As a result, we're using only currently running xids to compute the horizon. Surely these would significantly sacrifice accuracy. But we have to do so to avoid reporting false errors. Inspired by earlier patch by Daniel Shelepanov and the following discussion with Robert Haas and Tom Lane. Discussion: https://postgr.es/m/1649062270.289865713%40f403.i.mail.ru Reviewed-by: Alexander Lakhin, Dmitry Koval	2024-03-14 13:12:05 +02:00
Alvaro Herrera	cc6e64afda	Comment out noisy libpq_pipeline test libpq_pipeline's new 'cancel' test needs more research; disable it temporarily to prevent measles in the buildfarm.	2024-03-14 10:23:38 +01:00
Daniel Gustafsson	6b41ef0330	Fix documentation comment for pg_md5_hash Commit `b69aba7457` added the errstr parameter to pg_md5_hash but missed updating the synopsis in the documentation comment. The follow-up commit `587de223f0` added the parameter to the list of outputs. The returnvalue had been changed from integer to bool before that but remained in the synopsis. This fixes both. Author: Tatsuro Yamada <tatsuro.yamada@ntt.com> Discussion: https://postgr.es/m/TYYPR01MB82313576150CC86084A122CD9E292@TYYPR01MB8231.jpnprd01.prod.outlook.com	2024-03-14 09:23:37 +01:00
Amit Kapila	9c40db3b02	Fix typos in reorderbuffer.c. Author: Kyotaro Horiguchi Discussion: https://postgr.es/m/20240314.132817.1496502692848380820.horikyota.ntt@gmail.com	2024-03-14 12:12:55 +05:30
Jeff Davis	2d819a08a1	Introduce "builtin" collation provider. New provider for collations, like "libc" or "icu", but without any external dependency. Initially, the only locale supported by the builtin provider is "C", which is identical to the libc provider's "C" locale. The libc provider's "C" locale has always been treated as a special case that uses an internal implementation, without using libc at all -- so the new builtin provider uses the same implementation. The builtin provider's locale is independent of the server environment variables LC_COLLATE and LC_CTYPE. Using the builtin provider, the database collation locale can be "C" while LC_COLLATE and LC_CTYPE are set to "en_US", which is impossible with the libc provider. By offering a new builtin provider, it clarifies that the semantics of a collation using this provider will never depend on libc, and makes it easier to document the behavior. Discussion: https://postgr.es/m/ab925f69-5f9d-f85e-b87c-bd2a44798659@joeconway.com Discussion: https://postgr.es/m/dd9261f4-7a98-4565-93ec-336c1c110d90@manitou-mail.org Discussion: https://postgr.es/m/ff4c2f2f9c8fc7ca27c1c24ae37ecaeaeaff6b53.camel%40j-davis.com Reviewed-by: Daniel Vérité, Peter Eisentraut, Jeremy Schneider	2024-03-13 23:33:44 -07:00
Peter Eisentraut	6ab2e8385d	Put genbki.pl output into src/include/catalog/ directly With the makefile rules, the output of genbki.pl was written to src/backend/catalog/, and then the header files were linked to src/include/catalog/. This changes it so that the output files are written directly to src/include/catalog/. This makes the logic simpler, and it also makes the behavior consistent with the meson build system. Also, the list of catalog files is now kept in parallel in src/include/catalog/{meson.build,Makefile}, while before the makefiles had it in src/backend/catalog/Makefile. Reviewed-by: Andreas Karlsson <andreas@proxel.se> Discussion: https://www.postgresql.org/message-id/flat/21b74bdc-183d-4dd5-9c27-9378d178f459@eisentraut.org	2024-03-14 07:11:21 +01:00
Michael Paquier	6cb1b632b3	Revert "Add basic TAP tests for the low-level backup method" This reverts commit `99b4a63bef`. The test is proving to be unstable, so revert it for now. One of the failures seen involves the cluster started without the backup_label, where the archives of the primary are overwritten, causing recovery failures on Windows. This is simple to fix, but there is another issue that's creeping behind in the form of an "invalid data in file" ERROR when parsing the backup_label for the second recovery case, as an effect of the backup_label data written after retrieving its contents from pg_backup_stop(). _ Per buildfarm member sidewinder.	2024-03-14 13:19:12 +09:00
Michael Paquier	99b4a63bef	Add basic TAP tests for the low-level backup method There are currently no tests for the low-level backup method where pg_backup_start() and pg_backup_stop() are involved while taking a file-system backup. The tests introduced in this commit rely on a background psql process to make sure that the backup is taken while the session doing the SQL start and stop calls remains alive. Two cases are checked here with the backup taken: - Recovery without a backup_label, leading to a corrupted state. - Recovery with a backup_label, with a consistent state reached. Both cases cross-check some patterns in the logs generated when running recovery. Author: David Steele Discussion: https://postgr.es/m/f20fcc82-dadb-478d-beb4-1e2ffb0ace76@pgmasters.net	2024-03-14 10:49:52 +09:00
Alexander Korotkov	e820db5b56	Improve documentation for pg_stat_checkpointer fields pg_stat_checkpointer contains statistics for checkpoints and restartpoints. Before `12915a58ee` documentation said only about checkpoints implying that restartpoint is the variation of checkpoint. `12915a58ee` introduced new separate statistics fields for restartpoints. This commit explicitly documents fields that are relevant for both checkpoints and restartpoints. Reported-by: Magnus Hagander Discussion: https://postgr.es/m/CABUevExav5-SR0x%2BG9kBUMV0G8XsvSUfuyyqmYBBJi6VHns6sw%40mail.gmail.com Reviewed-by: Anton A. Melnikov	2024-03-14 02:17:59 +02:00
Nathan Bossart	ecb0fd3372	Reintroduce MAINTAIN privilege and pg_maintain predefined role. Roles with MAINTAIN on a relation may run VACUUM, ANALYZE, REINDEX, REFRESH MATERIALIZE VIEW, CLUSTER, and LOCK TABLE on the relation. Roles with privileges of pg_maintain may run those same commands on all relations. This was previously committed for v16, but it was reverted in commit `151c22deee` due to concerns about search_path tricks that could be used to escalate privileges to the table owner. Commits `2af07e2f74`, `59825d1639`, and `c7ea3f4229` resolved these concerns by restricting search_path when running maintenance commands. Bumps catversion. Reviewed-by: Jeff Davis Discussion: https://postgr.es/m/20240305161235.GA3478007%40nathanxps13	2024-03-13 14:49:26 -05:00
Robert Haas	2041bc4276	Add the system identifier to backup manifests. Before this patch, if you took a full backup on server A and then tried to use the backup manifest to take an incremental backup on server B, it wouldn't know that the manifest was from a different server and so the incremental backup operation could potentially complete without error. When you later tried to run pg_combinebackup, you'd find out that your incremental backup was and always had been invalid. That's poor timing, because nobody likes finding out about backup problems only at restore time. With this patch, you'll get an error when trying to take the (invalid) incremental backup, which seems a lot nicer. Amul Sul, revised by me. Review by Michael Paquier. Discussion: http://postgr.es/m/CA+TgmoYLZzbSAMM3cAjV4Y+iCRZn-bR9H2+Mdz7NdaJFU1Zb5w@mail.gmail.com	2024-03-13 15:12:33 -04:00
Alvaro Herrera	1ee910ce43	Hopefully make libpq_pipeline's new cancel test more reliable The newly introduced cancel test in libpq_pipeline was flaky. It's not completely clear why, but one option is that the check for "active" was actually seeing the active state for the previous query. This change should address any such race condition by first waiting until the connection is reported as idle. Author: Jelte Fennema-Nio <me@jeltef.nl> Discussion: https://postgr.es/m/CAGECzQRvmUK5-d68A+cm+fgmfht9Dv2uZ28-qq3QiaF6EAZqPQ@mail.gmail.com	2024-03-13 19:55:09 +01:00
Robert Haas	dbfc447165	Expose new function get_controlfile_by_exact_path(). This works just like get_controlfile(), but expects the path to the control file rather than the path to the data directory that contains the control file. This makes more sense in cases where the caller has already constructed the path to the control file itself. Amul Sul and Robert Haas, reviewed by Michael Paquier	2024-03-13 12:06:44 -04:00
Peter Eisentraut	97d85be365	Make the order of the header file includes consistent Similar to commit `7e735035f2`. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAMbWs4-WhpCFMbXCjtJ%2BFzmjfPrp7Hw1pk4p%2BZpU95Kh3ofZ1A%40mail.gmail.com	2024-03-13 15:07:00 +01:00
Peter Eisentraut	6612185883	Fix incorrect format placeholders	2024-03-13 06:40:32 +01:00
Michael Paquier	a189ed49d6	Add tests for more row patterns with COPY FROM .. (ON_ERROR ignore) While digging into the code of this feature, I got confused by the fact that a line is skipped when a value cannot be converted to its expected attribute even if the line has fewer attributes than the target relation. The tests had a check for the case of an empty line, this commit a couple more patterns where a line is incomplete, but skipped because of a conversion error. Reviewed-by: Bharath Rupireddy Discussion: https://postgr.es/m/Ze_7kZqexdt0BiyC@paquier.xyz	2024-03-13 14:19:21 +09:00
Amit Kapila	0b84f5c419	Fix a random failure in 038_save_logical_slots_shutdown.pl. The test ensures that all the WAL on the publisher is sent to the subscriber before shutdown by comparing the confirmed_flush_lsn of the associated slot with the shutdown_checkpoint WAL location. But if the shutdown_checkpoint location falls into a new page in the WAL then the check won't work. So, ensure that the shutdown_checkpoint WAL record doesn't fall into a new page. Reported-by: Bharath Rupireddy Author: Bharath Rupireddy Reviewed-by: Vignesh C, Kuroda Hayato, Amit Kapila Discussion: https://postgr.es/m/CALj2ACVLzH5CN-h9=S26mdRHPuJ9yDLUw70yh4JOiPw03WL0CQ@mail.gmail.com	2024-03-13 08:33:26 +05:30
Thomas Munro	0265e5c120	ci: Use a RAM disk and more CPUs on FreeBSD. Run the tests in a RAM disk. It's still a UFS file system and is backed by 20GB of disk, but this avoids a lot of I/O. Even though we disable fsync, our tests do a lot of directory manipulations, some of which force file system meta-data to disk and flush slow device write caches on UFS. This was a bottleneck preventing effective scaling beyond 2 CPUs. Now we can use 4 CPUs like on other OSes, for a huge speedup. Reviewed-by: Maxim Orlov <orlovmg@gmail.com> Discussion: https://postgr.es/m/CA%2BhUKG%2BFXLcEg1dyTqJjDiNQ8pGom4KrJj4wF38C90thti9dVA%40mail.gmail.com	2024-03-13 14:58:27 +13:00
Michael Paquier	77cf6a78de	Add some asserts based on LWLockHeldByMe() for replication slot statistics Two assertions checking that ReplicationSlotAllocationLock is acquired are added to pgstat_create_replslot() and pgstat_drop_replslot(), corresponding to the routines in charge of the creation and the drop of replication slot statistics. The code previously relied on this assumption and documented it in comments, but did not enforce this policy at runtime. Reviewed-by: Bertrand Drouvot Discussion: https://postgr.es/m/Ze_p-hmD_yFeVYXg@paquier.xyz	2024-03-13 07:45:11 +09:00
Jeff Davis	32dd2c1eff	Fix version check in 002_pg_upgrade.pl. Commit `f696c0cd5f` tried to account for the version in a way that includes development versions, but it was broken. Fix with suggestion from Tom Lane. Discussion: https://postgr.es/m/1553991.1710191312@sss.pgh.pa.us Reported-by: Tom Lane	2024-03-12 15:24:03 -07:00
Tom Lane	6ee3261e9b	Fix confusion about the return rowtype of SQL-language procedures. There is a very ancient hack in check_sql_fn_retval that allows a single SELECT targetlist entry of composite type to be taken as supplying all the output columns of a function returning composite. (This is grotty and fundamentally ambiguous, but it's really hard to do nested composite-returning functions without it.) As far as I know, that doesn't cause any problems in ordinary functions. It's disastrous for procedures however. All procedures that have any output parameters are labeled with prorettype RECORD, and the CALL code expects it will get back a record with one column per output parameter, regardless of whether any of those parameters is composite. Doing something else leads to an assertion failure or core dump. This is simple enough to fix: we just need to not apply that rule when considering procedures. However, that requires adding another argument to check_sql_fn_retval, which at least in principle might be getting called by external callers. Therefore, in the back branches convert check_sql_fn_retval into an ABI-preserving wrapper around a new function check_sql_fn_retval_ext. Per report from Yahor Yuzefovich. This has been broken since we implemented procedures, so back-patch to all supported branches. Discussion: https://postgr.es/m/CABz5gWHSjj2df6uG0NRiDhZ_Uz=Y8t0FJP-_SVSsRsnrQT76Gg@mail.gmail.com	2024-03-12 18:16:25 -04:00
David Rowley	fe4750effd	Fix incorrect filename reference in comment Author: Cary Huang Discussion: https://postgr.es/m/18e34071af0.dbfc9663424635.8571906799773344646@highgo.ca	2024-03-13 09:34:11 +13:00
Alvaro Herrera	61461a300c	libpq: Add encrypted and non-blocking query cancellation routines The existing PQcancel API uses blocking IO, which makes PQcancel impossible to use in an event loop based codebase without blocking the event loop until the call returns. It also doesn't encrypt the connection over which the cancel request is sent, even when the original connection required encryption. This commit adds a PQcancelConn struct and assorted functions, which provide a better mechanism of sending cancel requests; in particular all the encryption used in the original connection are also used in the cancel connection. The main entry points are: - PQcancelCreate creates the PQcancelConn based on the original connection (but does not establish an actual connection). - PQcancelStart can be used to initiate non-blocking cancel requests, using encryption if the original connection did so, which must be pumped using - PQcancelPoll. - PQcancelReset puts a PQcancelConn back in state so that it can be reused to send a new cancel request to the same connection. - PQcancelBlocking is a simpler-to-use blocking API that still uses encryption. Additional functions are - PQcancelStatus, mimicks PQstatus; - PQcancelSocket, mimicks PQcancelSocket; - PQcancelErrorMessage, mimicks PQerrorMessage; - PQcancelFinish, mimicks PQfinish. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Reviewed-by: Denis Laxalde <denis.laxalde@dalibo.com> Discussion: https://postgr.es/m/AM5PR83MB0178D3B31CA1B6EC4A8ECC42F7529@AM5PR83MB0178.EURPRD83.prod.outlook.com	2024-03-12 17:32:25 +01:00
Heikki Linnakangas	cb9663e20d	Fix copying SockAddr struct Valgrind alerted about accessing uninitialized bytes after commit 4945e4ed4a: ==700242== VALGRINDERROR-BEGIN ==700242== Conditional jump or move depends on uninitialised value(s) ==700242== at 0x6D8A2A: getnameinfo_unix (ip.c:253) ==700242== by 0x6D8BD1: pg_getnameinfo_all (ip.c:122) ==700242== by 0x4B3EB6: BackendInitialize (postmaster.c:4266) ==700242== by 0x4B684E: BackendStartup (postmaster.c:4114) ==700242== by 0x4B6986: ServerLoop (postmaster.c:1780) ==700242== by 0x4B80CA: PostmasterMain (postmaster.c:1478) ==700242== by 0x3F7424: main (main.c:197) ==700242== Uninitialised value was created by a stack allocation ==700242== at 0x4B6934: ServerLoop (postmaster.c:1737) ==700242== ==700242== VALGRINDERROR-END That was because the SockAddr struct was not copied correctly. Per buildfarm animal "skink".	2024-03-12 15:31:02 +02:00
Heikki Linnakangas	4945e4ed4a	Move initialization of the Port struct to the child process In postmaster, use a more lightweight ClientSocket struct that encapsulates just the socket itself and the remote endpoint's address that you get from accept() call. ClientSocket is passed to the child process, which initializes the bigger Port struct. This makes it more clear what information postmaster initializes, and what is left to the child process. Rename the StreamServerPort and StreamConnection functions to make it more clear what they do. Remove StreamClose, replacing it with plain closesocket() calls. Reviewed-by: Tristan Partin, Andres Freund Discussion: https://www.postgresql.org/message-id/7a59b073-5b5b-151e-7ed3-8b01ff7ce9ef@iki.fi	2024-03-12 13:42:38 +02:00
Heikki Linnakangas	d162c3a73b	Pass CAC as an argument to the backend process We used to smuggle it to the child process in the Port struct, but it seems better to pass it down as a separate argument. This paves the way for the next commit, which moves the initialization of the Port struct to the backend process, after forking. Reviewed-by: Tristan Partin, Andres Freund Discussion: https://www.postgresql.org/message-id/7a59b073-5b5b-151e-7ed3-8b01ff7ce9ef@iki.fi	2024-03-12 13:42:36 +02:00
Heikki Linnakangas	73f7fb2a4c	Set socket options in child process after forking Try to minimize the work done in the postmaster process for each accepted connection, so that postmaster can quickly proceed with its duties. These function calls are very fast so this doesn't make any measurable performance difference in practice, but it's nice to have all the socket options initialization code in one place for sake of readability too. This also paves the way for an upcoming commit that will move the initialization of the Port struct to the child process. Discussion: https://www.postgresql.org/message-id/7a59b073-5b5b-151e-7ed3-8b01ff7ce9ef@iki.fi	2024-03-12 13:42:28 +02:00
Heikki Linnakangas	f8c5317d00	Disconnect if socket cannot be put into non-blocking mode Commit `387da18874` moved the code to put socket into non-blocking mode from socket_set_nonblocking() into the one-time initialization function, pq_init(). In socket_set_nonblocking(), there indeed was a risk of recursion on failure like the comment said, but in pq_init(), ERROR or FATAL is fine. There's even another elog(FATAL) just after this, if setting FD_CLOEXEC fails. Note that COMMERROR merely logged the error, it did not close the connection, so if putting the socket to non-blocking mode failed we would use the connection anyway. You might not immediately notice, because most socket operations in a regular backend wait for the socket to become readable/writable anyway. But e.g. replication will be quite broken. Backpatch to all supported versions. Discussion: https://www.postgresql.org/message-id/d40a5cd0-2722-40c5-8755-12e9e811fa3c@iki.fi	2024-03-12 10:18:32 +02:00
Alvaro Herrera	4dec98c2af	libpq: Move pg_cancel to fe-cancel.c No other files need to access this struct, so there is no need to have its definition in a header file. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://postgr.es/m/202403061822.spfzqbf7dsgg@alvherre.pgsql	2024-03-12 09:11:24 +01:00
Michael Paquier	d6e171fed6	Keep replication slot statistics on invalidation The code path in charge of invalidating a replication slot includes a call to pgstat_drop_replslot(), which would result in removing the statistics of the slot once invalidated. However, there is no need to remove the statistics of an invalidated slot as one could still be interested in looking at them to understand the activity of the slot until its actual removal. The initial design of the feature committed in `be87200efd` used the approach to drop the slots, which is likely why the statistics were still removed during the invalidation. Another problem with this operation is that it was done without holding ReplicationSlotAllocationLock, leaving it unprotected on concurrent activity. This part is arguably a bug, but that's a limited problem in practice so no backpatch is done. In passing, this commit adds a test to check this behavior. The only remaining code path where slot statistics are dropped now related to the slot getting dropped. Author: Bertrand Drouvot Discussion: https://postgr.es/m/ZermH08Eq6YydHpO@ip-10-97-1-34.eu-west-3.compute.internal	2024-03-12 14:22:31 +09:00
Amit Kapila	397cd0b3c7	Remove redundant fetch of the recent flush pointer in WalSndWaitForWal. In WalSndWaitForWal(), we fetch a recent flush pointer both outside the loop and inside the loop. But we start using RecentFlushPtr only after we fetch it inside the loop. So we can remove one outside the loop. Author: Shveta Malik Reviewed-by: Bertrand Drouvot, Matthias van de Meent, Amit Kapila Discussion: https://postgr.es/m/CAJpy0uBSCQz1yMD-WiEthzEe23dti2-Kr_pitVb7vAPFbFKm=A@mail.gmail.com	2024-03-12 10:25:27 +05:30
Michael Paquier	2c8118ee5d	Use printf's %m format instead of strerror(errno) in more places Most callers of strerror() are removed from the backend code. The remaining callers require special handling with a saved errno from a previous system call. The frontend code still needs strerror() where error states need to be handled outside of fprintf. Note that pg_regress is not changed to use %m as the TAP output may clobber errno, since those functions call fprintf() and friends before evaluating the format string. Support for %m in src/port/snprintf.c has been added in `d6c55de1f9`, hence all the stable branches currently supported include it. Author: Dagfinn Ilmari Mannsåker Discussion: https://postgr.es/m/87sf13jhuw.fsf@wibble.ilmari.org	2024-03-12 10:02:54 +09:00
Peter Geoghegan	3045324214	Update obsolete index scan TID comments. Oversight in commit `c2fe139c20`.	2024-03-11 18:07:10 -04:00
Jeff Davis	bbbf71d9a6	Fix 002_pg_upgrade.pl. Commit `f696c0cd5f` caused a test failure in 002_pg_upgrade.pl, because an earlier s/// operator caused qr// to no longer match the empty string. Use qr/^$/ instead, which is a better test anyway, because we expect the stderr to be empty. Initially this appeared to be a perl bug, but per discussion, it seems that it was a misunderstanding of how perl works: an empty pattern uses the last successful pattern. Given how surprising that behavior is to perl non-experts, we will need to look for similar problems elsewhere and eliminate the use of empty patterns throughout the code. For now, address this one instance to fix the buildfarm. Discussion: https://postgr.es/m/0ef325fa06e7a1605c4e119c4ecb637c67e5fb4e.camel@j-davis.com Reviewed-by: Tom Lane	2024-03-11 14:09:07 -07:00
Alvaro Herrera	319e9e53f3	Add tests for libpq query cancellation APIs This is in preparation of making changes and additions to these APIs. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://postgr.es/m/CAGECzQRb21spiiykQ48rzz8w+Hcykz+mB2_hxR65D9Qk6nnw=w@mail.gmail.com	2024-03-11 21:54:03 +01:00
Nathan Bossart	24c928ad9a	reindexdb: Allow specifying objects to process in all databases. Presently, reindexdb's --table, --schema, --index, and --system options cannot be used together with --all, i.e., you cannot specify objects to process in all databases. This commit removes this unnecessary restriction. Furthermore, it removes the restriction that --system cannot be used with --table, --schema, and --index. There is no such restriction for the latter options, and there is no technical reason to disallow these combinations. Reviewed-by: Kyotaro Horiguchi, Dean Rasheed Discussion: https://postgr.es/m/20230628232402.GA1954626%40nathanxps13	2024-03-11 15:42:27 -05:00
Heikki Linnakangas	3d8652cd32	Remove unneeded vacuum_delay_point from heap_vac_scan_get_next_block heap_vac_scan_get_next_block() does relatively little work, so there is no need to call vacuum_delay_point(). A future commit will call heap_vac_scan_get_next_block() from a callback, and we would like to avoid calling vacuum_delay_point() in that callback. Author: Melanie Plageman Discussion: https://postgr.es/m/CAAKRu_Yf3gvXGcCnqqfoq0Q8LX8UM-e-qbm_B1LeZh60f8WhWA%40mail.gmail.com	2024-03-11 20:45:33 +02:00
Heikki Linnakangas	4e76f984a7	Confine vacuum skip logic to lazy_scan_skip() Rename lazy_scan_skip() to heap_vac_scan_next_block() and move more code into the function, so that the caller doesn't need to know about ranges or skipping anymore. heap_vac_scan_next_block() returns the next block to process, and the logic for determining that block is all within the function. This makes the skipping logic easier to understand, as it's all in the same function, and makes the calling code easier to understand as it's less cluttered. The state variables needed to manage the skipping logic are moved to LVRelState. heap_vac_scan_next_block() now manages its own VM buffer separately from the caller's vmbuffer variable. The caller's vmbuffer holds the VM page for the current block its processing, while heap_vac_scan_next_block() keeps a pin on the VM page for the next unskippable block. Most of the time they are the same, so we hold two pins on the same buffer, but it's more convenient to manage them separately. For readability inside heap_vac_scan_next_block(), move the logic of finding the next unskippable block to separate function, and add some comments. This refactoring will also help future patches to switch to using a streaming read interface, and eventually AIO (https://postgr.es/m/CA%2BhUKGJkOiOCa%2Bmag4BF%2BzHo7qo%3Do9CFheB8%3Dg6uT5TUm2gkvA%40mail.gmail.com) Author: Melanie Plageman, Heikki Linnakangas Reviewed-by: Andres Freund (older version) Discussion: https://postgr.es/m/CAAKRu_Yf3gvXGcCnqqfoq0Q8LX8UM-e-qbm_B1LeZh60f8WhWA%40mail.gmail.com	2024-03-11 20:43:58 +02:00
Nathan Bossart	1b49d56d35	clusterdb: Allow specifying tables to process in all databases. Presently, clusterdb's --table option cannot be used together with --all, i.e., you cannot specify tables to process in all databases. This commit removes this unnecessary restriction. In passing, change the synopsis in the documentation to use "[option...]" instead of "[--verbose \| -v]". There are other general-purpose options (e.g., --quiet and --echo), but the synopsis currently only lists --verbose. Reviewed-by: Kyotaro Horiguchi, Dean Rasheed Discussion: https://postgr.es/m/20230628232402.GA1954626%40nathanxps13	2024-03-11 13:11:20 -05:00
Alvaro Herrera	095493a377	Add missing connection statuses to docs The list of connection statuses that PQstatus might return during an asynchronous connection attempt was outdated: 1. CONNECTION_SETENV is never returned anymore and is only part of the enum for backwards compatibility, so remove it from the docs. 2. CONNECTION_CHECK_STANDBY and CONNECTION_GSS_STARTUP were not listed, so add them. CONNECTION_NEEDED and CONNECTION_CHECK_TARGET are not listed in the docs on purpose, since these are internal states that can never be observed by a caller of PQstatus. Author: Jelte Fennema-Nio <postgres@jeltef.nl> Discussion: https://postgr.es/m/CAGECzQRb21spiiykQ48rzz8w+Hcykz+mB2_hxR65D9Qk6nnw=w@mail.gmail.com	2024-03-11 17:20:36 +01:00
Nathan Bossart	648928c79b	vacuumdb: Allow specifying objects to process in all databases. Presently, vacuumdb's --table, --schema, and --exclude-schema options cannot be used together with --all, i.e., you cannot specify tables or schemas to process in all databases. This commit removes this unnecessary restriction, thus enabling potentially useful commands like "vacuumdb --all --schema pg_catalog". Reviewed-by: Kyotaro Horiguchi, Dean Rasheed Discussion: https://postgr.es/m/20230628232402.GA1954626%40nathanxps13	2024-03-11 10:33:36 -05:00
Heikki Linnakangas	674e49c73c	Set all_visible_according_to_vm correctly with DISABLE_PAGE_SKIPPING It's important for 'all_visible_according_to_vm' to correctly reflect whether the VM bit is set or not, even when we are not trusting the VM to skip pages, because contrary to what the comment said, lazy_scan_prune() relies on it. If it's incorrectly set to 'false', when the VM bit is in fact set, lazy_scan_prune() will try to set the VM bit again and dirty the page unnecessarily. As a result, if you used DISABLE_PAGE_SKIPPING, all heap pages were dirtied, even if there were no changes. We would also fail to clear any VM bits that were set incorrectly. This was broken in commit `980ae17310`, so backpatch to v16. Backpatch-through: 16 Reviewed-by: Melanie Plageman, Peter Geoghegan Discussion: https://www.postgresql.org/message-id/3df2b582-dc1c-46b6-99b6-38eddd1b2784@iki.fi	2024-03-11 09:28:09 +02:00
Heikki Linnakangas	af0e7deb4a	Don't destroy SMgrRelations at relcache invalidation With commit `21d9c3ee4e`, SMgrRelations remain valid until end of transaction (or longer if they're "pinned"). Relcache invalidation can happen in the middle of a transaction, so we must not destroy them at relcache invalidation anymore. This was revealed by failures in the 'constraints' test in buildfarm animals using -DCLOBBER_CACHE_ALWAYS. That started failing with commit `8af2565248`, which was the first commit that started to rely on an SMgrRelation living until end of transaction. Diagnosed-by: Tomas Vondra, Thomas Munro Reviewed-by: Thomas Munro Discussion: https://www.postgresql.org/message-id/CA%2BhUKGK%2B5DOmLaBp3Z7C4S-Yv6yoROvr1UncjH2S1ZbPT8D%2BZg%40mail.gmail.com	2024-03-11 09:08:02 +02:00
David Rowley	e629846472	Fix incorrect accessing of pfree'd memory in Memoize For pass-by-reference types, the code added in `0b053e78b`, which aimed to resolve a memory leak, was overly aggressive in resetting the per-tuple memory context which could result in pfree'd memory being accessed resulting in failing to find previously cached results in the hash table. What was happening was prepare_probe_slot() was switching to the per-tuple memory context and calling ExecEvalExpr(). ExecEvalExpr() may have required a memory allocation. Both MemoizeHash_hash() and MemoizeHash_equal() were aggressively resetting the per-tuple context and after determining the hash value, the context would have gotten reset before MemoizeHash_equal() was called. This could have resulted in MemoizeHash_equal() looking at pfree'd memory. This is less likely to have caused issues on a production build as some other allocation would have had to have reused the pfree'd memory to overwrite it. Otherwise, the original contents would have been intact. However, this clearly caused issues on MEMORY_CONTEXT_CHECKING builds. Author: Tender Wang, Andrei Lepikhov Reported-by: Tender Wang (using SQLancer) Reviewed-by: Andrei Lepikhov, Richard Guo, David Rowley Discussion: https://postgr.es/m/CAHewXNnT6N6UJkya0z-jLFzVxcwGfeRQSfhiwA+NyLg-x8iGew@mail.gmail.com Backpatch-through: 14, where Memoize was added	2024-03-11 18:19:56 +13:00
Michael Paquier	b36fbd9f8d	Improve consistency of replication slot statistics The replication slot stats stored in shared memory rely on an internal index number. Both pgstat_reset_replslot() and pgstat_fetch_replslot() lacked some LWLock protections with ReplicationSlotControlLock while operating on these index numbers. This issue could cause these two functions to potentially operate on incorrect slots when taken in isolation in the event of slots dropped and/or re-created concurrently. Note that pg_stat_get_replication_slot() is called once per slot when querying pg_stat_replication_slots, meaning that the stats are retrieved across multiple ReplicationSlotControlLock acquisitions. So, while this commit improves more consistency, it may still be possible that statistics are not completely consistent for a single scan of pg_stat_replication_slots under concurrent replication slot drop or creation activity. The issue should unlikely be a problem in practice, causing the report of inconsistent stats or or the stats reset of an incorrect slot, so no backpatch is done. Author: Bertrand Drouvot Reviewed-by: Heikki Linnakangas, Shveta Malik, Michael Paquier Discussion: https://postgr.es/m/ZeGq1HDWFfLkjh4o@ip-10-97-1-34.eu-west-3.compute.internal	2024-03-11 10:25:01 +09:00
Michael Paquier	f500ba07fa	Add some checkpoint and redo LSNs to a couple of recovery errors Two FATALs and one PANIC gain details about the LSNs they fail at: - When restoring from a backup_label, the FATAL log generated when not finding the checkpoint record now reports its LSN. - When restoring from a backup_label, the FATAL log generated when not finding the redo record referenced by a checkpoint record now shows both the redo and checkpoint record LSNs. - When not restoring from a backup_label, the PANIC error generated when not finding the checkpoint record now reports its LSN. This information is useful when debugging corruption issues, and these LSNs may not show up in the logs depending on the level of logging configured in the backend. Author: David Steele Discussion: https://postgr.es/m/0e90da89-77ca-4ccf-872c-9626d755e288@pgmasters.net	2024-03-11 09:08:05 +09:00
Michael Paquier	a04ddd077e	Improve support for ExplainOneQuery() hook There is a hook called ExplainOneQuery_hook that gives modules the possibility to plug into this code path, but, like utility.c for utility statement execution, there is no corresponding "standard" routine in the case of EXPLAIN executed for one Query. This commit adds a new standard_ExplainOneQuery() in explain.c, which is able to run explain on a non-utility Query without calling its hook. Per the feedback received from a couple of hackers, this change gives the possibility to cut a few hundred lines of code in some of the popular out-of-core modules as these maintained a copy of ExplainOneQuery(), adding custom extra information at the beginning or the end of the EXPLAIN output. Author: Mats Kindahl Reviewed-by: Aleksander Alekseev, Jelte Fennema-Nio, Andrei Lepikhov Discussion: https://postgr.es/m/CA+14427V_B4EAoC_o-iYYucRdMSOTfpuH9k-QbexffY1HYJBiA@mail.gmail.com	2024-03-11 08:40:40 +09:00
Peter Eisentraut	7b8e2ae2fd	Combine headerscheck and cpluspluscheck scripts They are mostly the same, and it is tedious to maintain two copies of essentially the same exclude list. headerscheck now has a new option --cplusplus to select the cpluspluscheck functionality. The top-level make targets are still the same. Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/flat/4754a5b0-a32b-4036-a99a-6de14cf9fd72@eisentraut.org	2024-03-10 07:56:17 +01:00
Jeff Davis	f696c0cd5f	Catalog changes preparing for builtin collation provider. Rename pg_collation.colliculocale to colllocale, and pg_database.daticulocale to datlocale. These names reflects that the fields will be useful for the upcoming builtin provider as well, not just for ICU. This is purely a rename; no changes to the meaning of the fields. Discussion: https://postgr.es/m/ff4c2f2f9c8fc7ca27c1c24ae37ecaeaeaff6b53.camel%40j-davis.com Reviewed-by: Peter Eisentraut	2024-03-09 14:48:18 -08:00
Tom Lane	519443162d	Simplify and merge unwanted-module drop logic in AdjustUpgrade.pm. In `be7800674` and followups, we failed to notice that there was already a better way to do it: instead of using DROP DATABASE IF EXISTS, we can check the list of existing DBs. Also, there seems no reason not to merge this into the pre-existing code for getting rid of unwanted module databases. Discussion: https://postgr.es/m/1066872.1710006597@sss.pgh.pa.us	2024-03-09 16:20:44 -05:00
Jeff Davis	b0289574bd	Run perltidy on 002_pg_upgrade.pl.	2024-03-09 11:51:31 -08:00
Jeff Davis	5ba8b70deb	Fix cross-version pg_upgrade test. Pass each statement as a separate '-c' arg, so they don't get combined into a single transaction. Discussion: https://postgr.es/m/bca97aecb50b2026b7dbc26604bf31861c819a64.camel@j-davis.com Reviewed-by: Tom Lane	2024-03-09 11:50:04 -08:00
Michael Paquier	f160bf06f7	Document units of "timeout" in ConditionVariableTimedSleep() The timeout is passed down to WaitLatch() as milliseconds. Author: Shveta Malik Discussion: https://postgr.es/m/CAJpy0uC=xiBQD1WapgYYvOiytap6ULJaakLd867zZXqu9tYc8w@mail.gmail.com	2024-03-09 15:44:41 +09:00
Jeff Davis	33ee2550d3	Fix type signedness error in commit `5c40364dd6`. Use ssize_t instead of size_t. Discussion: https://postgr.es/m/b20d6d97-7338-48ea-ba33-837a1c8ef98e@iki.fi Reported-by: Heikki Linnakangas	2024-03-08 16:00:46 -08:00
Daniel Gustafsson	be41a9b038	Fix errorhandling for reading from a pipe When reading a line from a pipe failed on no data being read, the errorhandling was erroneously logging with %m even thoug no error description is available for %m to print. This flaw accidentally introduced in `5c7038d70b`. Reported-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/baa34329-f431-46af-bf74-1a78fdc90e4f@eisentraut.org	2024-03-08 22:53:06 +01:00
Daniel Gustafsson	6929e133b3	Replace perror with custom postgres logging perror() is not used in postgres anymore out of policy, this replaces the final callsites with the custom postgres logging framework. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/89B00F63-40F7-4D82-8353-DC9CABBAC1D1@yesql.se	2024-03-08 22:50:20 +01:00
Tom Lane	f07a20c8a3	Improve WIN32 waiting logic in psql's \watch command. do_watch had some leftover logic for enabling siglongjmp out of waiting for input. That's never done anything on Windows (cf. psql_cancel_callback), and do_watch no longer relies on it for non-Windows, so let's drop it. Also, when the user cancels \watch by pressing ^C, the Windows code would run the query one more time before exiting. That doesn't seem very desirable, and it's not what happens on other platforms. Use the "done" flag similarly to non-Windows to avoid the extra query execution. Yugo Nagata (with minor fixes by me) Discussion: https://postgr.es/m/20240305220552.85fd4afd6b6b8103bf4fe3d0@sraoss.co.jp	2024-03-08 12:07:35 -05:00
Alvaro Herrera	270af6f0df	Admit deferrable PKs into rd_pkindex, but flag them as such ... and in particular don't return them as replica identity. The motivation for this change is letting the primary keys be seen by code that derives NOT NULL constraints from them, when creating inheritance children; before this change, if you had a deferrable PK, pg_dump would not recreate the attnotnull marking properly, because the column would not be considered as having anything to back said marking after dropping the throwaway NOT NULL constraint. The reason we don't want these PKs as replica identities is that replication can corrupt data, if the uniqueness constraint is transiently broken. Reported-by: Amul Sul <sulamul@gmail.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://postgr.es/m/CAAJ_b94QonkgsbDXofakHDnORQNgafd1y3Oa5QXfpQNJyXyQ7A@mail.gmail.com	2024-03-08 16:32:29 +01:00
Alexander Korotkov	4c1973fcae	Avoid recursion in MemoryContext functions You might run out of stack space with recursion, which is not nice in functions that might be used e.g. at cleanup after transaction abort. MemoryContext contains pointer to parent and siblings, so we can traverse a tree of contexts iteratively, without using stack. Refactor the functions to do that. MemoryContextStats() still recurses, but it now has a limit to how deep it recurses. Once the limit is reached, it prints just a summary of the rest of the hierarchy, similar to how it summarizes contexts with lots of children. That seems good anyway, because a context dump with hundreds of nested contexts isn't very readable. Report by Egor Chindyaskin and Alexander Lakhin. Discussion: https://postgr.es/m/1672760457.940462079%40f306.i.mail.ru Author: Heikki Linnakangas Reviewed-by: Robert Haas, Andres Freund, Alexander Korotkov, Tom Lane	2024-03-08 13:18:30 +02:00
Alexander Korotkov	6f38c43eb1	Avoid stack overflow in ShowTransactionStateRec() The function recurses, but didn't perform stack-depth checks. It's just a debugging aid, so instead of the usual check_stack_depth() call, stop the printing if we'd risk stack overflow. Here's an example of how to test this: (n=1000000; printf "BEGIN;"; for ((i=1;i<=$n;i++)); do printf "SAVEPOINT s$i;"; done; printf "SET log_min_messages = 'DEBUG5'; SAVEPOINT sp;") \| psql >/dev/null In the passing, swap building the list of child XIDs and recursing to parent. That saves memory while recursing, reducing the risk of out of memory errors with lots of subtransactions. The saving is not very significant in practice, but this order seems more logical anyway. Report by Egor Chindyaskin and Alexander Lakhin. Discussion: https://www.postgresql.org/message-id/1672760457.940462079%40f306.i.mail.ru Author: Heikki Linnakangas Reviewed-by: Robert Haas, Andres Freund, Alexander Korotkov	2024-03-08 13:18:30 +02:00
Alexander Korotkov	fefd9a3fed	Turn tail recursion into iteration in CommitTransactionCommand() Usually the compiler will optimize away the tail recursion anyway, but if it doesn't, you can drive the function into stack overflow. For example: (n=1000000; printf "BEGIN;"; for ((i=1;i<=$n;i++)); do printf "SAVEPOINT s$i;"; done; printf "ERROR; COMMIT;") \| psql >/dev/null In order to get better readability and less changes to the existing code the recursion-replacing loop is implemented as a wrapper function. Report by Egor Chindyaskin and Alexander Lakhin. Discussion: https://postgr.es/m/1672760457.940462079%40f306.i.mail.ru Author: Alexander Korotkov, Heikki Linnakangas	2024-03-08 13:18:30 +02:00
John Naylor	6d9751fa8f	Revert "Fix link error for test_radixtree module on Windows" This reverts commit `9552e3ace3`. I (john) forgot to revert this locally when a more principled fix was found, which has the same message title.	2024-03-08 11:09:15 +07:00
John Naylor	ab6ae62603	Fix link error for test_radixtree module on Windows Add PGDLLIMPORT to pg_popcount32/64. In passing, fix a typo. Diagnosis by Masahiko Sawada, patch by David Rowley Per buildfarm members drongo and fairywren Discussion: https://postgr.es/m/CAD21AoAMm1mQd%3Dw4PrfrKK%3DOMP8j8%3D7ntJRPF8%2B%3D10iUuvwiCA%40mail.gmail.com Discussion: https://postgr.es/m/CAApHDvov7724UrD1Ug0D1eV%2B9Pd_x5VEQmw-6HVG9w1WdCxXPA%40mail.gmail.com	2024-03-08 10:57:40 +07:00
John Naylor	9552e3ace3	Fix link error for test_radixtree module on Windows Add back "link_with" directive, similar to the one removed by `1f1d73a8b`, but only for Windows, but use the "_shlib" variation. Diagnosis by Masahiko Sawada, proposed fix adjusted and tested by me Per buildfarm members drongo and fairywren Discussion: https://postgr.es/m/CAD21AoAMm1mQd%3Dw4PrfrKK%3DOMP8j8%3D7ntJRPF8%2B%3D10iUuvwiCA%40mail.gmail.com	2024-03-08 10:25:23 +07:00
Amit Kapila	bf279ddd1c	Introduce a new GUC 'standby_slot_names'. This patch provides a way to ensure that physical standbys that are potential failover candidates have received and flushed changes before the primary server making them visible to subscribers. Doing so guarantees that the promoted standby server is not lagging behind the subscribers when a failover is necessary. The logical walsender now guarantees that all local changes are sent and flushed to the standby servers corresponding to the replication slots specified in 'standby_slot_names' before sending those changes to the subscriber. Additionally, the SQL functions pg_logical_slot_get_changes, pg_logical_slot_peek_changes and pg_replication_slot_advance are modified to ensure that they process changes for failover slots only after physical slots specified in 'standby_slot_names' have confirmed WAL receipt for those. Author: Hou Zhijie and Shveta Malik Reviewed-by: Masahiko Sawada, Peter Smith, Bertrand Drouvot, Ajin Cherian, Nisha Moond, Amit Kapila Discussion: https://postgr.es/m/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com	2024-03-08 08:10:45 +05:30
Tom Lane	453c468737	Cope with a deficiency in OpenSSL 3.x's error reporting. In OpenSSL 3.0.0 and later, ERR_reason_error_string randomly refuses to provide a string for error codes representing system errno values (e.g., "No such file or directory"). There is a poorly-documented way to extract the errno from the SSL error code in this case, so do that and apply strerror, rather than falling back to reporting the error code's numeric value as we were previously doing. Problem reported by David Zhang, although this is not his proposed patch; it's instead based on a suggestion from Heikki Linnakangas. Back-patch to all supported branches, since any of them are likely to be used with recent OpenSSL. Discussion: https://postgr.es/m/b6fb018b-f05c-4afd-abd3-318c649faf18@highgo.ca	2024-03-07 19:38:17 -05:00
Michael Paquier	d61a6cad64	Add support for DEFAULT in ALTER TABLE .. SET ACCESS METHOD This option can be used to switch a relation to use the access method set by default_table_access_method when running the command. This has come up when discussing the possibility to support setting pg_class.relam for partitioned tables (left out here as future work), while being useful on its own for relations with physical storage as these must have an access method set. Per suggestion from Justin Pryzby. Author: Michael Paquier Reviewed-by: Justin Pryzby Discussion: https://postgr.es/m/ZeCZ89xAVFeOmrQC@pryzbyj2023	2024-03-08 09:31:52 +09:00
Michael Paquier	4f8c1e7aaf	Update comment of AlterTableCmd->name in parsenodes.h Since `b0483263dd`, this field can be used to store an access method name for ALTER TABLE, but access methods were not mentioned in the field's description. Issue noticed while working on the area. Discussion: https://postgr.es/m/ZeWKgCtk6xiAsDsc@paquier.xyz	2024-03-08 08:44:13 +09:00
Jeff Davis	5c40364dd6	Unicode case mapping tables and functions. Implements Unicode simple case mapping, in which all code points map to exactly one other code point unconditionally. These tables are generated from UnicodeData.txt, which is already being used by other infrastructure in src/common/unicode. The tables are checked into the source tree, so they only need to be regenerated when we update the Unicode version. In preparation for the builtin collation provider, and possibly useful for other callers. Discussion: https://postgr.es/m/ff4c2f2f9c8fc7ca27c1c24ae37ecaeaeaff6b53.camel%40j-davis.com Reviewed-by: Peter Eisentraut, Daniel Verite, Jeremy Schneider	2024-03-07 11:15:06 -08:00
Peter Eisentraut	6d470211e5	Fix description and grouping of RangeTblEntry.inh The inh field of RangeTblEntry was doubly confusingly documented. Some parts of the code insisted that it was only valid for RTE_RELATION entries, other parts said the field was valid for all entries. Neither was quite correct. More correctly, the field is valid for RTE_RELATION entries but is also used in the planner for RTE_SUBQUERY entries. So it makes more sense to group it with other fields that are primarily for RTE_RELATION but borrowed by RTE_SUBQUERY. (The exact position was chosen so that it is next to relkind for better struct packing, and next to relid, since relid and inh are sort of the input fields and the others are filled in later.) Also add documentation for the planner's use at the struct definition. Discussion: https://www.postgresql.org/message-id/6c1fbccc-85c8-40d3-b08b-4f47f2093711@eisentraut.org	2024-03-07 12:13:09 +01:00
John Naylor	1f1d73a8b8	Blind attempt to fix ODR violations Remove apparently useless "link_with" directive. Even if this isn't the root cause, it makes the .build file more like the other test modules. Reviewed by Masahiko Sawada Follow-up to `ee1b30f12`, per buildfarm members olingo and grassquit. Discussion: https://postgr.es/m/CANWCAZaJAaO8MimTU%2BY-DZutM6HQLQu%3DK2HyoQULdB3v_6BSCg%40mail.gmail.com	2024-03-07 17:01:07 +07:00
Dean Rasheed	29ef1dd19b	Fix handling of self-modified tuples in MERGE. When an UPDATE or DELETE action in MERGE returns TM_SelfModified, there are 2 possible causes: 1). The target tuple was already updated or deleted by the current command. This can happen if the target row joins to more than one source row, and the SQL standard explicitly says that this must be an error. 2). The target tuple was already updated or deleted by a later command in the current transaction. This can happen if the tuple is modified by a BEFORE trigger or a volatile function used in the query, and should be an error for the same reason that it is in a plain UPDATE or DELETE command. In MERGE's primary error handling block, it failed to check for (2), causing it to return a misleading error message in such cases. In the secondary error handling block, following a concurrent update from another session, it failed to check for (1), causing it to silently ignore target rows joined to more than one source row, instead of reporting an error. Fix this, and add tests for both of these cases. Per report from Wenjiang Zhang. Back-patch to v15, where MERGE was introduced. Discussion: https://postgr.es/m/tencent_41DE0FF443FE14B94A5898D373792109E408%40qq.com	2024-03-07 09:57:02 +00:00
John Naylor	e444ebcb85	Fix incorrect format specifier for int64 Follow-up to `ee1b30f12`, per buildfarm member mamba. Discussion: https://postgr.es/m/CANWCAZYwyRMU%2BOTVOjK%3Dno1hm-W3ZQ5vrSFM1MFAaLtLydvwzA%40mail.gmail.com	2024-03-07 14:26:52 +07:00
John Naylor	ac234e6377	Fix redefinition of typedefs Per buildfarm members sifaka and longfin, clang with -Wtypedef-redefinition warns of duplicate typedefs unless building with C11. Follow-up to `ee1b30f12`. Masahiko Sawada Discussion: https://postgr.es/m/CANWCAZauSg%3DLUbBbXhpeQtBuPifmzQNTYS6O8NsoAPz1zL-Txg%40mail.gmail.com	2024-03-07 14:11:49 +07:00
John Naylor	ee1b30f128	Add template for adaptive radix tree This implements a radix tree data structure based on the design in "The Adaptive Radix Tree: ARTful Indexing for Main-Memory Databases" by Viktor Leis, Alfons Kemper, and ThomasNeumann, 2013. The main technique that makes it adaptive is using several different node types, each with a different capacity of elements, and a different algorithm for accessing them. The nodes start small and grow/shrink as needed. The main advantage over hash tables is efficient sorted iteration and better memory locality when successive keys are lexicographically close together. The implementation currently assumes 64-bit integer keys, and traversing the tree is in general slower than a linear probing hash table, so this is not a general-purpose associative array. The paper describes two other techniques not implemented here, namely "path compression" and "lazy expansion". These can further reduce memory usage and speed up traversal, but the former would add significant complexity and the latter requires storing the full key with the value. We do trivially compress the path when leading bytes of the key are zeros, however. For value storage, we use "combined pointer/value slots", as recommended in the paper. Values of size equal or smaller than the the platform's pointer type are stored in the array of child pointers in the last level node, while larger values are each stored in a separate allocation. This is for now fixed at compile time, but it would be fairly trivial to allow determining at runtime how variable-length values are stored. One innovation in our implementation compared to the ART paper is decoupling the notion of node "size class" from "kind". The size classes within a given node kind have the same underlying type, but a variable capacity for children, so we can introduce additional node sizes with little additional code. To enable different use cases to specialize for different value types and for shared/local memory, we use macro-templatized code generation in the same manner as simplehash.h and sort_template.h. Future commits will use this infrastructure for storing TIDs. Patch by Masahiko Sawada and John Naylor, but a substantial amount of credit is due to Andres Freund, whose proof-of-concept was a valuable source of coding idioms and awareness of performance pitfalls, and who reviewed earlier versions. Discussion: https://postgr.es/m/CAD21AoAfOZvmfR0j8VmZorZjL7RhTiQdVttNuC4W-Shdc2a-AA%40mail.gmail.com	2024-03-07 12:40:11 +07:00
Michael Paquier	65db0cfb4c	Revert "Add recovery TAP test for race condition with slot invalidations" This reverts commit `08a52ab151`, due to some sporadic instability in the test. Getting the test right should require some redesign with a second injection point, but let's revert it for now to avoid these issues in the CI as a lot of patches are under discussion in this last commit fest. Per buildfarm members hachi and gokiburi. Discussion: https://postgr.es/m/ZekQQHCrIqLVpGz5@paquier.xyz	2024-03-07 09:57:52 +09:00
Michael Paquier	099ca50bd4	Revert "Fix parallel-safety check of expressions and predicate for index builds" This reverts commit `eae7be600b`, following a discussion with Tom Lane, due to concerns that this impacts the decisions made by the planner for the number of workers spawned based on the inlining and const-folding of index expressions and predicate for cases that would have worked until this commit. Discussion: https://postgr.es/m/162802.1709746091@sss.pgh.pa.us Backpatch-through: 12	2024-03-07 08:30:35 +09:00
Jeff Davis	ad49994538	Add Unicode property tables. Provide functions to test for Unicode properties, such as Alphabetic or Cased. These functions use tables derived from Unicode data files, similar to the tables for Unicode normalization or general category, and those tables can be updated with the 'update-unicode' build target. Use Unicode properties to provide functions to test for regex character classes, like 'punct' or 'alnum'. Infrastructure in preparation for a builtin collation provider, and may also be useful for other callers. Discussion: https://postgr.es/m/ff4c2f2f9c8fc7ca27c1c24ae37ecaeaeaff6b53.camel%40j-davis.com Reviewed-by: Daniel Verite, Peter Eisentraut, Jeremy Schneider	2024-03-06 12:50:01 -08:00
Tom Lane	2ed8f9a01e	Fix type-checking of RECORD-returning functions in FROM. In the corner case where a function returning RECORD has been simplified to a RECORD constant or an inlined ROW() expression, ExecInitFunctionScan failed to cross-check the function's result rowtype against the coldeflist provided by the calling query. That happened because get_expr_result_type is able to extract a tupdesc from such expressions, which led ExecInitFunctionScan to ignore the coldeflist. (Instead, it used the extracted tupdesc to check the function's output, which of course always succeeds.) I have not been able to demonstrate any really serious consequences from this, because if some column of the result is of the wrong type and is directly referenced by a Var of the calling query, CheckVarSlotCompatibility will catch it. However, we definitely do fail to report the case where the function returns more columns than the coldeflist expects, and in the converse case where it returns fewer columns, we get an assert failure (but, seemingly, no worse results in non-assert builds). To fix, always build the expected tupdesc from the coldeflist if there is one, and consult get_expr_result_type only when there isn't one. Also remove the failing Assert, even though it is no longer reached after this fix. It doesn't seem to be adding anything useful, since later checking will deal with cases with the wrong number of columns. The only other place I could find that is doing something similar is inline_set_returning_function. There's no live bug there because we cannot be looking at a Const or RowExpr, but for consistency change that code to agree with ExecInitFunctionScan. Per report from PetSerAl. After some debate I've concluded that this should be back-patched. There is a small risk that somebody has been relying on such a case not throwing an error, but I judge this outweighed by the risk that I've missed some way in which the failure to cross-check has worse consequences than sketched above. Discussion: https://postgr.es/m/CAKygsHSerA1eXsJHR9wft3Gn3wfHQ5RfP8XHBzF70_qcrrRvEg@mail.gmail.com	2024-03-06 14:41:13 -05:00
John Naylor	de7c6fe834	Fix signedness error in `9f225e992` for gcc The first argument of vshrq_n_s8 needs to be a signed vector type, but it was passed unsigned. Clang is more lax with conversion, but gcc needs a cast. Fix by me, tested by Masahiko Sawada Per buildfarm members splitfin, batta, widowbird, snakefly, parula, massasauga Discussion: https://postgr.es/m/20240306074106.mg6w4koohdlworbs%40alap3.anarazel.de	2024-03-06 15:55:55 +07:00
Michael Paquier	eae7be600b	Fix parallel-safety check of expressions and predicate for index builds As coded, the planner logic that calculates the number of parallel workers to use for a parallel index build uses expressions and predicates from the relcache, which are flattened for the planner by eval_const_expressions(). As reported in the bug, an immutable parallel-unsafe function flattened in the relcache would become a Const, which would be considered as parallel-safe, even if the predicate or the expressions including the function are not safe in parallel workers. Depending on the expressions or predicate used, this could cause the parallel build to fail. Tests are included that check parallel index builds with parallel-unsafe predicate and expressions. Two routines are added to lsyscache.h to be able to retrieve expressions and predicate of an index from its pg_index data. Reported-by: Alexander Lakhin Author: Tender Wang Reviewed-by: Jian He, Michael Paquier Discussion: https://postgr.es/m/CAHewXN=UaAaNn9ruHDH3Os8kxLVmtWqbssnf=dZN_s9=evHUFA@mail.gmail.com Backpatch-through: 12	2024-03-06 17:23:56 +09:00
John Naylor	3e76a806cb	Move some bitmap logic out of bitmapset.c Move the logic for selecting appropriate pg_bitutils.h functions based on word size to bitmapset.h for wider visibility. Reviewed (in a previous version) by Tom Lane Discussion: https://postgr.es/m/CAFBsxsFW2JjTo58jtDB%2B3sZhxMx3t-3evew8%3DAcr%2BGGhC%2BkFaA%40mail.gmail.com	2024-03-06 14:30:16 +07:00
John Naylor	9f225e992b	Introduce helper SIMD functions for small byte arrays vector8_min - helper for emulating ">=" semantics vector8_highbit_mask - used to turn the result of a vector comparison into a bitmask Masahiko Sawada Reviewed by Nathan Bossart, with additional adjustments by me Discussion: https://postgr.es/m/CAFBsxsHbBm_M22gLBO%2BAZT4mfMq3L_oX3wdKZxjeNnT7fHsYMQ%40mail.gmail.com	2024-03-06 14:25:20 +07:00
Michael Paquier	08a52ab151	Add recovery TAP test for race condition with slot invalidations This commit adds a recovery test to provide coverage for the bug fixed in `818fefd8fd`, using an injection point to wait just after the process of an active slot is killed. The trick is to give enough time for effective_xmin and effective_catalog_xmin to advance so as the slot invalidation robustness can be checked since the active process is killed without holding its slot's mutex for a short time. Author: Bertrand Drouvot Discussion: https://postgr.es/m/ZdyZya4YrNapWKqz@ip-10-97-1-34.eu-west-3.compute.internal	2024-03-06 14:39:40 +09:00
Thomas Munro	d93627bcbe	Add --copy-file-range option to pg_upgrade. The copy_file_range() system call is available on at least Linux and FreeBSD, and asks the kernel to use efficient ways to copy ranges of a file. Options available to the kernel include sharing block ranges (similar to --clone mode), and pushing down block copies to the storage layer. For automated testing, see PG_TEST_PG_UPGRADE_MODE. (Perhaps in a later commit we could consider setting this mode for one of the CI targets.) Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/CA%2BhUKGKe7Hb0-UNih8VD5UNZy5-ojxFb3Pr3xSBBL8qj2M2%3DdQ%40mail.gmail.com	2024-03-06 12:01:01 +13:00
David Rowley	2bce0ad67f	Remove surplus trailing semicolon Author: Richard Guo Discussion: https://postgr.es/m/CAMbWs4-qjotfa7G=5PEOw4LDDDX58MmTwDdpdoU3Quse_BKv1Q@mail.gmail.com	2024-03-06 10:57:31 +13:00

... 4 5 6 7 8 ...

43120 Commits