postgresql

Commit Graph

Author	SHA1	Message	Date
Tom Lane	a45bc8a4f6	Fix handling of -d "connection string" in pg_dump/pg_restore. Parallel pg_dump failed if its -d parameter was a connection string containing any essential information other than host, port, or username. The same was true for pg_restore with --create. The reason is that these scenarios failed to preserve the connection string from the command line; the code felt free to replace that with just the database name when reconnecting from a pg_dump parallel worker or after creating the target database. By chance, parallel pg_restore did not suffer this defect, as long as you didn't say --create. In practice it seems that the error would be obvious only if the connstring included essential, non-default SSL or GSS parameters. This may explain why it took us so long to notice. (It also makes it very difficult to craft a regression test case illustrating the problem, since the test would fail in builds without those options.) Fix by refactoring so that ConnectDatabase always receives all the relevant options directly from the command line, rather than reconstructed values. Inject a different database name, when necessary, by relying on libpq's rules for handling multiple "dbname" parameters. While here, let's get rid of the essentially duplicate _connectDB function, as well as some obsolete nearby cruft. Per bug #16604 from Zsolt Ero. Back-patch to all supported branches. Discussion: https://postgr.es/m/16604-933f4b8791227b15@postgresql.org	2020-09-24 18:19:38 -04:00
Tom Lane	2e3c19462d	Simplify SortTocFromFile() by removing fixed buffer-size limit. pg_restore previously coped with overlength TOC-file lines using some complicated logic to ignore additional bufferloads. While this isn't wrong, since we don't expect that the interesting part of a line would run to more than a dozen or so bytes, it's more complex than it needs to be. Use a StringInfo instead of a fixed-size buffer so that we can process long lines as single entities and thus not need the extra logic. Daniel Gustafsson Discussion: https://postgr.es/m/48A4FA71-524E-41B9-953A-FD04EF36E2E7@yesql.se	2020-09-22 16:03:32 -04:00
Peter Eisentraut	8354e7b27e	Remove unused parameters Remove various unused parameters in pg_dump code. These have all become unused over time or were never used. Discussion: https://www.postgresql.org/message-id/flat/511bb100-f829-ba21-2f10-9f952ec06ead%402ndquadrant.com	2020-09-19 13:29:54 +02:00
Tom Lane	99175141c9	Improve common/logging.c's support for multiple verbosity levels. Instead of hard-wiring specific verbosity levels into the option processing of client applications, invent pg_logging_increase_verbosity() and encourage clients to implement --verbose by calling that. Then, the common convention that more -v's gets you more verbosity just works. In particular, this allows resurrection of the debug-grade messages that have long existed in pg_dump and its siblings. They were unreachable before this commit due to lack of a way to select PG_LOG_DEBUG logging level. (It appears that they may have been unreachable for some time before common/logging.c was introduced, too, so I'm not specifically blaming `cc8d41511` for the oversight. One reason for thinking that is that it's now apparent that _allocAH()'s message needs a null-pointer guard. Testing might have failed to reveal that before 96bf88d52.) Discussion: https://postgr.es/m/1173106.1600116625@sss.pgh.pa.us	2020-09-17 12:52:18 -04:00
Andres Freund	e07633646a	code: replace 'master' with 'leader' where appropriate. Leader already is the more widely used terminology, but a few places didn't get the message. Author: Andres Freund Reviewed-By: David Steele Discussion: https://postgr.es/m/20200615182235.x7lch5n6kcjq4aue@alap3.anarazel.de	2020-07-08 12:58:32 -07:00
Michael Paquier	aaf8c99050	Fix typos and some format mistakes in comments Author: Justin Pryzby Discussion: https://postgr.es/m/20200612023709.GC14879@telsasoft.com	2020-06-12 21:05:10 +09:00
Tom Lane	a9d70c1087	Fix pg_dump/pg_restore to restore event trigger comments later. Repair an oversight in commit 8728b2c70: if we're postponing restore of event triggers to the end, we must also postpone restoring any comments on them, since of course we cannot create the comments first. (This opens yet another opportunity for an event trigger to bollix the restore, but there's no help for that.) Per bug #16346 from Alexander Lakhin. Like the previous commit, back-patch to all supported branches. Hamid Akhtar and Tom Lane Discussion: https://postgr.es/m/16346-6210ad7a0ea81be1@postgresql.org	2020-04-08 11:23:39 -04:00
Tom Lane	8728b2c703	Fix pg_dump/pg_restore to restore event triggers later. Previously, event triggers were restored just after regular triggers (and FK constraints, which are basically triggers). This is risky since an event trigger, once installed, could interfere with subsequent restore commands. Worse, because event triggers don't have any particular dependencies on any post-data objects, a parallel restore would consider them eligible to be restored the moment the post-data phase starts, allowing them to also interfere with restoration of a whole bunch of objects that would have been restored before them in a serial restore. There's no way to completely remove the risk of a misguided event trigger breaking the restore, since if nothing else it could break other event triggers. But we can certainly push them to later in the process to minimize the hazard. To fix, tweak the RestorePass mechanism introduced by commit `3eb9a5e7c` so that event triggers are handled as part of the post-ACL processing pass (renaming the "REFRESH" pass to "POST_ACL" to reflect its more general use). This will cause them to restore after everything except matview refreshes, which seems OK since matview refreshes really ought to run in the post-restore state of the database. In a parallel restore, event triggers and matview refreshes might be intermixed, but that seems all right as well. Also update the code and comments in pg_dump_sort.c so that its idea of how things are sorted agrees with what actually happens due to the RestorePass mechanism. This is mostly cosmetic: it'll affect the order of objects in a dump's TOC, but not the actual restore order. But not changing that would be quite confusing to somebody reading the code. Back-patch to all supported branches. Fabrízio de Royes Mello, tweaked a bit by me Discussion: https://postgr.es/m/CAFcNs+ow1hmFox8P--3GSdtwz-S3Binb6ZmoP6Vk+Xg=K6eZNA@mail.gmail.com	2020-03-09 14:58:26 -04:00
Tom Lane	799d22461a	Assume that we have functional, 64-bit fseeko()/ftello(). Windows has this, and so do all other live platforms according to the buildfarm, so remove the configure probe and src/port/ substitution. Keep the probe that detects whether _LARGEFILE_SOURCE has to be defined to get that, though ... that seems to be still relevant in some places. This is part of a series of commits to get rid of no-longer-relevant configure checks and dead src/port/ code. I'm committing them separately to make it easier to back out individual changes if they prove less portable than I expect. Discussion: https://postgr.es/m/15379.1582221614@sss.pgh.pa.us	2020-02-21 14:30:47 -05:00
Alvaro Herrera	3974c4a724	Remove useless "return;" lines Discussion: https://postgr.es/m/20191128144653.GA27883@alvherre.pgsql	2019-11-28 16:48:37 -03:00
Amit Kapila	dddf4cdc33	Make the order of the header file includes consistent in non-backend modules. Similar to commit `7e735035f2`, this commit makes the order of header file inclusion consistent for non-backend modules. In passing, fix the case where we were using angle brackets (<>) for the local module includes instead of quotes (""). Author: Vignesh C Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/CALDaNm2Sznv8RR6Ex-iJO6xAdsxgWhCoETkaYX=+9DW3q0QCfA@mail.gmail.com	2019-10-25 07:41:52 +05:30
Michael Paquier	66bde49d96	Fix inconsistencies and typos in the tree, take 10 This addresses some issues with unnecessary code comments, fixes various typos in docs and comments, and removes some orphaned structures and definitions. Author: Alexander Lakhin Discussion: https://postgr.es/m/9aabc775-5494-b372-8bcb-4dfc0bd37c68@gmail.com	2019-08-13 13:53:41 +09:00
David Rowley	8abc13a889	Use appendStringInfoString and appendPQExpBufferStr where possible This changes various places where appendPQExpBuffer was used in places where it was possible to use appendPQExpBufferStr, and likewise for appendStringInfo and appendStringInfoString. This is really just a stylistic improvement, but there are also small performance gains to be had from doing this. Discussion: http://postgr.es/m/CAKJS1f9P=M-3ULmPvr8iCno8yvfDViHibJjpriHU8+SXUgeZ=w@mail.gmail.com	2019-07-04 13:01:13 +12:00
Tom Lane	8255c7a5ee	Phase 2 pgindent run for v12. Switch to 2.1 version of pg_bsd_indent. This formats multiline function declarations "correctly", that is with additional lines of parameter declarations indented to match where the first line's left parenthesis is. Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com	2019-05-22 13:04:48 -04:00
Tom Lane	be76af171c	Initial pgindent run for v12. This is still using the 2.0 version of pg_bsd_indent. I thought it would be good to commit this separately, so as to document the differences between 2.0 and 2.1 behavior. Discussion: https://postgr.es/m/16296.1558103386@sss.pgh.pa.us	2019-05-22 12:55:34 -04:00
Tom Lane	fc9a62af3f	Move logging.h and logging.c from src/fe_utils/ to src/common/. The original placement of this module in src/fe_utils/ is ill-considered, because several src/common/ modules have dependencies on it, meaning that libpgcommon and libpgfeutils now have mutual dependencies. That makes it pointless to have distinct libraries at all. The intended design is that libpgcommon is lower-level than libpgfeutils, so only dependencies from the latter to the former are acceptable. We already have the precedent that fe_memutils and a couple of other modules in src/common/ are frontend-only, so it's not stretching anything out of whack to treat logging.c as a frontend-only module in src/common/. To the extent that such modules help provide a common frontend/backend environment for the rest of common/ to use, it's a reasonable design. (logging.c does not yet provide an ereport() emulation, but one can dream.) Hence, move these files over, and revert basically all of the build-system changes made by commit `cc8d41511`. There are no places that need to grow new dependencies on libpgcommon, further reinforcing the idea that this is the right solution. Discussion: https://postgr.es/m/a912ffff-f6e4-778a-c86a-cf5c47a12933@2ndquadrant.com	2019-05-14 14:20:10 -04:00
Alvaro Herrera	7fcdb5e002	pg_dump: store unused attribs as NULL instead of '\0' Commit `f831d4accd` changed pg_dump to emit (and pg_restore to understand) NULLs for unused members in ArchiveEntry structs, as a side effect of some code beautification. That broke pg_restore of dumps generated with older pg_dump, however, so it was reverted in `19455c9f56`. Since the archiver version number has been bumped in `3b925e905d`, we can put it back. Author: Dmitry Dolgov Discussion: https://postgr.es/m/CA+q6zcXx0XHqLsFJLaUU2j5BDiBAHig=YRoBC_YVq7VJGvzBEA@mail.gmail.com	2019-04-26 12:05:17 -04:00
Alvaro Herrera	413ccaa74d	pg_restore: Require "-f -" to mean stdout The previous convention that stdout was selected by default when nothing is specified was just too error-prone. After a suggestion from Andrew Gierth. Author: Euler Taveira Reviewed-by: Yoshikazu Imai, José Arthur Benetasso Villanova Discussion: https://postgr.es/m/87sgwrmhdv.fsf@news-spur.riddles.org.uk	2019-04-04 16:54:15 -03:00
Peter Eisentraut	cc8d415117	Unified logging system for command-line programs This unifies the various ad hoc logging (message printing, error printing) systems used throughout the command-line programs. Features: - Program name is automatically prefixed. - Message string does not end with newline. This removes a common source of inconsistencies and omissions. - Additionally, a final newline is automatically stripped, simplifying use of PQerrorMessage() etc., another common source of mistakes. - I converted error message strings to use %m where possible. - As a result of the above several points, more translatable message strings can be shared between different components and between frontends and backend, without gratuitous punctuation or whitespace differences. - There is support for setting a "log level". This is not meant to be user-facing, but can be used internally to implement debug or verbose modes. - Lazy argument evaluation, so no significant overhead if logging at some level is disabled. - Some color in the messages, similar to gcc and clang. Set PG_COLOR=auto to try it out. Some colors are predefined, but can be customized by setting PG_COLORS. - Common files (common/, fe_utils/, etc.) can handle logging much more simply by just using one API without worrying too much about the context of the calling program, requiring callbacks, or having to pass "progname" around everywhere. - Some programs called setvbuf() to make sure that stderr is unbuffered, even on Windows. But not all programs did that. This is now done centrally. Soft goals: - Reduces vertical space use and visual complexity of error reporting in the source code. - Encourages more deliberate classification of messages. For example, in some cases it wasn't clear without analyzing the surrounding code whether a message was meant as an error or just an info. - Concepts and terms are vaguely aligned with popular logging frameworks such as log4j and Python logging. This is all just about printing stuff out. Nothing affects program flow (e.g., fatal exits). The uses are just too varied to do that. Some existing code had wrappers that do some kind of print-and-exit, and I adapted those. I tried to keep the output mostly the same, but there is a lot of historical baggage to unwind and special cases to consider, and I might not always have succeeded. One significant change is that pg_rewind used to write all error messages to stdout. That is now changed to stderr. Reviewed-by: Donald Dong <xdong@csumb.edu> Reviewed-by: Arthur Zakirov <a.zakirov@postgrespro.ru> Discussion: https://www.postgresql.org/message-id/flat/6a609b43-4f57-7348-6480-bd022f924310@2ndquadrant.com	2019-04-01 20:01:35 +02:00
Michael Paquier	1983af8e89	Switch some palloc/memset calls to palloc0 Some code paths have been doing some allocations followed by an immediate memset() to initialize the allocated area with zeros, this is a bit overkill as there are already interfaces to do both things in one call. Author: Daniel Gustafsson Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/vN0OodBPkKs7g2Z1uyk3CUEmhdtspHgYCImhlmSxv1Xn6nY1ZnaaGHL8EWUIQ-NEv36tyc4G5-uA3UXUF2l4sFXtK_EQgLN1hcgunlFVKhA=@yesql.se	2019-03-27 12:02:50 +09:00
Tom Lane	4870dce37f	Ensure xmloption = content while restoring pg_dump output. In combination with the previous commit, this ensures that valid XML data can always be dumped and reloaded, whether it is "document" or "content". Discussion: https://postgr.es/m/CAN-V+g-6JqUQEQZ55Q3toXEN6d5Ez5uvzL4VR+8KtvJKj31taw@mail.gmail.com	2019-03-23 16:51:37 -04:00
Andres Freund	3b925e905d	tableam: Add pg_dump support. This adds pg_dump support for table AMs in a similar manner to how tablespaces are handled. That is, instead of specifying the AM for every CREATE TABLE etc, emit SET default_table_access_method statements. That makes it easier to change the AM for all/most tables in a dump, and allows restore to succeed even if some AM is not available. This increases the dump archive version, as a tables/matview's AM needs to be tracked therein. Author: Dimitri Dolgov, Andres Freund Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de https://postgr.es/m/20190304234700.w5tmhducs5wxgzls@alap3.anarazel.de	2019-03-06 09:54:38 -08:00
Alvaro Herrera	19455c9f56	pg_dump: Fix ArchiveEntry handling of some empty values Commit `f831d4acc` changed what pg_dump emits for some empty fields: they were output as empty strings before, NULL pointer afterwards. That makes old pg_restore unable to work (crash) with such files, which is unacceptable. Return to the original representation by explicitly setting those struct members to "" where needed; remove some no longer needed checks for NULL input. We can declutter the code a little by returning to NULLs when we next update the archive version, so add a note to remind us later. Discussion: https://postgr.es/m/20190225074539.az6j3u464cvsoxh6@depesz.com Reported-by: hubert depesz lubaczewski Author: Dmitry Dolgov	2019-02-28 17:19:44 -03:00
Tom Lane	4dbe196907	Repair unsafe/unportable snprintf usage in pg_restore. warn_or_exit_horribly() was blithely passing a potentially-NULL string pointer to a %s format specifier. That works (at least to the extent of not crashing) on some platforms, but not all, and since we switched to our own snprintf.c it doesn't work for us anywhere. Of the three string fields being handled this way here, I think that only "owner" is supposed to be nullable ... but considering that this is error-reporting code, it has very little business assuming anything, so put in defenses for all three. Per a crash observed on buildfarm member crake and then reproduced here. Because of the portability aspect, back-patch to all supported versions.	2019-02-09 19:45:38 -05:00
Alvaro Herrera	f831d4accd	Add ArchiveOpts to pass options to ArchiveEntry The ArchiveEntry function has a number of arguments that can be considered optional. Split them out into a separate struct, to make the API more flexible for changes. Author: Dmitry Dolgov Discussion: https://postgr.es/m/CA+q6zcXRxPE+qp6oerQWJ3zS061WPOhdxeMrdc-Yf-2V5vsrEw@mail.gmail.com	2019-02-01 11:29:42 -03:00
Peter Eisentraut	5c82830797	pg_dump: Add missing newline to error message	2018-12-27 10:03:28 +01:00
Andres Freund	578b229718	Remove WITH OIDS support, change oid catalog column visibility. Previously tables declared WITH OIDS, including a significant fraction of the catalog tables, stored the oid column not as a normal column, but as part of the tuple header. This special column was not shown by default, which was somewhat odd, as it's often (consider e.g. pg_class.oid) one of the more important parts of a row. Neither pg_dump nor COPY included the contents of the oid column by default. The fact that the oid column was not an ordinary column necessitated a significant amount of special case code to support oid columns. That already was painful for the existing, but upcoming work aiming to make table storage pluggable, would have required expanding and duplicating that "specialness" significantly. WITH OIDS has been deprecated since 2005 (commit ff02d0a05280e0). Remove it. Removing includes: - CREATE TABLE and ALTER TABLE syntax for declaring the table to be WITH OIDS has been removed (WITH (oids[ = true]) will error out) - pg_dump does not support dumping tables declared WITH OIDS and will issue a warning when dumping one (and ignore the oid column). - restoring an pg_dump archive with pg_restore will warn when restoring a table with oid contents (and ignore the oid column) - COPY will refuse to load binary dump that includes oids. - pg_upgrade will error out when encountering tables declared WITH OIDS, they have to be altered to remove the oid column first. - Functionality to access the oid of the last inserted row (like plpgsql's RESULT_OID, spi's SPI_lastoid, ...) has been removed. The syntax for declaring a table WITHOUT OIDS (or WITH (oids = false) for CREATE TABLE) is still supported. While that requires a bit of support code, it seems unnecessary to break applications / dumps that do not use oids, and are explicit about not using them. The biggest user of WITH OID columns was postgres' catalog. This commit changes all 'magic' oid columns to be columns that are normally declared and stored. To reduce unnecessary query breakage all the newly added columns are still named 'oid', even if a table's column naming scheme would indicate 'reloid' or such. This obviously requires adapting a lot code, mostly replacing oid access via HeapTupleGetOid() with access to the underlying Form_pg_->oid column. The bootstrap process now assigns oids for all oid columns in genbki.pl that do not have an explicit value (starting at the largest oid previously used), only oids assigned later by oids will be above FirstBootstrapObjectId. As the oid column now is a normal column the special bootstrap syntax for oids has been removed. Oids are not automatically assigned during insertion anymore, all backend code explicitly assigns oids with GetNewOidWithIndex(). For the rare case that insertions into the catalog via SQL are called for the new pg_nextoid() function can be used (which only works on catalog tables). The fact that oid columns on system tables are now normal columns means that they will be included in the set of columns expanded by (i.e. SELECT * FROM pg_class will now include the table's oid, previously it did not). It'd not technically be hard to hide oid column by default, but that'd mean confusing behavior would either have to be carried forward forever, or it'd cause breakage down the line. While it's not unlikely that further adjustments are needed, the scope/invasiveness of the patch makes it worthwhile to get merge this now. It's painful to maintain externally, too complicated to commit after the code code freeze, and a dependency of a number of other patches. Catversion bump, for obvious reasons. Author: Andres Freund, with contributions by John Naylor Discussion: https://postgr.es/m/20180930034810.ywp2c7awz7opzcfr@alap3.anarazel.de	2018-11-20 16:00:17 -08:00
Michael Paquier	c34bca9ea5	Consolidate cross-option checks in pg_restore This moves one check for conflicting options from the archive restore code to the main function where other similar checks are performed. Also reword the error message to be consistent with other messages. The only option combination impacted is --create specified with --single-transaction, and informing the caller at an early step saves from opening the archive worked on. A TAP test is added for this combination. Author: Daniel Gustafsson Reviewed-by: Fabien Coelho Discussion: https://postgr.es/m/616808BD-4B59-4E6C-97A9-7317F62D5570@yesql.se	2018-10-30 11:38:35 +09:00
Tom Lane	d6c55de1f9	Implement %m in src/port/snprintf.c, and teach elog.c to rely on that. I started out with the idea that we needed to detect use of %m format specs in contexts other than elog/ereport calls, because we couldn't rely on that working in *printf calls. But a better answer is to fix things so that it does work. Now that we're using snprintf.c all the time, we can implement %m in that and we've fixed the problem. This requires also adjusting our various printf-wrapping functions so that they ensure "errno" is preserved when they call snprintf.c. Remove elog.c's handmade implementation of %m, and let it rely on snprintf to support the feature. That should provide some performance gain, though I've not attempted to measure it. There are a lot of places where we could now simplify 'printf("%s", strerror(errno))' into 'printf("%m")', but I'm not in any big hurry to make that happen. Patch by me, reviewed by Michael Paquier Discussion: https://postgr.es/m/2975.1526862605@sss.pgh.pa.us	2018-09-26 13:31:56 -04:00
Michael Paquier	08c9917e24	Ignore publication tables when --no-publications is used `96e1cb4` has added support for --no-publications in pg_dump, pg_dumpall and pg_restore, but forgot the fact that publication tables also need to be ignored when this option is used. Author: Gilles Darold Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/3f48e812-b0fa-388e-2043-9a176bdee27e@dalibo.com Backpatch-through: 10, where publications have been added.	2018-09-25 11:03:56 +09:00
Tom Lane	789ba5029a	Remove dead code from pop_next_work_item(). The pref_non_data heuristic has been dead code for nearly ten years, and as far as I can tell was dead code even when it was first committed. I'm tired of silencing Coverity complaints about it, so get rid of it. If anyone is ever interested in pursuing the concept, they can get the code out of our git history.	2018-09-17 12:43:07 -04:00
Tom Lane	548e50976c	Improve parallel scheduling logic in pg_dump/pg_restore. Previously, the way this worked was that a parallel pg_dump would re-order the TABLE_DATA items in the dump's TOC into decreasing size order, and separately re-order (some of) the INDEX items into decreasing size order. Then pg_dump would dump the items in that order. Later, parallel pg_restore just followed the TOC order. This method had lots of deficiencies: * TOC ordering randomly differed between parallel and non-parallel dumps, and was hard to predict in the former case, causing problems for building stable pg_dump test cases. * Parallel restore only followed a well-chosen order if the dump had been done in parallel; in particular, this never happened for restore from custom-format dumps. * The best order for restore isn't necessarily the same as for dump, and it's not really static either because of locking considerations. * TABLE_DATA and INDEX items aren't the only things that might take a lot of work during restore. Scheduling was particularly stupid for the BLOBS item, which might require lots of work during dump as well as restore, but was left to the end in either case. This patch removes the logic that changed the TOC order, fixing the test instability problem. Instead, we sort the parallelizable items just before processing them during a parallel dump. Independently of that, parallel restore prioritizes the ready-to-execute tasks based on the size of the underlying table. In the case of dependent tasks such as index, constraint, or foreign key creation, the largest relevant table is used as the metric for estimating the task length. (This is pretty crude, but it should be enough to avoid the case we want to avoid, which is ending the run with just a few large tasks such that we can't make use of all N workers.) Patch by me, responding to a complaint from Peter Eisentraut, who also reviewed the patch. Discussion: https://postgr.es/m/5137fe12-d0a2-4971-61b6-eb4e7e8875f8@2ndquadrant.com	2018-09-14 17:31:51 -04:00
Tom Lane	e0a0cc28d0	Make pg_restore's identify_locking_dependencies() more bulletproof. This function had a blacklist of dump object types that it believed needed exclusive lock ... but we hadn't maintained that, so that it was missing ROW SECURITY, POLICY, and INDEX ATTACH items, all of which need (or should be treated as needing) exclusive lock. Since the same oversight seems likely in future, let's reverse the sense of the test so that the code has a whitelist of safe object types; better to wrongly assume a command can't be run in parallel than the opposite. Currently the only POST_DATA object type that's safe is CREATE INDEX ... and that list hasn't changed in a long time. Back-patch to 9.5 where RLS came in. Discussion: https://postgr.es/m/11450.1535483506@sss.pgh.pa.us	2018-08-28 19:46:59 -04:00
Tom Lane	6771c932cf	Ensure schema qualification in pg_restore DISABLE/ENABLE TRIGGER commands. Previously, this code blindly followed the common coding pattern of passing PQserverVersion(AH->connection) as the server-version parameter of fmtQualifiedId. That works as long as we have a connection; but in pg_restore with text output, we don't. Instead we got a zero from PQserverVersion, which fmtQualifiedId interpreted as "server is too old to have schemas", and so the name went unqualified. That still accidentally managed to work in many cases, which is probably why this ancient bug went undetected for so long. It only became obvious in the wake of the changes to force dump/restore to execute with restricted search_path. In HEAD/v11, let's deal with this by ripping out fmtQualifiedId's server- version behavioral dependency, and just making it schema-qualify all the time. We no longer support pg_dump from servers old enough to need the ability to omit schema name, let alone restoring to them. (Also, the few callers outside pg_dump already didn't work with pre-schema servers.) In older branches, that's not an acceptable solution, so instead just tweak the DISABLE/ENABLE TRIGGER logic to ensure it will schema-qualify its output regardless of server version. Per bug #15338 from Oleg somebody. Back-patch to all supported branches. Discussion: https://postgr.es/m/153452458706.1316.5328079417086507743@wrigleys.postgresql.org	2018-08-17 17:12:33 -04:00
Peter Eisentraut	3a4b891964	Fix more format truncation issues Fix the warnings created by the compiler warning options -Wformat-overflow=2 -Wformat-truncation=2, supported since GCC 7. This is a more aggressive variant of the fixes in `6275f5d28a`, which GCC 7 warned about by default. The issues are all harmless, but some dubious coding patterns are cleaned up. One issue that is of external interest is that BGW_MAXLEN is increased from 64 to 96. Apparently, the old value would cause the bgw_name of logical replication workers to be truncated in some circumstances. But this doesn't actually add those warning options. It appears that the warnings depend a bit on compilation and optimization options, so it would be annoying to have to keep up with that. This is more of a once-in-a-while cleanup. Reviewed-by: Michael Paquier <michael@paquier.xyz>	2018-03-15 11:41:42 -04:00
Tom Lane	3d2aed664e	Avoid using unsafe search_path settings during dump and restore. Historically, pg_dump has "set search_path = foo, pg_catalog" when dumping an object in schema "foo", and has also caused that setting to be used while restoring the object. This is problematic because functions and operators in schema "foo" could capture references meant to refer to pg_catalog entries, both in the queries issued by pg_dump and those issued during the subsequent restore run. That could result in dump/restore misbehavior, or in privilege escalation if a nefarious user installs trojan-horse functions or operators. This patch changes pg_dump so that it does not change the search_path dynamically. The emitted restore script sets the search_path to what was used at dump time, and then leaves it alone thereafter. Created objects are placed in the correct schema, regardless of the active search_path, by dint of schema-qualifying their names in the CREATE commands, as well as in subsequent ALTER and ALTER-like commands. Since this change requires a change in the behavior of pg_restore when processing an archive file made according to this new convention, bump the archive file version number; old versions of pg_restore will therefore refuse to process files made with new versions of pg_dump. Security: CVE-2018-1058	2018-02-26 10:18:21 -05:00
Tom Lane	5c9f2564fa	Fix assorted errors in pg_dump's handling of extended statistics objects. pg_dump supposed that a stats object necessarily shares the same schema as its underlying table, and that it doesn't have a separate owner. These things may have been true during early development of the feature, but they are not true as of v10 release. Failure to track the object's schema separately turns out to have only limited consequences, because pg_get_statisticsobjdef() always schema- qualifies the target object name in the generated CREATE STATISTICS command (a decision out of step with the rest of ruleutils.c, but I digress). Therefore the restored object would be in the right schema, so that the only problem is that the TOC entry would be mislabeled as to schema. That could lead to wrong decisions for schema-selective restores, for example. The ownership issue is a bit more serious: not only was the TOC entry potentially mislabeled as to owner, but pg_dump didn't bother to issue an ALTER OWNER command at all, so that after restore the stats object would continue to be owned by the restoring superuser. A final point is that decisions as to whether to dump a stats object or not were driven by whether the underlying table was dumped or not. While that's not wrong on its face, it won't scale nicely to the planned future extension to cross-table statistics. Moreover, that design decision comes out of the view of stats objects as being auxiliary to a particular table, like a rule or trigger, which is exactly where the above problems came from. Since we're now treating stats objects more like independent objects in their own right, they ought to behave like standalone objects for this purpose too. So change to using the generic selectDumpableObject() logic for them (which presently amounts to "dump if containing schema is to be dumped"). Along the way to fixing this, restructure so that getExtendedStatistics collects the identity info (only) for all extended stats objects in one query, and then for each object actually being dumped, we retrieve the definition in dumpStatisticsExt. This is necessary to ensure that schema-qualification in the generated CREATE STATISTICS command happens with respect to the search path that pg_dump will now be using at restore time (ie, the schema the stats object is in, not that of the underlying table). It's probably also significantly faster in the typical scenario where only a minority of tables have extended stats. Back-patch to v10 where extended stats were introduced. Discussion: https://postgr.es/m/18272.1518328606@sss.pgh.pa.us	2018-02-11 13:24:15 -05:00
Tom Lane	1368e92e16	Support --no-comments in pg_dump, pg_dumpall, pg_restore. We have switches already to suppress other subsidiary object properties, such as ACLs, security labels, ownership, and tablespaces, so just on the grounds of symmetry we should allow suppressing comments as well. Also, commit `0d4e6ed30` added a positive reason to have this feature, i.e. to allow obtaining the old behavior of selective pg_restore should anyone desire that. Recent commits have removed the cases where pg_dump emitted comments on built-in objects that the restoring user might not have privileges to comment on, so the original primary motivation for this feature is gone, but it still seems at least somewhat useful in its own right. Robins Tharakan, reviewed by Fabrízio Mello Discussion: https://postgr.es/m/CAEP4nAx22Z4ch74oJGzr5RyyjcyUSbpiFLyeYXX8pehfou92ug@mail.gmail.com	2018-01-25 15:27:24 -05:00
Tom Lane	0d4e6ed308	Clean up some aspects of pg_dump/pg_restore item-selection logic. Ensure that CREATE DATABASE and related commands are issued when, and only when, --create is specified. Previously there were scenarios where using selective-dump switches would prevent --create from having any effect. For example, it would fail to do anything in pg_restore if the archive file had been made by a selective dump, because there would be no TOC entry for the database. Since we don't issue \connect either if we don't issue CREATE DATABASE, this could result in unexpectedly restoring objects into the wrong database. Also fix pg_restore's selective restore logic so that when an object is selected to be restored, we also restore its ACL, comment, and security label if any. Previously there was no way to get the latter properties except through tedious mucking about with a -L file. If, for some reason, you don't want these properties, you can match the old behavior by adding --no-acl etc. While at it, try to make _tocEntryRequired() a little better organized and better documented. Discussion: https://postgr.es/m/32668.1516848577@sss.pgh.pa.us	2018-01-25 14:26:15 -05:00
Tom Lane	5955d93419	Improve pg_dump's handling of "special" built-in objects. We had some pretty ad-hoc handling of the public schema and the plpgsql extension, which are both presumed to exist in template0 but might be modified or deleted in the database being dumped. Up to now, by default pg_dump would emit a CREATE EXTENSION IF NOT EXISTS command as well as a COMMENT command for plpgsql. The usefulness of the former is questionable, and the latter caused annoying errors in non-superuser dump/restore scenarios. Let's instead install a rule that built-in extensions (identified by having low-numbered OIDs) are not to be dumped. We were doing it that way already in binary-upgrade mode, so this just makes regular mode behave the same. It remains true that if someone has installed a non-default ACL on the plpgsql language, that will get dumped thanks to the pg_init_privs mechanism. This is more consistent with the handling of built-in objects of other kinds. Also, change the very ad-hoc mechanism that was used to avoid dumping creation and comment commands for the public schema. Instead of hardwiring a test in _printTocEntry(), make use of the DUMP_COMPONENT_ infrastructure to mark that schema up-front about what we want to do with it. This has the visible effect that the public schema won't be mentioned in the output at all, except for updating its ACL if it has a non-default ACL. Previously, while it was normally not mentioned, --clean mode would drop and recreate it, again causing headaches for non-superuser usage. This change likewise makes the public schema less special and more like other built-in objects. If plpgsql, or the public schema, has been removed entirely in the source DB, that situation won't be reproduced in the destination ... but that was true before. Discussion: https://postgr.es/m/29048.1516812451@sss.pgh.pa.us	2018-01-25 13:54:42 -05:00
Tom Lane	160a4f62ee	In pg_dump, force reconnection after issuing ALTER DATABASE SET command(s). The folly of not doing this was exposed by the buildfarm: in some cases, the GUC settings applied through ALTER DATABASE SET may be essential to interpreting the reloaded data correctly. Another argument why we can't really get away with the scheme proposed in commit `b3f840120` is that it cannot work for parallel restore: even if the parent process manages to hang onto the previous GUC state, worker processes would see the state post-ALTER-DATABASE. (Perhaps we could have dodged that bullet by delaying DATABASE PROPERTIES restoration to the end of the run, but that does nothing for the data semantics problem.) This leaves us with no solution for the default_transaction_read_only issue that commit `4bd371f6f` intended to work around, other than "you gotta remove such settings before dumping/upgrading". However, in view of the fact that parallel restore broke that hack years ago and no one has noticed, it's fair to question how many people care. I'm unexcited about adding a large dollop of new complexity to handle that corner case. This would be a one-liner fix, except it turns out that ReconnectToServer tries to optimize away "redundant" reconnections. While that may have been valuable when coded, a quick survey of current callers shows that there are no cases where that's actually useful, so just remove that check. While at it, remove the function's useless return value. Discussion: https://postgr.es/m/12453.1516655001@sss.pgh.pa.us	2018-01-23 10:55:16 -05:00
Tom Lane	b3f8401205	Move handling of database properties from pg_dumpall into pg_dump. This patch rearranges the division of labor between pg_dump and pg_dumpall so that pg_dump itself handles all properties attached to a single database. Notably, a database's ACL (GRANT/REVOKE status) and local GUC settings established by ALTER DATABASE SET and ALTER ROLE IN DATABASE SET can be dumped and restored by pg_dump. This is a long-requested improvement. "pg_dumpall -g" will now produce only role- and tablespace-related output, nothing about individual databases. The total output of a regular pg_dumpall run remains the same. pg_dump (or pg_restore) will restore database-level properties only when creating the target database with --create. This applies not only to ACLs and GUCs but to the other database properties it already handled, that is database comments and security labels. This is more consistent and useful, but does represent an incompatibility in the behavior seen without --create. (This change makes the proposed patch to have pg_dump use "COMMENT ON DATABASE CURRENT_DATABASE" unnecessary, since there is no case where the command is issued that we won't know the true name of the database. We might still want that patch as a feature in its own right, but pg_dump no longer needs it.) pg_dumpall with --clean will now drop and recreate the "postgres" and "template1" databases in the target cluster, allowing their locale and encoding settings to be changed if necessary, and providing a cleaner way to set nondefault tablespaces for them than we had before. This means that such a script must now always be started in the "postgres" database; the order of drops and reconnects will not work otherwise. Without --clean, the script will not adjust any database-level properties of those two databases (including their comments, ACLs, and security labels, which it formerly would try to set). Another minor incompatibility is that the CREATE DATABASE commands in a pg_dumpall script will now always specify locale and encoding settings. Formerly those would be omitted if they matched the cluster's default. While that behavior had some usefulness in some migration scenarios, it also posed a significant hazard of unwanted locale/encoding changes. To migrate to another locale/encoding, it's now necessary to use pg_dump without --create to restore into a database with the desired settings. Commit 4bd371f6f's hack to emit "SET default_transaction_read_only = off" is gone: we now dodge that problem by the expedient of not issuing ALTER DATABASE SET commands until after reconnecting to the target database. Therefore, such settings won't apply during the restore session. In passing, improve some shaky grammar in the docs, and add a note pointing out that pg_dumpall's output can't be expected to load without any errors. (Someday we might want to fix that, but this is not that patch.) Haribabu Kommi, reviewed at various times by Andreas Karlsson, Vaishnavi Prabakaran, and Robert Haas; further hacking by me. Discussion: https://postgr.es/m/CAJrrPGcUurV0eWTeXODwsOYFN=Ekq36t1s0YnFYUNzsmRfdAyA@mail.gmail.com	2018-01-22 14:09:09 -05:00
Tom Lane	2b792ab094	Make pg_dump's ACL, sec label, and comment entries reliably identifiable. _tocEntryRequired() expects that it can identify ACL, SECURITY LABEL, and COMMENT TOC entries that are for large objects by seeing whether the tag for them starts with "LARGE OBJECT ". While that works fine for actual large objects, which are indeed tagged that way, it's subject to false positives unless every such entry's tag starts with an appropriate type ID. And in fact it does not work for ACLs, because up to now we customarily tagged those entries with just the bare name of the object. This means that an ACL for an object named "LARGE OBJECT something" would be misclassified as data not schema, with undesirable results in a schema-only or data-only dump --- although pg_upgrade seems unaffected, due to the special case for binary-upgrade mode further down in _tocEntryRequired(). We can fix this by changing all the dumpACL calls to use the label strings already in use for comments and security labels, which do follow the convention of starting with an object type indicator. Well, mostly they follow it. dumpDatabase() got it wrong, using just the bare database name for those purposes, so that a database named "LARGE OBJECT something" would similarly be subject to having its comment or security label dropped or included when not wanted. Bring that into line too. (Note that up to now, database ACLs have not been processed by pg_dump, so that this issue doesn't affect them.) _tocEntryRequired() itself is not free of fault: it was overly liberal about matching object tags to "LARGE OBJECT " in binary-upgrade mode. This looks like it is probably harmless because there would be no data component to strip anyway in that mode, but at best it's trouble waiting to happen, so tighten that up too. The possible misclassification of SECURITY LABEL entries for databases is in principle a security problem, but the opportunities for actual exploits seem too narrow to be interesting. The other cases seem like just bugs, since an object owner can change its ACL or comment for himself, he needn't try to trick someone else into doing it by choosing a strange name. This has been broken since per-large-object TOC entries were introduced in 9.0, so back-patch to all supported branches. Discussion: https://postgr.es/m/21714.1516553459@sss.pgh.pa.us	2018-01-22 12:06:18 -05:00
Peter Eisentraut	e4128ee767	SQL procedures This adds a new object type "procedure" that is similar to a function but does not have a return type and is invoked by the new CALL statement instead of SELECT or similar. This implementation is aligned with the SQL standard and compatible with or similar to other SQL implementations. This commit adds new commands CALL, CREATE/ALTER/DROP PROCEDURE, as well as ALTER/DROP ROUTINE that can refer to either a function or a procedure (or an aggregate function, as an extension to SQL). There is also support for procedures in various utility commands such as COMMENT and GRANT, as well as support in pg_dump and psql. Support for defining procedures is available in all the languages supplied by the core distribution. While this commit is mainly syntax sugar around existing functionality, future features will rely on having procedures as a separate object type. Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com>	2017-11-30 11:03:20 -05:00
Peter Eisentraut	1356f78ea9	Reduce excessive dereferencing of function pointers It is equivalent in ANSI C to write (*funcptr) () and funcptr(). These two styles have been applied inconsistently. After discussion, we'll use the more verbose style for plain function pointer variables, to make it clear that it's a variable, and the shorter style when the function pointer is in a struct (s.func() or s->func()), because then it's clear that it's not a plain function name, and otherwise the excessive punctuation makes some of those invocations hard to read. Discussion: https://www.postgresql.org/message-id/f52c16db-14ed-757d-4b48-7ef360b1631d@2ndquadrant.com	2017-09-07 13:56:09 -04:00
Tom Lane	b1c2d76a2f	Fix possible core dump in parallel restore when using a TOC list. Commit `3eb9a5e7c` unintentionally introduced an ordering dependency into restore_toc_entries_prefork(). The existing coding of reduce_dependencies() contains a check to skip moving a TOC entry to the ready_list if it wasn't initially in the pending_list. This used to suffice to prevent reduce_dependencies() from trying to move anything into the ready_list during restore_toc_entries_prefork(), because the pending_list stayed empty throughout that phase; but it no longer does. The problem doesn't manifest unless the TOC has been reordered by SortTocFromFile, which is how I missed it in testing. To fix, just add a test for ready_list == NULL, converting the call with NULL from a poor man's sanity check into an explicit command not to touch TOC items' list membership. Clarify some of the comments around this; in particular, note the primary purpose of the check for pending_list membership, which is to ensure that we can't try to restore the same item twice, in case a TOC list forces it to be restored before its dependency count goes to zero. Per report from Fabrízio de Royes Mello. Back-patch to 9.3, like the previous commit. Discussion: https://postgr.es/m/CAFcNs+pjuv0JL_x4+=71TPUPjdLHOXA4YfT32myj_OrrZb4ohA@mail.gmail.com	2017-08-19 13:39:51 -04:00
Tom Lane	3eb9a5e7c4	Fix pg_dump/pg_restore to emit REFRESH MATERIALIZED VIEW commands last. Because we push all ACL (i.e. GRANT/REVOKE) restore steps to the end, materialized view refreshes were occurring while the permissions on referenced objects were still at defaults. This led to failures if, say, an MV owned by user A reads from a table owned by user B, even if B had granted the necessary privileges to A. We've had multiple complaints about that type of restore failure, most recently from Jordan Gigov. The ideal fix for this would be to start treating ACLs as dependency- sortable objects, rather than hard-wiring anything about their dump order (the existing approach is a messy kluge dating to commit `dc0e76ca3`). But that's going to be a rather major change, and it certainly wouldn't lead to a back-patchable fix. As a short-term solution, convert the existing two-pass hack (ie, normal objects then ACLs) to a three-pass hack, ie, normal objects then ACLs then matview refreshes. Because this happens in RestoreArchive(), it will also fix the problem when restoring from an existing archive-format dump. (Note this means that if a matview refresh would have failed under the permissions prevailing at dump time, it'll fail during restore as well. We'll define that as user error rather than something we should try to work around.) To avoid performance loss in parallel restore, we need the matview refreshes to still be parallelizable. Hence, clean things up enough so that both ACLs and matviews are handled by the parallel restore infrastructure, instead of reverting back to serial restore for ACLs. There is still a final serial step, but it shouldn't normally have to do anything; it's only there to try to recover if we get stuck due to some problem like unresolved circular dependencies. Patch by me, but it owes something to an earlier attempt by Kevin Grittner. Back-patch to 9.3 where materialized views were introduced. Discussion: https://postgr.es/m/28572.1500912583@sss.pgh.pa.us	2017-08-03 17:36:39 -04:00
Tom Lane	93f039b494	Fix pg_dump's handling of event triggers. pg_dump with the --clean option failed to emit DROP EVENT TRIGGER commands for event triggers. In a closely related oversight, it also did not emit ALTER OWNER commands for event triggers. Since only superusers can create event triggers, the latter oversight is of little practical consequence ... but if we're going to record an owner for event triggers, then surely pg_dump should preserve it. Per complaint from Greg Atkins. Back-patch to 9.3 where event triggers were introduced. Discussion: https://postgr.es/m/20170722191142.yi4e7tzcg3iacclg@gmail.com	2017-07-22 20:20:09 -04:00
Heikki Linnakangas	8046465c2e	Fix pg_basebackup output to stdout on Windows. When writing a backup to stdout with pg_basebackup on Windows, put stdout to binary mode. Any CR bytes in the output will otherwise be output incorrectly as CR+LF. In the passing, standardize on using "_setmode" instead of "setmode", for the sake of consistency. They both do the same thing, but according to MSDN documentation, setmode is deprecated. Fixes bug #14634, reported by Henry Boehlert. Patch by Haribabu Kommi. Backpatch to all supported versions. Discussion: https://www.postgresql.org/message-id/20170428082818.24366.13134@wrigleys.postgresql.org	2017-07-14 16:02:53 +03:00
Tom Lane	382ceffdf7	Phase 3 of pgindent updates. Don't move parenthesized lines to the left, even if that means they flow past the right margin. By default, BSD indent lines up statement continuation lines that are within parentheses so that they start just to the right of the preceding left parenthesis. However, traditionally, if that resulted in the continuation line extending to the right of the desired right margin, then indent would push it left just far enough to not overrun the margin, if it could do so without making the continuation line start to the left of the current statement indent. That makes for a weird mix of indentations unless one has been completely rigid about never violating the 80-column limit. This behavior has been pretty universally panned by Postgres developers. Hence, disable it with indent's new -lpl switch, so that parenthesized lines are always lined up with the preceding left paren. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us	2017-06-21 15:35:54 -04:00

1 2 3 4 5 ...

352 Commits