Previously, the core scanner's yy_transition[] array had 37045 elements.
Since that number is larger than INT16_MAX, Flex generated the array to
contain 32-bit integers. By reimplementing some of the bulkier scanner
rules, this patch reduces the array to 20495 elements. The much smaller
total length, combined with the consequent use of 16-bit integers for
the array elements reduces the binary size by over 200kB. This was
accomplished in two ways:
1. Consolidate handling of quote continuations into a new start condition,
rather than duplicating that logic for five different string types.
2. Treat Unicode strings and identifiers followed by a UESCAPE sequence
as three separate tokens, rather than one. The logic to de-escape
Unicode strings is moved to the filter code in parser.c, which already
had the ability to provide special processing for token sequences.
While we could have implemented the conversion in the grammar, that
approach was rejected for performance and maintainability reasons.
Performance in microbenchmarks of raw parsing seems equal or slightly
faster in most cases, and it's reasonable to expect that in real-world
usage (with more competition for the CPU cache) there will be a larger
win. The exception is UESCAPE sequences; lexing those is about 10%
slower, primarily because the scanner now has to be called three times
rather than one. This seems acceptable since that feature is very
rarely used.
The psql and epcg lexers are likewise modified, primarily because we
want to keep them all in sync. Since those lexers don't use the
space-hogging -CF option, the space savings is much less, but it's
still good for perhaps 10kB apiece.
While at it, merge the ecpg lexer's handling of C-style comments used
in SQL and in C. Those have different rules regarding nested comments,
but since we already have the ability to keep track of the previous
start condition, we can use that to handle both cases within a single
start condition. This matches the core scanner more closely.
John Naylor
Discussion: https://postgr.es/m/CACPNZCvaoa3EgVWm5yZhcSTX6RAtaLgniCPcBVOCwm8h3xpWkw@mail.gmail.com
This reverts commit bd7c95f0c1,
along with assorted follow-on fixes. There are some questions
about the definition and implementation of that statement, and
we don't have time to resolve them before v13 release. Rather
than ship the feature and then have backwards-compatibility
concerns constraining any redesign, let's remove it for now
and try again later.
Discussion: https://postgr.es/m/TY2PR01MB2443EC8286995378AEB7D9F8F5B10@TY2PR01MB2443.jpnprd01.prod.outlook.com
So far ECPG programs had to treat binary data for bytea column as 'char' type.
But this meant converting from/to escaped format with PQunescapeBytea/
PQescapeBytea() and therefore forcing users to add unnecessary code and cost
for the conversion in runtime. By adding a dedicated datatype for bytea most of
this special handling is no longer needed.
Author: Matsumura-san ("Matsumura, Ryo" <matsumura.ryo@jp.fujitsu.com>)
Discussion: https://postgr.es/m/flat/03040DFF97E6E54E88D3BFEE5F5480F737A141F9@G01JPEXMBYT04
DECLARE STATEMENT is a statement that lets users declare an identifier
pointing at a connection. This identifier will be used in other embedded
dynamic SQL statement such as PREPARE, EXECUTE, DECLARE CURSOR and so on.
When connecting to a non-default connection, the AT clause can be used in
a DECLARE STATEMENT once and is no longer needed in every dynamic SQL
statement. This makes ECPG applications easier and more efficient. Moreover,
writing code without designating connection explicitly improves portability.
Authors: Ideriha-san ("Ideriha, Takeshi" <ideriha.takeshi@jp.fujitsu.com>)
Kuroda-san ("Kuroda, Hayato" <kuroda.hayato@jp.fujitsu.com>)
Discussion: https://postgr.es/m4E72940DA2BF16479384A86D54D0988A565669DF@G01JPEXMBKW04
The function called can result in an out of memory error that subsequently was
disregarded. Instead it should set the appropriate SQL error variables and be
checked by whatever whenever statement is defined.
This adds a new object type "procedure" that is similar to a function
but does not have a return type and is invoked by the new CALL statement
instead of SELECT or similar. This implementation is aligned with the
SQL standard and compatible with or similar to other SQL implementations.
This commit adds new commands CALL, CREATE/ALTER/DROP PROCEDURE, as well
as ALTER/DROP ROUTINE that can refer to either a function or a
procedure (or an aggregate function, as an extension to SQL). There is
also support for procedures in various utility commands such as COMMENT
and GRANT, as well as support in pg_dump and psql. Support for defining
procedures is available in all the languages supplied by the core
distribution.
While this commit is mainly syntax sugar around existing functionality,
future features will rely on having procedures as a separate object
type.
Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com>
Clean up some technical debt left behind by commit 72b1e3a21: instead of
quickly hacking the name of base_yylex() with a #define, set it properly
with "%option prefix". This causes the names of pgc.l's other exported
symbols to change as well, so run around and modify the outside references
to them as needed. Similarly, make pgc.l's external references to
base_yylval use that variable's true name instead of a macro.
The reason for doing this now is that the quick-hack solution will fail
with future versions of flex, as reported by Дилян Палаузов.
Hence, back-patch into 9.6 where the previous commit appeared, since
it's likely people will build 9.6 with newer flex versions during
its lifetime.
Discussion: https://postgr.es/m/d845c1af-e18d-6651-178f-9f08cdf37e10@aegee.org
Now that we know about the %top{} trick, we can revert to building flex
lexers as separate .o files. This is worth doing for a couple of reasons
besides sheer cleanliness. We can narrow the scope of the -Wno-error flag
that's forced on scan.c. Also, since these grammar and lexer files are
so large, splitting them into separate build targets should have some
advantages in build speed, particularly in parallel or ccache'd builds.
We have quite a few other .l files that could be changed likewise, but the
above arguments don't apply to them, so the benefit of fixing them seems
pretty minimal. Leave the rest for some other day.
This provides a mechanism for specifying conversions between SQL data
types and procedural languages. As examples, there are transforms
for hstore and ltree for PL/Perl and PL/Python.
reviews by Pavel Stěhule and Andres Freund
variables is varchar. This fixes this test case:
int main(void)
{
exec sql begin declare section;
varchar a[50], b[50];
exec sql end declare section;
return 0;
}
Since varchars are internally turned into custom structs and
the type name is emitted for these variable declarations,
the preprocessed code previously had:
struct varchar_1 { ... } a _,_ struct varchar_2 { ... } b ;
The comma in the generated C file was a syntax error.
There are no regression test changes since it's not exercised.
Patch by Boszormenyi Zoltan <zb@cybertec.at>
Use bool as type for booleans instead of int.
Do not implicitely cast size_t to int.
Make the compiler stop complaining about unused variables by adding an empty statement.
Informix allows variables as argument to the embedded SQL command FREE. Given
that we only allow freeing cursors via FREE for compatibility reasons only we
should do the same.
list, minus a few specific words that have to be treated specially. This
replaces a hard-wired list of keywords that would have needed manual
maintenance, and was not getting it. The 8.4 coding was already missing
these words, causing ecpg to incorrectly treat them as reserved words:
CALLED, CATALOG, DEFINER, ENUM, FOLLOWING, INVOKER, OPTIONS, PARTITION,
PRECEDING, RANGE, SECURITY, SERVER, UNBOUNDED, WRAPPER. In HEAD we were
additionally missing COMMENTS, FUNCTIONS, SEQUENCES, TABLES.
Per gripe from Bosco Rama.
it works just as well to have them be ordinary identifiers, and this gets rid
of a number of ugly special cases. Plus we aren't interfering with non-rule
usage of these names.
catversion bump because the names change internally in stored rules.
to create a function for it.
Procedural languages now have an additional entry point, namely a function
to execute an inline code block. This seemed a better design than trying
to hide the transient-ness of the code from the PL. As of this patch, only
plpgsql has an inline handler, but probably people will soon write handlers
for the other standard PLs.
In passing, remove the long-dead LANCOMPILER option of CREATE LANGUAGE.
Petr Jelinek