I've sent 3 mails to pgsql-patches. There are two files, one for doc

and for src/data directories, and one minor patch for doc/README.locale. Please apply. Oleg.
1999-08-16 20:27:19 +00:00 · 1999-08-16 20:27:19 +00:00 · 972124091d
parent c5d0a1bc42
commit 972124091d
3 changed files with 138 additions and 1 deletions
--- a/doc/README.Charsets
+++ b/doc/README.Charsets
@ -0,0 +1,113 @@
+  
+  PostgreSQL Charsets README
+  Josef Balatka, <balatka@email.cz>
+  Draft v0.1, Tue Jul 20 15:49:07 CEST 1999
+  
+  This document is a brief overview of the national charsets support
+  that PostgreSQL ver. 6.5 has implemented. Various compilation options
+  and setup tips are mentioned here to be helpful in the particular use.
+  
+  ---------------------------------------------------------------------------
+  
+  Table of Contents
+  
+  1. Locale awareness
+  
+  2. Single-byte charsets recoding
+  
+  3. Multi-byte support/recoding
+  
+  4. Credits
+  
+  ---------------------------------------------------------------------------
+  
+  1. Locale awareness
+  
+     PostgreSQL server supports both locale aware and locale not aware
+     (default) operational modes. You can determine this mode during the
+     configuration stage of the installation with --enable-locale option.
+  
+     If you don't use --enable-locale, the multi-language code will not be
+     compiled and PostgreSQL will behave as an ASCII compliant application.
+     This mode is useful for its speed but only provided that you don't
+     have to consider national specific chars.
+
+     With --enable-locale you will get a locale aware server using LC_*
+     environment variables to determine how to process national specifics.
+     In this case strcoll(3) and similar functions are used internally
+     so speed is somewhat lower.
+  
+     Notice here that --enable-locale is sufficient when all your clients
+     use the same single-byte encoding as the database server does.
+  
+     When your clients use encoding different from the server than you have
+     to use, moreover, --enable-recode or --with-mb=<encoding> options on
+     the server side or a particular client that does recoding itself (e.g.
+     there exists a PostgreSQL ODBC driver for Win32 with various Cyrillic
+     encoding capability). Option --with-mb=<encoding> is necessary for the
+     multi-byte charsets support.
+  
+  
+  2. Single-byte charsets recoding
+  
+     You can set up this feature with --enable-recode option. This option
+     is described as 'enable Cyrillic recode support' which doesn't express
+     all its power. It can be used for *any* single-byte charset recoding.
+  
+     This method uses charset.conf file located in the $PGDATA directory.
+     It's a typical configuration text file where spaces and newlines
+     separate items and records and # specifies comments. Three keywords
+     with the following syntax are recognized here:
+  
+       BaseCharset	<server_charset>
+       RecodeTable	<from_charset>     <to_charset>    <file_name>
+       HostCharset	<host_spec>	   <host_charset>
+  
+     BaseCharset defines encoding of the database server. All charset
+     names are only used for mapping inside the charset.conf so you can
+     freely use typing-friendly names.
+     
+     RecodeTable records specify translation table between server and client.
+     The file name is relative to the $PGDATA directory. Table file format
+     is very simple. There are no keywords and characters are represented by
+     a pair of decimal or hexadecimal (0x prefixed) values on single lines:
+  
+       <char_value>  <translated_char_value>
+  
+     HostCharset records define IP address and charset. You can use a single
+     IP address, an IP mask range starting from the given address or an IP
+     interval (e.g. 127.0.0.1, 192.168.1.100/24, 192.168.1.20-192.168.1.40)
+  
+     The charset.conf is always processed up to the end, so you can easily
+     specify exceptions from the previous rules. In the src/data you will
+     find charset.conf example and a few recoding tables.
+  
+     As this solution is based on the client's IP address / charset mapping
+     there are obviously some restrictions as well. You can't use different
+     encoding on the same host at the same time. It's also inconvenient when
+     you boot your client hosts into more operating systems.
+     Nevertheless, when these restrictions are not limiting and you don't
+     need multi-byte chars than it's a simple and effective solution.
+  
+  
+  3. Multi-byte support/recoding
+  
+     It's a new generation of charset encoding in PostgreSQL designed as a
+     more complex solution supporting both single-byte and multi-byte chars.
+     You can set up this feature with --with-mb=<encoding> option.
+  
+     There is no IP mapping file and recoding is controlled through the new
+     SQL statements. Recoding tables are included in the code. Many national
+     charsets are already supported and further will follow.
+  
+     See doc/README.mb, doc/README.mb.jp to get detailed instruction on how
+     to use the multibyte support. In the file doc/README.locale there is
+     a particular instruction on usage of the multibyte support with Cyrillic.
+  
+  
+  4. Credits
+  
+     I'd like to thank the PostgreSQL development team and all contributors
+     for creating PostgreSQL. Thanks to Oleg Bartunov, Oleg Broytmann and
+     Tatsuo Ishii for opening the door into the multi-language world.
+  
--- a/doc/README.locale
+++ b/doc/README.locale
@ -1,5 +1,17 @@
 ===========
-14 Apr 1999
+1999 Jul 21
+===========
+
+   Josef Balatka, <balatka@email.cz> asked us to remove RECODE and sent me
+Czech ISO-8859-2 -> WIN-1250 translation table.
+   RECODE is no longer contains Cyrillic RECODE and will stay in PostgreSQL.
+
+   He also created some bits of documentation, mostly concerning RECODE -
+see README.Charsets.
+
+
+===========
+1999 Apr 14
 ===========

   Tatsuo Ishii <t-ishii@sra.co.jp> updated Multibyte support extending it
--- a/src/data/isocz-wincz.tab
+++ b/src/data/isocz-wincz.tab
@ -0,0 +1,12 @@
+#
+# Czech ISO-8859-2 -> WIN-1250 translation table
+#
+165 188
+169 138
+171 141
+174 142
+181 190
+185 154
+187 157
+190 158
+