Differences between revisions 8 and 17 (spanning 9 versions)
Revision 8 as of 2007-05-28 14:43:29
Size: 1819
Editor: dslb-084-058-237-229
Comment:
Revision 17 as of 2007-08-10 23:47:23
Size: 2042
Editor: vpn-113
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
= About character encoding = [wiki:Self:Installation Installation Guide]
Line 3: Line 3:
In general we use the multibyte character encoding UTF-8 as default encoding with the follwing consequences: = About character encoding (28May07 Markus) =

Compl
etionSearch supports ISO-8859-1 and the multibyte character encoding UTF-8.
UTF-8
is the default encoding with the following consequences:
Line 6: Line 9:
 * The text.php is saved as UTF-8  * The texts in text.php are saved as UTF-8
Line 8: Line 11:
 * $AC->settings->capitals is utf-8 encoded
 * In ajax.php we utf-8 encode the query string if $AC->settings->encoding is utf-8 and the charset of content_type is not utf-8 (means the request is sent in a non-utf-8 type)
 * We use mb_strtolower (instead of strtolower) with parameter $AC->settings->encoding to enable utf-8
 * We use mb_strtolower (instead of strtolower) with parameter $AC->settings->encoding to enable UTF-8
Line 12: Line 13:
We have to support other encodings like iso-8859-1 because some collections are not utf-8 encoded.
The default encoding can be overriden by $config->encoding in the autocomplete_config.php.
We do the following depending on the defined encoding:
 * We UTF-8 encode $AC->settings->capitals if $AC->settings->encoding is UTF-8
 * In ajax.php we UTF-8 encode the query string if $AC->settings->encoding is UTF-8 and the charset of content_type is not UTF-8 (means the request is sent as a non-UTF-8 type)
 * We set the page encoding of index.php, options.php and change_options.php according to $AC->settings->encoding (<meta http-equiv="content-type" content="text/html;charset=<?php echo $AC->settings->encoding; ?>">)
 * Texts from text.php are UTF-8 decoded by $AC->get_text() if $AC->settings->encoding is ISO-8859-1
 * We url encode the javascript code in function javascript_rhs (in generate_javascript.php) if $AC->settings->encoding is not UTF-8 (this is not necessary if utf-8 is used)
Line 15: Line 20:
If the encoding is not UTF-8 we do the following:
 * the page encoding of index.php, options.php and change_options.php is determined by $AC->settings->encoding
(<meta http-equiv="content-type" content="text/html;charset=<?php echo $AC->settings->encoding; ?>">
)
 * Texts from text.php are utf-8 decoded by $AC->get_text()
 * The way to write the javascript code in function javascript_rhs (in generate_javascript.php) depends on encoding: if no utf encoding is given the javascript is generated url encoded (this is not necessary if utf-8 is used)
=== Note: The form attribute accept-charset ===
If the form attribute accept-charset is set to "UTF-8" the form variables are UTF-8 encoded before sent to server (even if the page encoding is not UTF-8).
Line 22: Line 23:
== The PHP Apache extension php_mbstring ==
Line 23: Line 25:

== UTF-8 lowercase in PHP (23May07 Markus) ==

Requires extension mbstring (for functions like mb_strtolower). Following line required in php.ini
The use of the mb_strtolower function (and other mb_ functions) requires the extension php_mbstring in php.ini:
Line 36: Line 35:
(On geek, the mb_... functions were available by default, on Markus's laptop the line above had to be added.) (On geek, the mb_... functions were available by default, on Markus' laptop the line above had to be added.)
Line 38: Line 37:

== Texts in text.php are now UTF-8 encoded (23May07 Markus) ==
If this is the first extension you use be sure to have specified the location of the extension with the extension_dir directive.

[wiki:Installation Installation Guide]

About character encoding (28May07 Markus)

CompletionSearch supports ISO-8859-1 and the multibyte character encoding UTF-8. UTF-8 is the default encoding with the following consequences:

  • The $AC->settings->encoding is 'utf-8' unless overriden in autocomplete_config.php

  • The texts in text.php are saved as UTF-8
  • The css file uses '@charset "utf-8";'
  • We use mb_strtolower (instead of strtolower) with parameter $AC->settings->encoding to enable UTF-8

We do the following depending on the defined encoding:

  • We UTF-8 encode $AC->settings->capitals if $AC->settings->encoding is UTF-8

  • In ajax.php we UTF-8 encode the query string if $AC->settings->encoding is UTF-8 and the charset of content_type is not UTF-8 (means the request is sent as a non-UTF-8 type)

  • We set the page encoding of index.php, options.php and change_options.php according to $AC->settings->encoding (<meta http-equiv="content-type" content="text/html;charset=<?php echo $AC->settings->encoding; ?>">)

  • Texts from text.php are UTF-8 decoded by $AC->get_text() if $AC->settings->encoding is ISO-8859-1

  • We url encode the javascript code in function javascript_rhs (in generate_javascript.php) if $AC->settings->encoding is not UTF-8 (this is not necessary if utf-8 is used)

Note: The form attribute accept-charset

If the form attribute accept-charset is set to "UTF-8" the form variables are UTF-8 encoded before sent to server (even if the page encoding is not UTF-8).

The PHP Apache extension php_mbstring

The use of the mb_strtolower function (and other mb_ functions) requires the extension php_mbstring in php.ini:

In windows:
extension=php_mbstring.dll

or in linux:
extension=php_mbstring.so

(On geek, the mb_... functions were available by default, on Markus' laptop the line above had to be added.)

If this is the first extension you use be sure to have specified the location of the extension with the extension_dir directive.

CompleteSearch: FrontPage (last edited 2017-03-19 13:30:19 by Hannah Bast)