778
Comment:
|
1863
|
Deletions are marked like this. | Additions are marked like this. |
Line 1: | Line 1: |
= About character encoding = | = About character encoding (28May07 Markus) = |
Line 3: | Line 3: |
In general we use UTF-8 as default character encoding. | CompletionSearch supports ISO-8859-1 and the multibyte character encoding UTF-8. UTF-8 is the default encoding with the following consequences: |
Line 5: | Line 6: |
The AC->settings->encoding is 'utf-8' unless overriden in autocomplete_config.php The text.php is saved as UTF-8 The css file uses '@charset "utf-8";' |
* The $AC->settings->encoding is 'utf-8' unless overriden in autocomplete_config.php * The texts in text.php are saved as UTF-8 * The css file uses '@charset "utf-8";' * We use mb_strtolower (instead of strtolower) with parameter $AC->settings->encoding to enable UTF-8 |
Line 9: | Line 11: |
We have to support other encodings like iso-8859-1 because some collections are not utf-8 encoded. | We do the following depending on the defined encoding: * We UTF-8 encode $AC->settings->capitals if $AC->settings->encoding is UTF-8 * In ajax.php we UTF-8 encode the query string if $AC->settings->encoding is UTF-8 and the charset of content_type is not UTF-8 (means the request is sent as a non-UTF-8 type) * We set the page encoding of index.php, options.php and change_options.php according to $AC->settings->encoding (<meta http-equiv="content-type" content="text/html;charset=<?php echo $AC->settings->encoding; ?>">) * Texts from text.php are UTF-8 decoded by $AC->get_text() if $AC->settings->encoding is ISO-8859-1 * We url encode the javascript code in function javascript_rhs (in generate_javascript.php) if $AC->settings->encoding is not UTF-8 (this is not necessary if utf-8 is used) |
Line 11: | Line 18: |
=== Note: The form attribute accept-charset === If the form attribute accept-charset is set to "UTF-8" the form variables are UTF-8 encoded before sent to server (even if the page encoding is not UTF-8). |
|
Line 12: | Line 21: |
== The PHP Apache extension php_mbstring == | |
Line 13: | Line 23: |
== UTF-8 lowercase in PHP (23May07 Markus) == Requires extension mbstring (for functions like mb_strtolower). Following line required in php.ini |
The use of the mb_strtolower function (and other mb_ functions) requires the extension php_mbstring in php.ini: |
Line 25: | Line 33: |
(On geek, the mb_... functions were available by default, on Markus's laptop the line above had to be added.) == Texts in text.php are now UTF-8 encoded (23May07 Markus) == |
(On geek, the mb_... functions were available by default, on Markus' laptop the line above had to be added.) |
About character encoding (28May07 Markus)
CompletionSearch supports ISO-8859-1 and the multibyte character encoding UTF-8. UTF-8 is the default encoding with the following consequences:
The $AC->settings->encoding is 'utf-8' unless overriden in autocomplete_config.php
- The texts in text.php are saved as UTF-8
- The css file uses '@charset "utf-8";'
We use mb_strtolower (instead of strtolower) with parameter $AC->settings->encoding to enable UTF-8
We do the following depending on the defined encoding:
We UTF-8 encode $AC->settings->capitals if $AC->settings->encoding is UTF-8
In ajax.php we UTF-8 encode the query string if $AC->settings->encoding is UTF-8 and the charset of content_type is not UTF-8 (means the request is sent as a non-UTF-8 type)
We set the page encoding of index.php, options.php and change_options.php according to $AC->settings->encoding (<meta http-equiv="content-type" content="text/html;charset=<?php echo $AC->settings->encoding; ?>">)
Texts from text.php are UTF-8 decoded by $AC->get_text() if $AC->settings->encoding is ISO-8859-1
We url encode the javascript code in function javascript_rhs (in generate_javascript.php) if $AC->settings->encoding is not UTF-8 (this is not necessary if utf-8 is used)
Note: The form attribute accept-charset
If the form attribute accept-charset is set to "UTF-8" the form variables are UTF-8 encoded before sent to server (even if the page encoding is not UTF-8).
The PHP Apache extension php_mbstring
The use of the mb_strtolower function (and other mb_ functions) requires the extension php_mbstring in php.ini:
In windows: extension=php_mbstring.dll or in linux: extension=php_mbstring.so
(On geek, the mb_... functions were available by default, on Markus' laptop the line above had to be added.)