Differences between revisions 8 and 9
Revision 8 as of 2007-05-28 14:43:29
Size: 1819
Editor: dslb-084-058-237-229
Comment:
Revision 9 as of 2007-05-28 15:02:22
Size: 1703
Editor: dslb-084-058-237-229
Comment:
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
In general we use the multibyte character encoding UTF-8 as default encoding with the follwing consequences: CompletionSearch supports ISO-8859-1 and the multibyte character encoding UTF-8.
UTF-8
is the default encoding with the following consequences:
Line 6: Line 7:
 * The text.php is saved as UTF-8  * The texts in text.php are saved as UTF-8
Line 8: Line 9:
 * $AC->settings->capitals is utf-8 encoded
 * In ajax.php we utf-8 encode the query string if $AC->settings->encoding is utf-8 and the charset of content_type is not utf-8 (means the request is sent in a non-utf-8 type)
Line 12: Line 11:
We have to support other encodings like iso-8859-1 because some collections are not utf-8 encoded.
The default encoding can be overriden by $config->encoding in the autocomplete_config.php.

If the encoding is not UTF-8 we do the following:
 * the page encoding of index.php, options.php and change_options.php is determined by $AC->settings->encoding
We do the following depending on the defined encoding:
 * We UTF-8 encode $AC->settings->capitals if $AC->settings->encoding is UTF-8
 * In ajax.php we UTF-8 encode the query string if $AC->settings->encoding is UTF-8 and the charset of content_type is not UTF-8 (means the request is sent as a non-UTF-8 type)
 * We set the page encoding of index.php, options.php and change_options.php according to $AC->settings->encoding
Line 19: Line 17:
 * Texts from text.php are utf-8 decoded by $AC->get_text()
 * The way to write the javascript code in function javascript_rhs (in generate_javascript.php) depends on encoding: if no utf encoding is given the javascript is generated url encoded (this is not necessary if utf-8 is used)
 * Texts from text.php are UTF-8 decoded by $AC->get_text() if $AC->settings->encoding is ISO-8859-1
 * We url encode the javascript code in function javascript_rhs (in generate_javascript.php) if $AC->settings->encoding is not UTF-8 (this is not necessary if utf-8 is used)

About character encoding

CompletionSearch supports ISO-8859-1 and the multibyte character encoding UTF-8. UTF-8 is the default encoding with the following consequences:

  • The $AC->settings->encoding is 'utf-8' unless overriden in autocomplete_config.php

  • The texts in text.php are saved as UTF-8
  • The css file uses '@charset "utf-8";'
  • We use mb_strtolower (instead of strtolower) with parameter $AC->settings->encoding to enable utf-8

We do the following depending on the defined encoding:

  • We UTF-8 encode $AC->settings->capitals if $AC->settings->encoding is UTF-8

  • In ajax.php we UTF-8 encode the query string if $AC->settings->encoding is UTF-8 and the charset of content_type is not UTF-8 (means the request is sent as a non-UTF-8 type)

  • We set the page encoding of index.php, options.php and change_options.php according to $AC->settings->encoding

(<meta http-equiv="content-type" content="text/html;charset=<?php echo $AC->settings->encoding; ?>"> )

  • Texts from text.php are UTF-8 decoded by $AC->get_text() if $AC->settings->encoding is ISO-8859-1

  • We url encode the javascript code in function javascript_rhs (in generate_javascript.php) if $AC->settings->encoding is not UTF-8 (this is not necessary if utf-8 is used)

UTF-8 lowercase in PHP (23May07 Markus)

Requires extension mbstring (for functions like mb_strtolower). Following line required in php.ini

In windows:
extension=php_mbstring.dll

or in linux:
extension=php_mbstring.so

(On geek, the mb_... functions were available by default, on Markus's laptop the line above had to be added.)

Texts in text.php are now UTF-8 encoded (23May07 Markus)

CompleteSearch: FrontPage (last edited 2017-03-19 13:30:19 by Hannah Bast)