UTF-8 is a character encoding that can be used to represent all the characters in the Unicode character set. With UTF-8, you don't need to worry what character sets to use with different languages; you can just use the one encoding and be able to write any characters with it.
This page has been created with the goal to help users discuss and solve any conversion problems to UTF-8.
There's a document on UTF-8 and PHP with specific reference to DokuWiki happening here
I've been using DokuWiki since before Aug/2004, and it's been a most useful and valuable tool. This far, most of the “glitches” in using this excellent WikiEngine have been overcome with minor efforts. And Andreas has been doing an amazing job in coping with the Wish-List, Features Request and the “personal” To-Do list.
Before release 2005-02-06, the engine used ISO8859 encoding (correct me if I'm wrong, please), both for content and for pagenames. After this release, the engine moved to the better UTF-8 encoding. Will the update/upgrade process be easy? The next section aims to detail/describe some user experiences.
Is there an easy way to edit the language files in a regular text editor?
This used to be so easy, just using a normal text editor would allow a contributor to easily post a new language (set of files). But now, the UTF-8 enconding uses “strange” characters, and writing content with accented chars is now not a trivial task.
Unicode and its representation UTF-8 have been around for some years now. There are lots of editors for all known OS which support UTF-8. A Google search for unicode editor should give you an idea of the available products. Some linux distributions are a bit behind regarding Unicode support out of the box, I recommend to read the Unicode HOWTO for a starter.
The editor I prefer is JEdit. Performs beautifully with different character encodings and is available for most operating systems. But if the files contain the proper UTF-8 signature, you can edit them with many editors, for example vi (in Unix/Linux), or even with Notepad or Wordpad on Windows. —Tero 2005-02-16 08:51
”or even with Notepad or Wordpad on Windows”: Do correct me if I'm wrong, but I think this only counts for 2K & XP, maybe NT. Earlier versions might handle UTF-8 with updates, not sure about this?
I added links to two simple editor to the UTF8 page — Andreas Gohr 2005-05-13 00:15
Is it possible to de-activate UTF-8 support?
This sounds strange, after probably so many support tickets asking for UTF-8, but my running WikiSites will take quite some effort to be converted. Can this feature be disabled through a configuration setting? If so, what are the implications of it, meaning what will be lost or may not work correctly?
No - deactivating UTF-8 is not planned. It may work by converting your language files back to iso8859-15 and change $lang['encoding'] but this may break think in various places and is not recommended.
However converting your running Wiki sites to UTF-8 should be easy using the converter script - no special software is required. It is worth the effort in my opinion. Don't be afraid - it's the future
I want to disable UTF-8 feature too, because wiki-pages in this encoding take a lot of space vs. CP1251 (Windows Cyrillic). Is it possible to save wiki in CP1251 ?
For Russian it's particulary important to keep pages in UTF-8, because it's a right method for avoiding encoding collisions (with 1251 on some platforms users can see or Russian letters, OR special chars). Text files is relatively small in any encoding, I guess. I'm running wiki in Russian and happy with UTF-8.
This is so wrong and even if it wasn't, it wouldn't matter. — Johannes Buchner 2006-04-11 12:46
I Updated script to 2006-03-09 Search stopped work quite. UTF-8, russian - Okay Don't work Search only Reasons, please… It's old version 2005-02-18. — Andrei Kravchenko
I had a lot of problems with Umlaut-characters (öäüöä) and section edit. The length of a section wasn't calculated properly. After I removed default_charset = “iso-8859-1” from /etc/php.ini it worked. Roland Spatzenegger