====== Romanize filenames ====== **Keywords: UTF-8,romanize, cyrillic, latin, convert, filename** When upgrading from previous versions that did not yet have the "romanize" function, you will encounter a completely 'unreadable' directory structure. For example : %D0%BA%D1%8B%D1%80%D0%B3%D1%8B%D0%B7%D1%81%D1%82%D0%B0%D0%BD.txt is the same as кыргызстан.txt This is because UTF-8 filenames have been urlencoded. In later versions, the "romanization" option has been added to circumvent this problem. ((see [[config:deaccent]] and [[:romanization]] for more info)) The script below will convert this unreadable directory structure to "romanized" filenames. You will have to include the UTF8.php file which is part of the dokuwiki installation. Please note: this script is not error free: for example: there are some cyrillic characters that will end your filename with "'". Please check your pagestructure after conversion for invalid filenames. I hope this will help someone. Any improvements welcome. * @link http://aidanlister.com/repos/v/function.copyr.php * @param string $source Source path * @param string $dest Destination path * @return bool Returns TRUE on success, FALSE on failure */ function copyr($source, $dest) { $dest2=cleanID($dest); echo $source."->".$dest." ->$dest2
"; // Simple copy for a file if (is_file($source)) { return copy($source, $dest2); } // Make destination directory if (!is_dir($dest)) { mkdir($dest2); } // Loop through the folder $dir = dir($source); while (false !== $entry = $dir->read()) { // Skip pointers if ($entry == '.' || $entry == '..') { continue; } // Deep copy directories if ($dest !== "$source/$entry") { copyr("$source/$entry", "$dest/$entry"); } } // Clean up $dir->close(); return true; } copyr("/dokuwiki/data/pages/","/dokuwiki/data/pagesnew/"); function cleanID($id,$ascii=false){ $id = trim(urldecode($id)); $id = utf8_strtolower($id); $id = utf8_romanize($id); utf8_deaccent($id,-1); $id = preg_replace('#\'+#','_',$id); return($id); } ?>