Translations of this page?:

Migration from MoinMoin to DokuWiki

Below you will find scripts in PHP and Python to facilitate the conversion process. Before running them you must replace <>/code> for </code>.

FIXME What exactly does that mean? There is no <>/code> in the php script. Also, are there any parameters that need to be passed to the script and how is that to be done? According to the code there should be three parameters. Passed through the URL? Syntax? Can anyone help?

Another document on switching appears at http://www.emilsit.net/blog/archives/migrating-from-moinmoin-to-dokuwiki/

PHP

I have written a small php-script to convert wiki pages from MoinMoin http://moinmoin.wikiwikiweb.de/ to DokuWiki syntax. It does not take care of all differences, but it worked for me.

#!/usr/bin/php
<?php
 
//check comman line parameters
if ($argc != 3 || in_array($argv[1], array('--help', '-help', '-h', '-?'))) {
  echo "\n  Converts all files from given directory\n";
  echo "  from MoinMoin to DokuWiki syntax. NOT RECURSIV\n\n";
  echo "  Usage:\n";
  echo "  ".$argv[0]." <input dir> <output dir>\n\n";
} 
else {
  //get input and output directories
  $inDir = realpath($argv[1]) or die("input dir error");
  $outDir = realpath($argv[2]) or die("output dir error");
  //just print information
  echo "\nInput Directory: ".$inDir."\n";
  echo "Output Directory: ".$outDir."\n\n";
 
  //get all files from directory
  if (is_dir($inDir)) {
    $files = filesFromDir($inDir);
  }
 
  //migrate each file
  foreach ($files As $file) {
    //convert filename
    $ofile = convFileNames($file);
    //just print information
    echo "Migrating from ".$inDir."/".$file." to ".$outDir."/".$ofile."\n";
 
    //read input file
    $text = readFl($inDir."/".$file);
 
    //convert content
    $text = moin2doku($text);
 
    //encode in utf8
    $text = utf8_encode($text);
 
    //write output file
    writeFl($outDir."/".$ofile, $text);
  }
}
 
 
function moin2doku($text) {
  /* like convFileNames and more
  *   ToDo: [[Datestamp]] delete?
  *         bold and italic, what goes wrong?
  *         images
  *         Problems with newline and [[BR]]
  *         CamelCase in Heading: it will be converted
  *         Moin handles code sections without closing }}} right, DokuWiki does not     
  */
 
  //line by line
  $lines = explode("\n", $text);
  foreach($lines As $line) {
    //start converting
    $find = Array(  
                  '/\[\[TableOfContents\]\]/',      //remove
                  '/\[\[BR\]\]$/',                  //newline at end of line - remove
                  '/\[\[BR\]\]/',                   //newline
                  '/#pragma section-numbers off/',  //remove
                  '/\["(.*)"\]/',                   //internal link
                  '/(\[http.*\])/',                 //web link
                  '/\{{3}/',                        //code open
                  '/\}{3}/',                        //code close
                  '/^\s\*/',                        //lists must have not only but 2 whitespaces before *
                  '/={5}(\s.*\s)={5}$/',            //heading 5
                  '/={4}(\s.*\s)={4}$/',            //heading 4
                  '/={3}(\s.*\s)={3}$/',            //heading 3
                  '/={2}(\s.*\s)={2}$/',            //heading 2
                  '/={1}(\s.*\s)={1}$/',            //heading 1
                  '/\|{2}/',                        //table separator
                  '/\'{5}(.*)\'{5}/',               //bold and italic
                  '/\'{3}(.*)\'{3}/',               //bold
                  '/\'{2}(.*)\'{2}/',               //italic
                  '/(?<!\[)(\b[A-Z]+[a-z]+[A-Z][A-Za-z]*\b)/',  //CamelCase, dont change if CamelCase is in InternalLink
                  '/\[\[Date\(([\d]{4}-[\d]{2}-[\d]{2}T[\d]{2}:[\d]{2}:[\d]{2}Z)\)\]\]/'  //Date value 
                  );
    $replace = Array(
                     '',                            //remove                                
                     '',                            //newline remove                                
                     '\\\\\ ',                      //newline
                     '',                            //remove                                
                     '[[${1}]]',                    //internal link
                     '[${1}]',                      //web link
                     '<>>code>',                      //code open - remove >>, its included for viewing in dokuwiki
                     '<>>/code>',                     //code close - remove >>, its included for viewing in dokuwiki
                     '  *',                         //lists must have 2 whitespaces before *
                     '==${1}==',                      //heading 5                        
                     '===${1}===',                    //heading 4                        
                     '====${1}====',                  //heading 3                        
                     '=====${1}=====',                //heading 2                        
                     '======${1}======',              //heading 1                        
                     '|',                           //table separator                       
                     '**//${1}//**',                //bold and italic
                     '**${1}**',                    //bold                                  
                     '//${1}//',                    //italic
                     '[[${1}]]',                    //CamelCase
                     '${1}'                         //Date value
                     );
    $line = preg_replace($find,$replace,$line);
 
    $ret = $ret.$line."\r\n";
  }
  return $ret;
}
 
 
function convFileNames($name) {
  /* ö,ä,ü, ,. and more
  */
  $find = Array('/_20/',
                '/_5f/',
                '/_2e/',
                '/_c4/',
                '/_f6/',
                '/_fc/',
                '/_26/',
                '/_2d/'
                );
  $replace = Array('_',
                   '_',
                   '_',
                   'Ae',
                   'oe',
                   'ue',
                   '_',
                   '-'
                   );
  $name = preg_replace($find,$replace,$name);
  $name = strtolower($name);
  return $name.".txt";
}
 
 
function filesFromDir($dir) {
  $files = Array();
  $handle=opendir($dir);
  while ($file = readdir ($handle)) {
     if ($file != "." && $file != ".." && !is_dir($dir."/".$file)) {
         array_push($files, $file);
     }
  }
  closedir($handle); 
  return $files;
}
 
function readFl($file) {
  $fr = fopen($file,"r");
  if ($fr) {
    while(!feof($fr)) {
      $text = $text.fgets($fr);
    }
    fclose($fr);
  }
  return $text;
}
 
function writeFl($file, $text) {
  $fw = fopen($file, "w");
  if ($fw) {
    fwrite($fw, $text);
  }
  fclose($fw);
}
 
?>

Python

Based on the above two I've written a python script that automates the file renaming, copying and conversion business. Worked for me on windows.

import sys, os, os.path
import re
from os import listdir
from os.path import isdir, basename
 
def check_dirs(moin_pages_dir, output_dir):
    if not isdir(moin_pages_dir):
        print >> sys.stderr, "MoinMoin pages directory doesn't exist!"
        sys.exit(1)
    if not isdir(output_dir):
        print >> sys.stderr, "Output directory doesn't exist!"
        sys.exit(1)
 
 
def get_page_names(moin_pages_dir):
    items = listdir(moin_pages_dir)
    pages = []
    for item in items:
        item = os.path.join(moin_pages_dir, item)
        if isdir(item):
            pages.append(item)
    return pages
 
 
def get_current_revision(page_dir):
    rev_dir = os.path.join(page_dir, 'revisions')
    revisions = listdir(rev_dir)
    revisions.sort()
    return os.path.join(rev_dir, revisions[len(revisions)-1])
 
 
def convert_page(page):
    regexp = (
        ('\[\[TableOfContents\]\]', ''),            # remove
        ('\[\[BR\]\]$', ''),                        # newline at end of line - remove
        ('\[\[BR\]\]', '\n'),                       # newline
        ('#pragma section-numbers off', ''),        # remove
        ('^##.*?\\n', ''),                          # remove
        ('\["(.*)"\]',  '[[\\1]]'),                 # internal link
        ('(\[http.*\])', '[\\1]'),                  # web link
        ('\{{3}', '<>code>'),                       # code open
        ('\}{3}', '<>/code>'),                      # code close
        ('^\s\*', '  *'),                           # lists must have not only but 2 whitespaces before *
        ('={5}(\s.*\s)={5}$', '==\\1=='),           # heading 5
        ('={4}(\s.*\s)={4}$', '===\\1}==='),        # heading 4
        ('={3}(\s.*\s)={3}$', '====\\1===='),       # heading 3
        ('={2}(\s.*\s)={2}$', '=====\\1====='),     # heading 2
        ('={1}(\s.*\s)={1}$', '======\\1======'),   # heading 1
        ('\|{2}', '|'),                             # table separator
        ('\'{5}(.*)\'{5}', '**//\\1//**'),          # bold and italic
        ('\'{3}(.*)\'{3}', '**\\1**'),              # bold
        ('\'{2}(.*)\'{2}', '//\\1//'),              # italic
        ('(?<!\[)(\b[A-Z]+[a-z]+[A-Z][A-Za-z]*\b)','[[\\1]]'),  # CamelCase, dont change if CamelCase is in InternalLink
        ('\[\[Date\(([\d]{4}-[\d]{2}-[\d]{2}T[\d]{2}:[\d]{2}:[\d]{2}Z)\)\]\]', '\\1')  # Date value
    )
    for i in range(len(page)):
        line = page[i]
        for item in regexp:
            line = re.sub(item[0], item[1], line)
        page[i] = line
    return page
 
 
def print_help():
    print "Usage: moinconv.py <moinmoin pages directory> <output directory>"
    print "Convert moinmoin pages to dokuwiki."
    sys.exit(0)
 
 
def print_parameter_error():
    print >> sys.stderr, 'Incorrect parameters! Use --help switch to learn more.'
    sys.exit(1)
 
 
if __name__ == '__main__':
    if len(sys.argv) > 1:
        if sys.argv[1] in ('-h', '--help'):
            print_help()
        elif len(sys.argv) > 2:
            moin_pages_dir = sys.argv[1]
            output_dir = sys.argv[2]
        else:
            print_parameter_error()
    else:
        print_parameter_error()
 
    check_dirs(moin_pages_dir, output_dir)
    print 'Input dir is: %s.' % moin_pages_dir
    print 'Output dir is: %s.' % output_dir
    print
 
    pages = get_page_names(moin_pages_dir)
    for page in pages:
        curr_rev = get_current_revision(page)
        curr_rev_desc = file(curr_rev, 'r')
        curr_rev_content = curr_rev_desc.readlines()
        curr_rev_desc.close()
 
        curr_rev_content = convert_page(curr_rev_content)
 
        page_name = basename(page).lower()
        out_file = os.path.join(output_dir, page_name + '.txt')
        out_desc = file(out_file, 'w')
        out_desc.writelines([it.rstrip() + '\n' for it in curr_rev_content if it])
        out_desc.close()
 
        print 'Migrated %s to %s.' % (basename(page), basename(out_file))

Discussion

Why did you switch from MoinMoin to DokuWiki? Just curious, I'm debating between the two and Moin's WYSIWYG editor is very nice, and big sites like fedoraproject.org and ubuntu.com are using Moin. - posted on 1/16/2006
Because MoinMoin is not as stable as it looks like? You know the Ubuntuusers Wiki-case? - posted on 04/26/2007
 
tips/moinmoin2doku.txt · Last modified: 2008/09/16 15:27 by 89.176.32.186
 

Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Noncommercial-Share Alike 3.0 Unported

Imprint Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki
WikiForumIRCBugsTranslate