====== BOMfix Syntax PlugIn ====== ---- plugin ---- description: SuppressUTF-8 Byte-Order-Mark author : Matthias Watermann email : support@mwat.de type : syntax lastupdate : 2007-08-15 compatible : 2005-07-13+ depends : conflicts : similar : tags : bom, utf-8 ---- If you //always// and //exclusively// edit your wiki pages with Doku­Wiki's builtin edi­tor (i.e. the HTML form ba­sed edit op­tion) you //won't need// this plugin and may just re­turn to where­ever you came from when brow­sing to this page. External editors (i.e. separate stand­alone pro­grams like word­pro­ce­ssing soft­ware) usu­al­ly mark a file in UTF8 for­mat by pre­pen­ding its con­tent with a "magic" byte se­quence((often called ''BOM'': //Byte Order Mark//)) at the very start of file. While there is no harm in it as far as Doku­Wiki is con­cer­ned those "ma­gic" bytes //do// ap­pear in the page pre­sen­ted to the user. Depending on a page's actual content and the respec­tive CSS rules in ef­fect this may lead to un­de­sired re­sults. One way to get rid of this pro­blem would be to open the affec­ted page(s) with Doku­Wiki's builtin edit fea­ture and simply re­move those bytes. How­ever, such an ap­proach would cause the word­pro­ces­sor to open the file as plain text assu­ming it's in ASCII or, say, ISO-8859-1 for­mat -- whatever may be confi­gu­red as the de­fault text format. That, in conse­quence would in­va­li­date (or at least ren­der strange­ly) all UTF8 cha­rac­ter se­quen­ces. Actually that is the recommended approach //if// (i.e. ''if'') you ne­ver in­tend to edit the wiki pages by an ex­ter­nal edi­tor. As it happens, personally I prefer to edit the pages (of a local Doku­Wiki in­stal­la­tion) by edi­tors like [[http://www.kate-editor.org/|Kate]] or [[http://www.openoffice.org/|OpenOffice.org]] for various rea­sons((and, yes, I know that I bypass Doku­Wiki's locking and chan­ges-sy­stem this way; but I know what, when and how I'm doing it...)). There­for I((i.e. the edi­tor)) need those "magic" bytes //but// I don't want them to show up in the pa­ges pre­sen­ted to the end user (reader). Enter ''syntax_plugin_bomfix''. ===== Usage ===== The whole purpose of this plugin is to suppress the out­put of that "magic" byte sequence. And no­thing more. :!: There are //no new wiki language features// intro­duced by this plugin. Nor is there any­thing special you have to remem­ber when edi­ting one of your al­ready existing or newly crea­ted pages. Hence -- besi­des in­stal­ling this plugin there's no­thing to do or respect. ===== Installation ===== It's quite easy to integrate this [[#plugin_source|plugin]] with your DokuWiki: - Download the [[http://dev.mwat.de/dw/syntax_plugin_bomfix.zip|source archive]] (~3KB) and un­pack it in your Doku­Wiki plug­in di­rec­tory ''{dokuwiki}/lib/plugins'' (make sure, in­clu­ded sub­di­rec­to­ries are un­packed cor­rectly); this will create the directory ''{dokuwiki}/lib/plugins/bomfix''. - Make sure both the new direc­tory and the files therein are read­able by the web-server e.g. chown apache:apache dokuwiki/lib/plugins/* -Rc You might as well use the [[http://www.dokuwiki.org/plugin:plugin_manager|plugin manager]] for installing or updating this plugin. ===== Plugin Source ===== Here comes the [[http://www.gnu.org/licenses/gpl.html|GPLed]] PHP source((The comments within the [[#Plugin Source|source]] file are suit­able for the OSS [[http://www.stack.nl/~dimitri/doxygen/index.html|doxygen]] tool, a do­cu­men­ta­tion sy­stem for C++, C, Java, Ob­jec­tive-C, Py­thon, IDL and to some ex­tent PHP, C#, and D. --- Since I'm wor­king with dif­fe­rent pro­gram­ming lan­gua­ges it's a great ease to have one tool that handles the docs for all of them.)) for those who'd like to scan be­fore actu­ally in­stal­ling it: syntax_plugin_bomfix.php - A PHP4 class that implements * a DokuWiki plugin for UTF8 "magic" bytes. * *

* External editors (i.e. separate standalone programs like wordprocessing * software) usually mark a file in UTF8 format by prepending its content * with a "magic" byte sequence at the very start of file. While there is * no harm in it as far as DokuWiki is concerned those "magic" bytes * (Byte Order Mark) do appear in the page presented to the user. *

* Depending on a page's actual content and the respective CSS rules in * effect this may lead to undesired results. One way to get rid of this * problem would be to open the affected page(s) with DokuWiki's builtin * edit feature and simply remove those bytes. However, such an approach * would cause the wordprocessor to open the file as plain text assuming * it's in ASCII or, say, ISO-8859-1 format - whatever may be configured * as the default text format. That, in consequence would invalidate (or * at least render strangely) all UTF8 character sequences. *

* Actually that is the recommended approach if (if) * you never intend to edit the wiki pages by an external editor. *

* As it happens, personally I prefer to edit the pages (of a local Dokuwiki * installation) by OpenOffice.org for various reasons. (And, yes, I know * that I bypass DokuWiki's changes-system this way.) Therefor I need those * "magic" bytes but I don't want them to show in the pages * presented to the end user (reader). Enter syntax_plugin_bomfix. * The whole purpose of this plugin is to suppress the output of that * "magic" byte sequence. And nothing more. There are no new wiki language * features introduced by this plugin. Nor is there anything special you * have to remember when editing one of your already existing or newly * created pages. *

* To use it just install the plugin in your DokuWiki's plugin folder. * That's all. *

 *  Copyright (C) 2006, 2007  DFG/M.Watermann, D-10247 Berlin, FRG
 *      All rights reserved
 *    EMail : <support@mwat.de>
 * 
*
* This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either * version 3 of the * License, or (at your option) any later version.
* This software is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU * General Public License for more details. *
* @author Matthias Watermann * @version $Id: syntax_plugin_bomfix.php,v 1.3 2007/08/15 12:36:20 matthias Exp $ * @since created 24-Dec-2006 */ class syntax_plugin_bomfix extends DokuWiki_Syntax_Plugin { /** * @publicsection */ //@{ /** * Tell the parser whether the plugin accepts syntax mode * $aMode within its own markup. * * @param $aMode String The requested syntaxmode. * @return Boolean FALSE always since no nested markup * is possible with this plug. * @public */ function accepts($aMode) { return FALSE; } // accepts() /** * Connect lookup pattern to lexer. * * @param $aMode String The desired rendermode. * @public * @see render() */ function connectTo($aMode) { $this->Lexer->addSpecialPattern('^\xEF\xBB\xBF', $aMode, 'plugin_bomfix'); } // connectTo() /** * Get an associative array with plugin info. * *

* The returned array holds the following fields: *

*
author
Author of the plugin
*
email
Email address to contact the author
*
date
Last modified date of the plugin in * YYYY-MM-DD format
*
name
Name of the plugin
*
desc
Short description of the plugin (Text only)
*
url
Website with more information on the plugin * (eg. syntax description)
*
* @return Array Information about this plugin class. * @public * @static */ function getInfo() { return array( 'author' => 'Matthias Watermann', 'email' => 'support@mwat.de', 'date' => '2007-08-15', 'name' => 'BOMfix Syntax Plugin', 'desc' => 'Ignore UTF8 "magic" bytes at start of page', 'url' => 'http://www.dokuwiki.org/plugin:bomfix'); } // getInfo() /** * Where to sort in? * * @return Integer 380 (doesn't really matter). * @static * @public */ function getSort() { return 380; } // getSort() /** * Get the type of syntax this plugin defines. * * @return String 'substition' (i.e. 'substitution'). * @static * @public */ function getType() { return 'substition'; // sic! should be __substitution__ } // getType() /** * Handler to prepare matched data for the rendering process. * *

* The $aState parameter gives the type of pattern * which triggered the call to this method: *

*
*
DOKU_LEXER_ENTER
*
a pattern set by addEntryPattern()
*
DOKU_LEXER_MATCHED
*
a pattern set by addPattern()
*
DOKU_LEXER_EXIT
*
a pattern set by addExitPattern()
*
DOKU_LEXER_SPECIAL
*
a pattern set by addSpecialPattern()
*
DOKU_LEXER_UNMATCHED
*
ordinary text encountered within the plugin's syntax mode * which doesn't match any pattern.
*
* @param $aMatch String The text matched by the patterns. * @param $aState Integer The lexer state for the match. * @param $aPos Integer The character position of the matched text. * @param $aHandler Object Reference to the Doku_Handler object. * @return Integer The current lexer state. * @public * @see render() * @static */ function handle($aMatch, $aState, $aPos, &$aHandler) { return $aState; // doesn't really matter as it's ignored anyway ... } // handle() /** * Handle the actual output creation. * *

* The method checks for the given $aFormat and returns * FALSE when a format isn't supported. $aRenderer * contains a reference to the renderer object which is currently * handling the rendering. The contents of $aData is the * return value of the handle() method. *

* @param $aFormat String The output format to generate. * @param $aRenderer Object A reference to the renderer object. * @param $aData Integer The data created/returned by the * handle() method. * @return Boolean TRUE if rendered successfully, or * FALSE otherwise. * @public * @see handle() * @static */ function render($aFormat, &$aRenderer, $aData) { // nothing to do here - just 'eat' the BOM return TRUE; } // render() //@} } // class syntax_plugin_bomfix } // if //Setup VIM: ex: et ts=2 enc=utf-8 : ?>
==== Changes ==== __2007-08-15__:\\ * added GPL link and fixed some doc problems; __2007-12-26__:\\ + initial release; //[[support@mwat.de|Matthias Watermann]] 2007-08-15// ===== See also ===== ==== Plugins by the same author ==== * [[bomfix|BOMfix Plugin]] -- ignore Byte-Order-Mark characters in your pages * [[code2|Code Syntax Plugin]] -- use syntax highlighting of code fragments in your pages * [[deflist|Definition List Syntax Plugin]] -- use the only complete definition lists in your pages * [[diff|Diff Syntax Plugin]] -- use highlighting of diff files (aka "patches") in your pages((obsoleted by incorporating its ability into the [[code2|Code]] plugin)) * [[hr|HR Syntax Plugin]] -- use horizontal rules in nested block elements of your pages * [[lang|LANGuage Syntax Plugin]] -- markup different languages in your pages * [[lists|Lists Syntax Plugin]] -- use the only complete un-/ordered lists in your pages * [[nbsp|NBSP Syntax Plugin]] -- use Non-Breakable-Spaces in your pages * [[nstoc|NsToC Syntax Plugin]] -- use automatically generated namespace indices * [[shy|Shy Syntax Plugin]] -- use soft hyphens in your pages * [[tip|Tip Syntax Plugin]] -- add hint areas to your pages ===== Discussion ===== Hints, comments, suggestions ... \\