~~ORPHANSWANTED:orphans~~
~~ORPHANSWANTED:wanted~~
~~ORPHANSWANTED:valid~~
~~ORPHANSWANTED:all~~ << makes all three tables
Anything other than these 4 words will generate a syntax error message.
**Enhanced** usage adds optional namespaces, each prefixed with an exclamation point '!' (think 'not')
~~ORPHANSWANTED:orphans|wanted|valid|all[!namespace!another!one:with:subspaces]~~
Example -- exclude one namespace:
~~ORPHANSWANTED:wanted!wiki~~ Shows wanted pages, but none that are under the wiki: namespace
Example -- exclude multiple namespaces:
~~ORPHANSWANTED:orphans!wiki!sys:personal~~ Shows orphan pages, but none in the wiki: or in the sys:personal: namespaces
===== Version/Requirements =====
:!: Seems to work fine with Dokuwiki v[[changes#release_2007-06-26|2007-06-26]]. (Added this note in case the note below scares you off.) //-- brett, 2008-03-03//
:!: There appear to be some problems with using this plugin with the latest release of DokuWiki (2006-11-06). It doesn't appear to work correctly, and it breaks the Configuration Manager. --- //[[david.mcneill@ge.com|DGM2]] 2007-01-05 00:19//
:!: The plugin does seem not to support default namespace linking (as in %%x:y:%% tries to link to %%x:y:home%% and if that doesn't exist, it points to %%x:y%%), we use that a lot in our (very large) wiki, where we would obviously require a list of orphaned/wanted pages to detect false links etc. - Bernhard
Proposed fix for above issue...add the following code after the ''namespace fix'' around line 94:
if( $link[strlen($link)-1] == ":" ) {
$link .= $conf["start"];
}
-- Nathan
----
* Version 2.4 released 2008-11-13. Andy Webber incorporated most of the fixes in comments below; in particular the security fix.
* Version 2.3 released 2006-06-07. Modified call-by-reference (moved from function call to function args). Added file: as an excluded link type. Allow spaces after opening
[!]~~ :: orphans | wanted | valid | all
* [!] :: optional. prefix each with ! e.g., !wiki!comments:currentyear
* @license GPL 2 (http://www.gnu.org/licenses/gpl.html)
* @author dae@douglasedmunds.com >
* Updated by Andy Webber to include comments from dokuwiki plugin page upto 2008-11-10
*/
if(!defined('DOKU_INC')) define('DOKU_INC',realpath(dirname(__FILE__).'/../../').'/');
if(!defined('DOKU_PLUGIN')) define('DOKU_PLUGIN',DOKU_INC.'lib/plugins/');
require_once(DOKU_PLUGIN.'syntax.php');
require_once(DOKU_INC.'inc/search.php');
//-------------------------------------
//mod dae
function orph_handle_link(&$data, $link) {
$item = &$data["$link"];
if(isset($item)) {
// This item already has a member in the array
// Note that the file search found it
$item['links'] = 1 + $item['links']; // count the link
// echo " \n";
} else {
// Create a new entry
$data["$link"]=array('exists' => false, // Only found a link, not the file
'links' => 1);
// echo " \n";
}
}
function orph_fileNS($file) {
$path_probs = array("!^/!", "!/$!", "!/!", "!::!");
$replacements = array( "", "", ":", ":");
$x = strrpos($file, '/');
switch($x) {
case 0:
$result = "";
break;
default:
// replace all the / with : after dropping the file off the filename
$result = preg_replace($path_probs, $replacements, substr($file, 1, $x -1 ));
}
//echo "\n";
return $result;
}
function orph_Check_InternalLinks(&$data,$base,$file,$type,$lvl,$opts) {
define("LINK_PATTERN", "%\[\[([^\]|#]*)(#[^\]|]*)?\|?([^\]]*)]]%");
if(preg_match("/.*\.txt$/", $file)) {
global $conf;
// echo " \n";
$body = @file_get_contents($conf['datadir'] . $file);
// ignores entries in , %%, and emails with @
foreach ( array("/[\W\w]*<\/nowiki>/",
"/%%.*%%/",
"/[\W\w]*<\/code>/" ,
"/\[\[\\ *\\\\.*\]\]/" , //windows shares
"/\[\[\ *[a-zA-Z0-9._-]+@[a-zA-Z0-9-]+\..*\ *\]\]/" //email address with tags
//"/\[\[\ *[a-zA-Z0-9._-]+@[a-zA-Z0-9-]+\.[a-zA-Z.]+\ *\]\]/" //email address
//source http://www.sitepoint.com/article/regular-expressions-php
) as $ignored){
$body = preg_replace($ignored, "", $body);
}
$links = array();
preg_match_all(LINK_PATTERN, $body, $links);
foreach($links[1] as $link) {
if( (0 < strlen(ltrim($link)))
and ( $link[0] <> "/" )
and (!preg_match("/^\ *(https?|mailto|ftp|file):/", $link)) //mod 7 june 06: allow spaces before http, etc.
and (!preg_match("/^(.*)>/", $link))
and (!strpos("@", $link)) ) {
// Try fixing the link...
//$link = preg_replace("![ ]!", "_", strtolower($link));
// need to fix the namespace?
if( $link[0] == ":" ) { // forced root namespace
$link = substr($link, 1);
//echo "\t\t\n";
} else {
if($link[0] == ".") { // forced relative namespace
// $link = preg_replace("!::!", ":",orph_fileNS($file) . ":" . substr($link, 1));
$link = resolve_id(orph_fileNS($file),$link);
//echo "\t\t\n";
} else if(strpos($link,':') === false) {
$link = preg_replace("!::!", ":",orph_fileNS($file) . ":" . $link);
//echo "\t\t\n";
}
} // namespace fix
if( $link[strlen($link)-1] == ":" ) {
$link .= $conf["start"];
}
// looks like an ID?
$link = cleanID($link);
if(((strlen(ltrim($link)) > 0) // there IS an id?
and !auth_quickaclcheck($link) < AUTH_READ)) { // should be visible to user
// and (!preg_match("/^(http|mailto):/", $link)) // URL
// and (!preg_match("/^(.*)>/", $link))) { // interwiki
//check ACL
//echo " \n";
//dae mod
//orph_handle_link(&$data, $link);
orph_handle_link($data, $link); }
} // link is not empty?
} // end of foreach link
}
}
function orph_report_table($data, $page_exists, $has_links, $params_array) {
global $conf;
if ($page_exists && $conf['useheading']) {
$show_heading = true;
}
//take off $params_array[0];
$exclude_array = array_slice($params_array,1);
$count = 1;
$output = '';
// for valid html - need to close the that is feed before this
$output .= '
';
$output .= " # ID " .
($show_heading ? "Title " : "" ) . "Links \n";
arsort($data);
foreach($data as $id=>$item) {
if(($item["exists"] == $page_exists) and (($item["links"] <> 0)== $has_links)) {
// $id is a string, looks like this: page, namespace:page, or namespace::page
$match_array = explode(":", $id);
//remove last item in array, the page identifier
$match_array = array_slice($match_array, 0, -1);
//put it back together
$page_namespace = implode (":", $match_array);
//add a trailing :
$page_namespace = $page_namespace . ':';
//set it to show, unless blocked by exclusion list
$show_it = true;
foreach ($exclude_array as $exclude_item){
//add a trailing : to each $item too
$exclude_item = $exclude_item . ":";
// need === to avoid boolean false
// strpos(haystack, needle)
// if exclusion is beginning of page's namespace , block it
if (strpos($page_namespace, $exclude_item) === 0){
//there is a match, so block it
$show_it = false;
}
}
if ($show_it) {
$output .= "$count " .
$id ." " .
($show_heading ? "" . hsc(p_get_first_heading($id)) ." " : "" ) .
"" . $item["links"] .
($has_links ? " : Show backlinks" : "" ) .
" \n";
$count++;
}
}
}
//close the html table
$output .= "
\n";
//for valid html = need to reopen a
$output .= "
";
return $output;
}
function orph_search_wanted(&$data,$base,$file,$type,$lvl,$opts) {
if($type == 'd'){
return true; // recurse all directories, but we don't store namespaces
}
if(!preg_match("/.*\.txt$/", $file)) { // Ignore everything but TXT
return true;
}
// search the body of the file for links
// dae mod
// orph_Check_InternalLinks(&$data,$base,$file,$type,$lvl,$opts);
orph_Check_InternalLinks($data,$base,$file,$type,$lvl,$opts);
// get id of this file
$id = pathID($file);
//check ACL
if(auth_quickaclcheck($id) < AUTH_READ) {
return false;
}
// try to avoid making duplicate entries for forms and pages
$item = &$data["$id"];
if(isset($item)) {
// This item already has a member in the array
// Note that the file search found it
$item['exists'] = true;
} else {
// Create a new entry
$data["$id"]=array('exists' => true,
'links' => 0);
}
return true;
}
// --------------------
/**
* All DokuWiki plugins to extend the parser/rendering mechanism
* need to inherit from this class
*/
class syntax_plugin_orphanswanted extends DokuWiki_Syntax_Plugin {
/**
* return some info
*/
function getInfo(){
return array(
'author' => 'Doug Edmunds',
'email' => 'dae@douglasedmunds.com',
'date' => '2006-06-07',
'name' => 'OrphansWanted Plugin ver 2.3',
'desc' => 'Find orphan pages and wanted pages .
syntax ~~ORPHANSWANTED:[!]~~ .
:: orphans|wanted|valid|all .
are optional, start each namespace with !' ,
'url' => 'http://wiki.splitbrain.org/plugin:orphanswanted',
);
}
/**
* What kind of syntax are we?
*/
function getType(){
return 'substition';
}
/**
* What about paragraphs?
*/
function getPType(){
return 'normal';
}
/**
* Where to sort in?
*/
function getSort(){
return 990; //was 990
}
/**
* Connect pattern to lexer
*/
function connectTo($mode) {
$this->Lexer->addSpecialPattern('~~ORPHANSWANTED:[0-9a-zA-Z:!]+~~',$mode,'plugin_orphanswanted');
}
/**
* Handle the match
*/
function handle($match, $state, $pos, &$handler){
$match_array = array();
$match = substr($match,16,-2); //strip ~~ORPHANSWANTED: from start and ~~ from end
// Wolfgang 2007-08-29 suggests commenting out the next line
$match = strtolower($match);
//create array, using ! as separator
$match_array = explode("!", $match);
// $match_array[0] will be orphan, wanted, valid, all, or syntax error
// if there are excluded namespaces, they will be in $match_array[1] .. [x]
// this return value appears in render() as the $data param there
return $match_array;
}
/**
* Create output
*/
function render($format, &$renderer, $data) {
global $INFO, $conf;
if($format == 'xhtml'){
// user needs to add ~~NOCACHE~~ manually to page, to assure ACL rules are followed
// coding here is too late, it doesn't get parsed
// $renderer->doc .= "~~NOCACHE~~";
// $data is an array
// $data[1]..[x] are excluded namespaces, $data[0] is the report type
//handle choices
switch ($data[0]){
case 'orphans':
$renderer->doc .= $this->orphan_pages($data);
break;
case 'wanted':
$renderer->doc .= $this->wanted_pages($data);
break;
case 'valid':
$renderer->doc .= $this->valid_pages($data);
break;
case 'all':
$renderer->doc .= $this->all_pages($data);
break;
default:
$renderer->doc .= "ORPHANSWANTED syntax error";
// $renderer->doc .= "syntax ~~ORPHANSWANTED:~~ :: orphans|wanted|valid|all Ex: ~~ORPHANSWANTED:valid~~";
}
return true;
}
return false;
}
// three choices
// $params_array used to extract excluded namespaces for report
// orphans = orph_report_table($data, true, false, $params_array);
// wanted = orph_report_table($data, false, true), $params_array;
// valid = orph_report_table($data, true, true, $params_array);
function orphan_pages($params_array) {
global $conf;
$result = '';
$data = array();
search($data,$conf['datadir'],'orph_search_wanted',array('ns' => $ns));
$result .= orph_report_table($data, true, false,$params_array);
return $result;
}
function wanted_pages($params_array) {
global $conf;
$result = '';
$data = array();
search($data,$conf['datadir'],'orph_search_wanted',array('ns' => $ns));
$result .= orph_report_table($data, false, true,$params_array);
return $result;
}
function valid_pages($params_array) {
global $conf;
$result = '';
$data = array();
search($data,$conf['datadir'],'orph_search_wanted',array('ns' => $ns));
$result .= orph_report_table($data, true, true, $params_array);
return $result;
}
function all_pages($params_array) {
global $conf;
$result = '';
$data = array();
search($data,$conf['datadir'],'orph_search_wanted',array('ns' => $ns));
$result .= "
Orphans
";
$result .= orph_report_table($data, true, false,$params_array);
$result .= "
Wanted
";
$result .= orph_report_table($data, false, true,$params_array);
$result .= "
Valid
";
$result .= orph_report_table($data, true, true, $params_array);
return $result;
}
}
?>
===== Discussion =====
Maybe being able to find empty or near empty pages by size (find all pages less than 150 bytes) is good too. So pages that are blank or have only a few characters can be cleaned up. I supposed with the [[plugin:blog:new | blog plugin]] people can tag pages as needing work, but I like your solution of automatically finding them.
> This plugin is about links and the lack of them, not file size. Use an FTP program, sort by size. The page files don't have any metadata, so the file size will give you all the info you need about near empty pages. [dae 2006-01-30]
----
Really nice plugin, but the wanted-list also lists file://-links (on Windows)//
> I have posted Version 2, which should prevent MSWindows shares from appearing in the results tables. [dae 2006-02-09]
----
I like this plugin and have been using it for administration quite a bit over the past few days. There are a couple of things that are keeping me from putting public links to the page that has the function, however.
- I have quite a few example links that I created on my page using % signs to prevent wiki formatting so I could show users how to make a link in a format that keeps things organized. The find wanted part of this plugin finds all of those examples that aren't really links.
- I am using mediawiki style discussion pages on my site based on the code here: [[tips:discussion]]. All of the discussion pages are seen as orphans, which they technically are since there are no wiki links to them, but they clutter the list. Is there a way to prevent the function from searching a particular namespace for orphans?
--- 2006-01-30 14:50cdt\\
> First issue: The code has been fixed to ignore pagenames that appear in nowiki, double %'s, and code section. Also, email addresses that appear in double square brackets are ignored. \\ 2006-02-03 \\
>Second issue: I made a small modification on my orphanswanted script to drop anything in the discussion namespace. In the function **orph_report_table** you have to add the stuff indicated by the two comments: \\
foreach($data as $id=>$item) {
$idknack = explode(':', $id); // Add this line
if(($item["exists"] == $page_exists) and (($item["links"] <> 0)== $has_links) and !($idknack[0] == 'discussion')) { // Add the third conditional
$output .= "$count " .
$id ." " .
($page_exists ? "" . p_get_first_heading($id) ." " : "" ) .
"" . $item["links"] . " \n";
$count++;
}
}
Now my discussion pages don't show up in the orphans list. -- Christopher Wellons (mosquitopsu -- gmail)
>SECOND ISSUE UPDATE: \\
I have posted Version 2 of the plugin. It now allows ANY namespace or namespace:subspace to be excluded from the results. \\
The code changes suggested above (which hardcode the 'discussion' namespace) are no longer necessary. \\
To exclude orphans appearing in the discussion namespace, use this line: \\
~~ORPHANSWANTED:orphans!discussion~~ \\
--- Doug Edmunds 2006-02-09 \\
Wow! This is perfect! I actually have two namespaces I want to eliminate and it works perfectly for that as well. Not only do I get rid of discussion pages, but also anything in the wiki namespace in order to eliminate user pages (wiki:users:*).
~~ORPHANSWANTED:orphans!discussion!wiki~~
--- Chris Wellons (mosquitopsu -- gmail)
>**BUG FIXED: Valid Dokuwiki mailto: links appear in the Wanted Pages list.**\\
>I have a link on one page: example Please email [[blah@blah.com?subject=Blah|John Blah]] . In the Wanted pages, it shows up like this: **blah_blah.com_subject_blah**. What's more, you can click on that page, and then when you click it's title you don't see the backlink for it. -- Dean-Ryan Stone (dhry@dhryland.com) - 20060322 \\
I have updated the email regex to avoid this kind of email entry from appearing as a wanted page\\
--- Doug Edmunds 2006-04-02 \\
----
Bugs reported
1) I have following external link which generates an entry in the orphans list:
Link
[[ http://en.wikipedia.org/wiki/Repetitive_strain_injury | Repetitive Strain Injury ]]
[[ http://en.wikipedia.org/wiki/Repetitive_strain_injury | Repetitive Strain Injury ]]
Entry
http:en.wikipedia.org_wiki_repetitive_strain_injury
> **FIXED** version 2.3 \\
--- Doug Edmunds 2006-06-07 \\
2) the same problem is with file links like
[[file:///user/exampls.pdf]]
[[file:///user/exampls.pdf]]
>**FIXED** version 2.3 (file: has been added to the types of links excluded) \\
--- Doug Edmunds 2006-06-07 \\
--- Markus 2006-04-03 \\
----
Current development version of Dokuwiki
I can't get it to work with the current development version. I installed it via plugin manager. All that appears it the source code. Any idea what could be the problem? ---Martin 2006-05-9 \\
>I can't help you here. I am not working with the dev version. Try doing a manual install (download the zip, then upload it to the lib/plugins directory). If anyone else has a suggestion, pls post it. [dae 2006-05-30] \\
----
Info provided by email on 1 Sept 2006 from Reinhold Kainhofer:
I'm using the OrphansWanted DokuWiki plugin, and I just realized that you
hardcode the link URL rather than using DokuWiki's wl(..) method to generate
the Link URL (which also works correctly with the various URL rewrite
settings). Attached is a patch which fixes this "problem".
if ($show_it) {
- $output .= "$count " .
+ $output .= "$count " .
$id ." " .
($page_exists ? "" . p_get_first_heading($id) ." " : "" ) .
"" . $item["links"] . " \n";
> Thanks. I will look into this, and will work it into the next revision. If anyone else needs this fix now, make these changes to the orph_report_table function, around lines 152-153. --- Doug Edmunds 2006-09-01 \\
>>**FIXED** version 2.4 --- //Andy Webber 2008-11-13//
----
Got some problem with preventing namespaces with an underscore in it's name from showing in the list. Is this a bug or must i do something special with the underscored namespace?
--- Matthias Pitzl 2006-11-03 \\
----
How about not only excluding namespaces, but also limiting the orhans generation to a certain namespace? That way a large wiki (such as ours) with different people responsible for different namespaces can only find their orphans and their wanted pages?- bernhard
---
Using the internal links plugin that resolves **bracket bracket @ something** into an internal link outside the wiki - those links also show up as wanted. Would be great if you could check for links to 'pages' starting with '@' - great plugin!
Nils
----
First: Great and very useful plugin. The only thing I considered annoying was that the entries weren't sorted by number of
links. It makes it hard to find out which pages are really wanted. After reading through the source and DokuWiki Reference for two hours and trying to implement some mystical special sorting functions on my own (being a php newbie...), I found out that the simple command
arsort($data);
just before the foreach-loop in function ''orph_report_table'' did the trick. Just in case someone is interested ;-) --- //[[unixprog[a]gmail.com|kokstitan]] 2006-12-05 01:23//
>**FIXED** version 2.4 --- //Andy Webber 2008-11-13//
----
Another Idea for improvement: In ''orph_report_table'', it isn't checked, wheter ''$conf['useheading']'' is set and just calls ''p_get_first_heading'' whenever the page exists. A simple check on wheter both ''$page_exists'' and ''$conf['useheading']'' are true, make the empty Title column go away.
For example, one could add
global $conf;
if ($page_exists && $conf['useheading']) {
$show_heading = true;
}
to the beginning of ''orph_report_table'' and replace
($page_exists ? "Title " : "" )
with
($show_heading ? "Title " : "" )
and
($page_exists ? "" . p_get_first_heading($id) ." " : "" )
with
($show_heading ? "" . p_get_first_heading($id) ." " : "" )
Like this, the Title column is only shown, when useheading is set true. --- //[[unixprog[a]gmail.com|kokstitan]] 2006-12-05 02:10//
>>**FIXED** version 2.4 --- //Andy Webber 2008-11-13//
> Note that the JavaScript function "svchk()" no longer exists (and is now unnecessary) in Dokuwiki, and does cause JavaScript errors, so it should be removed.
>-- [[todd@rollerorgans.com]] 2007-02-26
>>**FIXED** version 2.4 --- //Andy Webber 2008-11-13//
----
This Plugin works fine if you use lowercase-namespaces - but if you use namespaces that start with a capital letter (in my case "Hilfe" etc. it is impossible to exclude namespaces with %%~~ORPHANSWANTED:valid!wiki!Diskussion!Hilfe~~%% It is possible to change this behaviour: edit line 268 (third line in function handle):
$match = strtolower($match);
to
//$match = strtolower($match);
or just or delete this line
-- Wolfgang 2007-08-29
----
**An exploit can be executed!**
Replace
($page_exists ? "" . p_get_first_heading($id) ." " : "" ) .
with:
($page_exists ? "" . htmlspecialchars(p_get_first_heading($id)) ." " : "" ) .
>**FIXED** version 2.4 --- //Andy Webber 2008-11-13//
----
you wrote:
>False links: Paragraphs that start with two or more spaces (but not followed by an *) create an unformatted text box. The plugin does not exclude these.
>How or where the program parses two or more leading spaces remains a mystery to me. If you know, let me know.
I believe this happens inside inc/parser/parser.php at that place:
'monospace'=> array (
'entry'=>'\x27\x27(?=.*\x27\x27)',
'exit'=>'\x27\x27',
'sort'=>100
),
I'd be very pleased if those false links would no longer occur.
----
One bug annoy me : I have false entries in the orphaned list, when using relative path as %%[[..:test:foobar]]%%
Any idea ?
--- Wally 2008-06-10
----
I've got the same bug, relative namespaces are not resolved correctly. I've made the following bugfix:\\
In systax.php replace this line:
$link = preg_replace("!::!", ":",orph_fileNS($file) . ":" . substr($link, 1));
with this one:
$link = resolve_id(orph_fileNS($file),$link);
resolve_id is a standard dokuwiki call to resolve relative namespaces!
--- Luitzen 2008-09-10
>**FIXED** version 2.4 --- //Andy Webber 2008-11-13//
----
I've added some code to your ophanswanted plugin for dokuwiki. It would
be nice if you could include it into your release. I attached a diff of
the old and my code below. Furthermore you should change the author in
the code to yourself.
diff -ur orphanswanted-old/syntax.php orphanswanted/syntax.php
--- orphanswanted-old/syntax.php 2006-06-07 10:55:00.000000000 +0200
+++ orphanswanted/syntax.php 2008-09-17 15:14:24.000000000 +0200
@@ -119,7 +119,7 @@
// for valid html - need to close the that is feed before this
$output .= '
';
$output .= " # ID " .
- ($page_exists ? "Title " : "" ) .
"Links \n";
+ ($page_exists ? "Title " : "" ) .
"Links " . ($has_links ? " " : "" )."\n";
foreach($data as $id=>$item) {
@@ -152,8 +152,11 @@
$output .= "$count " .
$id ." " .
- ($page_exists ? "" .
p_get_first_heading($id) ." " : "" ) .
- "" . $item["links"] .
" \n";
+ ($page_exists ? "" .
htmlspecialchars(p_get_first_heading($id)) ." " : "" ) .
+ "" . $item["links"] . " " .
+ ($has_links ? "Backlinks " : "" ) .
+ "\n";
$count++;
}
Regards,
---Sascha Bendix (sent by email 2008-09-17)
>**FIXED** version 2.4 --- //Andy Webber 2008-11-13//
----
=== table gets out of design ===
Hello,
I'm using a sidebar design wich lets a smaller space for the table generated by orphans wanted. The result is that the table gets out of the page, can I fix that somewhere ? in css or in the plugin ? (see [[http://wiki.survie.org/bug_orphanswanted.png|this screenshot]] to get the point) --- //NewMorning 2008/11/01 19:58//