====== Structured Data Plugin ====== ---- plugin ---- description: Add and query structured data in your wiki author : Andreas Gohr email : andi@splitbrain.org type : syntax, action, helper lastupdate : 2008-02-08 compatible : devel > 2007-12-21 depends : conflicts : similar : tags : sqlite, data, tags, database, experimental ---- :!: This plugin is still under development. It can only be pulled from my [[http://dev.splitbrain.org/darcsweb/darcsweb.cgi?r=dwplugins/data;a=summary|darcs repository]], currently. Patches are wanted and welcome! darcs get http://dev.splitbrain.org/darcs/dwplugins/data/ This plugin allows you to add structured data to any DokuWiki page. Think about this data as additional named attributes. Those attributes can then be queried and aggregated. The plugin is similar to what was done here for the [[plugins|repository plugin]] but its internals are very different to the [[plugins:repository]] plugin. **This plugin requires the SQLite extension for PHP!** (Should be included with PHP5). ===== Plugin Syntax ===== This plugin depends on multiple parts, each having a similar syntax. The syntax defines a block with various key/value pairs configuring the behaviour of the plugin part. ==== Data Entry (Input) ==== This is part is used to add structured data to a page. All data entered here is tied to the page. Let's start with an example: ---- dataentry ---- type : web development volume : 1 Mrd # how much do they pay? employees : Joe, Jane, Jim customer_page : customers:microsoft deadline_dt : 2009-08-17 server_pages : servers:devel01, extern:microsoft customer_url : http://www.microsoft.com task_tags : programming, coding, design, html ---- As you can see the block is defined by hyphens and the word ''dataentry''. You may add additional words after the ''dataentry'' keyword. Those will be added as additional CSS classes in the final HTML output. You can use this for styling how different entry types should be displayed later. You may use the ''#'' character to add comments to the block. Those will be ignored and will neither be displayed nor saved. Inside the block you see //column names// and their //values//. There are a few rules for the column names: * Use any name you like * If the name ends with the ''s'' character, you may add multiple values separated by commas (like in the employees row) * Special //types// can be added to the name to have the output formatted accordingly. Use an underscore to separate //identifier// and //type//. The following //types// are available currently: * ''dt'' -- a date in the form YYYY-MM-DD, formatted as simple text but the input is checked for correct format * ''page'' -- the entry is treated as Wiki [[pagename]] and will be linked in output * ''title'' -- like page, but an additional display title can be given separated by a pipe * ''nspage'' -- like page, but the column name is treated as namespace for the link. * ''url'' -- the value will be treated as external link * ''tag'' -- the values are linked to a page named after the column name, using the value as control filter for a data table * when no type is given, it's just treated as simple string * When using a type, add the ''s'' for multi-values at the very end (like in the ''server_pages'' row) ==== Data Table (Output) ==== To aggregate the structured data attached to various pages in your wiki this syntax is used. It will display a configurable table with the data you want. The table can be sorted and filtered. Paging is supported as well. Let's start with an example again: ---- datatable ---- cols : %pageid%, employees, deadline_dt, volume headers : Details, Assigned Employees, Deadline, $$$ max : 10 filter : type=web development sort : ^volume ---- The above config will display a table with all web development projects, the employees assigned to the project, the deadline and the volume. The table will be sorted by the volume and will display a maximum of 10 projects. So the keyword before the colon is a configuration option and the value behind is the actual setting. To make it more fault tolerant often multiple option names are possible. Here is a list of all available options: ^ Option(s) ^ Required? ^ Description ^ | cols\\ select | yes | These are the attributes you want to display. These are the same //names// you used in the Data Entry part | | title\\ titles\\ head\\ header\\ headers | no | If specified, these names will be used in the table headers instead of the column names | | max\\ limit | no | How many rows should be displayed. If more rows are available the table will be made browsable. If not given all matching rows are shown | | sort\\ order | no | By what column should the table be sorted initially? Prepend a ''%%^%%'' to reverse the sorting | | filter\\ where\\ filterand\\ and | no | Filter by a column value. You may specify this more than once, multiple filters will be ANDed. | | filtror\\ or | no | Like filter, but multiple instances will be ORed | For filtering, multiple comparators are possible: ^ Comparator ^ Meaning ^ | ''='' | Exact match | | ''!='' or ''<>'' | Does not exactly match | | ''<'' | Less than | | ''<='' | Less or equal than | | ''>'' | Greater than | | ''>='' | Greater or equal than | | ''~'' | Wildcard match. Use a ''*'' as wildcard. Like ''Apple*'' to match ''Apple Pie'' and ''Apple Computer'' | This syntax will disable all caching for the current page! ==== Related Pages (Output) ==== This mode allows you to display a list of pages which are similar to the current page because they share some of the structured data. Which columns are used for similarity comparison has to be given in the ''cols'' option. Additional filters and sorting options can be set. Here is an example: ---- datarelated ---- cols : task_tags, type title : Similar projects max : 5 sort : ^volume ---- The shown config will look for pages which share values in the columns ''task'' and ''type''. A maximum of 5 pages is shown, sorted by volume. So the keyword before the colon is a configuration option and the value behind is the actual setting. To make it more fault tolerant often multiple option names are possible. Here is a list of all available options: ^ Option(s) ^ Required? ^ Description ^ | cols\\ select | yes | These are the attributes used to compare similarity. These are the same //names// you used in the Data Entry part | | title | no | This is a descriptional info above the list. Defaults to "Related pages" | | max\\ limit | no | How many matches should be shown at maximum | | sort\\ order | no | By what column should the list be sorted? Prepend a ''%%^%%'' to reverse the sorting. This is a secondary sort. The list is always sorted by relevancy first. | | filter\\ where\\ filterand\\ and | no | Filter by a column value. You may specify this more than once, multiple filters will be ANDed. | | filtror\\ or | no | Like filter, but multiple instances will be ORed | For filtering, multiple comparators are possible: ^ Comparator ^ Meaning ^ | ''='' | Exact match | | ''!='' or ''<>'' | Does not exactly match | | ''<'' | Less than | | ''<='' | Less or equal than | | ''>'' | Greater than | | ''>='' | Greater or equal than | | ''~'' | Wildcard match. Use a ''*'' as wildcard. Like ''Apple*'' to match ''Apple Pie'' and ''Apple Computer'' | This mode will **not** disable caching for the page, so the list might not always be up to date. ==== Tag Cloud (Control) ==== This syntax will display the values of a given data name as a tag cloud. Each value will link back to the current page (unless configured otherwise). The page should also contain a Data Table - this table will then be filtered for all entries matching the selected tag. Example: ---- datacloud ---- field: employees min: 2 limit: 20 ---- The above code would display a cloud of employees assigned to at least two different projects. A maximum of the 20 most busiest employees are shown. These are the possible options for the cloud: ^ Option(s) ^ Required? ^ Description ^ | field\\ select\\ col | yes | What attribute is used to build the cloud? | | limit\\ max | no | Maximum number of tags to display. If not given all will be displayed | | min | no | Minimum count a tag must have. If not given all will be shown | | page\\ target | no | Give a page which contains the Data Table to control. If not given the current page is used | ===== Missing Features ===== * more control options: * attribute list (similar to tag cloud but as a simple list) * search field * better documentation, examples * more data types: * email * images? * links in the aggregate data page don't generate backlinks to the source data page ===== Discussion ===== Does the user have any control over the placement of the data box? If i wanted to position it to the right, what can i do? Also i'd like more control over the plural data types. Now, you can control it only via the letter s. This doesn't fly in most of the other languages and should be changed IMHO. Oh and one more thing. Wh y the dash-dash-dash-dash form? What about the following form that's more consistent with the other plugins, while still readable: foo : some data bar : some more data Nevertheless, I love this plugin, I hope you continue developing it. ;) --- //[[primozDOTverdnikATgmail.com|Drye Kindrew]] 2008/05/13 09:21// How would one go about refreshing the entire sqlite table? Simply deleting it from the cache directory is not enough. Do I have to click every page that uses the structured data in order to rebuild the table, or is there a better way to do this? --- //[[primozDOTverdnikATgmail.com|Drye Kindrew]] 2008/05/15 06:43// > I'm wondering the same thing. Just renamed/moved some items and I need to refresh. Drye, did you figure it out? I was able to use a Mac SQLite [[http://www.sqlabs.net/download.php|app]] to tidy it up really quick, but it'd be easier to not have to manually do it. --- //Brett F 2008/05/30 09:57// >> I wrote a little shell script in Linux, which first deletes the entire cache, and then systematically loads each and every wiki page. This approach is very dirty and I even had to modify some dokuwiki code in order to achieve it. A bit nicer way would be to write a PHP script that does the same thing. It would be even nicer if the plugin itself would be able to refresh the database. :) --- //[[primoz.verdnik@gmail.com|Drye Kindrew]] 2008/07/09 13:41//