====== MediaWiki ====== [[http://www.mediawiki.org/wiki/MediaWiki|MediaWiki]] is free and open source software for building and maintaining a wiki website. Using the MediaWiki Download function and the [[en:plugin:mediawikiplugin|MediaWikiPlugin]], you can mirror a Mediawiki website in a Greenstone collection. ===== MediaWiki Download ===== See the [[en:user:download|download]] page for general information on downloading records through Greenstone. Greenstone can download HTML pages and associated files like stylesheets from a given MediaWiki website from the GLI (in the Download panel) or the [[en:user:download#downloading_from_the_command_line|command line]] (using the ''downloadfrom.pl'' script). Either way, you are presented with the following options: ^Option^Description^ |Source URL(''-url '')|(REQUIRED) Source URL. In case of http redirects, this value may change| |Download Depth (''-depth '')|How many hyperlinks deep to go when downloading (Default: 0)| |Only files below URL (''-below'')|Only mirror files below this URL| |Only files within site (''-within'')|Only mirror files within the same site| |Ignore URL patterns (''-reject_files '')|Ignore url list, separate by comma, e.g.*cgi-bin*,*.ppt ignores hyperlinks that contain either 'cgi-bin' or '.ppt' (//Default: *action=*,*diff=*,*oldid=*,*printable*,*Recentchangeslinked*, Userlogin*,*Whatlinkshere*, *redirect*, *Special:*,Talk:*,Image:*,*.ppt,*.pdf,*.zip,*.doc//)| |Exclude directories (''-exclude_directories '')|List of exclude directories (must be absolute path to the directory), e.g. /people,/documentation will exclude the 'people' and 'documentation' subdirectory under the currently crawling site. (//Default: /wiki/index.php/Special:Recentchangeslinked, /wiki/index.php/Special:Whatlinkshere, /wiki/index.php/Talk:Creating_CD//)| If downloading via the GLI, you can view the downloaded files on the Gather panel. On the left-hand side of the panel, double click the Downloaded Files folder to expand its content. The subfolders are named by the URL. These files are physically stored in a temporary cache directory. You can build a collection using these downloaded files by dragging them across to the Collection section on the right-hand side of the Gather panel. An example MediaWiki download on the command line would be: perl -S downloadfrom.pl -document_mode MediaWiki -url http://en.wikipedia.org/ -depth 1 -reject_files *Recentchangeslinked*, Userlogin*,*Whatlinkshere*, This would download files below the url ''http://en.wikipedia.org/'' to one hyperlink deep, rejecting files with //Recentchangeslinked//, //Userlogin//, or //Whatlinkshere// in their url, and excluding the default directories. Files downloaded from MediaWiki sites are processed by the [[en:plugin:MediaWikiPlugin|MediaWikiPlugin]].