nzdl:projects
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision | ||
nzdl:projects [2017/11/05 22:43] – [Others] kjdon | nzdl:projects [2017/12/11 20:58] – [WikipediaMiner] kjdon | ||
---|---|---|---|
Line 16: | Line 16: | ||
- | =====Extracting data and metadata===== | + | ===== Extracting |
====Sequitur==== | ====Sequitur==== | ||
Line 26: | Line 26: | ||
[[http:// | [[http:// | ||
- | =====Text Mining===== | + | ==== Maui ==== |
- | See our Text Mining Webpage. ?? what link? http://www.cs.waikato.ac.nz/~nzdl/textmining/ | + | [[https://code.google.com/archive/p/maui-indexer/ |
+ | ==== Wikipedia Miner ==== | ||
+ | |||
+ | [[http:// | ||
=====Browsing interfaces===== | =====Browsing interfaces===== | ||
Line 66: | Line 69: | ||
===== Chinese Text Segmentation===== | ===== Chinese Text Segmentation===== | ||
- | |||
- | [[http:// | ||
- | |||
- | [[http:// | ||
Word segmentation is designed to find word boundaries in languages like Chinese and Japanese, which are (unlike English) written without spaces or other word delimiters (except for punctuation marks). It plays a significant role in applications that use the word as the basic unit due to the fact that machine-readable Chinese text is invariably stored in unsegmented form. | Word segmentation is designed to find word boundaries in languages like Chinese and Japanese, which are (unlike English) written without spaces or other word delimiters (except for punctuation marks). It plays a significant role in applications that use the word as the basic unit due to the fact that machine-readable Chinese text is invariably stored in unsegmented form. | ||
- | We have implemented a WWW interface for segmenting Chinese text. | + | We have implemented a WWW interface for segmenting Chinese text. A demo used to be available at www.nzdl.org/ |
- | If your web browser does not support Chinese text, [[http:// | + | (Note, the code can be found on community, in the chinese-text-segmenter |
- | Currently at [[http:// | + | |
More information can be found in the paper: [[https:// | More information can be found in the paper: [[https:// |
nzdl/projects.txt · Last modified: 2023/03/13 01:46 by 127.0.0.1