Inference Group
.
.




Search :
.
logo

Dasher的概念几乎和任何语言兼容. 现在它支持几种欧洲语种以及日语. 为了在一个非英语的欧洲语种里使用Dasher,你需要有一个用你的语种自然写作而成的文本文件来训练Dasher - 把这个文件放在下面的地址 input/sourceinput/source.txt. 确保 "Word" 选项是关闭的, 或者把 input/dict 文件用你们语种的字典代替.

当version 3 发布后,我们打算在Open Source community的帮助下,致力提高Dasher能处理的语种的数目, [Version 3 将使用Unicode.] 通过version 3, 和通过 version 1.6一样,每个语种需要一个自然写作的文本文件 (大约 300K 或更大).

更多建议有关建立一个训练集合


JDasher

daishoya - JDasher - Japanese Dasher - DAISHOYA

The Japanese name for Dasher is Daishoya (daishoya), which means `scribe'.

Demonstration
A movie describing Daishoya in Japanese.
howareyouH

As a first step towards a full Japanese version of Dasher handling both Kana and Kanji, David Ward has written a Hiragana version, available in version 1.6.3 of Dasher. (NB: later versions of windows-Dasher, such as 1.6.8, do not support Hiragana, because of Tcl font problems; the linux version of 1.6.8 works fine in Hiragana.)

The conversion of Dasher to Daishoya is simple: we replace the English alphabet a..z by the Hiragana alphabet, aiueo... (a,i,u,e,o, ka,ki,ku,ke,ko,...); and we replace the English training text by a Hiragana document. [Unfortunately, we have not been able to find a large pure-Hiragana document, so our language model is not as well-trained as we would like.]

Two orderings of the Hiragana alphabet are available (options "japan1" and "japan2"). In "japan2" the diacritical marks (",o) are included as separate characters; in "japan1" they are integrated by including the characters pa,ba, etc. in the alphabet ("pa", "ba").

We would welcome collaborators to help test Daishoya and introduce it to a large population of users.

We also need Hiragana data, in text form, for training the language model.


The Inference Group is supported by the Gatsby Foundation
and by a partnership award from IBM Zurich Research Laboratory
David MacKay
网站最后修改于 Fri Oct 1 10:33:25 BST 2010