Chainer implementation of estimation next word from current words.
(http://qiita.com/ixixi/items/a3d56b2db6e09249a519)
This model trains niconico-douga comments, estimates comment from bag-of-comment. For example,
| Input | Output |
|---|---|
| ┗(^ | ┗(^o^ )┓三 |
| 日本語 | 日本語でおk |
| /hi | /hidden |
| おっく | おっくせんまん!おっくせんまん! |
| らんら | らんらんるー |
| ξ*・ | ξ*・ヮ・* |
| わっふ | わっふるわっふる |
| かわい | かわいいww |
More description is here. ( http://www.monthly-hack.com/ )
- Python 2.7 or 3.4
- Chainer 1.16.0
Downloads data from NII.
ニコニコ動画コメント等データ(http://www.nii.ac.jp/dsc/idr/nico/nicocomm-apply.html)
Get latest 10 comments from each nico-douga.
It outputs 'last10comments.pkl'.
$ python maesyori1.py /path/to/data/thread/
Get sample comments that were selected randomly.
It outputs 'sample_texts.pkl' and 'sample_vocab.pkl'.
$ python maesyori2.py
Train comment-data by lstm.
It outputs '/result/lstm_model.npz'.
$ python nico_lstm.py -g <gpu_id>
Play generating comments.
Input bag of comments.
$ python play_lstm.py
>> ┗(^
┗(^o^ )┓三
>> 日本語
日本語でおk
>>