-
Notifications
You must be signed in to change notification settings - Fork 47
Description
Hello,
Hit an error while running
python .\train.py --config .\config\doc_ner_best.yaml --batch_size 1 --parse --target_dir .\datasets\mytest --keep_order
on Windows 10, Python 3.7, no GPU.
Here is the error message:
2022-07-28 14:35:50,789 Reading data from datasets\mytest
2022-07-28 14:35:50,789 Train: datasets\mytest\doc_train.txt
2022-07-28 14:35:50,789 Dev: None
2022-07-28 14:35:50,791 Test: None
Traceback (most recent call last):
File ".\train.py", line 345, in
train_eval_result, train_loss = student.evaluate(loader,out_path=Path('outputs/train.'+config.config['model_name']+'.'+tar_file_name+'.conllu'),embeddings_storage_mode="none",prediction_mode=True)
File "C:\Users\ebb\ACE\flair\models\sequence_tagger_model.py", line 2218, in evaluate
features = self.forward(batch,prediction_mode=prediction_mode)
File "C:\Users\ebb\ACE\flair\models\sequence_tagger_model.py", line 818, in forward
self.embeddings.embed(sentences,embedding_mask=self.selection)
File "C:\Users\ebb\ACE\flair\embeddings.py", line 184, in embed
embedding.embed(sentences)
File "C:\Users\ebb\ACE\flair\embeddings.py", line 97, in embed
self._add_embeddings_internal(sentences)
File "C:\Users\ebb\ACE\flair\embeddings.py", line 2962, in _add_embeddings_internal
self._add_embeddings_to_sentences(sentences)
File "C:\Users\ebb\ACE\flair\embeddings.py", line 3051, in _add_embeddings_to_sentences
subtokenized_sentence = self.tokenizer.tokenize(tokenized_string)
AttributeError: 'NoneType' object has no attribute 'tokenize'
The error is trigged by this line: https://github.com/Alibaba-NLP/ACE/blob/main/flair/embeddings.py#L3041
because self.tokenizer is None.
Any suggestions how to debug this issue? Thanks.
btw, the content of doc_train.txt is the following gibberish:
-DOCSTART- O
Amazon O
predict O
Paypal O
and O
do O
7-11 O
for O
Canada O
and O
Hongkong O