I think avgdl should be saved as an attribute after fitting so it's not estimated again if transform is called for one document instead of the 'training' corpus.
So
makes sense because all documents in X use the same avgdl but
fit(X).transform(X)
transform(other_document)
then estimates avgdl for this document alone again.
I think avgdl should be saved as an attribute after fitting so it's not estimated again if transform is called for one document instead of the 'training' corpus.
So
makes sense because all documents in X use the same avgdl but
then estimates avgdl for this document alone again.