Right now, FlockWatch uses raw frequency counts from two time windows to identify trending terms. If there are many more messages in t2 than in t1, FlockWatch will find a lot of trending terms (simply because more messages means more opportunities for a term to appear).
Maybe FlockWatch should use frequency rates (normalized by the number of messages in a time window) rather than raw frequency counts?
Right now, FlockWatch uses raw frequency counts from two time windows to identify trending terms. If there are many more messages in t2 than in t1, FlockWatch will find a lot of trending terms (simply because more messages means more opportunities for a term to appear).
Maybe FlockWatch should use frequency rates (normalized by the number of messages in a time window) rather than raw frequency counts?