Option 1: websockets
Option 2: built-in streaming using server send events
http://flask.pocoo.org/docs/0.12/patterns/streaming/
https://stackoverflow.com/a/13388915
In both cases, we probably want to decode the whole context so far at each yield (ie enc.decode(context)) rather than just the most recently predicted token to avoid accidentally breaking apart unicode characters
Option 1: websockets
Option 2: built-in streaming using server send events
http://flask.pocoo.org/docs/0.12/patterns/streaming/
https://stackoverflow.com/a/13388915
In both cases, we probably want to decode the whole context so far at each yield (ie
enc.decode(context)) rather than just the most recently predicted token to avoid accidentally breaking apart unicode characters