Skip to content

Latest commit

 

History

History
26 lines (17 loc) · 1.08 KB

File metadata and controls

26 lines (17 loc) · 1.08 KB

2025-03-28

prerequisite

execution

python3 kings.py -i path/to/input.wav

or

python kings.py -i path/to/input.wav

the output.mp4 will be placed in data/ after generated.

workflow

kings.py is the entry point of the application. here's the workflow:

  1. finds the input file and converts it to two audio files: 16000 (for google api) and 44100 hertz (for processing).
  2. convert the 16000 hz one into a mono channel, send it to google, and save the response as a json (data/speech.json). i don't think it uses a vtt file anymore.
  3. kings.pde is called. it uses the 44100hz audio file and the transcript json to generate the video. at this moment, the video is muted.
  4. back to king.py. it merges the muted video and the 44100hz audio file and makes the output.mp4.
  5. remove all the intermediate files.