DRAFT - Bootstrap Accelerator - Untested#115
DRAFT - Bootstrap Accelerator - Untested#115RuntimeRacer wants to merge 3 commits intolifeiteng:mainfrom
Conversation
|
@RuntimeRacer Very nice work, can you verify it on mutl-gpus? |
@lifeiteng Yes I will do a Multi-GPU run once I finished training current epoch on Single GPU. I had to fix some issues around #113 and #110, which caused epoch 1 to never finish until I stripped languages with non-latin charsets from my training data. I guess that epoch 1 will finish tonight or tomorrow; then I will test the accelerator code. |
|
@lifeiteng I did a first couple of test and tries the last 2 hours, however I am hitting kind of a wall when it comes to splitting the dataloaders across the GPUs. Accelerate makes assumptions like fixed batch sizes and known amount of elements in the dataset in the preparation step; however Lhotse uses it's custom Implementations to feed in data dynamically. I can't continue looking into this rn; but let me know in case you have any suggestions what we could do here. |
|
I was able to fix the existing DDP implementation: #116 |
Replaces DDP Implementation with Huggingface Accelerator to allow for simpler Multi-GPU Handling (https://huggingface.co/docs/accelerate/index)