Thank you very much for your outstanding work. When using your stage2.sh script for training, I encountered two questions:
-
Is the llava-158k you used composed of complex_reasoning_77k.json, conversation_58k.json, and detail_23k.json combined, or is it separately llava_instruct_150k.json?
-
In your script, you wrote: dataset_name VisualGenomeDataset@InstructCaptionDataset@VQAv2Dataset@AOKVQADataset@FlickrEntityDataset, but it results in the following error.

** module 'pink.datasets' has no attribute 'InstructCaptionDataset'. Did you mean: 'PretrainCaptionDataset'?**
Can I directly replace it with LLaVA?
Thank you very much for your outstanding work. When using your
stage2.shscript for training, I encountered two questions:Is the
llava-158kyou used composed ofcomplex_reasoning_77k.json,conversation_58k.json, anddetail_23k.jsoncombined, or is it separatelyllava_instruct_150k.json?In your script, you wrote:
dataset_name VisualGenomeDataset@InstructCaptionDataset@VQAv2Dataset@AOKVQADataset@FlickrEntityDataset, but it results in the following error.** module 'pink.datasets' has no attribute 'InstructCaptionDataset'. Did you mean: 'PretrainCaptionDataset'?**
Can I directly replace it with LLaVA?