Skip to content

yunseok624/Frame_Selection_methods

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

82 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Frame Selection Methods for Video LLMs

Welcome! This repository is dedicated to exploring and benchmarking various frame selection strategies for Video Language Models (Video LLMs), focusing on tasks like Video Reasoning and Video Question Answering (VQA).

Abstract

Results

MLLM Method Frames LLM param LVB V-MME MLVU
Qwen2-VL Uniform 32 7B TBD TBD TBD
AKS 32 7B TBD TBD TBD
FOCUS 32 7B TBD TBD TBD
Q-Frame 32 7B TBD TBD TBD
MDP3 32 7B TBD TBD TBD
FRAG 32 7B TBD TBD TBD
LLaVA-Video Uniform 32 7B 57.59 TBD TBD
AKS 32 7B 60.21 TBD TBD
FOCUS 32 7B TBD TBD TBD
Q-Frame 32 7B TBD TBD TBD
MDP3 32 7B TBD TBD TBD
FRAG 32 7B TBD TBD TBD
LLaVA-OneVision Uniform 32 7B 55.50 TBD TBD
AKS 32 7B 59.09 TBD TBD
FOCUS 32 7B TBD TBD TBD
Q-Frame 32 7B TBD TBD TBD
MDP3 32 7B TBD TBD TBD
FRAG 32 7B TBD TBD TBD

Acknowledgment

This project is based on AKS (paper, code), FOCUS (paper, code), Q-Frame (paper, code), MDP3 (paper, code), FRAG (paper, code), LLaVA-NeXT (paper, code), lmms_eval(paper, code)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors