Espresso: High Compression For Rich Extraction From Videos for Your Vision-Language Model
Published in arXiv, 2024
Recommended citation: Keunwoo Yu, Achal Dave, Rares Ambrus, Jean Mercat, "Espresso: High Compression For Rich Extraction From Videos for Your Vision-Language Model." arXiv, 2024.
Download Paper