Multimodal ArXiv: A Dataset for Improving Scientific Comprehension of Large Vision-Language Models

Publication
In arXiv