VLM-Study Participant MrBananaHuman HwangJohn chagmgang Milestone Materials Date Presentor Paper 2024-12-10 KevinCha Adiveintovision-language-model.pdf 2024-12-18 HwangJohn phi-series_241218.pdf 2024-12-26 MrBananaHuman Unifying_Vision_Text_and_Layout_for_Universal_Document_Processing.pdf 2025-01-08 KevinCha VisionLLM_VisionLLMv2.pdf 2025-01-15 HwangJohn an_introduction_of_vision_language_model.pdf 2025-01-22 MrBananaHuman Enhancing Visual Document Undertanding with Contrastive Learning in Large Visual-Language Models.pdf 2025-02-05 chagmgang Janus-Pro : Unified Multimodal Understanding and Generation with Data and Model Scaling.pdf 2025-02-14 HwangJohn Vision_language_models_are_blind.pdf 2025-02-19 MrBananaHuman Hierarchical Vision Feature Aggregation for OCR-Free Document Understanding.pdf 2025-03-05 chagmgang RLAIF-V: Open-Source AI FeedbackLeads to Super GPT-4V Trustworthiness.pdf 2025-03-19 chagmgang LLMs_can_see_and_hear_without_any_training 2025-04-09 HwangJohn Phi4-multimodal 2025-04-09 MrBananaHuman Qwen2.5Omni 2025-04-23 chagmgang SmolVLM 2025-04-30 HwangJohn InternVL3 2025-05-07 MrBananaHuman vlm