VLM-Study

Participant

Milestone Materials

Date	Presentor	Paper
2024-12-10	KevinCha	Adiveintovision-language-model.pdf
2024-12-18	HwangJohn	phi-series_241218.pdf
2024-12-26	MrBananaHuman	Unifying_Vision_Text_and_Layout_for_Universal_Document_Processing.pdf
2025-01-08	KevinCha	VisionLLM_VisionLLMv2.pdf
2025-01-15	HwangJohn	an_introduction_of_vision_language_model.pdf
2025-01-22	MrBananaHuman	Enhancing Visual Document Undertanding with Contrastive Learning in Large Visual-Language Models.pdf
2025-02-05	chagmgang	Janus-Pro : Unified Multimodal Understanding and Generation with Data and Model Scaling.pdf
2025-02-14	HwangJohn	Vision_language_models_are_blind.pdf
2025-02-19	MrBananaHuman	Hierarchical Vision Feature Aggregation for OCR-Free Document Understanding.pdf
2025-03-05	chagmgang	RLAIF-V: Open-Source AI FeedbackLeads to Super GPT-4V Trustworthiness.pdf
2025-03-19	chagmgang	LLMs_can_see_and_hear_without_any_training
2025-04-09	HwangJohn	Phi4-multimodal
2025-04-09	MrBananaHuman	Qwen2.5Omni
2025-04-23	chagmgang	SmolVLM
2025-04-30	HwangJohn	InternVL3
2025-05-07	MrBananaHuman	vlm

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
material		material
README.md		README.md