Skip to content
@H-EmbodVis

H-EmbodVis

Embodied Vision Projects from Huazhong University of Science and Technology

H-EmbodVis

Embodied Vision · World Models · Autonomous Driving · 3D Scene Understanding

H-EmbodVis (Huazhong University of Science and Technology Embodied Vision Projects) is a research initiative. We primarily focus on Embodied AI, while also exploring Autonomous Driving and Generative Models.


🔬 Research Areas

We focus on building intelligent systems that can perceive, understand, and interact with the physical world. Key directions include:

  • Embodied AI & Agents: Integrating vision, language, and action planning.
  • World Models for Autonomous Driving: Developing end-to-end driving frameworks and simulators.
  • 3D Vision & Point Cloud Analysis: Efficient architectures for 3D representation learning.
  • Multimodal Foundation Models: Large-scale models for diverse data modalities.

🌟 Featured Projects

Autonomous Driving & World Models

  • HERMES (ICCV 2025) A Unified Self-Driving World Model for Simultaneous 3D Scene Understanding and Generation.
  • Orion (ICCV 2025) Holistic End-to-End Autonomous Driving via Vision-Language Instructed Action Generation.
  • Awesome-World-Model Curated collection of papers on World Models for Autonomous Driving and Robotics.

3D Vision & Efficient Computing

  • PointMamba (NeurIPS 2024) State Space Models (Mamba) applied to Point Cloud Analysis.
  • UniSeg3D (NeurIPS 2024) A Unified Framework for 3D Scene Understanding.
  • PointGST (IEEE TPAMI) Parameter-Efficient Fine-Tuning in Spectral Domain for Point Cloud Learning.
  • EasyCache Training-Free Video Diffusion Acceleration.

Multimodal & Embodied Agents

  • NAUTILUS (NeurIPS 2025) A Large Multimodal Model for Underwater Scene Understanding.
  • GRANT (AAAI 2026 Oral) Teaching Embodied Agents for Parallel Task Execution.
  • MERGE (NeurIPS 2025) Unifying Generation and Depth Estimation via Text-to-Image Diffusion Models.

Collaboration

We are always looking for passionate collaborators and students.

  • Connect: Reach out via email (dkliang@hust.edu.cn).
  • Reuse: Creating impactful open-source software is a core value. Please cite our papers if you use our code.

🌐 Website | 🎓 Google Scholar | 📂 Repositories

Pinned Loading

  1. GRANT GRANT Public

    [AAAI 2026 Oral] Cook and Clean Together: Teaching Embodied Agents for Parallel Task Execution

    Python 357 11

  2. PointMamba PointMamba Public

    Forked from LMD0311/PointMamba

    [NeurIPS 2024] PointMamba: A Simple State Space Model for Point Cloud Analysis

    Python

  3. UniSeg3D UniSeg3D Public

    Forked from dk-liang/UniSeg3D

    [NeurIPS 2024] A Unified Framework for 3D Scene Understanding

    Python

  4. Orion Orion Public

    Forked from xiaomi-mlab/Orion

    [ICCV 2025] Official code of "ORION: A Holistic End-to-End Autonomous Driving Framework by Vision-Language Instructed Action Generation"

    Python

  5. HERMES HERMES Public

    Forked from LMD0311/HERMES

    [ICCV 2025] HERMES: A Unified Self-Driving World Model for Simultaneous 3D Scene Understanding and Generation

    Python

  6. PointGST PointGST Public

    Forked from jerryfeng2003/PointGST

    [IEEE TPAMI] Parameter-Efficient Fine-Tuning in Spectral Domain for Point Cloud Learning

    Python

Repositories

Showing 10 of 14 repositories

Top languages

Loading…

Most used topics

Loading…