Publications

(2025). VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning. NeurIPS 2025.

PDF Cite Code Dataset Video Page

(2024). VisionZip: Longer is Better but Not Necessary in Vision Language Models. CVPR 2025.

PDF Cite Code Video Page

(2024). Unified Language-driven Zero-shot Domain Adaptation. CVPR 2024.

PDF Cite Video Page

(2023). LISA++: An Improved Baseline for Reasoning Segmentation with Large Language Model. Tech Report.

PDF Cite Dataset

(2023). LiDAR-LLM: Exploring the Potential of Large Language Models for 3D LiDAR Understanding. AAAI 2025.

PDF Cite Dataset Page

(2023). Exploring sparse visual prompt for cross-domain semantic segmentation. AAAI 2024.

PDF Cite Code Video Page