I am a fifth-year student in the master’s-doctoral combined program at the University of Chinese Academy of Sciences (2020–present). My research interests include multimodal large model reasoning, multimodal agent reasoning, and knowledge distillation.

Feel free to contact me via email for academic discussions or internship opportunities! I am expected to graduate in June 2026 and look forward to job opportunities.

🔥 News

2026: 🎉🎉 One paper on GenProve (Fine-Grained Provenance Generation) has been accepted by ACL 2026.
2026: 🔥 Our new paper The Trinity of Consistency as a Defining Principle for General World Models has been released! We are grateful for the inspiring collaboration with researchers from University of Chinese Academy of Sciences, Shanghai AI Lab, Westlake University, Shanghai Jiao Tong University, National University of Singapore, among others.
2026: 🎉🎉 One paper on Geoint-R1 (Formal Multimodal Geometric Reasoning) has been accepted by CVPR 2026.
2026: 🎉🎉 One paper on GGBench (Geometric Generative Reasoning Benchmark for Unified Multimodal Models) has been accepted by CVPR 2026.
2026: 🎉🎉 One paper on DoGe (Data-Scarce Vision-Language Reasoning) has been accepted by CVPR 2026.
2025.08: 🎉🎉 One paper on mChartQA (Chart Question Answering) has been accepted by Pattern Recognition.
2025.08: 🎉🎉 One paper on ChartMind (Chart Question Answering) has been accepted by EMNLP 2025.
2025.08: 🎉🎉 Our collaborative technical report with the Seed Team, StructVRM, has been released! This work focuses on deep thinking in large models and achieves remarkable performance in mathematical and scientific reasoning. Paper link, Results link.
2025.07: 🎉🎉 One paper on Multimodal Scientific reasoning has been accepted by ACM Multimedia 2025.
2025.05: 🎉🎉 One paper on Multimodal Chain-of-Thought Verification has been accepted by ACL 2025.
2025.04: 🎉🎉 One paper on Sketch-to-Diagram has been accepted by IJCAI 2025.
2025.04: 🎉🎉 One paper on Chinese Multimodal Attribute Extraction has been accepted by ICIC 2025.
2025.04: 🎉🎉 One paper on Document Retrieval has been accepted by ICIC 2025.
2025.04: 🎉🎉 One paper on EEG-to-Text has been accepted by ICIC 2025.
2025.02: 🎉🎉 One paper on Text-to-Diagram has been accepted by CVPR 2025 (highlight).
2024.07: 🎉🎉 One paper on Multimodal Chain-of-Thought is accepted by ECCV 2024.
2024.07: 🎉🎉 One paper on Multimodal Chain-of-Thought is accepted by NCAA.
2024.06: 🎉🎉 One paper on A Survey on Multimodal Large Model Applications is accepted by CIBM.
2024.05: 🎉🎉 One paper on Interpretable and Generalizable Spatiotemporal Learning is accepted by ECML-PKDD 2024.
2024.04: 🎉🎉 One paper on Knowledge Distillation is accepted by IJCAI 2024.

📝 Publications

CVPR 2025

From Words to Structured Visuals: A Benchmark and Framework for Text-to-Diagram Generation and Editing

Jingxuan Wei, Cheng Tan, Qi Chen, Gaowei Wu, Siyuan Li, Zhangyang Gao, Linzhuang Sun, Bihui Yu, Ruifeng Guo

[PDF] | [GitHub] | [BibTeX]

@article{wei2024words,
  title={From Words to Structured Visuals: A Benchmark and Framework for
         Text-to-Diagram Generation and Editing},
  author={Wei, Jingxuan and Tan, Cheng and Chen, Qi and Wu, Gaowei and 
          Li, Siyuan and Gao, Zhangyang and Sun, Linzhuang and Yu, Bihui and 
          Guo, Ruifeng},
  journal={CVPR},
  year={2025}
}

ECCV 2024

Boosting the Power of Small Multimodal Reasoning Models to Match Larger Models with Self-Consistency Training

Cheng Tan, Jingxuan Wei, Zhangyang Gao, Linzhuang Sun, Siyuan Li, Xihong Yang, Stan Z. Li

[PDF] | [GitHub] | [BibTeX]

@inproceedings{tan2024boosting,
  title={Boosting the power of small multimodal reasoning models to match larger models with self-consistency training},
  author={Tan, Cheng and Wei, Jingxuan and Gao, Zhangyang and Sun, Linzhuang and 
          Li, Siyuan and Guo, Ruifeng and Yu, Bihui and Li, Stan Z},
  booktitle={European Conference on Computer Vision},
  pages={305--322},
  year={2024},
  organization={Springer}
}

Neural Computing and Applications, 2024

Enhancing Human-like Multimodal Reasoning: A New Challenging Dataset and Comprehensive Framework

Jingxuan Wei, Cheng Tan, Zhangyang Gao, Linzhuang Sun, Siyuan Li, Bihui Yu, Ruifeng Guo, Stan Z. Li

[PDF] | [GitHub] | [BibTeX]

@article{wei2024enhancing,
  title={Enhancing human-like multimodal reasoning: a new challenging dataset and comprehensive framework},
  author={Wei, Jingxuan and Tan, Cheng and Gao, Zhangyang and Sun, Linzhuang and 
          Li, Siyuan and Yu, Bihui and Guo, Ruifeng and Li, Stan Z},
  journal={Neural Computing and Applications},
  volume={36},
  number={33},
  pages={20849--20861},
  year={2024},
  publisher={Springer}
}

Computers in Biology and Medicine, 2024

A Survey on Advancements in Image-Text Multimodal Models: From General Techniques to Biomedical Implementations

Ruifeng Guo, Jingxuan Wei, Linzhuang Sun, Bihui Yu, Guiyong Chang, Dawei Liu, Sibo Zhang, Zhengbing Yao, Mingjun Xu, Liping Bu

[PDF] | [BibTeX]

@article{guo2024survey,
  title={A survey on advancements in image-text multimodal models: From general techniques to biomedical implementations},
  author={Guo, Ruifeng and Wei, Jingxuan and Sun, Linzhuang and Yu, Bihui and 
          Chang, Guiyong and Liu, Dawei and Zhang, Sibo and Yao, Zhengbing and 
          Xu, Mingjun and Bu, Liping},
  journal={Computers in Biology and Medicine},
  pages={108709},
  year={2024},
  publisher={Elsevier}
}

ECML-PKDD, 2024

Interpretable Spatiotemporal Predictive Learning

Interpretable and Generalizable Spatiotemporal Predictive Learning with Disentangled Consistency

Jingxuan Wei, Cheng Tan, Zhangyang Gao, Linzhuang Sun, Bihui Yu, Ruifeng Guo, Stan Li

[PDF] | [BibTeX]

@inproceedings{wei2024interpretable,
  title={Interpretable and Generalizable Spatiotemporal Predictive Learning with Disentangled Consistency},
  author={Wei, Jingxuan and Tan, Cheng and Gao, Zhangyang and Sun, Linzhuang and 
          Yu, Bihui and Guo, Ruifeng and Li, Stan},
  booktitle={Joint European Conference on Machine Learning and Knowledge Discovery in Databases},
  pages={3--20},
  year={2024},
  organization={Springer}
}

IJCAI, 2024

Sentence-Level or Token-Level? A Comprehensive Study on Knowledge Distillation

Jingxuan Wei, Linzhuang Sun, Yichong Leng, Xu Tan, Bihui Yu, Ruifeng Guo

[PDF] | [BibTeX]

@inproceedings{ijcai2024p722,
  title     = {Sentence-Level or Token-Level? A Comprehensive Study on Knowledge Distillation},
  author    = {Wei, Jingxuan and Sun, Linzhuang and Leng, Yichong and Tan, Xu and 
               Yu, Bihui and Guo, Ruifeng},
  booktitle = {Proceedings of the Thirty-Third International Joint Conference on
               Artificial Intelligence, {IJCAI-24}},
  publisher = {International Joint Conferences on Artificial Intelligence Organization},
  editor    = {Kate Larson},
  pages     = {6531--6540},
  year      = {2024},
  month     = {8},
  note      = {Main Track},
  doi       = {10.24963/ijcai.2024/722},
  url       = {https://doi.org/10.24963/ijcai.2024/722},
}

Others

arXiv: MM-Verify: Enhancing Multimodal Reasoning with Chain-of-Thought Verification. Sun, Linzhuang and Liang, Hao and Wei, Jingxuan et al.
SMC: Faster and More Efficient Subject Image Generation for Text-to-Image Diffusion Models. Yu, Bihui and Yao, Zhengbing and Wei, Jingxuan et al.
SMC: SAM-Wav2lip++: Enhancing Behavioral Realism in Synthetic Agents Through Audio-Driven Speech and Action Refinement. Yu, Bihui and Liu, Dawei and Wei, Jingxuan et al.
ADMA: TED-CS: Textual Enhanced Sensitive Video Detection with Common Sense Knowledge. Yu, Bihui and Sun, Linzhuang and Wei, Jingxuan et al.
Computers & Electrical Engineering: Feature-guided Multimodal Sentiment Analysis Towards Industry 4.0. Yu, Bihui and Wei, Jingxuan et al.

🎖 Honors and Awards

2018.09 Inner Mongolia Autonomous Region Merit Student
2019.09 National Scholarship
2020.09 Inner Mongolia Autonomous Region Merit Student
2021.09 University of Chinese Academy of Sciences Merit Student
2022.09 University of Chinese Academy of Sciences Merit Student
2022.09 National Scholarship

📖 Educations

2023.03-present Ph.D. in University of Chinese Academy of Sciences. Supervisor: Prof. Ruifeng Guo and Prof. Bihui Yu.
2020.09-2022.12 M.S. in University of Chinese Academy of Sciences. Supervisor: Prof. Bihui Yu.
2016.09-2020.06 B.S. in Inner Mongolia University of Science and Technology. Ranks first in the major and college.

🛠 Services

Program committee member | Reviewer

Annual Meeting of the Association for Computational Linguistics (ACL)
Empirical Methods in Natural Language Processing (EMNLP)
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
International Conference on Learning Representations (ICLR)
International Conference on Machine Learning (ICML)
Conference and Workshop on Neural Information Processing Systems (NeurIPS)
ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD)
International Journal of Computer Vision (IJCV)