About Me

Yidi Li is currently an Associate Professor in the College of Computer Science and Technology at Taiyuan University of Technology (太原理工大学, 计算机科学与技术学院), and the director of the Multimodal Intelligent Human-Robot Interaction Laboratory (多模态智能人机交互实验室, MIHRI Lab). Since October 2025, she has been a CSC-funded Visiting Scholar at the Robotic Manipulation Lab. / Harada Lab., the University of Osaka, Japan. She received her Ph.D. degree in Computer Science and Technology from Peking University under the supervision of Prof. Hong Liu in 2023.

Her research focuses on multimodal embodied intelligence, audio-visual perception, and human-robot interaction. Her work aims to enable robots and intelligent agents to perceive, understand, and interact with complex real-world environments by integrating complementary sensory modalities such as vision and audition.

Her recent research covers audio-visual speaker tracking, multimodal robust representation learning, incomplete-modality perception, event-RGB fusion, and embodied perception. She has published more than 40 papers in international conferences and SCI journals in artificial intelligence, computer vision, and multimodal learning.

📣📣 Call for members📣📣

多模态智能人机交互实验室(MIHRI Lab)现正招收2027、2028年入学的研究生，大一/大二优秀本科生！

我们寻求对人工智能、机器人技术、计算机视觉等领域充满热情的优秀保研/考研学生。欢迎编程能力较好、有深度学习实践经验、程序设计竞赛或者科研经历，有志于攻读硕士/博士研究生和出国深造的同学与我联系（发送简历至liyidi@tyut.edu.cn），也欢迎大一/大二的优秀本科生进组学习。

MIHRI Lab将为成员提供：

前沿研究：参与丰富的前沿研究课题。
学术交流：参加国内/国际会议，扩展学术视野。
国际合作：海外名校合作专家联合指导。
学习访问：优秀学生可推荐至国内外知名院校学习访学。

🥳 News

2026.05 📄 One paper accepted by Pattern Recognition! (SCI Q1-top)
2026.05 📄 One paper early accepted by MICCAI 2026!
2026.01 📄 Three papers accepted by ICASSP 2026!
2025.12 🏅 Dr. Li was selected for the 2025 Sanjin Talent Program: Young Top Talent in Scientific and Technological Innovation.
2025.12 📄 One paper published in CAAI Transactions on Intelligence Technology! (SCI Q1-top)
2025.09 🏆 MIHRI Lab won the Best Student Paper Award at ACAIT 2025!
2025.04 📄 Two papers accepted by Expert Systems with Applications! (SCI Q1-top)
2024.12 📄 One paper accepted by ICASSP 2025!
2024.12 📄 Two papers accepted by CAAI Transactions on Intelligence Technology! (SCI Q1-top)
2024.09 📄 One paper accepted by IEEE Transactions on Multimedia! (SCI Q1-top)
2024.07 🏆 Dr. Li received the ACM Rising Star Award from ACM China Council Taiyuan Chapter.

📜 Research Area

Multimodal Embodied Intelligence

Robust perception, spatial understanding, and human-robot interaction.
Audio-Visual Learning

Audio-visual speaker tracking, sound source localization, event localization, and incomplete-modality learning.
Robotic Perception and Multimodal Sensing

Event-based vision, event-RGB fusion, object tracking, action recognition, and industrial vision.

💻 Research Experiences

2025.10 - Present: Postdoctoral Researcher, the University of Osaka, Japan
2023.7 - Present: Associate Professor, Taiyuan University of Technology, China
2017.09 - 2023.07: Ph.D. in Computer Science, Peking University, China

📝 Publications

Yidi Li, Yihan Li, etc. Global-Local Distillation Network-Based Audio-Visual Speaker Tracking with Incomplete Modalities. Pattern Recognition, 2026. (SCI Q1-top)
Yihan Li, Yidi Li*, etc. AVCLNet: Multimodal Multi-Speaker Tracking Network Using Audio-Visual Contrastive Learning. CAAI Transactions on Intelligence Technology, 2026, 11(1): 238–255. (SCI Q1-top)
Yidi Li, Jiahao Wen, etc. PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection. Expert Systems with Applications, 2025, 281: 127608. (SCI Q1-top)
Yidi Li, Hong Liu*, Bing Yang. STNet: Deep Audio-Visual Fusion Network for Robust Speaker Tracking. IEEE Transactions on Multimedia, 2025, 27: 1835–1847. (SCI Q1-top)
Yidi Li, Guoquan Wang*, etc. On-Device Audio-Visual Multi-Person Wake Word Spotting. CAAI Transactions on Intelligence Technology, 2023, 8(4): 1578–1589. (SCI Q1-top)
Yidi Li, Jiale Ren*, etc. Audio-Visual Keyword Transformer for Unconstrained Sentence-Level Keyword Spotting. CAAI Transactions on Intelligence Technology, 2023, 9(1): 142–152. (SCI Q1-top)
Yidi Li, Hong Liu*, Hao Tang. Multi-Modal Perception Attention Network with Self-Supervised Learning for Audio-Visual Speaker Tracking. Proceedings of the AAAI Conference on Artificial Intelligence, 2022, 36(2): 1456–1463. CCF A Oral
Yihan Li, Hao Guo, Zhenhuan Xu, Yidi Li*, Weiwei Wan. A Multi-View Fusion Framework for Audio-Visual Multi-Speaker Tracking. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2026: 22727–22731. CCF B
Yan Niu, Yiling Wang, Jingyan Li, Wenxi Wang, Mengni Zhou, Jie Xiang, Xin Wen, Yidi Li*. Adaptively Weighted Multi-Modal Joint Entropy with Dynamic Allocation and Fault-Tolerant Fusion for Industrial Diagnostics. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2026: 5476–5480. CCF B
Yan Niu, Wenxi Wang, Yiling Wang, Jianyu Zhi, Luqi Wang, Xin Wen, Yidi Li*. A Robust Method for Gear Failure Detection and Severity Estimation Based on Multi-Sensor Physical Feature Fusion and Domain Adaptation. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2026: 19877–19881. CCF B
Yidi Li, Wenkai Zhao, etc. Multi-Stage Multimodal Distillation for Audio-Visual Speaker Tracking. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2025. CCF B
Yihan Li, Yi Shi, Chongwei Yan, Hao Guo, Bin Ren, Yidi Li*. Robust Multi-View Audio-Visual Speaker Tracking via Bird’s Eye View Representation. Asian Conference on Artificial Intelligence Technology (ACAIT), 2025. Best Student Paper Award
Yidi Li, Kairan Zhang, etc. Vision-Guided Acoustic Localization with Decoupled Inference for Moving Speakers. International Conference on Intelligent Computing (ICIC), 2025: 135–146. CCF C Oral
Zihao Mi, Jianan Zhang, Xueyu Liu, Guanghui Yue, Junhong Yue, Mingqiang Wei, Yidi Li, Yongfei Wu*. Multi-Instance Curriculum Learning for Histopathology Image Classification with Bias Reduction. Medical Image Analysis, 2025, 105: 103647. (SCI Q1-top)
Xubin Wu, Yan Niu, Xia Li, Jie Xiang*, Yidi Li*. A Prior Causality-Guided Multi-View Diffusion Network for Brain Disorder Classification. CAAI Transactions on Intelligence Technology, 2025, 10(6): 1731–1744. (SCI Q1-top)
Jie Xiang, Ang Zhao, Xia Li, Xubin Wu, Yanqing Dong, Yan Niu, Xin Wen*, Yidi Li*. Enhancing Brain MRI Super-Resolution through Multi-Slice Aware Matching and Fusion. CAAI Transactions on Intelligence Technology, 2025: 1411–1421. (SCI Q1-top)
Hangbei Cheng, Xueyu Liu, Jun Zhang, Xiaorong Dong, Xuetao Ma, Yansong Zhang, Hao Meng, Xing Chen, Guanghui Yue, Yidi Li*, Yongfei Wu*. GLMKD: Joint Global and Local Mutual Knowledge Distillation for Weakly Supervised Lesion Segmentation in Histopathology Images. Expert Systems with Applications, 2025. (SCI Q1-top)
Tao Wang, Mengyuan Liu*, Hong Liu*, Wenhao Li, Miaoju Ban, Tianyu Guo, Yidi Li. Feature Completion Transformer for Occluded Person Re-Identification. IEEE Transactions on Multimedia, 2024, 26: 8529–8542. (SCI Q1-top)
Zhenhuan Xu, Yongfei Wu, Liming Zhang, Yidi Li*. Adaptive Fourier Decomposition Based Signal Extraction on Weak Electromagnetic Field. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024: 9446–9450. CCF B

🌟 Projects

[1] Young Scientists Fund of the National Natural Science Foundation of China, 2024.
[2] Scientific and Technologial Innovation Programs of Higher Education Institutions in Shanxi, 2024.
[3] Shanxi Provincial Department of Science and Technology Basic Research Project, 2024.
[4] Open Research Project of Guangdong Provincial Key Laboratory, 2025.

🏅 Selected Honors and Services

Selected for the 2025 Sanjin Talent Program: Young Top Talent in Scientific and Technological Innovation.
Recipient of the 2023 ACM Rising Star Award from ACM China Council Taiyuan Chapter.
Recipient of the IEEE Outstanding Service Award.
Supervised the student paper receiving the Best Student Paper Award at the 2025 Asia Conference on Artificial Intelligence Technology.
Invited Speaker at the 2026 International Conference on Control and Robotics Engineering (ICCRE 2026).
Workshop Chair of 2024 IEEE Smart World Congress (IEEE SWC 2024).
Publicity Chair of Cyberworlds 2025.
Session Chair for the 2024 China Internet of Things Conference, 2025 Asia Conference on Artificial Intelligence Technology, and 2025 International Conference on Intelligent Computing.

Yidi Li (李一迪)