profile photo

Xiao-Ming Wu (伍晓鸣)

M.S. student

School of Computer Science and Engineering

Sun Yat-sen University

wuxm65@mail2.sysu.edu.cn

Scholar  /  Github

Biography

I'm currently a second-year master student at Sun Yat-sen University, advised by Prof. Wei-Shi Zheng, where I cultivate the interest in research, and develop the scientific ability and taste of it. Previously, I obtain my B.E. degree in Shandong University. At that time, I had my first attempt on scientific research, advised by Xin Luo and Xin-Shun Xu. I am now looking for a potential Ph.D. position in 2025 Fall.

Research Interests

Research is for curiosity and fun.

Now I am interested in how to build a robotics system that unifies learning and control, which aims to fuse representation learning into each part in robotic modeling and control, making robotics control able to perceive and understand. Thus, my current research interests include:

  • Robotics and Embodied AI
  • Deep Representation Learning

Besides, I have been working on many other topics, including diffusion models, low-level vision and hashing retrieval. Participating in diverse research areas broadens my horizon about deep learning.

Publications

Below are my publications (show by date). My first author works are highlighted.

Dexterous Grasp Transformer
Guo-Hao Xu&, Yi-Lin Wei&, Dian Zheng, Xiao-Ming Wu, Wei-Shi Zheng*
Computer Vision and Pattern Recognition (CVPR), 2024.

Paper and code will be ready soon.

A new transformer-based framework for dexterous grasp generation, capable of predicting a diverse set of feasible grasp poses only in one pass.

Single-View Scene Point Cloud Human Grasp Generation
Yan-Kang Wang, Chengyi Xing, Yi-Lin Wei, Xiao-Ming Wu, Wei-Shi Zheng*
Computer Vision and Pattern Recognition (CVPR), 2024.
paper / code

Exploring a new task of generating human grasps based on single-view scene point clouds. A new baseline and a new dataset are also proposed for this novel task.

Selective Hourglass Mapping for Universal Image Restoration Based on Diffusion Model
Dian Zheng, Xiao-Ming Wu, Shuzhou Yang, Jian Zhang, Jian-Fang Hu, Wei-Shi Zheng*
Computer Vision and Pattern Recognition (CVPR), 2024.
paper / code

A diffusion-based universal image restoration model, with an assemble-then-separate (like the hourglass) mapping for multi-tasks training.

DiffuVolume: Diffusion Model for Volume based Stereo Matching
Dian Zheng, Xiao-Ming Wu, Zuhao Liu, Jingke Meng, Wei-Shi Zheng*
arXiv preprint arXiv:2308.15989, submitted to International Journal of Computer Vision (IJCV).
paper / code

A diffusion-based module to gradually remove the redundancy in the cost volume for stereo matching, which is also plug-and-play for volume-based methods.

Estimator Meets Equilibrium Perspective: A Rectified Straight Through Estimator for Binary Neural Networks Training
Xiao-Ming Wu, Dian Zheng, Zuhao Liu, Wei-Shi Zheng*
International Conference on Computer Vision (ICCV), 2023
paper / code

A new perspective to view the binary neural network training: equilibrium between estimating error and gradient stability. And a simple and effective gradient estimator is proposed to balance it.

Generating Anomalies for Video Anomaly Detection with Prompt-based Feature Mapping
Zuhao Liu, Xiao-Ming Wu, Dian Zheng, Kun-Yu Lin, Wei-Shi Zheng*
Computer Vision and Pattern Recognition (CVPR), 2023
paper

A prompt-based feature mapping to generate more real-domain anomalies to supervise the anomaly detection training.

Weakly-Supervised Online Hashing with Refined Pseudo Tags
Chen-Lu Ding, Xin Luo*, Xiao-Ming Wu, Yu-Wei Zhan, Rui Li, Hui Zhang, and Xin-Shun Xu
Conference on Information and Knowledge Management (CIKM), 2022
paper / code

Generating pseudo tags based on the co-occurrence similarity between tags to handle the weakly-supervised online hashing.

Online Enhanced Semantic Hashing Towards Effective and Efficient Retrieval for Streaming Multi-Modal Data
Xiao-Ming Wu, Xin Luo*, Yu-Wei Zhan, Chen-Lu Ding, Zhen-Duo Chen, Xin-Shun Xu
AAAI Conference on Artificial Intelligence (AAAI), 2022
paper / code

Introducing semantic into online hashing, handling the dimension mismatching and mitigating the inconsistent problem in incremental online hashing.

Discrete Online Cross-Modal Hashing
Yu-Wei Zhan, Yong-Xin Wang, Yu Sun, Xiao-Ming Wu, Xin Luo*, and Xin-Shun Xu
Pattern Recognition (PR), 2022
paper / code

Inspired by Discrete Latent Factor Hashing (DLFH), we propose an online-version DLFH, with many novel designs to handle the online problem in cross-modal hashing.

Academic Service

Journal Reviewer: Pattern Analysis and Machine Intelligence (TPAMI), Pattern Recognition (PR).

Conference Reviewer: Computer Vision and Pattern Recognition (CVPR) 2024, ACM Multimedia (MM) 2024.

Awards

National Scholarship of China for Guaduate Student (研究生国家奖学金), 2023
First Prize, Academic Scholarship of Sun Yat-Sen University (中山大学学业一等奖学金), 2022, 2023
Honorable Bachelor Degree of Shandong University (山东大学荣誉学士学位), 2022
National Scholarship of China for Underguaduate Student (本科生国家奖学金), 2019, 2021
First Prize, Academic Scholarship of Shandong University (山东大学学业一等奖学金), 2019, 2020, 2021