Xiaoming Zhao's photo

I am a Ph.D. student working with Prof. Alexander Schwing at Department of Computer Science, University of Illinois Urbana–Champaign (UIUC).

Earlier, I received B.S. degree in Statistics from University of Science and Technology of China (USTC).

My research is focused on developing methodologies capable of effectively harnessing various sources of data with applications in 2D/3D/4D computer vision, generative models, and neural rendering.

My recent works aim for scalable 3D representation learning and 3D/4D view synthesis without requiring large-scale real-world multi-view datasets, utilizing

  1. Generic data priors
  2. Generative Adversarial Networks (GANs)
  3. (2D) Diffusion models trained with large-scale data

During my graduate study, I interned at Apple, Meta Reality Labs, and Google, conducting research related to above topics and gaining experiences in large-scale training for GANs and diffusion models.

I am on the job market, looking for research scientist / engineer positions in industry or post-doc positions in academia starting from summer 2024.
Happy to discuss potential fits.


Email    /    Google Scholar    /    GitHub    /    CV

Publications

   [11]
IllumiNeRF: 3D Relighting without Inverse Rendering.
Xiaoming Zhao, Pratul P. Srinivasan, Dor Verbin, Keunhong Park, Ricardo Martin Brualla, Philipp Henzler
ArXiv, 2024   
[Paper] [Website] [bibtex]   

IllumiNeRF provides a simpler approach than traditional inverse rendering for 3D relighting: relit source views than run a latent NeRF.
   [10]
GoMAvatar: Efficient Animatable Human Modeling from Monocular Video Using Gaussians-on-Mesh.
Jing Wen, Xiaoming Zhao, Zhongzheng Ren, Alexander G. Schwing, and Shenlong Wang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024   
[Paper] [Code] [Website] [bibtex]   

GoMAvatar introduces Gaussians-on-Mesh (GoM) representation for real-time, memory-efficient, and high-quality animatable human modeling.
      [9]
NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows.
Zhenggang Tang, Zhongzheng Ren, Xiaoming Zhao, Bowen Wen, Jonathan Tremblay, Stan Birchfield, Alexander G. Schwing
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024   
[Paper] [Code] [Website] [bibtex]   

NeRFDeformer automatically modifies a NeRF representation based on a single RGB-D observation of a non-rigid transformed version of the original scene.
      [8]
Pseudo-Generalized Dynamic View Synthesis from a Video.
Xiaoming Zhao, Alex Colburn, Fangchang Ma, Miguel Angel Bautista, Joshua M. Susskind, Alexander G. Schwing
International Conference on Learning Representations (ICLR), 2024   
[Paper] [Code] [Website] [bibtex]   

PGDVS provides an analysis framework for generalized dynamic view synthesis and finds with consistent depth estimations, scene-specific appearance optimization is NOT required.
      [7]
Occupancy Planes for Single-view RGB-D Human Reconstruction.
Xiaoming Zhao, Yuan-Ting Hu, Zhongzheng Ren, and Alexander G. Schwing
AAAI Conference on Artificial Intelligence (AAAI), 2023   
[Paper] [Code] [bibtex]   

OPlanes provides more flexibility than voxel grids and enables to better leverage correlations than per-point classification.
      [6]
Generative Multiplane Images: Making a 2D GAN 3D-Aware.
Xiaoming Zhao, Fangchang Ma, David Güera, Zhile Ren, Alexander G. Schwing, and Alex Colburn
European Conference on Computer Vision (ECCV), 2022 (Oral Presentation)   
[Paper] [Code] [Website] [bibtex]   
  
GMPI guarantees to be view-consistent and enables fast training (in less than half a day at a resolution of 10242) and high FPS during inference.
      [5]
Initialization and Alignment for Adversarial Texture Optimization.
Xiaoming Zhao, Zhizhen Zhao, and Alexander G. Schwing
European Conference on Computer Vision (ECCV), 2022   
[Paper] [Code] [Website] [bibtex]   

Carefully designed initialization and alignment procedures enable benefiting from both classical and recent learning-based texture optimization techniques.
      [4]
Class-agnostic Reconstruction of Dynamic Objects from Videos.
Zhongzheng Ren*, Xiaoming Zhao*, and Alexander G. Schwing
(* denotes equal contribution)
Neural Information Processing Systems (NeurIPS), 2021   
[Paper] [Website] [bibtex]   

REDO enables class-agnostic geometry reconstruction for dynamic objects from RGB-D videos.
      [3]
The Surprising Effectiveness of Visual Odometry Techniques for Embodied PointGoal Navigation.
Xiaoming Zhao, Harsh Agrawal, Dhruv Batra, and Alexander G. Schwing
International Conference on Computer Vision (ICCV), 2021   
[Paper] [Code] [Website] [bibtex]   

A well-trained visual odometry module can be a drop-in replacement for GPS and Compass sensor in PointGoal navigation.
      [2]
Mitigating Data Scarcity in Protein Binding Prediction Using Meta-Learning.
Yunan Luo*, Jianzhu Ma*, Xiaoming Zhao, Yufeng Su, Yang Liu, Trey Ideker, and Jian Peng
(* denotes equal contribution)
Research in Computational Molecular Biology (RECOMB), 2019   
[Paper] [bibtex]   

Meta-learning and few-shot learning strategy can be utilized to mitigate the data scarcity issue in characterizing the specificity of less-studied kinases for protein-peptide binding prediction.
      [1]
Integrating Thermodynamic and Sequence Contexts Improves Protein-RNA Binding Prediction.
Yufeng Su, Yunan Luo, Xiaoming Zhao, Yang Liu, and Jian Peng
PLOS Computational Biology, 2019   
[Paper] [Code] [bibtex]   

A deep learning-based thermodynamic model is introduced for protein-RNA binding prediction.

Workshops

      [2]
Learning from Synthesized Demonstrations.
Xiaoming Zhao, Yang Liu, and Jian Peng
International Conference on Machine Learning (ICML) Workshop on Learning in Artificial Open Worlds, 2020   
[Poster] [bibtex]
      [1]
Approximation Gradient Error Variance Reduced Optimization.
Wei-Ye Zhao, Yang Liu, Xiaoming Zhao, Jie-Lin Qiu, and Jian Peng
AAAI Conference on Artificial Intelligence (AAAI) Workshop on Reinforcement Learning in Games, 2019   
[Paper] [bibtex]