Xue-qian Li

I am currently a research intern at Argo AI, working with Dr. Simon Lucey and Dr. Jhony Kaesemodel Pontes.

  • Research
  • Academic
  • Publication

Research Experience

Trading Positional Complexity vs. Deepness in Coordinate Networks
We use non-Fourier positional encodings (e.g., shifted Gaussian functions) to show the signal reconstruction is determined by a trade-off between the stable rank of the embedding and the distance preservation between embedded coordinates. Furthermore, employing a complex positional encoding requires only a linear (rather than deep) function to achieve comparable performance, while being orders of magnitude faster than current state-of-the-art.
Neural Prior for Trajectory Estimation
Traditionally, trajectories have been represented by a set of handcrafted bases that have limited expressibility. Here, we propose a neural trajectory prior to capture continuous spatio-temporal information without the need for offline data. We demonstrate how our proposed objective is optimized during runtime to estimate trajectories for two important tasks: Non-Rigid Structure from Motion (NRSfM) and lidar scene flow integration for self-driving scenes.
Neural Scene Flow Prior
We propose a neural scene flow prior that utilizes the architecture of neural networks (MLPs) as an implicit regularizer. Unlike learning-based scene flow methods, optimization occurs at runtime, and our approach needs no offline datasets -- making it ideal for deployment in new environments such as autonomous driving. Also, the implicit and continuous scene flow representation allows us to estimate dense long-term correspondences across a sequence of point clouds.
PointNetLK Revisited
We revisit a recent innovation -- PointNetLK -- and show that the inclusion of an analytical Jacobian can exhibit remarkable generalization properties while reaping the inherent fidelity benefits of a learning framework. Our approach not only outperforms the state-of-the-art in mismatched conditions but also produces results competitive with current learning methods when operating on real-world test data close to the training set.
One Framework to Register Them All: PointNet Encoding for Point Cloud Alignment
We propose a novel framework that produces a couple of new approaches that improve upon the state-of-the-art registration in some aspects. The existence of a suite of techniques that are dependent upon: (i) computation complexity, (ii) object specificity, (iii) noise, (iv) choice of loss functions and (v) pose parametrization. The choice of technique largely depends on the user’s requirement. All of these techniques can be unified into a single framework which can serve as a starting point to researchers in this field.
PCRNet: Point Cloud Registration Network using PointNet Encoding
We provide a novel network to align point clouds that utilizes PointNet encoding. Our PCRNet uses data specific information to make accurate registration estimation, which is robust to noise and large initial misalignment between source point cloud and template point cloud. With single pass, PCRNet performs computationally faster than existing methods, while iterative version of PCRNet can refine the registration result to make highly accurate prediction. Our PCRNet can be applied further to 3D tracking, model replacement.
A Multi-Domain Feature Learning Method for Visual Place Recognition
We propose a multi-domain feature learning-based visual place recognition method with 2 modules: feature extraction module, which uses CapsuleNet to extract features under different circumstances; feature separation module, which enforces a separation in environmental condition-invariant and condition-related domains. Combining 2 modules through sequential matching, we can find potential match.
MRS-VPR: A Multi-resolution Sampling-based Global Visual Place Recognition Method
In order to perform sequence matching between short-term testing frame and long-term reference frame in visual place recognition, we develop a multi-resolution sampling-based global search method with coarse-to-fine searching and particle filter, which balances the matching accuracy and searching efficiency. By iteratively updating both potential trajectories and frame sequences through low resolution to high resolution, we can find the best match under the highest resolution level.
Medical Tool Segmentation
Using k-nearest neighbors search and several morphological operations on images, we develop an efficient method to collect data from da Vinci surgical robot with user interface. We implement different deep learning modules to do medical tool segmentation on different augmented data collected in our lab. After comparing several methods, we find a way to do accurate and real-time medical tool segmentation for 15-20 frames per second.

Academic Project

Using Deep Kernels in Time Varying Networks for Reverse-Engineering of Gene Interaction
(Course: Probabilistic Graphical Models)
We explore the performance of deep kernels on graphical models, which incorporates both the structural properties of deep learning architectures with non-parametric flexibility of kernels. In our project, we aim to solve the task of reverse-engineering of gene interaction. Inspired by KELLER, Pairwise MRF is used to model interactions between genes. Deep kernels, constructed from neural network and LSTM are used for learning parameters of the Pairwise MRF.
Purcue Playing with Polar Bear (Course: Computer Vision)
Instead of using the raw image to do Lucas-Kanade tracking, I utilize Difference of Gaussian Pyramid features to do efficient tracking of 2 images of Purcue. I estimate the intrinsic of the camera and perform 3D reconstruction of Purcue. With the combination of Lucas-Kanada algorithm and random affine warp to the point cloud, my Purcue is now playing with polar bear in the video.
Navigation in Wild World (Course: Deep RL and Control)
We apply Double DQN, Dueling DQN, A2C, A3C and DDPG to the grid-world navigation problem with various reward reshaping techniques and different inputs. Though scent perception can yield good performance for Double DQN and Dueling DQN, adding vision perception helps refine the prediction. A2C, A3C and DDPG is not suitable for this discrete grid-world setting.


Conference & arXiv
Jianqiao Zheng, Sameera Ramasinghe, Xueqian Li, Simon Lucey. Trading Positional Complexity vs. Deepness in Coordinate Networks. ECCV 2022
Chaoyang Wang, Xueqian Li, Jhony Kaesemodel Pointes, Simon Lucey. Neural Prior for Trajectory Estimation. CVPR 2022
Xueqian Li, Jhony Kaesemodel Pointes, Simon Lucey. Neural Scene Flow Prior. NeurIPS 2021 (spotlight)
Xueqian Li, Jhony Kaesemodel Pointes, Simon Lucey. PointNetLK Revisited. CVPR 2021 (oral)
Vinit Sarode*, Xueqian Li*, Hunter Goforth, Yasuhiro Aoki, Animesh Dhagat, Arun R. Srivatsan, Simon Lucey, Howie Choset. One Framework to Register Them All: PointNet Encoding for Point Cloud Alignment. arXiv preprint arXiv:1912.05766. (* equal contribution)
Vinit Sarode*, Xueqian Li*, Hunter Goforth, Yasuhiro Aoki, Arun R. Srivatsan, Simon Lucey, Howie Choset. PCRNet: Point Cloud Registration Network using PointNet Encoding. arXiv preprint arXiv:1908.07906. (* equal contribution)
Peng Yin, Lingyun Xu, Xueqian Li, Yin Chen, Yingli Li, R. Arun Srivatsan, Lu Li, Jianmin Ji, and Yuqing He. A Multi-Domain Feature Learning Method for Visual Place Recognition. ICRA 2019
Peng Yin, R. Arun Srivatsan, Yin Chen, Xueqian Li, Hongda Zhang, Lingyun Xu, Lu Li, Zhenzhong Jia, Jiamin Ji and Yuqing He. MRS-VPR: a multi-resolution sampling based global visual place recognition method. ICRA 2019