Allen Tu

Portrait

atu1@umd.edu



About Me

I am a PhD student in Computer Science at the University of Maryland, College Park (UMD), advised by Professor Tom Goldstein. My research spans deep learning, computer vision, and graphics, with emphasis on 3D scene reconstruction, multimodal biometric recognition, and generative priors for robust vision systems. I have published in top machine learning conferences, contributing methods in compression and explainability that enable large-scale recognition and 3D reconstruction in real-world settings. In addition to my academic work, I collaborate with both academic and industry partners on problems spanning vision, language, and generative modeling. I completed my BS/MS in Computer Science at UMD in 2024.

I was a Computer Vision Research Intern at Systems & Technology Research (STR), focusing on multimodal biometric recognition, and a Peer Research Mentor in the FIRE: Capital One Machine Learning program at UMD, where I mentored over 80 undergraduates in their first research experiences. These roles deepened my interest in building machine learning systems that are both efficient and trustworthy. A recurring theme in my work is designing methods that not only achieve high accuracy, but also scale effectively and provide interpretable signals for decision making. My goal is to advance AI that can be deployed in real-world applications, ensuring systems that are safe, reliable, and beneficial to society.

News


Research Highlights


Boost DeformableGS rendering speed from 20 to 276 FPS using temporal sensitivity pruning and groupwise SE(3) motion distillation, all while preserving the superior image quality of per-Gaussian neural motion.




Enhance 3D Gaussian Splatting by selectively injecting super-resolution only where high-frequency detail is missing, yielding sharper results and improved perceptual quality without introducing multi-view inconsistencies.




Redefine template-based recognition through encoder-grounded recognizability prediction that learns directly from embedding geometry via class-center similarity and angular separation, enabling principled filtering, calibrated weighting, and cross-modal explainability that surpass prior FIQA methods in accuracy, interpretability, and generality.




Accelerate 3D Gaussian Splatting rendering speed by over 6× and reduce model size by over 90% through accurately localizing primitives during rasterization and pruning the scene during training, providing a significantly higher speedup than existing techniques while maintaining competitive image quality.




Prune 90% of primitives from any pretrained 3D Gaussian Splatting model using a mathematically principled sensitivity score, more than tripling rendering speed while retaining more salient foreground information and higher visual fidelity than previous techniques at a substantially higher compression ratio.




Unconstrained 3D reconstruction and novel view synthesis in challenging real-world environments.
  • Efficient rendering, compression, and training for 3D Gaussian Splatting (3DGS) [1, 2, 3]
  • Multi-view consistent super-resolution for training 3DGS with low-resolution imagery [4]
  • Image, video, and multiview diffusion model priors for sparse-view 3D reconstruction
  • Uncertainty quantification algorithms for 3DGS and Neural Radiance Fields (NeRF)



Multimodal fusion of incomplete face, body, and gait information in severe operational conditions.

Experience


University of Maryland Institute of Advanced Computer Studies
Graduate Research Assistant: August 2023 — Present
Tom Lab

Systems & Technology Research
Computer Vision Research Intern: May 2022 — Present
Video and Image Understanding Group

University of Maryland Department of Computer Science
Undergraduate Researcher: January 2021 — December 2022

The First-Year Innovation and Research Experience
Peer Research Mentor: January 2021 — December 2022
Capital One Machine Learning

nCino, Inc.
Software Engineering Intern: June 2021 — August 2021
Data Integrations