Allen Tu

Portrait

atu1@umd.edu



About Me

I am a PhD student in Computer Science at the University of Maryland, College Park (UMD), advised by Professor Tom Goldstein. My research spans deep learning, computer vision, and graphics, with emphasis on 3D scene reconstruction, multimodal biometric recognition, and generative priors for robust vision systems. I have published in top machine learning conferences, contributing methods in compression and explainability that enable large-scale recognition and 3D reconstruction in real-world settings. In addition to my academic work, I collaborate with both academic and industry partners on problems spanning vision, language, and generative modeling. I completed my BS/MS in Computer Science at UMD in 2024.

I was a Computer Vision Research Intern at Systems & Technology Research (STR), focusing on multimodal biometric recognition, and a Peer Research Mentor in the Capital One Machine Learning FIRE program at UMD, where I mentored over 80 undergraduates in their first research experiences. These roles deepened my interest in building machine learning systems that are both efficient and trustworthy. A recurring theme in my work is designing methods that not only achieve high accuracy, but also scale effectively and provide interpretable signals for decision making. My goal is to advance AI that can be deployed in real-world applications, ensuring systems that are safe, reliable, and beneficial to society.

News


Research Highlights

Zoomable Image

Redefine template-based recognition through encoder-grounded recognizability prediction that learns directly from embedding geometry via class-center similarity and angular separation, enabling principled filtering, calibrated weighting, and cross-modal explainability that surpass prior FIQA methods in accuracy, interpretability, and generality.




Achieve 10.38× faster rendering, 7.71× smaller model size, and 2.71× shorter training time for dynamic 3D Gaussian Splatting representations through temporal sensitivity pruning and flow-based Gaussian grouping while maintaining visual quality.




Accelerate 3D Gaussian Splatting rendering speed by over 6× and reduce model size by over 90% through accurately localizing primitives during rasterization and pruning the scene during training, providing a significantly higher speedup than existing techniques while maintaining competitive image quality.




Prune 90% of primitives from any pretrained 3D Gaussian Splatting model using a mathematically principled sensitivity score, more than tripling rendering speed while retaining more salient foreground information and higher visual fidelity than previous techniques at a substantially higher compression ratio.


Zoomable Image


Unconstrained 3D reconstruction and rendering of real-world environments.
  • Fast rendering, extreme compression, and rapid training for 3D Gaussian Splatting (3DGS).
  • Image, video, and multiview diffusion model priors for sparse-view 3D reconstruction.
  • Uncertainty quantification algorithms for 3DGS and Neural Radiance Fields (NeRF).

Zoomable Image


Multimodal fusion of face, body, and gait recognition in severe real-world scenarios.
  • Transfer learning for face image recognizability assessment. (2025)
  • Feature clustering and aggregation for face recognition from videos. (2024)
  • Operating condition-invariant encoders and multimodal ensembling. (2023)
  • Data augmentation via garment transfer for training clothing-invariant body encoders. (2022)

Experience


University of Maryland Institute of Advanced Computer Studies (UMIACS)
Graduate Research Assistant: August 2023 — Present
Tom Lab

Systems & Technology Research (STR)
Computer Vision Research Intern: May 2022 — Present
Video and Image Understanding Group (Intelligence Division)

The First-Year Innovation and Research Experience (FIRE)
Peer Research Mentor: January 2021 — December 2022
Capital One Machine Learning

nCino, Inc.
Software Engineering Intern: June 2021 — August 2021
Data Integrations