My research focuses on developing deep learning methods for understanding, modeling, and generating digital humans, spanning self-supervised learning, generative modeling, large language models, multimodal learning, and video diffusion.
Panoramic Video Generation using Diffusion Models for 360° Immersive Content
Mentors: Konrad Tollmar, Han Liu, Hau Nghiep Phan, Ray Phan (SEED Electronic Arts) — Summer 2025
- Built a scalable panoramic video data collection pipeline with panoramic depth, multi-view video captions and LLM-based scene understanding. Fine-tuned SOTA diffusion models with text, reference image as style, and depth conditioning, and circular padding for seamless 360° content generation.
Example-Based Motion Synthesis and Style Transfer Using Skeleton-Aware Dynamic Motion Segmentation and Matching
Mentors: Paul Debevec, Jim Su, Ahmet Tasel (Netflix Eyeline Studios), Angelica Lim (SFU) — Fall 2023 - Spring 2025
- Collected motion capture data at beginner, amateur, and professional levels in boxing and salsa dancing scenarios, and proposed a Skeleton-Aware Dynamic Motion Segmentation and Matching method that, without the need for training, enables the synthesis of unlimited similar samples for crowds and style transfer between performance levels.
Stylistic Co-speech Gesture Generation from Speech
Mentors: Paul Debevec, Jim Su, Ahmet Tasel (Netflix Eyeline Studios), Angelica Lim (SFU) — Fall 2023 - Spring 2025
- Collected a high-quality dataset of facial and body gestures, along with audio, across various emotional scenarios, and developed a novel method to generate co-speech gestures aligned with the speaker's style, speech content, and emotional state.
MotionScript: Natural Language Descriptions for Expressive 3D Human Motions
Mentors: Jim Su, Ahmet Tasel (Netflix Eyeline Studios), Angelica Lim (SFU), Li Cheng (UAlberta) — Summer 2023 - Winter 2024
- Developed an algorithm converting 3D motions into detailed natural language descriptions, bridging the gap between motion and Large Language Models (LLM) for fine-grained motion generation and editing.
Human Navigational Intent Inference using a Multimodal Framework with Quantized Latent Diffusion Model
Mentors: Angelica Lim, Mo Chen (SFU) — Spring 2023
- Human navigational intent inference through modeling human navigational behaviour depending on their position, heading, and movement speed.
Gesture2Vec: Clustering Gestures using Representation Learning Methods for Co-speech Gesture Generation
Mentors: Angelica Lim, Mo Chen (SFU) — Summer 2021 - Spring 2022
- Clustering multi-dimensional gesture time-series using Vector Quantized Variational Autoencoders (VQ-VAE) to improve human-robot interaction based on speech-context, speaker style, task, and emotional state.
Co-speech Gesture Generation models Benchmarking
Mentors: Angelica Lim, Mo Chen, Angel Chang (SFU) — Summer 2020 - Summer 2021
- Implemented several sequential generative models i.e. LSTM, GRU, TCN, GAN, etc. for co-speech gesture generation.
- Clustering gestures sequences using various method i.e. KNN, Normalizing Flow, and VQ-VAE.
- Data-driven co-speech gesture generation using a body language dataset collected in the wild from the YouTube TED channel.
Facial Expression Analysis for Confusion Detection
Mentors: Mo Chen, Angelica Lim (SFU) — Fall 2019
- Developed a facial expression-based method for detecting confusion during human-robot interactions on Pepper, a humanoid robot.
Voice Understanding Performance in Patients with Parkinson's Disease (PD)
Mentors: Zahra Zargol Moradi (Oxford University), Hadi Moradi (University of Tehran) — Fall 2018
- Developed a mobile application to collect data of emotional voice understanding in patients with PD in retired Oxford community.
Automatically Deriving User Mood Model Based on Visual Emotional Features
Mentors: Javad Hatami, Hadi Moradi (University of Tehran, Iran) — Fall 2014 - Summer 2017
- Designing a mood detection interface based on facial expression embedded in Telegram Messenger.
- Detect and report the mood state of a person to his/her Partner to Make him/her prepared for upcoming encounters.