Research

My research focuses on developing deep learning methods for understanding, modeling, and generating digital humans, spanning self-supervised learning, generative modeling, large language models, multimodal learning, and video diffusion.

Panoramic Video Generation using Diffusion Models for 360° Immersive Content

Mentors: Konrad Tollmar, Han Liu, Hau Nghiep Phan, Ray Phan (SEED Electronic Arts) — Summer 2025

  • Built a scalable panoramic video data collection pipeline with panoramic depth, multi-view video captions and LLM-based scene understanding. Fine-tuned SOTA diffusion models with text, reference image as style, and depth conditioning, and circular padding for seamless 360° content generation.

Example-Based Motion Synthesis and Style Transfer Using Skeleton-Aware Dynamic Motion Segmentation and Matching

Mentors: Paul Debevec, Jim Su, Ahmet Tasel (Netflix Eyeline Studios), Angelica Lim (SFU) — Fall 2023 - Spring 2025

  • Collected motion capture data at beginner, amateur, and professional levels in boxing and salsa dancing scenarios, and proposed a Skeleton-Aware Dynamic Motion Segmentation and Matching method that, without the need for training, enables the synthesis of unlimited similar samples for crowds and style transfer between performance levels.

Stylistic Co-speech Gesture Generation from Speech

Mentors: Paul Debevec, Jim Su, Ahmet Tasel (Netflix Eyeline Studios), Angelica Lim (SFU) — Fall 2023 - Spring 2025

  • Collected a high-quality dataset of facial and body gestures, along with audio, across various emotional scenarios, and developed a novel method to generate co-speech gestures aligned with the speaker's style, speech content, and emotional state.

MotionScript: Natural Language Descriptions for Expressive 3D Human Motions

Mentors: Jim Su, Ahmet Tasel (Netflix Eyeline Studios), Angelica Lim (SFU), Li Cheng (UAlberta) — Summer 2023 - Winter 2024

  • Developed an algorithm converting 3D motions into detailed natural language descriptions, bridging the gap between motion and Large Language Models (LLM) for fine-grained motion generation and editing.

Human Navigational Intent Inference using a Multimodal Framework with Quantized Latent Diffusion Model

Mentors: Angelica Lim, Mo Chen (SFU) — Spring 2023

  • Human navigational intent inference through modeling human navigational behaviour depending on their position, heading, and movement speed.

Gesture2Vec: Clustering Gestures using Representation Learning Methods for Co-speech Gesture Generation

Mentors: Angelica Lim, Mo Chen (SFU) — Summer 2021 - Spring 2022

  • Clustering multi-dimensional gesture time-series using Vector Quantized Variational Autoencoders (VQ-VAE) to improve human-robot interaction based on speech-context, speaker style, task, and emotional state.

Co-speech Gesture Generation models Benchmarking

Mentors: Angelica Lim, Mo Chen, Angel Chang (SFU) — Summer 2020 - Summer 2021

  • Implemented several sequential generative models i.e. LSTM, GRU, TCN, GAN, etc. for co-speech gesture generation.
  • Clustering gestures sequences using various method i.e. KNN, Normalizing Flow, and VQ-VAE.
  • Data-driven co-speech gesture generation using a body language dataset collected in the wild from the YouTube TED channel.

Facial Expression Analysis for Confusion Detection

Mentors: Mo Chen, Angelica Lim (SFU) — Fall 2019

  • Developed a facial expression-based method for detecting confusion during human-robot interactions on Pepper, a humanoid robot.

Voice Understanding Performance in Patients with Parkinson's Disease (PD)

Mentors: Zahra Zargol Moradi (Oxford University), Hadi Moradi (University of Tehran) — Fall 2018

  • Developed a mobile application to collect data of emotional voice understanding in patients with PD in retired Oxford community.

Automatically Deriving User Mood Model Based on Visual Emotional Features

Mentors: Javad Hatami, Hadi Moradi (University of Tehran, Iran) — Fall 2014 - Summer 2017

  • Designing a mood detection interface based on facial expression embedded in Telegram Messenger.
  • Detect and report the mood state of a person to his/her Partner to Make him/her prepared for upcoming encounters.
Payam Jome-Yazdian

SFU logo ROSIE Lab logo MARS Lab logo

pjomeyaz@sfu.ca
(778) 251-4174

Ph.D. Candidate · Research & Teaching Assistant
Simon Fraser University
Vancouver, Canada

LinkedIn
Email
Google Scholar
GitHub