News!

2026.02.24: Our work has been accepted by the Conference on Computer Vision and Pattern Recognition (CVPR 2026).

2025.10.24: Our work has been presented at presentation at the International Conference on Advanced Data Mining and Applications (ADMA 2025) in Kyoto, Japan.

2024.11.01: Our work has been presented at the ACM International Conference on Multimedia (ACMMM 2024) in Melbourne, Australia.

2024.09.04: Our work has been presented at the International Conference on Artificial Intelligence in Healthcare (AIiH 2024) in Swansea, UK.

2023.02.10: Our work has been presented at the AAAI Conference on Artificial Intelligence (AAAI 2023) in Washington, D.C., USA.

“Discover Excellence.” -The University of Tokyo Slogan

Resume

Project Assistant Professor

Department of Information and Communication Engineering, The University of Tokyo

04/2025 – Present

Postdoctoral Researcher

Department of Information and Communication Engineering, The University of Tokyo

04/2023 – 03/2025

Ph.D. Degree in Information Science

Department of Information and Communication Engineering, The University of Tokyo

Thesis title: Presentation Skill Assessment Systems using Deep Neural Networks

04/2020 – 03/2023

Master’s Degree in Information Science

Department of Information and Communication Engineering, The University of Tokyo

Thesis title: Impression Analysis on Presentations using Attention-based LSTM

04/2018 – 03/2020

Japanese Language School

Fuji International Language Institute

07/2016 – 03/2018

Bachelor’s Degree in Engineering

Software Engineering, Sun Yat-sen University

Thesis title: Plant Recognition Based on GoogleNet

10/2012 – 07/2016

Research

Research experience in large language models, multimodal learning, and deep learning for human communication analysis.

@IEICE TRANSACTIONS on Information and Systems 2022
Assessment System of Presentation Slide Design using Visual and Structural Features

@BigMM19
Impression Prediction of Oral Presentation using LSTM and Dot-product Attention Mechanism

@MIRU2021
Class-Balanced Contrastive Pre-Training for Improving Long-Tailed Recognition

Experiences

Experienced in collaborative research with companies and research institutes, and cooperation with other engineers.

Collaborative research with Talent and Assessment Inc.

Research Engineer

02/2021 - Present

Evaluation system of online interview using multimodal Transformer.
Accepted by internal conference: the International Conference on Advanced Data Mining and Applications (ADMA 2025).

Collaborative research with Rubato Co., Ltd.

Research Engineer

04/2019 - Present

Design assessment system of presentation slides using visual and structural features.
Published in journal: IEICE TRANSACTIONS on Information and Systems.

Value Exchange Engineering RA.

Research assistant

04/2021 - 03/2023

“Value Exchange Engineering”, a joint research project launched by R4D (the research and development organization of Mercari Inc.) and RIISE (Research Institute for an Inclusive Society through Engineering).

IST-RA.

Researcher assistant

04/2021 - 03/2023

The University of Tokyo Graduate School of Information Science and Technology Doctoral Student Special Incentives Program (IST-RA).

Collaborative research with Keio University.

Research Engineer

11/2021 - 03/2022

Topic and user satisfaction analysis of a remote dialogue system in mental health intervention.
Presented at international conference: the International Conference on Artificial Intelligence in Healthcare (AIiH 2024)

Collaborative research with PRAP Japan, Inc.

Research Engineer

10/2018 - 09/2019

Evaluation system of conference speaker using attention-based LSTM.
Published in journal: International Journal of Multimedia Data Engineering and Management.

Publications

(1) Journals

1. Shengzhou Yi, Junichiro Matsugami, and Toshihiko Yamasaki. Assessment System of Presentation Slide Design using Visual and Structural Features. IEICE TRANSACTIONS on Information and Systems, vol. E105-D, no. 3, pp. 587-596, 2022. [URL]

2. Shengzhou Yi, Koshiro Mochitomi, Isao Suzuki, Xueting Wang, and Toshihiko Yamasaki. Attention-based Multimodal Neural Network for Automatic Evaluation of Press Conferences. International Journal of Multimedia Data Engineering and Management, vol. 11, issue 3, pp. 1-19, 2020. [URL]

(2) International Conferences

1. Chen Fu, Shengzhou Yi, Ling Xiao, and Toshihiko Yamasaki. LLM Guided Multi Style Typography and Layout Generation via Dynamic Direct Preference Optimization. The Conference on Computer Vision and Pattern Recognition (CVPR 2026 Findings), Jun. 3-7, 2026, Denver, USA. (accepted)

2. Shengzhou Yi, Toshiaki Yamasaki, and Toshihiko Yamasaki. Adaptive Multimodal Transformer for Personality Trait Assessment in Online Job Interviews. International Conference on Advanced Data Mining and Applications (ADMA 2025), Oct. 22-24, 2025, Kyoto, Japan. [URL]

3. Shengzhou Yi, Junichiro Matsugami, Takuya Yamamoto, and Toshihiko Yamasaki. Enhancing Speaking and Slide Design Skills with Deep Learning: An Online Presentation Assessment System. The 32nd ACM International Conference on Multimedia (ACMMM 2024), Demo. Oct. 28-Nov. 1, 2024, Melbourne, Australia. [URL]

4. Shengzhou Yi, Toshiaki Kikuchi, and Toshihiko Yamasaki. Conversation Analysis of Remote Dialogue System for Mental Health Interventions. In International Conference on Artificial Intelligence in Healthcare (AIiH 2024), Sep. 4-6, 2024, Swansea, the U.K. [URL]

5. Shengzhou Yi, Junichiro Matsugami, Hiroshi Yumoto, and Toshihiko Yamasaki. An Online Presentation Slide Assessment System Using Visual and Semantic Segmentation Features. In 37th AAAI Conference on Artificial Intelligence (AAAI 2023), Demo, pp. 16494-16496. Feb. 7-14, 2023, Washington D.C., USA. [URL]

6. Shengzhou Yi, Koshiro Mochitomi, Isao Suzuki, Xueting Wang, and Toshihiko Yamasaki. Attention-based LSTM for Automatic Evaluation of Press Conferences. In IEEE 3rd International Conference on Multimedia Information Processing and Retrieval (MIPR 2020), pp. 187-192, Aug. 6-8, 2020, Shenzhen, China. [URL]

7. Shengzhou Yi, Hiroshi Yumoto, Xueting Wang, and Toshihiko Yamasaki. PresentationTrainer: Oral Presentation Support System for Impression-related Feedback. In 34th AAAI Conference on Artificial Intelligence (AAAI 2020), Demo, pp. 13644-13645. Feb. 7-12, 2020, New York, USA. [URL]

8. Shengzhou Yi, Xueting Wang, and Toshihiko Yamasaki. Emotion and Theme Recognition of Music Using Convolutional Neural Networks. In MediaEval Benchmark Workshop (MediaEval 2019), Oct. 27-29, 2019, Sophia Antipolis, France.

9. Shengzhou Yi, Xueting Wang, and Toshihiko Yamasaki. Impression Prediction of Oral Presentation using LSTM and Dot-product Attention Mechanism. In IEEE 5th International Conference on Multimedia Big Data (BigMM 2019), pp. 242‒246, Sep. 11-13, 2019, Singapore. [URL]

(3) Domestic Conferences and Symposia

1. Yuta Takatsuji, Tatsuya Iwanari, Ryosuke Katsuta, Shuntaro Masuda, Shengzhou Yi, and Toshihiko Yamasaki. Automated Improvement of Real Estate Rent Prediction Methods via Multi-Faceted Error Analysis Using Large Language Models. Annual Conference of the Japanese Society for Artificial Intelligence (JSAI), Jun. 8-12, 2026, Takasaki. (submitted)

2. Yuta Takatsuji, Tatsuya Iwanari, Ryosuke Katsuta, Shuntaro Masuda, Shengzhou Yi, and Toshihiko Yamasaki. Iterative Evaluation and Improvement Framework for Real Estate Rent Prediction Models Using Large Language Models. Media Experience Virtual Environment (MVE), Mar. 16-18, 2026, Okinawa. (submitted)

3. Shengzhou Yi, Toshiaki Yamasaki, and Toshihiko Yamasaki. Multistage Rebalancing Adaptive Multimodal Transformer for Personality Trait Assessment in Online Job Interviews. Image Engineering Technical Group (IE), IEICE Technical Report, Feb. 19-20, 2026, Sapporo.

4. Zhihao Shao, Ryo Sekizaki, Shengzhou Yi, Toshiaki Yamasaki, and Toshihiko Yamasaki. A Rationale-Guided and Rule-Based LLM Framework for Analyzing Psychological Counseling Dialogues. Image Engineering Technical Group (IE), IEICE Technical Report, Feb. 19-20, 2026, Sapporo.

5. Chen Fu, Naoto Tanji, Gakumatsu Ryu, Hiroyuki Seshime, Shengzhou Yi, Ling Xiao, and Toshihiko Yamasaki. Content-Aware Layout Generation with Large Language Models. Meeting on Image Recognition and Understanding (MIRU), July. 29-1, 2025, Kyoto.

6. Chi Zhang, Ryo Sekizaki, Shengzhou Yi, and Toshihiko Yamasaki. Automatic Psychological Counseling Classification Using BERT-Based Models and LLM-Driven Data Augmentation. Annual Conference of the Japanese Society for Artificial Intelligence (JSAI), May 27-30, 2025, Kumamoto.

7. Chen Fu, Yuesong Liu, Naoto Tanji, Hiroyuki Seshime, Shengzhou Yi, Ling Xiao, and Toshihiko Yamasaki. Constrained Advertisement Layout Generation based on Graph Neural Networks. Meeting on Image Recognition and Understanding (MIRU), Aug. 6-9, 2024, Kumamoto.

8. Tomoya Sugihara, Shuntaro Masuda, Shengzhou Yi, and Toshihiko Yamasaki. Price Prediction of Handmade Items using Multimodal Data. Image Engineering Technical Group (IE), IEICE Technical Report, Jun. 6-7, 2024, Niigata.

9. Shengzhou Yi, Toshiaki Yamasaki, and Toshihiko Yamasaki. Enhancing Online Structured Job Interviews: A Comprehensive Personality Assessment Using Multimodal Neural Networks. Annual Conference of the Japanese Society for Artificial Intelligence (JSAI), May 28-31, 2024, Hamamatsu.

10. Shengzhou Yi, Toshiaki Yamasaki, and Toshihiko Yamasaki. Online Structured Job Interview Assessment Using Multimodal Transformer and Prompt Learning. Media Experience Virtual Environment (MVE), vol. 123, no. 228, pp. 34-39, Oct. 26-27, 2023, Muroran.

11. Shengzhou Yi, Junichiro Matsugami, Takuya Yamamoto, Yukiyoshi Katsumizu, and Toshihiko Yamasaki. Online Presentation Skill Training Systems Using Multi-Modal Neural Network. Meeting on Image Recognition and Understanding (MIRU), Demo-10, Jul. 25-28, 2023, Hamamatsu.

12. Shengzhou Yi, Toshiaki Yamasaki, and Toshihiko Yamasaki. An Assessment System of Online Structured Job Interviews Supported by Multi-Modal Deep Learning. Annual Conference of the Japanese Society for Artificial Intelligence (JSAI), Jun. 6-9, 2023, Kumamoto.

13. Shengzhou Yi, Junichiro Matsugami, Takuya Yamamoto, Yukiyoshi Katsumizu, and Toshihiko Yamasaki. A Presentation Training System Based on Multi-modal Neural Networks. Annual Conference of the Japanese Society for Artificial Intelligence (JSAI), Jun. 6-9, 2023, Kumamoto.

14. Shengzhou Yi, Toshiaki Yamasaki, and Toshihiko Yamasaki. Assessment System of Remote Structured Interview using Bimodal Neural Network. Image Engineering Technical Group (IE), IEICE Technical Report, Feb. 21-22, 2023, Sapporo.

15. Shengzhou Yi, Junichiro Matsugami, and Toshihiko Yamasaki. Presentation Slide Assessment System using Visual and Semantic Segmentation Features. Media Experience Virtual Environment (MVE), vol. 122, no. 175, pp. 16-21, Sep. 8-9, 2022, Tokyo.

16. Shengzhou Yi, Junichiro Matsugami, and Toshihiko Yamasaki. Presentation Slide Design Evaluation using Two-Level Vision Transformer. Meeting on Image Recognition and Understanding (MIRU), IS2-88, Jul. 25-28, 2022, Himeji.

17. Shengzhou Yi, Toshiaki Kikuchi, and Toshihiko Yamasaki. User Satisfaction Prediction for Dialogue System in Mental Health Interventions. Image Engineering Technical Group (IE), IEICE Technical Report, vol. 121, no. 374, pp. 1-6, Feb. 21-22, 2022, Sapporo.

18. Shengzhou Yi, Junichiro Matsugami, and Toshihiko Yamasaki. Identifying Design Problems of Presentation Slides using a Bimodal Neural Network. Media Experience Virtual Environment (MVE), IEICE Technical Report, vol. 121, no. 179, pp. 21-26, Sep. 17-28, 2021, Tokyo.

19. Shengzhou Yi, Li Tao, Xueting Wang, and Toshihiko Yamasaki. Class-Balanced Contrastive Pre-Training for Improving Long-Tailed Recognition. Meeting on Image Recognition and Understanding (MIRU). Jul. 27-30, 2021, Nagoya.

20. Shengzhou Yi, Junichiro Matsugami, Xueting Wang, and Toshihiko Yamasaki. Slide Design Assessment Featuring Visual and Structural Analysis. 人工知能と知識処理研究会 (AI), IEICE Technical Report, vol. 120, no. 281, pp. 13-18, Dec. 10-11, 2020, Hamamatsu.

21. Shengzhou Yi, Takuya Yamamoto, Osamu Yamamoto, Yukiyoshi Katsumizu, Hiroshi Yumoto, Xueting Wang, Toshihiko Yamasaki. Make Your Presentation Better: Oral Presentation Support System using Linguistic and Acoustic Features. Image Engineering Technical Group (IE), IEICE Technical Report, vol. 119, no. 421, pp. 317-322, Feb. 27-28, 2020, Sapporo.

22. Shengzhou Yi, Xueting Wang, and Toshihiko Yamasaki. CNN-based Music Emotion and Theme Recognition Featuring Shallow Architecture. Media Experience Virtual Environment (MVE), IEICE Technical Report, vol. 119, no. 386, pp. 99-100, Jan. 23-24, 2020, Nara.

23. Shengzhou Yi, Koshiro Mochitomi, Isao Suzuki, Xueting Wang, Toshihiko Yamasaki. Automatic Evaluation of Press Conferences Using LSTM with Self-Attention Mechanism. Human Communication Group (HCG) Symposium, Dec. 11-13, 2019, Hiroshima.

24. Shengzhou Yi, Wang Xueting, and Yamasaki Toshihiko. Impression Prediction of Oral Presentation Using LSTM with Dot-product Attention Mechanism. Media Experience Virtual Environment (MVE), IEICE Technical Report, vol. 119, no. 75, pp. 1-6, Jun. 10-11, 2019, Tokyo.

25. Shengzhou Yi, Xueting Wang, and Toshihiko Yamasaki. Impression Prediction of Oral Presentation using LSTM and Dot-product Attention Mechanism. Third International Workshop on Symbolic-Neural Learning (SNL-2019), P-16, Jul. 11-12, 2019, Tokyo, Japan.

26. Shengzhou Yi, Toshihiko Yamasaki, Izumi Masumura, Yoshinori Yasui, Takako Misaki, Nobuhiko Okabe. Prediction of the National Epidemiological Surveillance of Infectious Diseases Using LSTM. Image Media Processing Symposium (IMPS), P-1-11, Nov. 19-21, 2018, Gotemba.

Awards

IDR User Platform Excellence Award and Enterprise Award (IDRユーザフォーラム企業賞、優秀賞), co-author, 2023. [URL]

JSAI Annual Conference Award (人工知能学会全国大会優秀賞), 2023. [URL]

MVE Award (メディアエクスペリエンス・バーチャル環境基礎研究会　MVE賞), 2023. [URL]

Fundings

Grant-in-Aid for Early-Career Scientists, Japan Society for the Promotion of Science, Principal Investigator, ¥3,600K, FY2024 - FY2026.

(日本学術振興会, 若手研究, 研究代表者, 直接経費総額3,600千円, 2024年度〜2026年度の予定.)

Hello, I'm Shengzhou Yi

News!

“Discover Excellence.” -The University of Tokyo Slogan

Resume

Project Assistant Professor

Postdoctoral Researcher

Ph.D. Degree in Information Science

Master’s Degree in Information Science

Japanese Language School

Bachelor’s Degree in Engineering

Research

@IEICE TRANSACTIONS on Information and Systems 2022 Assessment System of Presentation Slide Design using Visual and Structural Features

@BigMM19 Impression Prediction of Oral Presentation using LSTM and Dot-product Attention Mechanism

@MIRU2021 Class-Balanced Contrastive Pre-Training for Improving Long-Tailed Recognition

Experiences

Collaborative research with Talent and Assessment Inc.

Research Engineer

Collaborative research with Rubato Co., Ltd.

Research Engineer

Value Exchange Engineering RA.

Research assistant

IST-RA.

Researcher assistant

Collaborative research with Keio University.

Research Engineer

Collaborative research with PRAP Japan, Inc.

Research Engineer

Publications

(1) Journals

(2) International Conferences

(3) Domestic Conferences and Symposia

Awards

IDR User Platform Excellence Award and Enterprise Award (IDRユーザフォーラム 企業賞、優秀賞), co-author, 2023. [URL]

JSAI Annual Conference Award (人工知能学会 全国大会優秀賞), 2023. [URL]

MVE Award (メディアエクスペリエンス・バーチャル環境基礎研究会 MVE賞), 2023. [URL]

Fundings

Grant-in-Aid for Early-Career Scientists, Japan Society for the Promotion of Science, Principal Investigator, ¥3,600K, FY2024 - FY2026.

Activities

ACM Transactions on Multimedia Computing, Communications, and Applications, Reviewer, 2025.

Multimedia Modeling 2025, Reviewer.

ICME 2023/2024, Program Committee Member.

IEICE Transactions on Information and Systems, Reviewer, 2023/2024.

The Journal of Supercomputing, Reviewer, 2024.

ACMMM 2024, Program Committee Member.

IEEE VCIP 2024, Reviewer.

Scientific Reports Reviewer, 2023.

AAAI 2021/2022, Program Committee Member.

MM Asia 2022, Program Committee Member.

Lecture: Visual Media (映像メディア学) at IST, UTokyo, Spring 2024.

Research Assitant, IST, UTokyo, 04/2021-03/2023.

Research Assitant, RIISE, UTokyo, 04/2021-03/2023.

@IEICE TRANSACTIONS on Information and Systems 2022
Assessment System of Presentation Slide Design using Visual and Structural Features

@BigMM19
Impression Prediction of Oral Presentation using LSTM and Dot-product Attention Mechanism

@MIRU2021
Class-Balanced Contrastive Pre-Training for Improving Long-Tailed Recognition

IDR User Platform Excellence Award and Enterprise Award (IDRユーザフォーラム企業賞、優秀賞), co-author, 2023. [URL]

JSAI Annual Conference Award (人工知能学会全国大会優秀賞), 2023. [URL]

MVE Award (メディアエクスペリエンス・バーチャル環境基礎研究会　MVE賞), 2023. [URL]