Skip to content

MCMartinLee/Multi-TPC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multi-TPC

A Multimodal Dataset for Three-Party Conversations with Speech, Motion, and Gaze

Data DOI

Table of contents

Installation

  • python version >=3.8.0
pip install -r requirements.txt

Tools

  • ViconIQ — Motion capture and motion data processing
  • D-Lab — Gaze tracking and audio capture
  • Audacity — Audio trimming and channel-level processing
  • Praat — Prosodic feature extraction (pitch and intensity)

Data Capture

Layout

Synchronization

All modalities are synchronized using a physical clapboard instrumented with motion-capture markers.
The clap event provides a shared temporal reference across motion, gaze, and audio streams.


Data Pre-processing

ViconIQ

Motion data are processed using ViconIQ, including:

  • Gap interpolation
  • Temporal smoothing

A step-by-step demonstration is available in this video tutorial.

D-Lab

Gaze and audio data are exported from D-Lab.
See this video tutorial(Release soon) for the export workflow.

Audacity

Audio files are processed using Audacity to:

  • Trim recordings
  • Mute other participants’ voices in each individual audio track

A demonstration is available in this video tutorial.


Data processing

Motion

For detail motion processing, please refer to this document

Gaze

For detail gaze processing, please refer to this document

Audio and Text

For detail audio processing, please refer to this document

Analysis

  1. Download the dataset and put inside the Data folder
  2. Run Jupyter Notebook of example

Visualization

git clone https://github.com/MCMartinLee/Conversation_Demo

Author(s)

Meng-Chen Lee, mlee45 (at) uh.edu

Zhigang Deng

License and copyright

The scripts are licensed under the MIT license.

In the related C++ module repository, the software is also subject to the MIT license that is provided in the repositories.

Acknowledgements

This work was supported in part by NSF IIS-2005430. We would like to thank Mai Trinh to help with data capture in this work. We also want to thank the volunteers who participated in the data collection experiments.

Publication and citation

If you use this work, please cite the data paper available here.

About

A Multimodal Dataset for Three-Party Conversations with Speech, Motion, and Gaze

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors