SigmediaApp

Welcome to the home of the TCD-TIMIT database and the RoomReader Corpus. The TCD-TIMIT was released in 2015 to further work on audio-visual speech recognition. The RoomReader Corpus was released in 2022 to explore multimodal, multiparty online conversational interactions and automatic engagement detection.

TCD-TIMIT consists of high-quality audio and video footage of 62 speakers reading a total of 6913 phonetically rich sentences. Three of the speakers are professionally-trained lipspeakers, recorded to test the hypothesis that lipspeakers may have an advantage over regular speakers in automatic visual speech recognition systems. Video footage was recorded from two angles: straight on, and at 30degrees. The paper outlines the recording of footage, and the required post-processing to yield video and audio clips for each sentence. Audio, visual, and joint audio-visual baseline experiments are reported. Separate experiments were run on the lipspeaker and non-lipspeaker data, and the results compared. Visual and audio-visual baseline results on the non-lipspeakers were low overall. Results on the lipspeakers were found to be significantly higher. It is hoped that as a publicly available database, TCD-TIMIT will now help further state of the art in audio-visual speech recognition research.

RoomReader is a corpus of multimodal, multiparty conversational interactions in which participants followed a collaborative student-tutor scenario designed to elicit spontaneous speech. The corpus was developed within the wider RoomReader Project to explore multimodal cues of conversational engagement and behavioural aspects of collaborative interaction in online environments. However, the corpus can be used to study a wide range of phenomena in online multimodal interaction. The corpus consists of over 8 hours of video and audio recordings from 118 participants in 30 gender-balanced sessions, in the “in-the-wild” online environment of Zoom. The recordings have been edited, synchronised, and fully transcribed. Student participants have been continuously annotated for engagement with a novel continuous scale. We provide questionnaires measuring engagement and group cohesion collected from the annotators, tutors, and participants themselves. We also make a range of accompanying data available such as personality tests and behavioural assessments. The dataset and accompanying psychometrics present a rich resource enabling the exploration of a range of downstream tasks across diverse fields including linguistics and artificial intelligence.

These datasets are being made free for research use. The RoomReader corpus requires you to electronically sign a non-commercial license agreement. Please reference the following papers in any publication where you use the data:

Harte, N.; Gillen, E., "TCD-TIMIT: An Audio-Visual Corpus of Continuous Speech," Multimedia, IEEE Transactions on , vol.17, no.5, pp.603,615, May 2015 doi: 10.1109/TMM.2015.2407694
Reverdy, Justine, O'Connor Russell, Sam, Duquenne, Louise, Garaialde, Diego, Cowan, R., Benjamin and Harte, Naomi (2022). RoomReader: A Multimodal Corpus of Online Multiparty Conversational Interactions. In Proceedings of the 13th International Conference on Language Resources and Evaluation (LREC 2022), Marseille, France, pp. 2517-2527.

To access the data, you need to create an account. Your account will only be enabled if you supply a valid university email address and properly identify yourself.

Trinity College Dublin, The University of Dublin

Trinity Search

Trinity Menu

Sigmedia

Trinity Research Navigation

Sitemap

Contact Us

Our Location

Our Associations and Charters