In conversational analyses, humans manually weave multimodal information into the transcripts, which is significantly time-consuming. We introduce a system that automatically expands the verbatim transcripts of video-recorded conversations using multimodal data streams. This system uses a set of preprocessing rules to weave multimodal annotations into the verbatim transcripts and promote interpretability. Our feature engineering contributions are two-fold: firstly, we identify the range of multimodal features relevant to detect rapport-building; secondly, we expand the range of multimodal annotations and show that the expansion leads to statistically significant improvements in detecting rapport-building.
翻译:在对话分析中,人手工将多式联运信息编入记录誊本,这非常耗时。我们引入了一个系统,利用多式联运数据流自动扩大视频录音对话的录音誊本。这个系统使用一套预处理规则将多式说明编入逐字记录誊本,并促进可解释性。我们的特写工程贡献有两个方面:第一,我们确定与发现建立关系有关的多式特征的范围;第二,我们扩大多式说明的范围,并表明这种扩大导致在发现建立关系方面在统计上有显著改进。