Feature Analysis of MoCap Data for Optimised Sign Language Processing

Proceedings of the LREC 2026 12th Workshop on the Representation and Processing of Sign Languages: Language in Motion

Abstract

Despite the rapid advances in AI and its impact on machine translation (MT), when it comes to sign language (SL) processing and MT, there is a big bottleneck – the lack of substantial quantities of quality signed data suitable for developing SLMT models. Marker-based motion capturing (MoCap) is a technique for tracing and recording the body movements (including hands and figures) in 3D space with high precision and has been widely used in SL research. MoCap data is of high representative accuracy, making it very suitable for analysing movement patterns and articulatory features. However, it is also very complex – a recording of a single sign may contain more than 240 entries over 156 features making it difficult for processing. In this paper we analyse MoCap data aiming to understand which captured features are of high importance. Consecutively, we optimise the MoCap data representation, reducing the number of features, and assess how this feature- reduced data impacts sign classification task. We organise MoCap features based on their importance and show how models trained on feature-reduced representations outperform those developed on the complete feature set.