Request Correction
Use this form to request corrections to the paper metadata. Select the fields that need correction and provide the correct information.
Correction Guidelines
- Click the edit button next to a field to report a correction.
- Fill in the suggested correction value for each field you want to correct.
- Provide your name and email so we can contact you if needed.
Paper Information
A Video-Based Reverse Dictionary for Sign Language Using Gesture Similarity
Paper Fields
Click the edit button next to a field to report a correction.
A Video-Based Reverse Dictionary for Sign Language Using Gesture Similarity
Sign language recognition systems are usually modeled as classification systems that map gesture videos to pre-defined glosses. But these systems do not allow similarity searches, where a user can search for similar gestures without knowing the corresponding gloss. This paper presents a pose-based video-to-video search framework for isolated signs, which acts as a reverse gesture dictionary. The system employs keypoints on the skeletal structure instead of RGB images. Two architectures are proposed for modeling temporal information: an encoder with self-attention in a Transformer architecture and a Spatial-Temporal Graph Convolutional Network (ST-GCN). The embedding space is optimized using metric learning objectives, including supervised contrastive learning and ArcFace angular margin loss. The performance of the retrieval system is evaluated on the WLASL dataset using ranking metrics like Recall@K and mean Average Precision (mAP). Experiments reveal that the temporal modeling using the Transformer architecture is an improvement over the graph-based modeling approach in the low-shot learning scenario. The attention-based temporal pooling approach further enhances the ranking quality, with the best-performing model achieving an mAP of 0.237 on the WLASL validation set. Cross-dataset evaluation on a 226-label AUTSL dataset reveals non-trivial generalization performance on the unseen dataset, despite training only on the WLASL dataset.
Authors
Expand an author to correct their information. Use the remove button to request author removal, or add a new author.
PDF Attachment
You may attach a PDF as a corrected version of the paper. Max file size: 10MB. Only PDF files are accepted.
Your Information
Author Declaration *
Select at least one field to correct using the edit buttons above.