Evaluating Multilingual Language Models for Cross-Lingual ESG Issue Identification

Proceedings of the Joint Workshop of the 7th Financial Technology and Natural Language Processing, the 5th Knowledge Discovery from Unstructured Data in Financial Services, and the 4th Workshop on Economics and Natural Language Processing

DOI:10.63317/4ify52wiiomk

Abstract

The automation of information extraction from ESG reports has recently become a topic of increasing interest in the Natural Language Processing community. While such information is highly relevant for socially responsible investments, identifying the specific issues discussed in a corporate social responsibility report is one of the first steps in an information extraction pipeline. In this paper, we evaluate methods for tackling the Multilingual Environmental, Social and Governance (ESG) Issue Identification Task. Our experiments use existing datasets in English, French and Chinese with a unified label set. Leveraging multilingual language models, we compare two approaches that are commonly adopted for the given task: off-the-shelf and fine-tuning. We show that fine-tuning models end-to-end is more robust than off-the-shelf methods. Additionally, translating text into the same language has negligible performance benefits.

Resources

Details

Paper ID

lrec2024-ws-finnlp-06

Pages

pp. 50-58

DOI

10.63317/4ify52wiiomk

BibKey

li-etal-2024-evaluating-multilingual

Editors

Chung-Chi Chen, Xiaomo Liu, Udo Hahn, Armineh Nourbakhsh, Zhiqiang Ma, Charese Smiley, Veronique Hoste, Sanjiv Ranjan Das, Manling Li, Mohammad Ghassemi, Hen-Hsen Huang, Hiroya Takamura, Hsin-Hsi Chen

Publisher

European Language Resources Association (ELRA) and ICCL

ISSN

N/A

ISBN

N/A

Workshop

Location

Turin, Italy

Date

20 - 25 May 2024

Authors

WL
Wing Yan Li
EC
Emmanuele Chersoni
CN
Cindy Sing Bik Ngai

Links

URL

DOI