☆ 4.6 Article

Monolingual and Cross-Lingual Intent Detection without Training Data in Target Languages

ELECTRONICS (2021)

期刊

ELECTRONICS

卷 10, 期 12, 页码 -

出版社

MDPI

DOI: 10.3390/electronics10121412

关键词

BERT; word and sentence transformers; monolingual and cross-lingual experiments; EN; DE; FR; LT; LV; PT languages

类别

Computer Science, Information Systems Engineering, Electrical & Electronic Physics, Applied

资金

European Regional Development Fund within the joint project of SIA TILDE
University of Latvia Multilingual Artificial Intelligence Based Human Computer Interaction [1.1.1.1/18/A/148]
European Union [825081]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

This research experimentally solved the intent detection problem in five target languages using an English dataset. By utilizing various models and methods, they overcame the data scarcity issue and demonstrated the robustness of sentence transformers under different cross-lingual conditions.

Due to recent DNN advancements, many NLP problems can be effectively solved using transformer-based models and supervised data. Unfortunately, such data is not available in some languages. This research is based on assumptions that (1) training data can be obtained by the machine translating it from another language; (2) there are cross-lingual solutions that work without the training data in the target language. Consequently, in this research, we use the English dataset and solve the intent detection problem for five target languages (German, French, Lithuanian, Latvian, and Portuguese). When seeking the most accurate solutions, we investigate BERT-based word and sentence transformers together with eager learning classifiers (CNN, BERT fine-tuning, FFNN) and lazy learning approach (Cosine similarity as the memory-based method). We offer and evaluate several strategies to overcome the data scarcity problem with machine translation, cross-lingual models, and a combination of the previous two. The experimental investigation revealed the robustness of sentence transformers under various cross-lingual conditions. The accuracy equal to similar to 0.842 is achieved with the English dataset with completely monolingual models is considered our top-line. However, cross-lingual approaches demonstrate similar accuracy levels reaching similar to 0.831, similar to 0.829, similar to 0.853, similar to 0.831, and similar to 0.813 on German, French, Lithuanian, Latvian, and Portuguese languages.

Monolingual and Cross-Lingual Intent Detection without Training Data in Target Languages

期刊

ELECTRONICS

出版社

MDPI

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Monolingual and Cross-Lingual Intent Detection without Training Data in Target Languages

期刊

ELECTRONICS

出版社

MDPI

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文