翻译标题： OPI在SemEval 2023任务1中的应用：基于图像-文本嵌入和多模态信息检索的视觉词义消歧翻译摘要：视觉词义消歧的目标是找到最佳匹配提供的单词含义描述的图像。这是一个具有挑战性的问题，需要结合语言和图像理解的方法。在本文中，我们提出了我们在SemEval 2023视觉词义消歧共享任务中的提交。所提出的系统集成了多模态嵌入、学习排序方法和基于知识的方法。我们建立了一个基于CLIP模型的分类器，其结果通过从维基百科和词汇数据库检索到的额外信息得到丰富。我们的解决方案在多语言任务中排名第三，在波斯语跟踪中获胜，这是三个语言子任务之一。 (OPI at SemEval 2023 Task 1: Image-Text Embeddings and Multimodal Information Retrieval for Visual Word Sense Disambiguation)

翻译：翻译标题： OPI在SemEval 2023任务1中的应用：基于图像-文本嵌入和多模态信息检索的视觉词义消歧翻译摘要：视觉词义消歧的目标是找到最佳匹配提供的单词含义描述的图像。这是一个具有挑战性的问题，需要结合语言和图像理解的方法。在本文中，我们提出了我们在SemEval 2023视觉词义消歧共享任务中的提交。所提出的系统集成了多模态嵌入、学习排序方法和基于知识的方法。我们建立了一个基于CLIP模型的分类器，其结果通过从维基百科和词汇数据库检索到的额外信息得到丰富。我们的解决方案在多语言任务中排名第三，在波斯语跟踪中获胜，这是三个语言子任务之一。

Sławomir Dadas

The goal of visual word sense disambiguation is to find the image that best matches the provided description of the word's meaning. It is a challenging problem, requiring approaches that combine language and image understanding. In this paper, we present our submission to SemEval 2023 visual word sense disambiguation shared task. The proposed system integrates multimodal embeddings, learning to rank methods, and knowledge-based approaches. We build a classifier based on the CLIP model, whose results are enriched with additional information retrieved from Wikipedia and lexical databases. Our solution was ranked third in the multilingual task and won in the Persian track, one of the three language subtasks.

翻译：