论文标题
意大利语的Word2Vec的分析
An Analysis of Word2Vec for the Italian Language
论文作者
论文摘要
单词表示在NLP任务中至关重要,因为它恰恰是从单词之间的语义亲密关系编码的,因此可以考虑教机器理解文本。尽管单词嵌入概念传播,但在英语以外的语言背景下,仍然很少有成就。在这项工作中,分析词2VEC算法的语义能力,生成了意大利语的嵌入。探索了参数设置,例如时期数,上下文窗口的大小和负面反向传播样本的数量。
Word representation is fundamental in NLP tasks, because it is precisely from the coding of semantic closeness between words that it is possible to think of teaching a machine to understand text. Despite the spread of word embedding concepts, still few are the achievements in linguistic contexts other than English. In this work, analysing the semantic capacity of the Word2Vec algorithm, an embedding for the Italian language is produced. Parameter setting such as the number of epochs, the size of the context window and the number of negatively backpropagated samples is explored.