语法需要在树上生长吗？序列到序列网络中层次归纳偏差的来源

论文标题

语法需要在树上生长吗？序列到序列网络中层次归纳偏差的来源

Does syntax need to grow on trees? Sources of hierarchical inductive bias in sequence-to-sequence networks

论文作者

McCoy, R. Thomas, Frank, Robert, Linzen, Tal

论文摘要

暴露于相同培训数据的学习者可能会因电感偏差的不同而概括不同。在神经网络模型中，理论上可能源于模型架构的任何方面。我们研究哪些架构因素会影响对两个句法任务训练的神经序列到序列模型的概括行为，英语问题形成和英语时态重新攻击。对于这两个任务，训练集都与基于层次结构和基于线性顺序的概括的概括一致。我们调查的所有架构因素都会影响模型如何概括，包括与层次结构没有明确联系的因素。例如，LSTM和GRU在质量不同的电感偏见上表现出不同。但是，始终在任务之间构成层次偏差的唯一因素是使用树结构模型而不是具有顺序复发的模型，这表明类似人类的句法概括需要建筑句法结构。

Learners that are exposed to the same training data might generalize differently due to differing inductive biases. In neural network models, inductive biases could in theory arise from any aspect of the model architecture. We investigate which architectural factors affect the generalization behavior of neural sequence-to-sequence models trained on two syntactic tasks, English question formation and English tense reinflection. For both tasks, the training set is consistent with a generalization based on hierarchical structure and a generalization based on linear order. All architectural factors that we investigated qualitatively affected how models generalized, including factors with no clear connection to hierarchical structure. For example, LSTMs and GRUs displayed qualitatively different inductive biases. However, the only factor that consistently contributed a hierarchical bias across tasks was the use of a tree-structured model rather than a model with sequential recurrence, suggesting that human-like syntactic generalization requires architectural syntactic structure.

下载PDF全文

下载文献需遵守相关版权规定

论文标题