Computational models of distributional semantics (a.k.a. word embeddings) represent a word's meaning in terms of its relationships with all other words. We examine what grammatical information is encoded in distributional models and investigate the role of indirect associations. Distributional models are sensitive to associations between words at one degree of separation, such as ‘tiger’ and ‘stripes’, or two degrees of separation, such as ‘soar’ and ‘fly’. By recursively adding higher levels of representations to a computational, holographic model of semantic memory, we construct a distributional model sensitive to associations between words at arbitrary degrees of separation. We find that word associations at four degrees of separation increase the similarity assigned by the model to English words that share part-of-speech or syntactic type. Word associations at four degrees of separation also improve the ability of the model to construct grammatical English sentences. Our model proposes that human memory uses indirect associations to learn part-of-speech and that the basic associative mechanisms of memory and learning support knowledge of both semantics and grammatical structure.
All Science Journal Classification (ASJC) codes
- Neuropsychology and Physiological Psychology
- Language and Linguistics
- Experimental and Cognitive Psychology
- Linguistics and Language
- Artificial Intelligence