Developing a Biological Computational Model of Natural Language: An Interdisciplinary Approach to Linguistics
Cristopher Font-Santiago
Ambiguity, the possibility of multiple mental representations and/or interpretations for one utterance, is a necessary, intrinsic, and pervasive feature of natural language. Previous approaches have focused on best capturing individual language modules, such as phonetic-phonological parsers (Church, 1987; Meng, 1995), semantic parsers (Damonte & Monti, 2021; Liang, 2016), and syntactic parsers (Choe & Charniak, 2016; Schuster & Manning, 2016).) Among these, syntactic parsers are the principal concern for us, and we aim to implement a hybrid neural-symbolic system.
To achieve the initial development of these components for the system, synthetic toy corpora of 2,020 sentences, each in English and in Spanish, were designed. These text corpora distinguish between: (1) grammatical and ungrammatical strings, (2) three combinations of lexical ambiguity, and (3) three combinations of structural ambiguity. In the construction of these corpora, we also account for challenges with natural language morphosyntax that the system is unable to handle with its current architecture, particularly with different kinds of lexical ambiguity. We hypothesize that the development of a universal parser is possible with a suitable combination of Minimalist syntax and an appropriate system architecture. The development of this hybrid architecture system exemplifies the potential for interdisciplinary approaches to the study of language.
Partial Reference List
Choe, D. K., & E. Charniak. (2016). Parsing as Language Modeling. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2331–2336. Association for Computational Linguistics.
Church, K. W. (1987). Phonological Parsing and Lexical Retrieval. Cognition, 25(1-2), 53–69.
Damonte, M., & E. Monti. (2021). One Semantic Parser to Parse Them All: Sequence to Sequence Multi-Task Learning on Semantic Parsing Datasets. In Proceedings of *SEM 2021: The Tenth Joint Conference on Lexical and Computational Semantics, 173–184. Association for Computational Linguistics.
Liang, P. (2016). Learning Executable Semantic Parsers for Natural Language Understanding. Communications of the ACM, 59(9), 68–76.
Meng, HML. (1995). Phonological Parsing for Bi-directional Letter-to-Sound / Sound-to-Letter Generation. [Doctoral dissertation, Massachusetts Institute of Technology]. Spoken Language Systems Publication.
Schuster, S., & C. D. Manning. (2016). Enhanced English Universal Dependencies: An Improved Representation for Natural Language Understanding Tasks. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16) 2016, 2371–2378. European Language Resources Association (ELRA).