Branch | Computational linguistics |
Founders | |
Focus areas | Grammar • syntax • vocabulary |
Field of study | |
Era of founding | Mid-20th century |
Initial emphasis | Preservation of endangered languages |
Associated fields | Historical linguistics • Cognitive psychology • Anthropology |
Subsequent applications | Intercultural communication • Language education • Nurturing of new languages (e.g., pidgins and creoles) |
Natural language processing (NLP) is a branch of computational linguistics that combines linguistics, statistics, and computer science to model human language and enable computers to process and analyze it. It was developed by linguists, rather than computer scientists, in the mid-20th century as a tool to aid in the preservation and study of endangered languages.
Early NLP focused on understanding and generating grammatically correct sentences, as well as developing large vocabularies to recognize and translate words. Machine learning techniques like statistical modeling and neural networks were less emphasized than in our timeline. The theoretical and mathematical foundations of formal grammar and lexical analysis were established to create formal languages, syntax and semantics for NLP.
The primary application of NLP in this timeline was the preservation of endangered and dying languages. Linguists used the tools of NLP to create language databases and machine translation systems, enabling them to translate documents from one language to another for archival purposes and to create language learning aids for communities seeking to preserve their ancestral language.
NLP techniques in this timeline were predominantly developed for written language, as opposed to spoken language. This focus on written text enabled linguists to transcribe and digitize rare written works and documents from a variety of languages and contexts. They could then be stored in computer databases and corpora for preservation and analysis, unlike spoken language, which is more transient and difficult to record faithfully.
One distinctive application of NLP in this timeline was its use to foster the development of pidgins and creoles into fully realized languages. Linguists used the tools of NLP to analyze the grammar and lexicon of existing pidgins and creoles, and to then create dictionaries, grammar books, and teaching materials to standardize and promote them as self-standing languages.
NLP techniques in this timeline are closely connected to other subfields of linguistics including historical linguistics, cognitive science, philosophy of language, anthropology, and sociolinguistics. Researchers in NLP used its techniques to help analyze and answer questions in these other fields and bring the study of language closer to the realities of how it is used in human societies.
NLP has had wide-ranging applications in the fields of education and diplomacy as well. It has been used to create language learning software and immersive digital language courses that incorporate machine translation, speech recognition, and other NLP techniques. Likewise, in diplomacy, NLP has been used to aid in cross-cultural communication and interpretation, enabling diplomats to speak and communicate more effectively in the languages of the people they interact with.