Timezone: »

Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval
Uri Alon · Frank Xu · Junxian He · Sudipta Sengupta · Dan Roth · Graham Neubig

Thu Jul 21 07:40 AM -- 07:45 AM (PDT) @ Hall F
Retrieval-based language models (R-LM) model the probability of natural language text by combining a standard language model (LM) with examples retrieved from an external datastore at test time. While effective, a major bottleneck of using these models in practice is the computationally costly datastore search, which can be performed as frequently as every time step.In this paper, we present RetoMaton - retrieval automaton - which approximates the datastore search, based on (1) saving pointers between consecutive datastore entries, and(2) clustering of entries into "states".This effectively results in a weighted finite automaton built on top of the datastore, instead of representing the datastore as a flat list.The creation of the automaton is unsupervised, and a RetoMaton can be constructed from any text collection: either the original training corpus or from another domain. Traversing this automaton at inference time, in parallel to the LM inference, reduces its perplexity by up to 1.85, or alternativelysaves up to 83% of the nearest neighbor searches over $k$NN-LM (Khandelwal et al., 2020) without hurting perplexity. Our code and trained models are available at https://github.com/neulab/retomaton .

Author Information

Uri Alon (Carnegie Mellon University)
Frank Xu (Carnegie Mellon University)
Junxian He (Carnegie Mellon University)
Sudipta Sengupta (Amazon Web Services)
Dan Roth (University of Pennsylvania and AWS AI Labs)
Dan Roth

Dan Roth is the Eduardo D. Glandt Distinguished Professor at the Department of Computer and Information Science, University of Pennsylvania, lead of NLP Science at AWS AI Labs., and a Fellow of the AAAS, the ACM, AAAI, and the ACL. In 2017 Roth was awarded the John McCarthy Award, the highest award the AI community gives to mid-career AI researchers. Roth was recognized “for major conceptual and theoretical advances in the modeling of natural language understanding, machine learning, and reasoning.” Roth has published broadly in machine learning, natural language processing, knowledge representation and reasoning, and learning theory, and has developed advanced machine learning based tools for natural language applications that are being used widely. Roth was the Editor-in-Chief of the Journal of Artificial Intelligence Research (JAIR) and a program chair of AAAI, ACL, and CoNLL. Roth has been involved in several startups; most recently he was a co-founder and chief scientist of NexLP, a startup that leverages the latest advances in Natural Language Processing (NLP), Cognitive Analytics, and Machine Learning in the legal and compliance domains. NexLP was acquired by Reveal in 2020. Prof. Roth received his B.A Summa cum laude in Mathematics from the Technion, Israel, and his Ph.D. in Computer Science from Harvard University in 1995.

Graham Neubig (Carnegie Mellon University)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors