A Markov Categorical Framework for Language Modeling
Yifan Zhang
Abstract
Auto-regressive (AR) language models factorize sequence probabilities as $P_\theta(\mathbf{w}) = \prod_t P_\theta(w_t | \mathbf{w}_{
Chat is not available.
Successful Page Load