Poster with Prerecorded Video
 in 
Workshop: Tokenization Workshop (TokShop)
                        
                    
                    Tokenisation is NP-Complete
Philip Whittington · Gregor Bachmann · Tiago Pimentel
Keywords: [ Compression ] [ Computational Complexity ] [ Tokenisation ]
                        Abstract:
                        
                            In this work, we prove the NP-completeness of two variants of tokenisation, defined as the problem of compressing a dataset to at most $\delta$ symbols by either finding a vocabulary directly (_direct_ tokenisation), or selecting a sequence of merge operations (_bottom-up_ tokenisation).
                        
                    
                    
                Chat is not available.
            
        Successful Page Load