This is obviously not a good language model but it is a fun demonstration of the equivalence of compression and prediction.