Compression is limited by the pigeonhole principle. You can't get any compressio...

Compression is limited by the pigeonhole principle. You can't get any compression for free.

There's every possible text in Pi, but on average it's going to cost the same or more to encode the ___location of the text than the text itself.

To get compression, you can only shift costs around, by making some things take fewer bits to represent, at the cost of making everything else take more bits to disambiguate (e.g. instead of all bytes taking 8 bits, you can make a specific byte take 1 bit, but all other bytes will need 9 bits).

To be able to reference words from an English dictionary, you will have to dedicate some sequences of bits to them in the compressed stream.

If you use your best and shortest sequences, you're wasting them on picking from an inflexible fixed dictionary, instead of representing data in some more sophisticated way that is more frequently useful (which decoders already do by building adaptive dictionaries on the fly and other dynamic techniques).

If you try to avoid hurting normal compression and assign less valuable longer sequences of bits to the dictionary words instead, these sequences will likely end up being longer than the words themselves.