---------------------------------------------------------------------------
Bytewise coding method

 - We can't use plain alphabet encoding, because its incompatible
with SSE and most of other modelling techniques, and also it doesn't
really provide a solution for decoding.

 - Plain bitwise coding shows good results with "practical" data
and binary huffman decomposition for the symbols can significantly
improve the speed.
   But this blocks proper combinatoric probability estimation
(important for text) and SSE and practically prohibits symbol masking.
   Also its hardly possible to keep complete binary trees for
all contexts, unless hashtable is used... which introduces redundancy -
and there's already some on texts because of bitwise probability
estimation.

 - Then there's unary coding, which can be reasonably fast,
compatible with SSE and masking and allows for compact storage
in memory.
   But unary coding has problems with symbol reordering, unless
plain frequency counters are used (which are relatively redundant,
comparing to fsm, and slow - because of implied division by freq sum).
And then there're some really bad "worst cases", because contexts
with >8 symbols (or huffman code different from unary tree) can't
be efficiently processed.

Q. So what's the solution for the good text compression?

A. It just seems that there's no clear universal solution.
For texts even plain unary coding without any tricks is
efficient enough, and for binary files a completely different 
probability estimation is required anyway.
So the answer might be in implementing separate models for
texts (and similar types of data) and binaries.
Thus, the models can be even partly merged (eg. bitwise
distributions used for low orders and linked with the
prefix tree. 
In fact, it kinda makes sense even to consider three kinds
of models/statistics - unary, binary, and string.
And like that, it even seems possible to consider larger
alphabet than bytes - like unicode or whole words.