---------------------------------------------------------------------------
Match finder implementation

1. Task spec

Find matching byte strings within a sliding window.
 - match size can be limited
 - its possible to put the "future" data into the window and ensure
   that string comparison won't wrap.

Ideally, for each byte, we need to find the nearest offsets of all
matching strings, including overlapped.

2. dumb search / hash chains

The simplest possible method is to compare current string with strings
at all distances. It also doesn't require any extra memory, but instead,
the complexity is O(N^2).

It can be improved at the cost of 4N+O(1) memory - we can make linked
lists of all strings with the same prefix.

However, its still potentially O(N^2), or the chain length has to be
very limited, which affects compression.

3. adding anchor hashing

Its possible to manage short strings (2-4 or 2-5 or so) via a simple
hashtable (string to its last location, instead of all locations).

Then, the window can be indexed with anchored hashing, which would
allow to treat longer strings as symbols.

Unfortunately it seems hard to understand how to implement standard LZ
using anchored fragments - follow the chain of the first full fragment within
target length?..

Instead, something similar can be done with usual hash chains, but in a more
uniform fashion - direct hashtable for short matches, then chains for 4+ symbols.

But either way, with anchor hashing or not, potential chain length is still a problem.

4. ternary tree

Its possible to build a binary or ternary tree with O(N) memory - we add a string
on each offset, thus a tree node per offset.

But like that we'd only have access to offsets of unique strings - otherwise the
only way would be to recursively process the whole tree branch.
Sure, its possible to maintain linked lists for the nodes of given prefix length...
but that won't be O(N) memory already.

There're interesting possible optimizations (instead of looking up each string from tree root),
like suffix pointers and hash comparisons (to make binary trees more balanced), but
first its necessary to find a way to map a string to its last offset.

Well, its possible to add a last prefix offset pointer to nodes (or use "equal" pointer
for that), but then n'th prefix occurrence would store that pointer for the n-byte prefix.
Which means that when there're less than n strings with the same prefix, we'd have to compare
them all to find the best match.

5. BWT

Still, its known that its possible to index all strings with 4N or 5N extra memory, using BWT.
The main problem with it is its blockwise nature (impossible to do anything until window is full), 
and having to skip "future" strings in lookup.

There may be a workaround for future string skipping (like by adding extra pointers), but its
unknown atm.
Also its possible to redo the sort after reading each few MB of data (to avoid waiting for 1G read
with 1G window), but that would add significant delays.

Overall, BWT for string indexing really seems interesting, but a sequential method seems preferable,
even if its slower.

6. PPM tree

In the end, it seems that a tree similar to ppmd's one would be a "fast" solution.
Simply because I know how to implement it, and it can provide all the necessary info.
But there're issues:
 - its not O(N), so depending on data, full indexing of N-byte window may be impossible
with a fixed k*N byte tree buffer.
 - with variable-size nodes there's a "garbage collection" problem (which I generally
avoid in Ash and the like). Its necessary to at least have an ability to "reformat" memory blocks
(from one node size to another), or otherwise actual window size can deteriorate to nearly 0.
For that it might make sense to move tree nodes to new locations even if their size
didn't change (otherwise its impossible to free a block w/o full tree traversal, even when
there're enough of free nodes).
 - the main problem is parsing complexity. Order-272 (or so) parser can be pretty slow on
redundant data (with lots of such matches), especially with sliding window support.