<FunkyBob> | !last | 2019-05-12 15:01:25 |
<Shelwien> | :) | 2019-05-12 15:45:13 |
<FunkyBob> | why do you keep writing that, anyway? | 2019-05-12 15:46:46 |
| is it a signal to the bot? | 2019-05-12 15:46:50 |
<Shelwien> | yes, session split for log archive | 2019-05-12 15:52:48 |
| at first i tried doing that by time between posts, but it didn't work right | 2019-05-12 15:53:13 |
| well, and there're some other commands | 2019-05-12 15:53:34 |
| !grep enwik8 | 2019-05-12 15:53:37 |
<FunkyBob> | nice | 2019-05-12 15:56:02 |
*** TheWolf_ has joined the channel | 2019-05-12 17:37:35 |
*** Jibz has left the channel | 2019-05-12 18:11:22 |
<Shelwien> | hi | 2019-05-12 18:33:35 |
<TheWolf_> | hi | 2019-05-12 19:38:23 |
<FunkyBob> | hi | 2019-05-12 19:39:39 |
<Shelwien> | :) | 2019-05-12 19:52:06 |
<TheWolf_> | bad Shelwien you broke the chain xD | 2019-05-12 20:10:33 |
<FunkyBob> | :P | 2019-05-12 21:10:46 |
<unic0rn> | there can be uncompressed Shelwien, compressed Shelwien, AVX-accelerated Shelwien, but bad? that's illogical | 2019-05-12 22:02:07 |
| also, lets not forget template-based Shelwien++ | 2019-05-12 22:06:23 |
<Shelwien> | templates are compression | 2019-05-12 22:14:54 |
<FunkyBob> | so, I tried a 30/d0 break to indicate lit/match ... it improved things a little... but clearly not as much as breaking the nibble barrier would | 2019-05-13 00:32:46 |
<Shelwien> | did you see my result with secondary compression? | 2019-05-13 00:48:26 |
<FunkyBob> | no? | 2019-05-13 05:32:55 |
| maybe? | 2019-05-13 05:32:59 |
<Shelwien> | this: | 2019-05-13 11:00:39 |
| <Shelwien> lzfb output for book1 without literals and match distances is 131498 | 2019-05-13 11:00:40 |
| <Shelwien> it can be then compressed with lzfb to 98397 | 2019-05-13 11:00:40 |
| it means that you can use it as is | 2019-05-13 11:01:04 |
*** TheWolf_ has left the channel | 2019-05-13 11:29:09 |
*** TheWolf has joined the channel | 2019-05-13 11:29:52 |
| so, there's a coder like this: https://pastebin.com/xgJM5J5R | 2019-05-13 14:27:06 |
| order1 CM, could be a BWT postcoder | 2019-05-13 14:27:19 |
| the counter LUT is static | 2019-05-13 14:27:49 |
| so the question is, how to optimize it for best compression? | 2019-05-13 14:28:04 |
<FunkyBob> | with a brick? | 2019-05-13 16:08:26 |
<Shelwien> | !grep brick | 2019-05-13 16:52:09 |
| btw, here's an improved version of LZ4 | 2019-05-13 19:25:08 |
| https://github.com/inikep/lizard/blob/lizard/lib/lizard_decompress_liz.h | 2019-05-13 19:25:10 |
| LZ5 aka Lizard | 2019-05-13 19:25:18 |
<FunkyBob> | have run across that, in the forum and out | 2019-05-13 19:27:15 |
<Shelwien> | there's the token layout at the start of that .h file | 2019-05-13 19:30:13 |
<FunkyBob> | my half-awke brain this morning was contenplating using elias or similar encoding of lengths/distances... see how that went | 2019-05-13 19:54:32 |
| (rounded to nearest byte) | 2019-05-13 19:54:39 |
<Shelwien> | ? | 2019-05-13 19:55:31 |
<FunkyBob> | but absolutely need to bundle (lit run len, match len, match distance) into a single chunk | 2019-05-13 19:55:44 |
<Shelwien> | btw, how about asciiz lit runs? | 2019-05-13 19:58:11 |
| or something along that line | 2019-05-13 19:58:28 |
<FunkyBob> | as in null terminated strings? | 2019-05-13 19:58:40 |
| works fine for compressing text... | 2019-05-13 19:58:44 |
<Shelwien> | yes | 2019-05-13 19:58:47 |
| maybe not specifically \x00, but \x1A or something | 2019-05-13 19:59:56 |
<FunkyBob> | either was, requires some way to escape if that value turns up | 2019-05-13 20:00:39 |
<Shelwien> | just presume that literal runlen can't be shorter than 1? | 2019-05-13 20:01:30 |
| well, here's another idea though, from bsdiff | 2019-05-13 20:02:12 |
| rather than deleting matches, replace them with zeroes | 2019-05-13 20:02:36 |
| so overall layout of literals remains unchanged, just matches are zeroed out | 2019-05-13 20:03:02 |
| then this gets compressed with RLE | 2019-05-13 20:03:20 |
<FunkyBob> | interesting | 2019-05-13 20:03:26 |
<Shelwien> | yes | 2019-05-13 20:03:33 |
| bsdiff actually subtracts its matches, rather then deleting | 2019-05-13 20:03:52 |
| so imprecise matches are possible | 2019-05-13 20:04:02 |
| and its really helpful for exes | 2019-05-13 20:04:13 |
| since these have lots of inlined code which mostly matches except for some addrs | 2019-05-13 20:04:38 |
<FunkyBob> | yeah, I've considered that sort of thing before... | 2019-05-13 20:04:39 |
<Shelwien> | which end up having the same difference | 2019-05-13 20:04:50 |
| so that literal stream can be actually compressed again | 2019-05-13 20:05:15 |
| i'd like to port bsdiff to a normal LZ | 2019-05-13 20:08:28 |
| but it uses BWT for matchfinding | 2019-05-13 20:08:46 |
| so it can be pretty hard to port it | 2019-05-13 20:09:01 |
| !next | 2019-05-13 22:30:37 |