*** Krugz has left the channel | 2010-01-15 23:21:09 |
*** Krugz has joined the channel | 2010-01-15 23:26:40 |
*** mike_____ has left the channel | 2010-01-15 23:33:51 |
*** scott___ has joined the channel | 2010-01-16 02:15:06 |
<scott___> | who is here | 2010-01-16 02:15:24 |
*** scott___ has left the channel | 2010-01-16 03:59:30 |
*** STalKer-Y has joined the channel | 2010-01-16 04:26:10 |
*** STalKer-X has left the channel | 2010-01-16 04:29:42 |
*** maniscalco has joined the channel | 2010-01-16 04:52:04 |
<maniscalco> | just seeing what is going on | 2010-01-16 04:52:32 |
| anyone home ? | 2010-01-16 04:52:38 |
| shelwien, i did some simple tests and bwmonstr does in fact use the output file before creating the final output file | 2010-01-16 04:53:58 |
| the intermediate file size is roughly equal to the final compressed size | 2010-01-16 04:54:43 |
| good night | 2010-01-16 04:54:53 |
*** maniscalco has left the channel | 2010-01-16 04:55:03 |
*** Krugz has left the channel | 2010-01-16 05:46:13 |
*** Krugz has joined the channel | 2010-01-16 05:48:24 |
* ChanServ This channel has been registered with ChanServ. | 2010-01-16 06:14:11 |
*** pmcontext has joined the channel | 2010-01-16 07:53:34 |
<pmcontext> | hi | 2010-01-16 07:54:56 |
| anyone here ? | 2010-01-16 08:16:31 |
*** pmcontext has left the channel | 2010-01-16 08:24:24 |
*** mike_____ has joined the channel | 2010-01-16 08:33:25 |
*** pinc has joined the channel | 2010-01-16 11:37:28 |
<Shelwien> | ... | 2010-01-16 11:37:58 |
| To be specific, its possible that bwmonstr does the usual | 2010-01-16 12:05:52 |
| BWT+postcoding first, with file output, then discards | 2010-01-16 12:05:53 |
| the compressed input and loads the compressed BWT output instead. | 2010-01-16 12:05:53 |
| Anyway, here's the original thread about intermediate | 2010-01-16 12:05:53 |
| compression: | 2010-01-16 12:05:53 |
| http://encode.dreamhosters.com/showthread.php?t=379 | 2010-01-16 12:05:53 |
| However, my implementation compares _compressed_ data strings | 2010-01-16 12:05:56 |
| to produce BWT, and for that employs a special bitcode (dunno whether it | 2010-01-16 12:05:57 |
| has a name, its just a direct application of dynamic programming) | 2010-01-16 12:05:59 |
| which preserves the lexicographic order of symbols. | 2010-01-16 12:06:01 |
| And as to bwmonstr, its known that it uses plain huffman coding | 2010-01-16 12:06:03 |
| (probably order2), and unpacks the symbols for string comparisons, | 2010-01-16 12:06:05 |
| which is likely the main reason for its speed. | 2010-01-16 12:06:07 |
| Anyway, it seems fairly obvious how to perform the forward BWT | 2010-01-16 12:06:09 |
| in 0.5N memory, but the backward transform (iBWT) with the | 2010-01-16 12:06:11 |
| same memory restriction seems much more tricky. | 2010-01-16 12:06:13 |
*** Krugz|Sleep has left the channel | 2010-01-16 12:15:46 |
| ah, scary? ;) | 2010-01-16 12:24:40 |
<STalKer-Y> | hmmm | 2010-01-16 12:30:44 |
*** toffer has joined the channel | 2010-01-16 13:05:31 |
<toffer> | hi | 2010-01-16 13:05:40 |
<STalKer-Y> | hi | 2010-01-16 13:05:53 |
<mike_____> | hmm, which was the original paper on arithmetic coding? | 2010-01-16 13:06:48 |
| rissanen? | 2010-01-16 13:07:19 |
<toffer> | dunnot know which one is the original | 2010-01-16 13:08:42 |
| quot g.n.n. martin - range encoding. easy to understand and readable | 2010-01-16 13:08:57 |
| does anybody got some free cpu power? | 2010-01-16 13:10:10 |
| i got some optimization work | 2010-01-16 13:10:20 |
*** pmcontext has joined the channel | 2010-01-16 13:11:14 |
| ideally some high end quad core ^^ | 2010-01-16 13:11:16 |
<pmcontext> | hi everyone | 2010-01-16 13:11:22 |
<Shelwien> | hi | 2010-01-16 13:11:30 |
<pmcontext> | :D | 2010-01-16 13:11:41 |
<toffer> | hi | 2010-01-16 13:11:48 |
<pmcontext> | hi toffer | 2010-01-16 13:12:00 |
<Shelwien> | toffer: i guess i can run your optimizer | 2010-01-16 13:12:33 |
<toffer> | ok | 2010-01-16 13:12:55 |
| yesterday i ran book1 as a sanity test | 2010-01-16 13:13:03 |
| 142 f* = 2.189916 | 2010-01-16 13:13:06 |
| that's bpc | 2010-01-16 13:13:09 |
| which is 2104xxx | 2010-01-16 13:13:16 |
| still not below 210k | 2010-01-16 13:13:26 |
| 2104xx | 2010-01-16 13:13:32 |
<Shelwien> | yeah, but that's still good enough | 2010-01-16 13:13:45 |
<toffer> | it's around 0.5% better than the previous version w/o model switching | 2010-01-16 13:14:26 |
| anyway | 2010-01-16 13:14:31 |
| i got huffman decomposition working | 2010-01-16 13:14:37 |
| it was ~10% faster | 2010-01-16 13:14:41 |
| and i discovered that it actually is 30% faster after forcing gcc o inline everything | 2010-01-16 13:14:56 |
<Shelwien> | i'd expect much more gain actually | 2010-01-16 13:14:59 |
| ah | 2010-01-16 13:15:04 |
<toffer> | but at that level gcc refused to inline even bit prediction functions, etc | 2010-01-16 13:15:20 |
| well with order1 decomposition it gonna be nearly 50% faster on text files | 2010-01-16 13:15:36 |
<Shelwien> | ;) | 2010-01-16 13:15:48 |
<toffer> | as i said it's still unoptimized | 2010-01-16 13:16:13 |
| anyway i first want to get the match model running | 2010-01-16 13:16:20 |
| properly | 2010-01-16 13:16:25 |
| maybe i'd add one or two more models and use o1 decomposition | 2010-01-16 13:16:44 |
<Shelwien> | anyway, that's speed good, unless compression is not compromised that much | 2010-01-16 13:17:05 |
<toffer> | on enwik it's unchanged | 2010-01-16 13:17:22 |
| the sfc gets 0.2% worse compression | 2010-01-16 13:17:36 |
| both unoptimzied | 2010-01-16 13:17:40 |
<Shelwien> | well, unoptimized is not very relevant ;) | 2010-01-16 13:18:36 |
<toffer> | but that already shows that optimization might resolve that alphabet permutation loss | 2010-01-16 13:19:31 |
<Shelwien> | well, its unknown to me whether its a permulation loss or gain ;) | 2010-01-16 13:21:06 |
| *permutation | 2010-01-16 13:21:14 |
<toffer> | the decomp. scheme tries to group similar ascii codes together | 2010-01-16 13:21:36 |
<Shelwien> | well, my optimized permutation doesn't cluster them together that much ;) | 2010-01-16 13:22:17 |
<toffer> | since huffman deco. is dynamically constructed it gonna be different for distinct files | 2010-01-16 13:27:39 |
<Shelwien> | this is kind bad though | 2010-01-16 13:28:54 |
<toffer> | but that's how it works | 2010-01-16 13:29:11 |
| and that's why i try to make the lexical order as similar as possible to the ascii ordering | 2010-01-16 13:29:37 |
<Shelwien> | what about doing some contextual segmentation first? | 2010-01-16 13:30:04 |
| like, parse the context history and find a split point, where it becomes better to switch the code | 2010-01-16 13:30:59 |
<toffer> | that gonna be too much effort | 2010-01-16 13:31:12 |
<Shelwien> | ;) | 2010-01-16 13:31:19 |
| mike: btw, as to arithmetic coding, i heard that his is the first publication: http://www.richpasco.org/scaffdc.pdf | 2010-01-16 13:33:09 |
| anyway, if huffman decomposition is bad, because concept of "file" is not always applicable in compression | 2010-01-16 14:03:58 |
| and implementing a segmentation is too much of a bother | 2010-01-16 14:04:22 |
| then what about some universal model for redundancy removal? | 2010-01-16 14:04:57 |
| like my msb model? | 2010-01-16 14:05:14 |
<toffer> | that xoring? | 2010-01-16 14:05:22 |
<Shelwien> | its not xoring, its matching msb counting | 2010-01-16 14:05:38 |
<toffer> | well but xor and find the most significant bit set | 2010-01-16 14:05:54 |
| it's what i meant | 2010-01-16 14:06:04 |
| did you ever try to use it for e.g. preprocessing? | 2010-01-16 14:06:19 |
| i mean in a cm codec | 2010-01-16 14:06:22 |
<Shelwien> | !grep msb_ | 2010-01-16 14:06:32 |
| damn | 2010-01-16 14:06:48 |
| anyway, its there ;) | 2010-01-16 14:07:04 |
<toffer> | ^^ | 2010-01-16 14:07:11 |
| gonna see when your script stops | 2010-01-16 14:07:16 |
<Shelwien> | it did already | 2010-01-16 14:07:32 |
| it compresses book1 to 574624 there | 2010-01-16 14:07:42 |
<toffer> | o1 huffman will do better ^^ | 2010-01-16 14:07:57 |
<Shelwien> | but that's with arithmetic coding | 2010-01-16 14:07:57 |
| sure | 2010-01-16 14:08:02 |
<toffer> | maybe it's still useful as a first preprocessing step | 2010-01-16 14:08:17 |
<Shelwien> | so i tried to start a discussion about improving this method | 2010-01-16 14:08:18 |
<toffer> | btw | 2010-01-16 14:08:19 |
| i already tried it | 2010-01-16 14:08:21 |
| i comes to my mind again | 2010-01-16 14:08:30 |
| it was clearing msbs | 2010-01-16 14:08:38 |
<Shelwien> | ? | 2010-01-16 14:08:50 |
<toffer> | something like c1^c | 2010-01-16 14:09:14 |
| where c1 is a prediction | 2010-01-16 14:09:19 |
| and coding the residual | 2010-01-16 14:09:23 |
| s | 2010-01-16 14:09:24 |
| just limiting it to a few msbs didn't hurt but improved context clustering | 2010-01-16 14:09:42 |
| and speed | 2010-01-16 14:09:46 |
| but i don't got numbers | 2010-01-16 14:09:51 |
| since i abandoned it - seemed to much micky | 2010-01-16 14:10:03 |
<Shelwien> | ok | 2010-01-16 14:10:05 |
| anyway, now that i think about it again | 2010-01-16 14:10:12 |
| the msb model won't be applicable in your case | 2010-01-16 14:10:24 |
| because to get some compression with it | 2010-01-16 14:10:44 |
| you'd need to use a high-order context for it | 2010-01-16 14:11:05 |
| (even my implementation uses order-4 already) | 2010-01-16 14:11:14 |
<toffer> | well i used order 6 afair | 2010-01-16 14:11:43 |
<Shelwien> | so it won't work if you have a lower order model based on such decomposition | 2010-01-16 14:11:45 |
<toffer> | i totally forgot about uploading the optimizer | 2010-01-16 14:12:27 |
| ^^ | 2010-01-16 14:12:29 |
<Shelwien> | i noticed ;) | 2010-01-16 14:12:55 |
<toffer> | !grep ftp | 2010-01-16 14:13:02 |
| >.< | 2010-01-16 14:13:11 |
| !grep dcc | 2010-01-16 14:13:18 |
<Shelwien> | ;) | 2010-01-16 14:13:28 |
| now, that looks like a bug %) | 2010-01-16 14:13:42 |
<toffer> | yeah | 2010-01-16 14:13:46 |
| !grep apocalypse | 2010-01-16 14:14:04 |
| ah! | 2010-01-16 14:14:09 |
<Shelwien> | ftp://toffer.dreamhosters.com/dcc | 2010-01-16 14:14:16 |
| ? | 2010-01-16 14:14:18 |
<toffer> | there's your bug again | 2010-01-16 14:14:19 |
<Shelwien> | i think it finds matches in grep files | 2010-01-16 14:14:38 |
| guess it'd take time, only 4 iterations for now | 2010-01-16 14:38:15 |
<toffer> | yeah | 2010-01-16 14:39:12 |
| the match model is quite computation intensive | 2010-01-16 14:39:22 |
| got from 2mb/s to 1.5mb/s | 2010-01-16 14:39:31 |
<Shelwien> | btw, changed it to 4 threads | 2010-01-16 14:39:34 |
<toffer> | and with debug code it's like 1.2mb/s | 2010-01-16 14:39:42 |
| there's no speed gain | 2010-01-16 14:40:02 |
| i tested that on my q6600 at university | 2010-01-16 14:40:09 |
<Shelwien> | somehow, it doesn't use all 4 threads evenly, it seems | 2010-01-16 14:40:48 |
<toffer> | no wonder | 2010-01-16 14:42:48 |
| 50/4 = 12.5 | 2010-01-16 14:42:56 |
<Shelwien> | might be, but cpu consumption gets down to 50% and less for ~30s at the end of each iteration | 2010-01-16 14:44:21 |
<toffer> | guess i'd redo thread scheduing | 2010-01-16 14:45:55 |
| it gonna have three threads with 12 evaluations | 2010-01-16 14:46:10 |
<Shelwien> | in fact, its nearly 0% for like 10s | 2010-01-16 14:46:13 |
<toffer> | well i never observed that | 2010-01-16 14:46:24 |
<Shelwien> | look at task managed/performance | 2010-01-16 14:46:37 |
| *manager | 2010-01-16 14:46:41 |
<toffer> | i do | 2010-01-16 14:46:42 |
| completely used all the time | 2010-01-16 14:46:50 |
<Shelwien> | ah | 2010-01-16 14:46:55 |
| guess ramdrive matters ;) | 2010-01-16 14:47:05 |
| does it reread the file or write something? | 2010-01-16 14:47:35 |
| afaiu such optimizer should keep the file in memory? | 2010-01-16 14:47:56 |
<toffer> | it's plugged in a way that it just calls the coding routine | 2010-01-16 14:48:23 |
| but that can be done, too | 2010-01-16 14:48:27 |
| is there so much i/o overhead?! | 2010-01-16 14:48:38 |
<Shelwien> | also, don't optimize it for 3 threads, as i'm perfectly ok with using 4 | 2010-01-16 14:48:42 |
| i guess there is | 2010-01-16 14:48:50 |
<toffer> | well it's not "optimized" for any number of threads | 2010-01-16 14:49:04 |
| the thread allocation was just typed in a long time ago and stayed unchanged | 2010-01-16 14:49:26 |
| but i n ever observed something like that | 2010-01-16 14:49:31 |
<Shelwien> | http://nishi.dreamhosters.com/u/m1a.png | 2010-01-16 14:52:13 |
<toffer> | that's weird indeed | 2010-01-16 14:53:38 |
<Shelwien> | i run it on ramdrive along with enwik7 | 2010-01-16 14:54:13 |
<toffer> | ftp://toffer.dreamhosters.com/dcc/z.png | 2010-01-16 14:55:47 |
| just got two threads | 2010-01-16 14:56:33 |
| ftp://toffer.dreamhosters.com/dcc/z.pngz | 2010-01-16 14:56:39 |
| gr | 2010-01-16 14:56:42 |
| ftp://toffer.dreamhosters.com/dcc/zz.png | 2010-01-16 14:56:45 |
| damn keyboard | 2010-01-16 14:56:48 |
<Shelwien> | = http://toffer.dreamhosters.com/zz.png | 2010-01-16 14:57:20 |
<toffer> | ^^ | 2010-01-16 14:57:41 |
<Shelwien> | btw, it reached 2.00 just now | 2010-01-16 14:58:12 |
<toffer> | 14 f* = 1.836894 at iteration 14+20 | 2010-01-16 14:58:18 |
| well the nuoptimized version | 2010-01-16 14:58:34 |
| got around 1.801 | 2010-01-16 14:58:40 |
<Shelwien> | will see | 2010-01-16 14:58:49 |
<toffer> | guess i need to make some stuff more generic, i.e. to include the possibility of a decomposition. and apply some speed optimizations from good old cmm4 | 2010-01-16 14:59:55 |
| and i wanted to add some incompressible data bypassing | 2010-01-16 15:00:07 |
| e.g. turn off all but one m1 model | 2010-01-16 15:00:18 |
<Shelwien> | turning off really all is important too | 2010-01-16 15:00:48 |
| actually compression of barely compressible data requires all different models | 2010-01-16 15:01:34 |
<toffer> | but i still need some model running to detect a increasing compression | 2010-01-16 15:01:42 |
| take a10 | 2010-01-16 15:01:58 |
<Shelwien> | better to make it async imho | 2010-01-16 15:02:00 |
<toffer> | i don't got a jpeg model | 2010-01-16 15:02:03 |
<Shelwien> | not jpeg | 2010-01-16 15:02:11 |
| recompression of known formats is one thing | 2010-01-16 15:02:24 |
| but i'm talking about compression of unknown random-looking data | 2010-01-16 15:02:42 |
| ...hmm, suddenly switched to 2nd pass | 2010-01-16 15:03:19 |
<toffer> | yes | 2010-01-16 15:03:24 |
| it runs 20 iterations with a mutation rate of 0.1 | 2010-01-16 15:03:34 |
| than 30 with 0.09 | 2010-01-16 15:03:39 |
| and finally it runs at a rate which allows faster convergence | 2010-01-16 15:03:53 |
| 0.07 to be specific | 2010-01-16 15:03:59 |
| afterwards it starts the hill climber to run into a local minimum | 2010-01-16 15:04:45 |
<Shelwien> | 1.917 for now | 2010-01-16 15:05:12 |
<toffer> | maybe the multithreading overhead is acceptible with something like 6 or 8 models | 2010-01-16 15:06:18 |
| gonna have a shower and a cup of coffee now | 2010-01-16 15:08:37 |
* Shelwien goes to get some tea too | 2010-01-16 15:09:10 |
| and gonna pull of that 90kg bar from my bed ^^ | 2010-01-16 15:09:13 |
<Shelwien> | err... who? ;) | 2010-01-16 15:09:26 |
<toffer> | weights for back training ^^ | 2010-01-16 15:09:54 |
| my corneal looks like a builder's | 2010-01-16 15:10:31 |
* Shelwien envies | 2010-01-16 15:11:32 |
| not really | 2010-01-16 15:11:52 |
| my girlfriend complains about it feeling like sand paper | 2010-01-16 15:12:22 |
* Shelwien thinks about possible relations between weight lifting and context weighting | 2010-01-16 15:24:28 |
<pmcontext> | what do u think => are body builders are smarter or dumber then average person ? | 2010-01-16 15:30:01 |
<Shelwien> | dunno, but we have at least 3 here ;) | 2010-01-16 15:30:34 |
<pmcontext> | so they smarter then :D | 2010-01-16 15:30:52 |
| then wight lifting directly proportioan to context weighting | 2010-01-16 15:31:28 |
<Shelwien> | not necessarily; it only makes them interested in compression ;) | 2010-01-16 15:31:36 |
<pmcontext> | :P | 2010-01-16 15:31:59 |
| im working on changes that toffer said after seeing my code | 2010-01-16 15:32:58 |
<Shelwien> | i don't know what he said ;) | 2010-01-16 15:33:17 |
<pmcontext> | yes its in other window | 2010-01-16 15:33:27 |
| u where discussing here , so i thought to talk other window | 2010-01-16 15:34:16 |
| would u like to see code ? | 2010-01-16 15:35:12 |
<Shelwien> | can't say that, but i still can look at it ;) | 2010-01-16 15:37:28 |
*** encode has joined the channel | 2010-01-16 15:38:12 |
<encode> | just tested RKUC - PPM with sparse contexts | 2010-01-16 15:38:28 |
| will integrate that into ppmx | 2010-01-16 15:38:41 |
<Shelwien> | ... | 2010-01-16 15:38:45 |
<STalKer-Y> | maybe that is the reason why i am not good ;D | 2010-01-16 15:38:46 |
| i should start training ;) | 2010-01-16 15:39:11 |
<Shelwien> | yeah ;) | 2010-01-16 15:39:14 |
<encode> | tricky part is where to integrate a sparse model | 2010-01-16 15:39:18 |
<pmcontext> | http://pastebin.com/d5dec9447 | 2010-01-16 15:39:21 |
<Shelwien> | in ppmonstr Shkarin integrated it via SSE context | 2010-01-16 15:41:27 |
| he made a sparse match model there | 2010-01-16 15:41:55 |
| and used a flag like (c==sparse_guess) in SSE context for unary symbol coding | 2010-01-16 15:42:30 |
<encode> | RKUC did it differently for sure | 2010-01-16 15:44:51 |
*** mike_____ has left the channel | 2010-01-16 15:45:01 |
| I guess I may replace one of the model with Sparse one | 2010-01-16 15:45:11 |
| AxCD context | 2010-01-16 15:45:37 |
| instead of ABC | 2010-01-16 15:45:50 |
| will do some experiments | 2010-01-16 15:46:07 |
| just currently I'm installing Windows 7 onto my laptop | 2010-01-16 15:46:26 |
| WinXP is too old | 2010-01-16 15:46:32 |
<Shelwien> | pmcontext: well, it looks ok actually, except for 4095*ct[1] thing | 2010-01-16 15:46:34 |
<encode> | BTW, we may run DOS compressor via DOSbox | 2010-01-16 15:47:10 |
<Shelwien> | but vista+ on netbook is a waste | 2010-01-16 15:47:15 |
<encode> | still RKUC runs quite buggy | 2010-01-16 15:47:18 |
| nope | 2010-01-16 15:47:25 |
| I guess | 2010-01-16 15:47:29 |
| W7 is fast enough | 2010-01-16 15:47:47 |
<pmcontext> | is it bad | 2010-01-16 15:47:52 |
<Shelwien> | ? | 2010-01-16 15:48:00 |
<pmcontext> | shelwien: 4095*ct[1] thing <<-- bad ? | 2010-01-16 15:49:00 |
<Shelwien> | well, it should be 4096* normally | 2010-01-16 15:49:22 |
| or, better ct[1]<<12 | 2010-01-16 15:49:26 |
<encode> | (ct[1]<<12)/(ct[0]+ct[1]) | 2010-01-16 15:49:45 |
<Shelwien> | 1.0 probability maps t 4096 there, not 4095 | 2010-01-16 15:49:49 |
| *maps to | 2010-01-16 15:49:59 |
<pmcontext> | ah i see | 2010-01-16 15:50:30 |
<Shelwien> | also, it'd be really nice of you if you could get rid of STL there ;) | 2010-01-16 15:51:16 |
*** encode has left the channel | 2010-01-16 15:51:53 |
<pmcontext> | shelwien | 2010-01-16 16:00:04 |
| why is (c0 * 31 / 32 +1 ) , (c1 * 32 / 31 ) better then rescaling | 2010-01-16 16:00:05 |
| how does this work | 2010-01-16 16:00:07 |
*** Shelwien has left the channel | 2010-01-16 16:03:37 |
*** Shelwien has joined the channel | 2010-01-16 16:04:04 |
<Shelwien> | damn, somehow think computer started hanging when i walk around | 2010-01-16 16:04:29 |
<pmcontext> | why is (c0 * 31 / 32 +1 ) , (c1 * 32 / 31 ) better then rescaling | 2010-01-16 16:05:13 |
| how does this work , and wb | 2010-01-16 16:05:15 |
<Shelwien> | anyway, as to p*=(1-wr) update | 2010-01-16 16:05:32 |
| its the same as rescaling | 2010-01-16 16:05:40 |
| like, dividing the counts by 2 after 256 iterations | 2010-01-16 16:06:26 |
| is equivalent to dividing the counts by 2^(1/256) on each iteration ;) | 2010-01-16 16:06:42 |
<pmcontext> | yes i am doing rescaling after count > 1023 | 2010-01-16 16:06:43 |
| if(ct[b] > 1023) { | 2010-01-16 16:06:55 |
| ct[b]= (ct[b]/2) + 1; | 2010-01-16 16:06:57 |
| ct[1-b]= (ct[1-b]/2) + 1; | 2010-01-16 16:06:58 |
| } | 2010-01-16 16:07:00 |
<Shelwien> | http://www.wolframalpha.com/input/?i=1./2^(1/256) | 2010-01-16 16:08:39 |
| 1/2^(1/256) = ~0.997296 | 2010-01-16 16:09:19 |
| 1/(1-1./2^(1/256)) = ~369.830 | 2010-01-16 16:09:45 |
| so dividing by 2 each too iterations would be roughly equivalent to | 2010-01-16 16:10:21 |
| p *= 369/370 | 2010-01-16 16:10:55 |
<pmcontext> | oh its like counter = counter * 0.9 ? | 2010-01-16 16:11:07 |
<Shelwien> | and 31/32 is just a faster rate | 2010-01-16 16:11:11 |
| sure | 2010-01-16 16:11:21 |
<pmcontext> | im not sure how i can use this to replace the rescaling code i use | 2010-01-16 16:12:25 |
<Shelwien> | you can add more precision bits to your counters | 2010-01-16 16:12:59 |
<pmcontext> | now it is 10 bits i think , since i rescale > 1023 | 2010-01-16 16:13:42 |
<Shelwien> | yes, but its integer 10 bits as you increment it by 1 | 2010-01-16 16:14:07 |
| now, what if you make it ct[b] = (ct[b]*31/32) + 1024 | 2010-01-16 16:14:33 |
| ct[1-b] = ct[1-b]*31/32 ? | 2010-01-16 16:14:40 |
| and remove the rescaling | 2010-01-16 16:14:54 |
<pmcontext> | hmm i will try | 2010-01-16 16:15:06 |
| it seems to compress almost same | 2010-01-16 16:16:22 |
| with above change and rescaling removed | 2010-01-16 16:16:35 |
<Shelwien> | sure | 2010-01-16 16:16:55 |
| but this allows for some fine-tuning | 2010-01-16 16:17:38 |
<pmcontext> | with rescaling book1 346858 | 2010-01-16 16:17:54 |
<Shelwien> | well, book1 is pretty stationary, so no wonder | 2010-01-16 16:18:25 |
<pmcontext> | with 31/32 change 349347 | 2010-01-16 16:18:33 |
<Shelwien> | why don't you try some binary instead | 2010-01-16 16:18:35 |
<pmcontext> | ok | 2010-01-16 16:18:39 |
| with rescaling obj1 12321 | 2010-01-16 16:20:05 |
| without rescale and using 31/32 13927 | 2010-01-16 16:20:07 |
<toffer> | 25 f* = 1.809810, iteration 20+20+25 | 2010-01-16 16:39:42 |
<Shelwien> | 1.83 on my side | 2010-01-16 16:40:09 |
<toffer> | contexts | 2010-01-16 16:40:17 |
<Shelwien> | MDL_MASKS or what? | 2010-01-16 16:40:43 |
<toffer> | o1,2,3,5 and match is o8 | 2010-01-16 16:40:58 |
| all expect o8 and o8 with leading 0x40 | 2010-01-16 16:41:08 |
| MDL_MASKS and MDL_0x40 | 2010-01-16 16:41:14 |
| and MM_ORDER | 2010-01-16 16:41:17 |
| i meant all expect o5 and o8 with 0x40 | 2010-01-16 16:41:57 |
<Shelwien> | masks = 3F0F0300 | 2010-01-16 16:42:01 |
| 0x40 = 1100 | 2010-01-16 16:42:16 |
| order = 100 | 2010-01-16 16:42:20 |
<toffer> | that's o0, o2, o4, o7 and match is o5 | 2010-01-16 16:43:15 |
| i don't think that the model assignment is stable atm | 2010-01-16 16:44:17 |
| it gonna be after roughly 100 iterations | 2010-01-16 16:44:24 |
| which iteration are you at? | 2010-01-16 16:46:08 |
<Shelwien> | who knows, 44 of some pass ;) | 2010-01-16 16:46:47 |
<toffer> | ok | 2010-01-16 16:48:41 |
| so let's see what happens | 2010-01-16 16:48:45 |
| it should get around 1.800bpc | 2010-01-16 16:48:55 |
| dunnot know how good the initial configuration which was hard coded is | 2010-01-16 16:49:25 |
| and as i said it got around 1.800bpc at that memory level | 2010-01-16 16:49:41 |
<Shelwien> | well, 1.814 atm | 2010-01-16 17:12:20 |
<toffer> | yours gonna take longer to converge, since the mutation rate was higher, initially | 2010-01-16 17:12:48 |
| i've lowered it here, since i only got a dual core amd at home | 2010-01-16 17:13:00 |
<Shelwien> | hm... | 2010-01-16 17:13:44 |
| meanwhile, i made newbrc into a real o0 coder | 2010-01-16 17:14:05 |
| and somehow its twice faster than fpaq0pv4b on enwik7 %) | 2010-01-16 17:14:17 |
<toffer> | ? | 2010-01-16 17:14:31 |
<Shelwien> | http://www.ctxmodel.net/files/newbrc/newbrc_1.rar | 2010-01-16 17:14:44 |
<toffer> | sorry, i have to leave now | 2010-01-16 17:17:50 |
<Shelwien> | ;) | 2010-01-16 17:18:09 |
<toffer> | to get the next train | 2010-01-16 17:18:12 |
| bye | 2010-01-16 17:18:13 |
| till tomorrow | 2010-01-16 17:18:16 |
*** toffer has left the channel | 2010-01-16 17:18:22 |
<pmcontext> | ;) << figuring out which emoticon that is | 2010-01-16 17:18:35 |
| ok got it | 2010-01-16 17:18:45 |
| shelwien in o1rc mix_test_v2 | 2010-01-16 17:20:20 |
| do u use static mixing or dynamic ? | 2010-01-16 17:20:21 |
<Shelwien> | mostly dynamic | 2010-01-16 17:21:21 |
<pmcontext> | o1rc0.exe | 2010-01-16 17:21:26 |
| ok | 2010-01-16 17:21:41 |
<Shelwien> | there was a static mixing branch in some early version, for comparison | 2010-01-16 17:22:17 |
| but i don't remember which ;) | 2010-01-16 17:22:21 |
<pmcontext> | ok and how do i use it ? o1rc0 input output ? | 2010-01-16 17:22:41 |
<Shelwien> | maybe o1rc c input output | 2010-01-16 17:23:22 |
| and d | 2010-01-16 17:23:25 |
<pmcontext> | strange it not do anything , it just exit | 2010-01-16 17:25:05 |
| K:\exe>o1rc0 c test.bmp ttt | 2010-01-16 17:25:07 |
| K:\exe> | 2010-01-16 17:25:08 |
<Shelwien> | try running it via Start/Run , maybe it'd say some error | 2010-01-16 17:25:44 |
| though i don't know which exe do you mean anyway | 2010-01-16 17:26:00 |
<pmcontext> | o1rc0.exe from mix_test_v2 | 2010-01-16 17:26:38 |
| and same nothing hapen from start/run | 2010-01-16 17:26:40 |
| tried o1rc1 and o1rc2 also same | 2010-01-16 17:27:13 |
<Shelwien> | ah | 2010-01-16 17:27:31 |
| you just had to look at o1rc.cpp there :) | 2010-01-16 17:27:52 |
| FILE* f = fopen( "book1bwt", "rb" ); | 2010-01-16 17:28:07 |
| if( f==0 ) return 1; | 2010-01-16 17:28:08 |
| FILE* g = fopen( "book1bwt.ari", "wb" ); | 2010-01-16 17:28:08 |
| if( g==0 ) return 2; | 2010-01-16 17:28:08 |
| of course it quits ;) | 2010-01-16 17:28:13 |
<pmcontext> | Oh silly me :P | 2010-01-16 17:28:35 |
<Shelwien> | well, later versions use commandline, and i totally forgot ;) | 2010-01-16 17:29:20 |
<pmcontext> | since i have 2 models and static mix , i notice | 2010-01-16 17:39:49 |
| if i chosse w0 = 0 and w1 = 64 , it gives me best result | 2010-01-16 17:40:14 |
| if i use w0 it is resulting little bad compression | 2010-01-16 17:40:54 |
<Shelwien> | well, for normal data its natural | 2010-01-16 17:40:58 |
| but try book1bwt | 2010-01-16 17:41:06 |
<pmcontext> | ok one sec | 2010-01-16 17:41:14 |
| w0 = 1 , w1 = 63 , book1bwt 271536 | 2010-01-16 17:42:24 |
| dynamic mix will improve this ? | 2010-01-16 17:44:20 |
<Shelwien> | well, counters first i guess | 2010-01-16 17:45:24 |
<pmcontext> | i notice in ur readme file 216816 - Static order0+order1 mix . this seem to much compressed | 2010-01-16 17:45:44 |
<Shelwien> | you're supposed to get around 230k there | 2010-01-16 17:45:54 |
| even with a plain order0 coder | 2010-01-16 17:46:00 |
<pmcontext> | im not sure wat is making it bad | 2010-01-16 17:46:36 |
<Shelwien> | counters | 2010-01-16 17:47:23 |
<pmcontext> | http://pastebin.com/d4083530d | 2010-01-16 17:47:40 |
| counters o.o | 2010-01-16 17:51:08 |
<Shelwien> | struct Predict{ | 2010-01-16 17:56:56 |
| unsigned P; | 2010-01-16 17:56:57 |
| Predict(){ P=2048;} | 2010-01-16 17:56:57 |
| unsigned int p(){return P;} | 2010-01-16 17:56:57 |
| void update( int b ) { | 2010-01-16 17:56:57 |
| b ? | 2010-01-16 17:56:57 |
| P += (4096-P)>>5 | 2010-01-16 17:56:59 |
| : P -= P>>5; | 2010-01-16 17:57:01 |
| } | 2010-01-16 17:57:03 |
| }; | 2010-01-16 17:57:05 |
| makes it 240k | 2010-01-16 17:57:07 |
| but compiling your program is troublesome for me because of STL | 2010-01-16 17:57:25 |
<pmcontext> | ok i will remove stl i think u mean cout ? | 2010-01-16 17:57:56 |
| i use visual studio | 2010-01-16 17:58:08 |
<Shelwien> | well, i use various compiler (MSC,IntelC,gcc) but in console, without studio | 2010-01-16 17:58:40 |
| and for MSC/IntelC I use old VC6 libraries | 2010-01-16 17:59:11 |
<pmcontext> | i didnt use any stl classes im sure | 2010-01-16 17:59:17 |
<Shelwien> | because they're much more convenient | 2010-01-16 17:59:28 |
| yeah, that's why it doesn't make any sense to link STL there at all | 2010-01-16 17:59:42 |
<pmcontext> | o.o do u mean the stdafx.h ? | 2010-01-16 18:00:01 |
<Shelwien> | no, i mean using namespace std | 2010-01-16 18:00:19 |
<pmcontext> | oh that is problem with visual studio . i have to put that to use printf and other functions | 2010-01-16 18:00:41 |
| or it says not found | 2010-01-16 18:00:50 |
<Shelwien> | you can put #include <stdio.h> instead | 2010-01-16 18:00:58 |
<pmcontext> | cant use | 2010-01-16 18:01:06 |
| visual studio 2005 , | 2010-01-16 18:01:16 |
| it say printf is not declared , without using namespace std | 2010-01-16 18:01:40 |
| same for cout | 2010-01-16 18:02:05 |
<Shelwien> | sure for cout | 2010-01-16 18:02:14 |
| but for printf that can't be | 2010-01-16 18:02:22 |
<pmcontext> | i will try | 2010-01-16 18:02:22 |
| ok got it ur right | 2010-01-16 18:06:01 |
| it only ask for cout | 2010-01-16 18:06:13 |
| i will change counter | 2010-01-16 18:06:51 |
| oh it gives 240232 , why is this counter better ? | 2010-01-16 18:12:36 |
<Shelwien> | adapts faster probably | 2010-01-16 18:13:07 |
<pmcontext> | so it better then c1 / ( c1 + c0 ) o.o | 2010-01-16 18:14:48 |
<Shelwien> | not necessarily | 2010-01-16 18:16:47 |
| but in this specific case it is | 2010-01-16 18:16:57 |
| otherwise, you can try changing the parameters in the old counter | 2010-01-16 18:17:34 |
| like number of iterations before rescale | 2010-01-16 18:17:44 |
| or some different rescale coef instead of 31/32 | 2010-01-16 18:18:13 |
<pmcontext> | i see | 2010-01-16 18:22:16 |
| with prev counter 233650 | 2010-01-16 18:25:34 |
| what is simplest way to adapt weights | 2010-01-16 18:30:32 |
*** pinc has left the channel | 2010-01-16 18:56:57 |
<Shelwien> | simplest... | 2010-01-16 19:03:15 |
| well, you can adapt it like a counter | 2010-01-16 19:03:23 |
| like you have bit b and probs p1,p2 | 2010-01-16 19:03:37 |
| and weight w | 2010-01-16 19:03:43 |
| (in the same scale as a probability) | 2010-01-16 19:03:55 |
| then you calculate a usefulness flag | 2010-01-16 19:04:47 |
*** scott___ has joined the channel | 2010-01-16 19:05:44 |
| flag = (b ? (p1-p2) : (SCALE-p1)-(SCALE-p2))>=0; | 2010-01-16 19:05:50 |
| and update the w counter with this flag | 2010-01-16 19:05:59 |
| hi | 2010-01-16 19:06:05 |
<scott___> | hi | 2010-01-16 19:07:21 |
| what was the page with the results of e.coli I thought I bookmaked it but can't find it? | 2010-01-16 19:08:39 |
<Shelwien> | dist = (b ? (p1-p2) : (SCALE-p1)-(SCALE-p2)); | 2010-01-16 19:09:42 |
| f_update = (dist>Limit) || (-dist>Limit); | 2010-01-16 19:09:42 |
| f_dir = (dist>0); | 2010-01-16 19:09:42 |
| if( f_update ) w.Update( f_dir ); | 2010-01-16 19:09:42 |
| !grep ratings*.coli | 2010-01-16 19:10:07 |
| !grep ratings.*coli | 2010-01-16 19:10:14 |
| http://compressionratings.com/s_dna.full.html | 2010-01-16 19:10:40 |
<scott___> | thanks | 2010-01-16 19:11:19 |
| booked marked it and wrote done yours and paq value | 2010-01-16 19:14:23 |
| not done down | 2010-01-16 19:14:37 |
| those 2 other things ...nishi.dream are empty | 2010-01-16 19:31:17 |
<Shelwien> | what? | 2010-01-16 19:31:31 |
| ah, of course they're empty as search didn't find anything | 2010-01-16 19:31:50 |
<scott___> | why did you send the links then? | 2010-01-16 19:32:30 |
<Shelwien> | i didn't | 2010-01-16 19:32:39 |
| i tried to find the link in the chat log | 2010-01-16 19:32:52 |
<scott___> | thats ok the page you sent has everything about the compressed lentghs doesn't | 2010-01-16 19:33:41 |
<Shelwien> | dunno what you mean ;) | 2010-01-16 19:34:14 |
<pmcontext> | i reading | 2010-01-16 19:34:30 |
<scott___> | the link to dna.full.htm have all the data | 2010-01-16 19:34:44 |
| had | 2010-01-16 19:34:49 |
| yours is 1108006 | 2010-01-16 19:36:12 |
<Shelwien> | http://encode.dreamhosters.com/showpost.php?p=9527&postcount=10 | 2010-01-16 19:37:55 |
<scott___> | so 1101678 paq8p should be the thing to beat | 2010-01-16 19:40:16 |
<Shelwien> | well, it'd be already cool if you make it under 1159672 ;) | 2010-01-16 19:41:32 |
<scott___> | I think I will try 2 other familes of compressors. I notice that when I increage form 8 bits to 16 and even 19 which is werd I get better compression on big files | 2010-01-16 19:42:48 |
| since the basis unit is 2 bits i think I will try 2 4 6 8 10 12 14 16 18 and just see what they do in pure arithmetic | 2010-01-16 19:44:11 |
| basis meant basic | 2010-01-16 19:44:27 |
| I do every thing in binary cells arb255 style does it in 8 bit bytes but I can wire the celss so no boundary and use it to always get the next byte with most files this sucks but with files of two bit unites it might work nice | 2010-01-16 19:47:30 |
| 8 bits per btye | 2010-01-16 19:48:00 |
<Shelwien> | whatever, just post some result already ;) | 2010-01-16 19:49:34 |
<scott___> | I hope my bwt does the best of what I am trying but feel I have to test several methods. I will pick the best and show it here. Then I may try to play the tuning the cell game your good at | 2010-01-16 19:49:50 |
| ok no more on it till I have results fair enough | 2010-01-16 19:50:13 |
| later guys I am in a coding mood bye | 2010-01-16 19:53:14 |
*** scott___ has left the channel | 2010-01-16 19:53:29 |
*** Krugz has joined the channel | 2010-01-16 20:14:23 |
*** pmcontext has left the channel | 2010-01-16 20:46:13 |
*** encode has joined the channel | 2010-01-16 21:00:06 |
<encode> | windows 7 works as a charm | 2010-01-16 21:00:23 |
| also i saved a hidden 4 gb on my laptop | 2010-01-16 21:01:32 |
| making a system disk larger | 2010-01-16 21:02:03 |
*** encode has left the channel | 2010-01-16 21:03:08 |
<Shelwien> | toffer, seems the opt already quit (with 1.790804) - http://nishi.dreamhosters.com/u/m1_20100116.rar | 2010-01-16 21:28:19 |
| !next | 2010-01-16 23:09:04 |