*** toffer has joined the channel | 2009-09-30 21:44:05 |
<toffer> | somehow my browser crashed | 2009-09-30 21:48:20 |
| somehow i think i really need to grab a standalone irc client ^^ | 2009-09-30 21:48:57 |
<Shelwien> | mirc? | 2009-09-30 21:49:10 |
| ah, linux... | 2009-09-30 21:49:17 |
| well, there're many too | 2009-09-30 21:49:21 |
<toffer> | as i said it needs to be cross plattform | 2009-09-30 21:50:57 |
| since i got all important software for windows and linux | 2009-09-30 21:51:11 |
<Shelwien> | well, firefox should have a built-in irc client | 2009-09-30 21:51:32 |
<toffer> | mostly openoffice, latex, inkscape, octave, codeblocks, gcc/mingw, firefox, thunderbird | 2009-09-30 21:51:35 |
| really | 2009-09-30 21:51:40 |
<Shelwien> | yeah, try opening irc://irc.irchighway.org/compression in firefox | 2009-09-30 21:52:01 |
*** toffer has left the channel | 2009-09-30 21:53:47 |
*** cm has joined the channel | 2009-09-30 21:54:47 |
<cm> | well i had to install it | 2009-09-30 21:55:00 |
| but the webchat looks far more convenient to me | 2009-09-30 21:55:15 |
<Shelwien> | ;) | 2009-09-30 21:55:45 |
<toffer> | at least here's copy and paste functionality | 2009-09-30 21:57:13 |
| looks a bit ugly, but ok - guess i'll keep it | 2009-09-30 21:57:23 |
<Shelwien> | yeah, also webchat won't be able to access bots | 2009-09-30 21:57:49 |
| ...though i guess DCC doesn't work there either... | 2009-09-30 21:58:13 |
<toffer> | [ERROR] Internal error dispatching command “dcc-accept”. | 2009-09-30 21:58:41 |
| [ERROR] Must be in REQUESTED state and direction GET. | 2009-09-30 21:58:42 |
<Shelwien> | internal errors are fun | 2009-09-30 21:59:04 |
| i guess it would accept a file though | 2009-09-30 21:59:38 |
<toffer> | an internal error report is still better than a segfault or a failed assertion :) | 2009-09-30 22:00:26 |
<Shelwien> | ...doesn't accept a file either, it seems | 2009-09-30 22:00:54 |
<toffer> | well i did accept | 2009-09-30 22:01:18 |
| but it doesn't work due to university firewall i guess | 2009-09-30 22:01:32 |
| maybe if you change the port | 2009-09-30 22:01:45 |
| <1000 | 2009-09-30 22:01:48 |
<Shelwien> | nah, its direct p2p | 2009-09-30 22:02:17 |
<toffer> | direct connections are blocked for ports > 1000 | 2009-09-30 22:02:38 |
<Shelwien> | well, whatever | 2009-09-30 22:02:38 |
<toffer> | ] Got DCC File Transfer offer from “Shelwien” (91.124.210.54:1024) | 2009-09-30 22:02:47 |
<Shelwien> | well, i can try... | 2009-09-30 22:03:08 |
<toffer> | you don't need to | 2009-09-30 22:03:22 |
| we got ftp and stuff so why bother | 2009-09-30 22:03:30 |
| ^^ | 2009-09-30 22:03:32 |
<Shelwien> | hmm... maybe works... | 2009-09-30 22:04:42 |
| blocking ports >1000 is weird though... | 2009-09-30 22:04:58 |
*** toffer has left the channel | 2009-09-30 22:05:37 |
*** toffer has joined the channel | 2009-09-30 22:06:37 |
<toffer> | weird | 2009-09-30 22:06:43 |
<Shelwien> | on my side it said "transfer complete" ;) | 2009-09-30 22:09:57 |
<toffer> | -rwxrwxr-x 1 cm cm 0 2009-10-01 00:04 /mnt/shared_extern/temp/arith-speedups.pdf | 2009-09-30 22:10:50 |
| well here it just froze | 2009-09-30 22:10:56 |
<Shelwien> | ... | 2009-09-30 22:11:03 |
| well, i wasn't going to rely on that for sending you files anyway ;) | 2009-09-30 22:11:31 |
<toffer> | yep. me not either | 2009-09-30 22:11:45 |
<Shelwien> | but actually irc is pretty convenient for filesharing | 2009-09-30 22:11:50 |
| more than icq at least | 2009-09-30 22:12:08 |
| btw, that pdf reminded me | 2009-09-30 22:13:43 |
| did you check whether you have (or can have) access to DCC articles on ieee site? | 2009-09-30 22:14:03 |
<toffer> | i guess yes | 2009-09-30 22:15:49 |
| at least i once tried | 2009-09-30 22:15:58 |
| but not everything was accessible though | 2009-09-30 22:16:25 |
<Shelwien> | ? | 2009-09-30 22:16:42 |
| also, i wonder if i'd be able to access that if i'd pay that membership free.. | 2009-09-30 22:17:44 |
| *fee | 2009-09-30 22:17:47 |
<toffer> | dunnot know | 2009-09-30 22:20:53 |
| it`s some university license | 2009-09-30 22:20:59 |
| they sell some kind of access packages | 2009-09-30 22:21:09 |
| and the university got some kind of access all flatrate | 2009-09-30 22:21:23 |
<Shelwien> | well, it'd be really nice of you | 2009-09-30 22:23:24 |
| if you could download all the DCC articles since 2005 ;) | 2009-09-30 22:23:47 |
| or explain how else i can get them without paying $25 per article ;) | 2009-09-30 22:24:06 |
<toffer> | no problem | 2009-09-30 22:24:15 |
| but it'll take some time | 2009-09-30 22:24:34 |
<Shelwien> | i'm not in a hurry at all | 2009-09-30 22:24:58 |
| i'm just trying to collect them ;) | 2009-09-30 22:25:04 |
<toffer> | is there any easy way to get all the urls | 2009-09-30 22:25:06 |
| cause i could use wget than | 2009-09-30 22:25:16 |
<Shelwien> | that's probably unlikely, because you can't download them without logging in | 2009-09-30 22:25:52 |
| hell, they sell them ;) | 2009-09-30 22:25:56 |
| i'd write some perl parsers in such cases though | 2009-09-30 22:26:31 |
| but dunno whether you'd bother with that | 2009-09-30 22:26:41 |
<toffer> | i don't need to log in | 2009-09-30 22:26:54 |
| it's detected via ip ranges | 2009-09-30 22:27:08 |
<Shelwien> | you have to check it yourself then ;) | 2009-09-30 22:27:33 |
| its not like that for me unfortunately ;) | 2009-09-30 22:27:45 |
| or i guess you can setup a proxy for me there ;) | 2009-09-30 22:27:57 |
| eg. i like this one - http://www.3proxy.ru/download/ | 2009-09-30 22:29:16 |
| btw http://91.124.210.54/list.txt | 2009-09-30 22:29:30 |
| list of compression-related OCRed books i have | 2009-09-30 22:30:00 |
*** pinc has joined the channel | 2009-09-30 22:37:38 |
| btw, any progress with m1? ;) | 2009-09-30 22:52:49 |
| like multiple submodels? ;) | 2009-09-30 22:52:57 |
| btw, i guess you're not going to try anything funny with rangecoders? | 2009-09-30 22:53:45 |
<toffer> | i played around with wget | 2009-09-30 23:00:49 |
| and these ursl | 2009-09-30 23:00:52 |
| urls | 2009-09-30 23:00:53 |
| i can automatically download everything now ^^ | 2009-09-30 23:01:03 |
| well i've made a mixing scheme which combines static estimations only via additions | 2009-09-30 23:01:49 |
| i can now implement a mixing tree via that | 2009-09-30 23:02:45 |
<Shelwien> | i'm not sure why you would need static mixing | 2009-09-30 23:04:08 |
| can't you just make a fsm to work like that? | 2009-09-30 23:04:21 |
<toffer> | downloading ddc 2005 | 2009-09-30 23:10:17 |
| it doesn't require any auxilary memory | 2009-09-30 23:10:40 |
<Shelwien> | and what's good in that and static mixing anyway? ;) | 2009-09-30 23:11:38 |
<toffer> | it's used as a context for sse | 2009-09-30 23:13:47 |
| after a few papers wget failed "All online seats are currently occupied." | 2009-09-30 23:14:03 |
<Shelwien> | %) | 2009-09-30 23:14:18 |
| i guess you can setup a retry there... | 2009-09-30 23:14:33 |
<toffer> | yep | 2009-09-30 23:14:46 |
| anyway you can be sure to get that stuff next week | 2009-09-30 23:15:57 |
| all of it | 2009-09-30 23:15:58 |
| well maybe they block such script downloads (quickly accessing urls) | 2009-09-30 23:16:59 |
<Shelwien> | try adding some sleep between wget calls maybe? | 2009-09-30 23:19:21 |
<toffer> | it's a cookie issue | 2009-09-30 23:23:34 |
| but i can fix that i guess | 2009-09-30 23:23:39 |
| seems to work now | 2009-09-30 23:33:47 |
| fetching everything | 2009-09-30 23:33:52 |
| ^^ | 2009-09-30 23:33:54 |
<Shelwien> | %) | 2009-09-30 23:36:14 |
<toffer> | mh | 2009-09-30 23:41:49 |
| somehow it still doesnt | 2009-09-30 23:41:53 |
<Shelwien> | delays and retries? | 2009-09-30 23:42:34 |
<toffer> | retries don't work | 2009-09-30 23:43:29 |
| but i can run the script with something like 15min delay | 2009-09-30 23:43:39 |
| that'd be a normal usage pattern | 2009-09-30 23:43:44 |
| that's the session expire limit | 2009-09-30 23:43:57 |
<Shelwien> | no, i mean | 2009-09-30 23:44:10 |
| you can check if its saved a pdf | 2009-09-30 23:44:19 |
*** pinc|mirror has joined the channel | 2009-09-30 23:44:23 |
<toffer> | i know | 2009-09-30 23:44:25 |
<Shelwien> | and retry if it didn't | 2009-09-30 23:44:29 |
<toffer> | that's what i did | 2009-09-30 23:44:30 |
| but the html page it generated | 2009-09-30 23:44:37 |
| said that the seat limit is reached | 2009-09-30 23:44:50 |
| and seats expire after 15 minutes of inactivity | 2009-09-30 23:45:00 |
<Shelwien> | ok, though i still think that there's no sense to wait for 15 min | 2009-09-30 23:45:28 |
| just add some delay and check what it stored | 2009-09-30 23:46:18 |
<toffer> | the cookies store some session id | 2009-09-30 23:46:44 |
<Shelwien> | yeah | 2009-09-30 23:46:57 |
| you can get it with wget i thin | 2009-09-30 23:47:05 |
<toffer> | but re-using the cookies doesn't work somehow | 2009-09-30 23:47:07 |
<Shelwien> | *think | 2009-09-30 23:47:07 |
<toffer> | i used --load-cookie and --save-cookie | 2009-09-30 23:47:30 |
<Shelwien> | you can open some html page with gives you a cookie first | 2009-09-30 23:47:32 |
| yeah | 2009-09-30 23:47:34 |
| and then get the pdf | 2009-09-30 23:47:39 |
| but there might be an additional option necessary | 2009-09-30 23:47:58 |
| like | 2009-09-30 23:48:13 |
| --keep-session-cookies load and save session (non-permanent) cookies. | 2009-09-30 23:48:13 |
<toffer> | that's what grep told me too ^^ | 2009-09-30 23:49:07 |
*** pinc has left the channel | 2009-09-30 23:49:45 |
| it seems to work with a 30sec delay | 2009-09-30 23:51:55 |
| guess tomorrow everything will be here | 2009-09-30 23:54:09 |
| but i gonna go to bed now | 2009-09-30 23:54:13 |
<Shelwien> | i should too, i guess | 2009-09-30 23:54:33 |
| bye ;) | 2009-09-30 23:54:35 |
<toffer> | gn8 | 2009-09-30 23:55:01 |
*** toffer has left the channel | 2009-09-30 23:55:09 |
*** pinc|mirror has left the channel | 2009-10-01 00:18:22 |
*** Shelwien has left the channel | 2009-10-01 02:06:42 |
*** pinc has joined the channel | 2009-10-01 07:19:17 |
*** Shelwien has joined the channel | 2009-10-01 07:32:53 |
*** pinc has left the channel | 2009-10-01 08:03:19 |
*** pinc has joined the channel | 2009-10-01 08:30:09 |
*** pinc|mirror has joined the channel | 2009-10-01 08:40:26 |
*** pinc has left the channel | 2009-10-01 08:41:12 |
*** toffer has joined the channel | 2009-10-01 08:58:56 |
<Shelwien> | hi | 2009-10-01 08:59:37 |
<toffer> | slept already? | 2009-10-01 08:59:45 |
<Shelwien> | not sure | 2009-10-01 09:00:07 |
<toffer> | well you have to know | 2009-10-01 09:01:49 |
| otherwise i guess you have serious trouble -.- | 2009-10-01 09:02:00 |
<Shelwien> | not sure whether it counts ;) | 2009-10-01 09:02:10 |
<toffer> | yestreday i downloaded the whole ddc 2005 | 2009-10-01 09:02:23 |
| that additional cookie option helped | 2009-10-01 09:02:31 |
<Shelwien> | ;) | 2009-10-01 09:02:36 |
<toffer> | the rest will be here today | 2009-10-01 09:02:43 |
| but currently i'm under windows | 2009-10-01 09:02:54 |
<Shelwien> | there's wget for windows too ;) | 2009-10-01 09:03:07 |
| anyway, do you have a place where to upload it? | 2009-10-01 09:05:02 |
<toffer> | yeah | 2009-10-01 09:06:32 |
| i'll upload when i got everything | 2009-10-01 09:08:45 |
<Shelwien> | sure | 2009-10-01 09:08:52 |
<toffer> | but there're alot of "papers" which are just a bit more than an abstract | 2009-10-01 09:09:01 |
<Shelwien> | weird | 2009-10-01 09:09:22 |
<toffer> | and what else happened at your side? | 2009-10-01 09:12:28 |
<Shelwien> | nothing probably | 2009-10-01 09:12:54 |
*** Shelwien has left the channel | 2009-10-01 09:13:55 |
*** Guest9968193 has joined the channel | 2009-10-01 09:13:59 |
| <Shelwien> thinking how to make that damned LZ thing to work in one pass | 2009-10-01 09:17:25 |
| there's a data window and hashtable window | 2009-10-01 09:20:10 |
| and now i have to undo changes in these when a match is found | 2009-10-01 09:21:00 |
<toffer> | if it's just arithmetic ops | 2009-10-01 09:23:16 |
| you could simply undo it | 2009-10-01 09:23:24 |
<Shelwien> | not really | 2009-10-01 09:24:14 |
<toffer> | dunnot know about your internal structure | 2009-10-01 09:24:21 |
| a+b-a=b | 2009-10-01 09:24:29 |
<Shelwien> | there's another window for rolling hashes | 2009-10-01 09:24:30 |
| and that's not the main problem | 2009-10-01 09:24:56 |
| just that its too complicated and annoying | 2009-10-01 09:25:15 |
| for a task like this | 2009-10-01 09:25:22 |
| well, i guess it could be much simpler if i didn't try to speed-optimize it ;) | 2009-10-01 09:26:14 |
| atm processes enwik9 in 7s btw | 2009-10-01 09:26:31 |
| finding matches at any distance | 2009-10-01 09:27:14 |
<toffer> | that's pretty good | 2009-10-01 09:28:02 |
| btw how are the statistics | 2009-10-01 09:28:11 |
<Shelwien> | ? | 2009-10-01 09:28:18 |
<toffer> | e.g., a match length distribution | 2009-10-01 09:28:24 |
<Shelwien> | well, it finds only longer matches | 2009-10-01 09:28:51 |
| like 100+ bytes | 2009-10-01 09:28:54 |
<toffer> | still acceptible as a preprocessor | 2009-10-01 09:29:23 |
| could you lower that limit to say 20 | 2009-10-01 09:29:37 |
<Shelwien> | well, there're algorithm-specific quirks | 2009-10-01 09:30:12 |
| but yes, basically | 2009-10-01 09:30:21 |
| "100" is just a random setting | 2009-10-01 09:30:38 |
| in fact, i think it should still support changing it as a commandline option %) | 2009-10-01 09:30:57 |
<toffer> | well how much compression do you get when replacing 100 byte strings with 1 byte tokens | 2009-10-01 09:31:22 |
<Shelwien> | there're 16500 matches with >100 len | 2009-10-01 09:33:18 |
<toffer> | and alltogether | 2009-10-01 09:34:14 |
| just 1.6% | 2009-10-01 09:34:20 |
| whoops | 2009-10-01 09:34:58 |
| 0.16 | 2009-10-01 09:35:02 |
| guess to use it as a preprocessor one needs to drastically lower the length | 2009-10-01 09:35:28 |
| the match length distribution i've seen looked like a laplacian distribution | 2009-10-01 09:36:10 |
| can you easily lower the length limit? | 2009-10-01 09:36:50 |
<Shelwien> | 7747637 bytes in matched strings | 2009-10-01 09:38:50 |
<toffer> | for e9? | 2009-10-01 09:39:30 |
<Shelwien> | yeah | 2009-10-01 09:39:39 |
<toffer> | what about a match length of 20 | 2009-10-01 09:39:58 |
<Shelwien> | its kinda troublesome, but i'd try | 2009-10-01 09:41:19 |
| 29s like that | 2009-10-01 09:41:49 |
| ...and a 500M hashtable... figures... | 2009-10-01 09:42:15 |
| 4.6M matches like that | 2009-10-01 09:43:55 |
| ...but i guess i won't be able to sum it up with console utils | 2009-10-01 09:44:40 |
<toffer> | ok | 2009-10-01 09:46:21 |
| ^^ | 2009-10-01 09:46:23 |
<Shelwien> | actually there's not much sense in LZ for enwik anyway | 2009-10-01 09:47:47 |
| even in template-based articles there's not that much complete matches | 2009-10-01 09:48:19 |
<toffer> | i guess there're long matches (>15 bytes) due to repeating phrases from time to time. and xml stuff | 2009-10-01 09:48:51 |
<Shelwien> | well, maybe some kind of more advanced analysis would help | 2009-10-01 09:48:53 |
| well, dunno about 20-byte matches | 2009-10-01 09:49:25 |
| but i looked at longer ones | 2009-10-01 09:49:32 |
| and there was stuff like reference links in some soccer articles %) | 2009-10-01 09:49:55 |
*** pinc has joined the channel | 2009-10-01 10:22:34 |
*** pinc|mirror has left the channel | 2009-10-01 10:26:30 |
<toffer> | changing m1 to 3 models just required to add 2 lines of code to the main model :) | 2009-10-01 11:49:05 |
<Shelwien> | ;) | 2009-10-01 11:49:27 |
<toffer> | a prediction is found via | 2009-10-01 11:49:37 |
| sse2d( quant[model0.state], mix2( quant[model1], quant[model2] ) ) | 2009-10-01 11:50:22 |
| mix2 is static | 2009-10-01 11:50:24 |
| can be made dynamic... | 2009-10-01 11:50:52 |
| but let's see how that preforms | 2009-10-01 11:50:57 |
<Shelwien> | i'd not call that 3 models with static mix | 2009-10-01 11:51:06 |
<toffer> | the reason for mixing quantised predictions is to take advantage of the nonlinearity contained in the fsm probability quantisation. in fact it's like mixing stretched predicitons | 2009-10-01 11:53:19 |
| and how would you call that? | 2009-10-01 11:53:28 |
| i really like the simplicity in that scheme. the prediction function just is sse( q[s0] + (q[s1]+q[s2]>>N) ) | 2009-10-01 11:57:31 |
| the whole complexity is moved into initialization | 2009-10-01 11:57:47 |
<Shelwien> | "encode" likes that too ;) | 2009-10-01 11:58:05 |
<toffer> | just look at distribution or fs classes | 2009-10-01 11:58:07 |
| fsm | 2009-10-01 11:58:12 |
| hey - it's no lz | 2009-10-01 11:58:26 |
| ^^ | 2009-10-01 11:58:27 |
<Shelwien> | no i meant his bcm etc | 2009-10-01 11:58:43 |
<toffer> | so how would you call that scheme? | 2009-10-01 11:59:07 |
<Shelwien> | that's still 2 models i think, one of these is just more complex than a simple counter | 2009-10-01 12:00:08 |
| imho its a matter of "degree of freedom" | 2009-10-01 12:00:46 |
| and adaptivity | 2009-10-01 12:00:55 |
<toffer> | well there're 3 context mask | 2009-10-01 12:01:30 |
| s | 2009-10-01 12:01:32 |
<Shelwien> | but not that i care if it'd really improve the compression ;) | 2009-10-01 12:01:39 |
<toffer> | i came up with two layouts | 2009-10-01 12:02:10 |
| this one | 2009-10-01 12:02:19 |
| and the obvious solution of mixing 2xm1 | 2009-10-01 12:02:40 |
| well for 3 models i'd fix the other input | 2009-10-01 12:03:09 |
<Shelwien> | it'd be interesting though, if you could compare static mix vs adaptive mix vs sse2 there | 2009-10-01 12:06:18 |
<toffer> | i will do that | 2009-10-01 12:09:23 |
| but that'll take some time | 2009-10-01 12:09:28 |
| and "some" should be months, since my spare time shirnks more and more | 2009-10-01 12:09:44 |
| at least now there's a flexible framework to "plug together" and quantise distributions represented as an array of counters | 2009-10-01 12:12:10 |
<Shelwien> | err... why would it take months to replace that static mix with another sse2? | 2009-10-01 12:13:42 |
<toffer> | no more time | 2009-10-01 12:18:02 |
| and in future work | 2009-10-01 12:18:09 |
| i guess the best overall variant will be mix(sse2,sse2) | 2009-10-01 12:22:37 |
<Shelwien> | yeah, probably | 2009-10-01 12:23:01 |
| sse2(sse2()) might work too, but its unstable | 2009-10-01 12:23:35 |
<toffer> | wow there're the first failed assertions :) | 2009-10-01 12:28:33 |
<Shelwien> | %) | 2009-10-01 12:28:46 |
<toffer> | it was just a typo in the source | 2009-10-01 12:42:21 |
| optimization is running now | 2009-10-01 12:42:25 |
| the optimizer blocked the statically mixed models... | 2009-10-01 13:06:25 |
<Shelwien> | huh... | 2009-10-01 13:06:51 |
<toffer> | that's due to the parameter which controls blocking is just 7 bit and the two models are configured via 2x80 bit | 2009-10-01 13:06:54 |
<Shelwien> | well, why don't you start with it then? | 2009-10-01 13:07:03 |
| i mean, optimize that static mix first | 2009-10-01 13:07:13 |
| and then add sse2 and another model | 2009-10-01 13:07:23 |
<toffer> | so to quickly improve compression it's more likely to just drop that 8 bit | 2009-10-01 13:07:30 |
| erm 7 | 2009-10-01 13:08:12 |
| well i fixed that parameter now | 2009-10-01 13:08:17 |
| and guess what | 2009-10-01 13:11:23 |
| it's blocked again | 2009-10-01 13:11:28 |
| via another parameter | 2009-10-01 13:11:31 |
| ^^ | 2009-10-01 13:11:33 |
<Shelwien> | ;) | 2009-10-01 13:11:39 |
<toffer> | adaptive mixing won't do that | 2009-10-01 13:11:42 |
| i'll now fix the static mix to w=0.5 to see what happens | 2009-10-01 13:11:58 |
| now there's another problem with collisions | 2009-10-01 13:27:28 |
| ^^ | 2009-10-01 13:27:41 |
| it doesn'T stop | 2009-10-01 13:27:46 |
| guess i'll have to try layout 2 that weekend taking this lesson into account | 2009-10-01 13:28:34 |
*** toffer has left the channel | 2009-10-01 13:42:54 |
*** toffer has joined the channel | 2009-10-01 14:10:42 |
*** pinc has left the channel | 2009-10-01 15:31:08 |
*** toffer has left the channel | 2009-10-01 18:22:12 |
<Shelwien> | !next | 2009-10-01 19:56:41 |