*** Shelwien has left the channel2009-09-09 02:53:59
*** pinc has joined the channel2009-09-09 07:08:27
*** pinc has left the channel2009-09-09 08:22:31
*** pinc has joined the channel2009-09-09 08:45:45
*** Shelwien has joined the channel2009-09-09 11:01:08
<osman> here is a really lazy programmer: http://imafrogg.com/blog/jpeg-text-compression/2009-09-09 11:15:47
 :)2009-09-09 11:15:49
<Shelwien> ...2009-09-09 11:17:08
 as funny as it may sound, there's some sense in using the visual text representation for compression2009-09-09 11:21:17
 (and other tasks too)2009-09-09 11:21:50
<osman> it's another topic i think2009-09-09 11:22:09
 i know what do you exactly mean2009-09-09 11:22:18
<Shelwien> like, how spammers write some keywords in their mails?2009-09-09 11:22:19
 also the same applies to the audio version ;)2009-09-09 11:23:18
 but both are only usable as contexts, not as the main stream 2009-09-09 11:23:41
<osman> yep. that's the point IMO2009-09-09 11:29:52
<Shelwien> ...2009-09-09 11:34:32
 btw, what do you think about my static compression idea?2009-09-09 11:35:14
 i mean, the one with log2(c[i]^c) contexts?2009-09-09 11:35:42
 i was thinking about fma-delta2009-09-09 11:36:47
 and, well, if there's a data window, and larger window is better2009-09-09 11:37:23
 then it might be reasonable to compress the data in there ;)2009-09-09 11:37:38
 and then, for hashing there's also a sense to use some compression2009-09-09 11:39:03
 so it seems more practical to use the same coding for both2009-09-09 11:39:37
 but hashing requires that coding to be completely static2009-09-09 11:39:53
 because otherwise hashes for different files won't match ;)2009-09-09 11:40:14
<osman> looks interesting at least :)2009-09-09 11:44:06
<Shelwien> do you understand the idea?2009-09-09 11:44:23
 basically its like extended RLE2009-09-09 11:44:31
<osman> why do you use log2(c[i]^c) as context?2009-09-09 11:44:42
<Shelwien> number of matchin MSBs actually2009-09-09 11:44:58
 in context byte and next byte2009-09-09 11:45:12
<osman> ah..ok2009-09-09 11:45:32
<Shelwien> and of course i mean to use multiple such contexts2009-09-09 11:45:35
 like 4-5-62009-09-09 11:45:41
<osman> so, it's somehow an extended REP like coder?2009-09-09 11:45:53
<Shelwien> err... what is?2009-09-09 11:46:05
<osman> i mean whole idea2009-09-09 11:46:24
 with over greater distance than actual window size2009-09-09 11:46:39
<Shelwien> in a way, more or less2009-09-09 11:46:48
 its an engine for fast finding of long matches2009-09-09 11:47:10
 i already posted the remote diff-patch kit based on that2009-09-09 11:47:35
<osman> yeah. i remember2009-09-09 11:47:49
<Shelwien> and next i'm thinking to write a tool similar to xdelta2009-09-09 11:48:08
 but that requires to keep a data window for better efficiency2009-09-09 11:48:30
 and i'm thinking that it might be cool to compress the data in window ;)2009-09-09 11:48:53
 btw, here's that game of mine - http://shelwien.googlepages.com/hopters.com2009-09-09 11:55:42
 seems to be ok at 50k in dosbox too2009-09-09 11:57:47
 arrows/ASWD and left/right shifts2009-09-09 11:58:11
<osman> hehe...2009-09-09 11:59:23
 there is a funny bug2009-09-09 11:59:28
 even after "exploding" i can still shoot :)2009-09-09 11:59:40
<Shelwien> its not a bug ;)2009-09-09 11:59:47
 its justice ;)2009-09-09 11:59:50
<osman> i have a "ultra-futuristic" helicopter now. i can move exploded helicopter with almost no effort 8-)2009-09-09 12:00:47
<Shelwien> ;)2009-09-09 12:01:05
<osman> i think it's really good. i can't see any technical differences between dangerous dave (afair) or yours2009-09-09 12:01:57
 and at that time i really like "dave" :)2009-09-09 12:02:09
<Shelwien> yeah, its actually even playable2009-09-09 12:02:20
 and there was even some networking support %)2009-09-09 12:02:49
 very weird though2009-09-09 12:02:57
<osman> it could be good if i could play against to machine2009-09-09 12:03:14
<Shelwien> i made a 2nd keyboard emulator TSR ;)2009-09-09 12:03:20
<osman> playing with "myself" made thinking2009-09-09 12:03:35
 cool %)2009-09-09 12:03:44
<Shelwien> it was transmitting the keypresses from a different machine2009-09-09 12:03:48
 and pushing them into local keyboard controller2009-09-09 12:04:06
 btw, its undocumented2009-09-09 12:04:12
 but there was a way to store your own value into port 602009-09-09 12:04:35
 and generate IRQ1 even2009-09-09 12:04:39
 it was originally made for MK3 fights though ;)2009-09-09 12:05:18
 to avoid keyboard blocking ;)2009-09-09 12:05:23
<osman> at win9x time, i have tried to read keyboard, comport, mouse etc2009-09-09 12:05:40
 and thought as like that "what if i try to read all ports in a specific range" %)2009-09-09 12:06:03
 voila! i had got a "guarantee" computer freezer :)2009-09-09 12:06:23
<Shelwien> well, dunno how to get that with reading2009-09-09 12:06:46
 but writing would certainly work2009-09-09 12:06:58
<osman> with "in" instruction2009-09-09 12:06:58
<Shelwien> for example, there was that 8042 timer2009-09-09 12:07:08
 one channel of which controlled the memory refresh\2009-09-09 12:07:18
 so it was possible to make programs to run a little faster2009-09-09 12:07:51
 with the risk of memory loss2009-09-09 12:08:03
<osman> %)2009-09-09 12:08:57
*** pinc has left the channel2009-09-09 15:07:47
<Shelwien> btw2009-09-09 16:23:57
 how to print exactly what i want on my printer still remains the question2009-09-09 16:24:15
 by i've got another idea2009-09-09 16:24:23
 instead, i can show a picture on the screen ;)2009-09-09 16:24:45
 and take a photo2009-09-09 16:24:56
 and then recover the information out of it2009-09-09 16:25:09
 its a considerably different task2009-09-09 16:25:38
 but would be still a good application for my error-correction ideas ;)2009-09-09 16:26:02
*** Simon|B has joined the channel2009-09-09 17:44:58
*** toffer has joined the channel2009-09-09 17:45:24
<toffer> hi2009-09-09 17:46:45
<Shelwien> hi2009-09-09 17:46:53
* Shelwien is writing the log2(c^c[i]) static coder2009-09-09 17:47:16
<toffer> sorry that i hardly participate - the deadline for my thesis is approaching ^^2009-09-09 17:47:17
 ?2009-09-09 17:48:03
<Shelwien> i told you before2009-09-09 17:48:18
 that i'd like to use some compression before hashing etc2009-09-09 17:48:41
 to improve randomness etc2009-09-09 17:48:47
 but it has to be a static model, same for all files2009-09-09 17:49:14
<toffer> "before" was some time ago2009-09-09 17:49:36
<Shelwien> so i'd invented something like extended RLE2009-09-09 17:49:37
<toffer> and how does it work?2009-09-09 17:50:58
 "in short"2009-09-09 17:51:03
 since i'd leave in ~20 minutes2009-09-09 17:51:12
<Shelwien> as i said... c[i] are previous symbols, and c is current one2009-09-09 17:51:37
 and context is something like2009-09-09 17:51:53
<toffer> well i read the expression differently :)2009-09-09 17:52:10
<Shelwien> log2(c^c[0])2009-09-09 17:52:11
<toffer> ok2009-09-09 17:52:19
<Shelwien> log2(c[0]^c[1])2009-09-09 17:52:20
 etc2009-09-09 17:52:20
<toffer> i do remember that2009-09-09 17:52:21
<Shelwien> basically the number of matching MSBs in symbols2009-09-09 17:52:39
 well, it works more or less2009-09-09 17:52:58
<toffer> do you have any results alread?2009-09-09 17:52:59
<Shelwien> with order-4 like that2009-09-09 17:53:07
 9*9*9*9 contexts2009-09-09 17:53:13
<toffer> just 9 ?2009-09-09 17:53:33
<Shelwien> 3.1M->2.1M calgary.tar compression2009-09-09 17:53:39
 matching bits2009-09-09 17:53:49
 0..82009-09-09 17:53:52
<toffer> ok2009-09-09 17:54:39
<Shelwien> have to do more tests2009-09-09 17:54:53
 and maybe extend the context2009-09-09 17:54:58
<toffer> i guess that kind of context quantisation is well suited for redundant data2009-09-09 17:55:03
<Shelwien> but i think this would be usable2009-09-09 17:55:08
<toffer> i mean directly2009-09-09 17:55:25
<Shelwien> the whole point is2009-09-09 17:55:26
 to compress redundant data2009-09-09 17:55:31
<toffer> not as a generator2009-09-09 17:55:37
<Shelwien> and to not expand anything2009-09-09 17:55:39
 and it has to be a static model2009-09-09 17:55:48
 and that's the idea i've got2009-09-09 17:56:02
 maybe you can suggest something else to apply in this case? 2009-09-09 17:56:54
<toffer> well i cannot imagine anything which would be that fast2009-09-09 17:57:34
 since it's just a lookupp2009-09-09 17:57:39
 lookup2009-09-09 17:57:41
<Shelwien> yeah, but i'm talking about the model2009-09-09 17:57:55
 do you have any alternative ideas for a model2009-09-09 17:58:12
 which would be static2009-09-09 17:58:20
 would allow some compression sometimes2009-09-09 17:58:31
 and would not significantly expand anything2009-09-09 17:58:39
 despite being static2009-09-09 17:58:43
<toffer> some alphabet decomposition based on prefix codes (e.g. huffman)2009-09-09 17:59:27
 would hardly expand anything2009-09-09 17:59:38
 and provide some compression2009-09-09 17:59:47
<Shelwien> well, obviously i plan to use huffman with this coding2009-09-09 17:59:54
 but plain static huffman won't work2009-09-09 18:00:05
<toffer> not static2009-09-09 18:00:09
 dynamic2009-09-09 18:00:14
<Shelwien> not static can't be used in this case2009-09-09 18:00:21
<toffer> but that's still a two pass process2009-09-09 18:00:23
 you can store the tree2009-09-09 18:00:36
<Shelwien> as i need encoded block hashes in different files2009-09-09 18:00:39
 to match2009-09-09 18:00:41
 (for equal substrings)2009-09-09 18:01:06
<toffer> it's for your diff?2009-09-09 18:01:15
<Shelwien> for all of it2009-09-09 18:01:24
 i've started writing it now2009-09-09 18:01:33
<toffer> well storing a huffman tree would be bad for a diff2009-09-09 18:01:40
<Shelwien> because fma-delta needs a data window2009-09-09 18:01:45
<toffer> but still be acceptable for compression2009-09-09 18:01:47
<Shelwien> and more data would fit into the window in compressed form2009-09-09 18:02:01
 and i need this for better hashing of redundant data anyway2009-09-09 18:02:23
<toffer> what about reusing unused symbols?2009-09-09 18:02:51
<Shelwien> diff just won't work with a stored huffman tree ;)2009-09-09 18:02:52
<toffer> that's why i asked for the application2009-09-09 18:03:05
<Shelwien> and LZ-like algos won't work either ;)2009-09-09 18:03:12
<toffer> or extending the alphabet to 9 bit and do some ngram replacement2009-09-09 18:03:18
<Shelwien> won't work2009-09-09 18:03:30
<toffer> why?2009-09-09 18:03:35
 mh well ok for diffing it won't 2009-09-09 18:03:59
<Shelwien> ngram replacement might help, but there're plans to use such filters separately from FMA engine anyway2009-09-09 18:05:09
 (FMA = far match analysis)2009-09-09 18:05:19
 and shrinking the alphabet2009-09-09 18:05:44
 won't work because some files would have full alphabet2009-09-09 18:05:56
 and the same substrings2009-09-09 18:06:00
<toffer> i cannot imagine anything atm2009-09-09 18:07:55
<Shelwien> why, there's a lot2009-09-09 18:08:10
<toffer> at least nothing which isn't adaptive2009-09-09 18:08:11
<Shelwien> for example, MTF can be applicable2009-09-09 18:08:16
 with some restrictions2009-09-09 18:08:29
<toffer> but mtf is adaptive2009-09-09 18:08:46
<Shelwien> yeah, but its adaptivity can be contained in a small window2009-09-09 18:09:03
 and i only need such a coding2009-09-09 18:09:36
 that in the equal 512-byte blocks in different files2009-09-09 18:09:52
 hashes of at least one 256-byte substring would match2009-09-09 18:10:05
 but mtf has a different problem2009-09-09 18:10:44
 i don't know how to prevent it from being redundant on random data ;)2009-09-09 18:11:05
<toffer> maybe you should restate the exact requirements2009-09-09 18:12:11
<Shelwien> i need a model, which would provide some compression for redundant data2009-09-09 18:13:10
 and won't expand random etc data2009-09-09 18:13:18
 and codes of substrings in different files encodings2009-09-09 18:14:02
 have to still match if strings match2009-09-09 18:14:11
*** pinc has joined the channel2009-09-09 18:15:48
<toffer> gonna leave now. back again later on2009-09-09 18:23:00
 bye2009-09-09 18:23:02
*** toffer has left the channel2009-09-09 18:23:06
<Shelwien> ;)2009-09-09 18:23:11
*** asmodean has left the channel2009-09-09 18:48:00
*** pinc has left the channel2009-09-09 18:48:00
*** Simon|B has left the channel2009-09-09 18:48:00
*** Shelwien has left the channel2009-09-09 18:48:00
*** osman has left the channel2009-09-09 18:48:00
*** Shelwien has joined the channel2009-09-09 18:48:01
*** pinc has joined the channel2009-09-09 18:48:01
*** Simon|B has joined the channel2009-09-09 18:48:01
*** osman has joined the channel2009-09-09 18:48:01
*** asmodean has joined the channel2009-09-09 18:48:01
* ChanServ This channel has been registered with ChanServ.2009-09-09 18:48:01
<osman> hi shelwien2009-09-09 19:11:17
 seems i have found something weird again :)2009-09-09 19:11:27
 you know pattern matching is a important part of an archiver2009-09-09 19:11:56
 so, i've worked on it.2009-09-09 19:12:03
 but, at a time, i realized that actually we can't easily do it. because, unicode coding is variable and so, we can't work on arrays2009-09-09 19:12:43
 for ensuring my idea, i've looked at sami's fnmatch and 7-zip wildcards source2009-09-09 19:13:21
 they are all "assume" as strings are basically arrays and each independent array element represent a single character2009-09-09 19:14:15
 so, at the end, both 7zip and sami's work should fail on asian languages with "?" wildcards %)2009-09-09 19:14:59
 what do you think about it?2009-09-09 19:29:56
<Shelwien> there's probably a lot of other problems anyway2009-09-09 19:39:28
 like sami's works imho don't support filename shortcuts like PROGRA~1 for "Program Files"2009-09-09 19:40:04
 don't remember about nz, but "archiver template" doesn't for sure2009-09-09 19:40:29
 also, i don't think that console archivers actually need anything more complex than *.exe2009-09-09 19:41:37
<osman> imagine if someone tries to only "archive" with 3 letters and they will surely use "???" as pattern2009-09-09 19:46:46
<Shelwien> yeah, you can imagine anything, but did you ever use something like that? ;)2009-09-09 19:47:31
<osman> but, in asian languages each unicode codepoint sometimes > 0xFFFF, so, both "archiver template" and 7zip will fail to match correctly2009-09-09 19:47:52
 you are right. i didn't use. but what if some use? ;)2009-09-09 19:48:16
 i wouldn't call that as "unicode" support2009-09-09 19:48:28
<Shelwien> there're GUIs etc anyway2009-09-09 19:48:29
 which normally don't have such features at all ;)2009-09-09 19:48:45
<osman> even winrar can fail in that area ;)2009-09-09 19:48:49
<Shelwien> whatever2009-09-09 19:49:11
 i'm just trying to say that building a perfect pattern matcher2009-09-09 19:49:20
 might be not practical2009-09-09 19:49:25
<osman> because i didn't see any special handling of string in unrar source. afair, filename stored as UTF-16 in archiver2009-09-09 19:49:50
<Shelwien> at least, if it'd slow down the file enumeration for more common patterns2009-09-09 19:49:59
 but well2009-09-09 19:50:38
 if we're gonna work with utf8 anyway2009-09-09 19:50:46
 then supporting this makes sense ;)2009-09-09 19:51:03
<osman> yeah. don't forget. i'm working on both linux and windows simultanesly now.2009-09-09 19:51:25
 so, i'm considering both utf-8 and utf-162009-09-09 19:51:38
<Shelwien> why?2009-09-09 19:51:49
<osman> for taking some ideas, i have just downloaded linux kernel %)2009-09-09 19:51:53
<Shelwien> just convert utf-16 to utf-82009-09-09 19:51:56
<osman> i realized that working with utf-8 can be a high overload2009-09-09 19:52:29
 so, i'll use utf-16 under windows and utf-8 under posix compliant OSes2009-09-09 19:52:46
 for only internal representation2009-09-09 19:53:13
 but, in archive data etc, i'll always use utf-82009-09-09 19:53:33
 "my heart will go on utf-8" :)2009-09-09 19:53:50
<Shelwien> what kind of "overload"?2009-09-09 19:54:05
 i don't think that utf8-utf16 conversion would be any slower than wstrcpy (or how its called)2009-09-09 19:54:54
<osman> conversion on API calls and checking surrogates for ensuring character length2009-09-09 19:54:59
<Shelwien> dunno2009-09-09 19:55:18
 i think that utf8 would be actually faster as it would be more compact2009-09-09 19:55:32
<osman> ahhh...actually even my str length function is wrong now %)2009-09-09 19:55:44
 seems using two different handling could cause a real "headache" %)2009-09-09 19:56:13
*** pinc|mirror has joined the channel2009-09-09 19:56:13
<Shelwien> its very easy to count symbols in utf8 strings2009-09-09 19:56:19
 as you can just ignore some codes2009-09-09 19:56:35
<osman> do you know a "shortcut"?2009-09-09 19:56:36
<Shelwien> ?2009-09-09 19:56:48
<osman> i mean a easy way2009-09-09 19:57:02
 without handling surrogates2009-09-09 19:57:10
<Shelwien> as i said... in utf8 it seems simple2009-09-09 19:57:22
<osman> more preciesly less branches2009-09-09 19:57:26
<Shelwien> just ignore the 10xxxxxx codes2009-09-09 19:57:41
*** pinc has left the channel2009-09-09 19:59:55
<osman> len += ((c & 128) != 0) or something like that?2009-09-09 20:00:28
*** pinc|mirror has left the channel2009-09-09 20:00:53
<Shelwien> not exactly2009-09-09 20:01:29
 (c & 0xC0) != 0xC02009-09-09 20:01:41
<osman> 7zip has been frozen while extracting linux kernel %)2009-09-09 20:03:27
<Shelwien> ?2009-09-09 20:04:02
<osman> i mean did not respond for a long time2009-09-09 20:04:36
 btw, why do almost all archivers first extract files to temp and then move the actual extraction target?2009-09-09 20:32:20
<Shelwien> "all"?2009-09-09 20:32:39
 freearc maybe, as its weird2009-09-09 20:32:50
 though as to reasons2009-09-09 20:33:22
<osman> 7zip and rar do that too2009-09-09 20:33:38
<Shelwien> the destination file might exist2009-09-09 20:33:39
 and if extracted file has the same name2009-09-09 20:33:54
 but, for example, is broken2009-09-09 20:34:04
 they make sure that it won't overwrite anything2009-09-09 20:34:15
 or something2009-09-09 20:34:21
<osman> they can ask at least2009-09-09 20:34:29
<Shelwien> anyway, they extract stuff to tempfiles, yeah2009-09-09 20:34:35
<osman> this both doubles required time and disk space2009-09-09 20:34:46
<Shelwien> but i think they should create these tempfiles on the target drive2009-09-09 20:34:49
 otherwise it takes too long to move the data2009-09-09 20:35:11
<osman> all of them creates at temp directory which is irrelevant to target drive. so, i always have to "clean" my C: drive2009-09-09 20:35:48
<Shelwien> dunno really2009-09-09 20:36:35
<osman> it's really annoying for me2009-09-09 20:36:49
 i sometimes could not extract some iso files or dvd movies2009-09-09 20:37:05
<Shelwien> i still don't think that console rar works like that2009-09-09 20:37:23
<osman> it might not be2009-09-09 20:37:40
<Shelwien> ...huh?! %)2009-09-09 20:42:52
 seems that my msb coders compresses archives ;)2009-09-09 20:43:25
 a little ;)2009-09-09 20:43:28
<osman> i mean console rar might not fit "extract to temp" rule2009-09-09 20:43:43
 you mean even compressed data?2009-09-09 20:43:53
<Shelwien> well, original rar 269456 bytes2009-09-09 20:44:10
 compressed 2690032009-09-09 20:44:15
<osman> for a static coder, it's very good IMO2009-09-09 20:44:41
<Shelwien> well, i suspect that's because of statistics2009-09-09 20:45:09
<osman> http://www.koders.com/c/fid856C2F4B1D04931B2005712C658E2DC3D181154E.aspx2009-09-09 20:57:09
 seems everyone is not perfect :/2009-09-09 20:57:21
 this source also does not take utf-8 variable property into account2009-09-09 20:58:03
<Shelwien> ...and nobody cares ;)2009-09-09 20:58:28
*** Simon|B has left the channel2009-09-09 20:59:21
<osman> are you sure?2009-09-09 21:03:23
 asian people are really angry with who developed unicode set. because most of their characters are in range > 0xFFFF2009-09-09 21:04:04
<Shelwien> not japanese i think ;)2009-09-09 21:04:44
<osman> if we consider that there are ~3 billion chinese. and considering whole world population is around ~5-6 billion. we should take care IMO :)2009-09-09 21:04:47
<Shelwien> its not that bad actually ;)2009-09-09 21:05:29
<osman> you know that most spoken language is actually chinese not english :)2009-09-09 21:05:31
<Shelwien> sure2009-09-09 21:06:08
 english is not even second apparently ;)2009-09-09 21:06:18
*** toffer has joined the channel2009-09-09 21:12:40
 toffer: i made the coder and it compresses book1 to ~570k2009-09-09 21:17:25
 and what's more funny, it compresses archives %)2009-09-09 21:17:37
<toffer> hi2009-09-09 21:19:07
 archives still have a header and stuff like this2009-09-09 21:19:15
<Shelwien> yeah2009-09-09 21:19:26
 <osman> you mean even compressed data?2009-09-09 21:19:35
 <Shelwien> well, original rar 269456 bytes2009-09-09 21:19:35
 <Shelwien> compressed 2690032009-09-09 21:19:35
<toffer> that's just 400 bytes2009-09-09 21:19:51
<Shelwien> yeah, but its not expanded ;)2009-09-09 21:20:07
 which is good ;)2009-09-09 21:20:10
<osman> then try to compress a 7zip or winrk archive :) afair, their headers are also compressed2009-09-09 21:20:14
<Shelwien> some m1*.7z2009-09-09 21:21:06
 78510 -> 78159 ;)2009-09-09 21:21:17
<osman> hehe2009-09-09 21:21:27
<toffer> i'd only count that if it scales on large archives2009-09-09 21:21:42
<Shelwien> probably does, if there're lots of files2009-09-09 21:22:01
 there's probably some small redundancy2009-09-09 21:22:16
<toffer> (if thre're not lots files in the header)2009-09-09 21:22:19
<Shelwien> like and rc stream start/end etc2009-09-09 21:22:23
<toffer> file names and stuff like that2009-09-09 21:22:26
<Shelwien> scales2009-09-09 21:23:10
<osman> what about your mkv video test? it's really hard to compress IMO2009-09-09 21:23:30
<Shelwien> 3k difference on 10M zip archive2009-09-09 21:23:31
 wow...2009-09-09 21:24:25
<toffer> and how much kb does zip save if you zip the zipfile again ... zip! :)2009-09-09 21:24:31
<Shelwien> 23k on that mkv2009-09-09 21:25:01
<osman> hehe. it might outperform at least BIT :)2009-09-09 21:25:23
<Shelwien> well, some of that is certainly due to statistics volume2009-09-09 21:26:24
 its not perfectly static yet2009-09-09 21:26:36
 but things like 3k and 23k are certainly much larger than stats2009-09-09 21:27:04
 i think that's because its able to detect compressible substrings2009-09-09 21:28:02
 i mean, if there're not much msb matches in context, it just leaves it alone2009-09-09 21:28:55
 seems like not quite bad algo for detection and maybe segmentation2009-09-09 21:29:53
<osman> do you use trunc(log2(c[i]^c) * k) or just trunc(log2(c[i]^c)) ?2009-09-09 21:31:02
 i mean 9 contexts or more?2009-09-09 21:31:22
<Shelwien> "just" and i don't really use log2 at all ;)2009-09-09 21:31:36
 there 9^4 contexts2009-09-09 21:31:47
<osman> yep. last one is actually a bsr instruction :)2009-09-09 21:31:55
<Shelwien> LUT in my case2009-09-09 21:32:05
<osman> try bsr. it might help.... but maybe not. because, you have a single LUT and it can be highly cached2009-09-09 21:32:46
<Shelwien> actually i'd have a single LUT per whole context index2009-09-09 21:33:11
 well, maybe2009-09-09 21:33:27
 i mean, these *9 are not really good ;)2009-09-09 21:34:02
 even if they're done via LEA's actually ;)2009-09-09 21:34:20
<osman> :)2009-09-09 21:34:41
<Shelwien> wonder if i should move the case bit to lsb or something %)2009-09-09 21:36:41
<osman> it might scale like before :)2009-09-09 21:37:04
 because lsbs are mostly noisy2009-09-09 21:37:15
<Shelwien> i mean, A/a case2009-09-09 21:37:24
<osman> aa...ok. got it2009-09-09 21:37:51
 it can help :)2009-09-09 21:37:55
 just optimize your reoder for that :)2009-09-09 21:38:16
<Shelwien> i thought that too2009-09-09 21:38:26
<osman> it may more helpful2009-09-09 21:38:27
<Shelwien> not reorder, just bit order in the byte ;)2009-09-09 21:38:41
<osman> if you are not lazy as me, then why not? :)2009-09-09 21:39:09
 i would probably start reoder optimization and sleep after that :)2009-09-09 21:39:26
<Shelwien> well, i'd do that2009-09-09 21:39:29
 i'd have to convert it to huffman anyway2009-09-09 21:40:00
<osman> btw, i realized that actually GCC comes from another dimension of the space %) it won't compile most of my sources %)2009-09-09 21:42:14
 *it doesn't compile2009-09-09 21:42:24
<Shelwien> yeah2009-09-09 21:42:35
 the main problem is that it not only has a whole different runtime library2009-09-09 21:42:54
 but also has some annoying C++ syntax incompatibilities2009-09-09 21:43:10
<osman> yep. definitely2009-09-09 21:43:24
 probably i'll use intelc for posix platforms in the end %)2009-09-09 21:43:57
<Shelwien> yeah, might be a good idea2009-09-09 21:44:12
 though i didn't hear about IC for freebsd2009-09-09 21:44:27
<osman> freebsd is posix compliant too. if i could even "execute" some simple command in freebsd, i would test my linux compile in there2009-09-09 21:45:09
 freebsd is a really nightmare2009-09-09 21:45:19
 it eventually crashes after starting GUI2009-09-09 21:45:39
 i can't use it in vmware2009-09-09 21:46:15
 just i can only see prompt2009-09-09 21:46:27
<Shelwien> well, its vmware problem, not freebsd's2009-09-09 21:46:41
<osman> most of commands are incompatible with linux distros'2009-09-09 21:46:45
 if i could not run it, then i can't test it right? :) so, it doesn't matter it's about vmware or not2009-09-09 21:47:30
 :)2009-09-09 21:47:33
 seems i'll start to test macos x :)2009-09-09 21:48:22
 it's posix compliant too :)2009-09-09 21:48:34
 "In UTF-8, characters outside the basic multilingual plane are not a special case. UTF-16 is often mistaken to be the obsolete constant-length UCS-2 encoding, leading to code that works for most text but suddenly fails for non-BMP characters. It's better to implement support for the entire range of Unicode from the start."2009-09-09 22:12:10
 from Wikipedia :)2009-09-09 22:12:15
*** toffer has left the channel2009-09-09 22:13:33
 "...Japanese and the Korean UTF-8 article on Wikipedia take more space if saved as UTF-16 than the original UTF-8 version" i think this is a really good reason to use utf8 :)2009-09-09 22:15:18
<Shelwien> err... i think many things take more spaces in utf-16 than in utf-8 ;)2009-09-09 22:26:15
<osman> but, considering asian languages...it is a bit surprise to see utf-8 is more compact2009-09-09 22:26:54
<Shelwien> you know, there're spaces and stuff too2009-09-09 22:27:44
<osman> yep. that's the point in here actually :)2009-09-09 22:28:14
*** toffer has joined the channel2009-09-09 22:41:37
*** toffer has left the channel2009-09-09 23:45:13
<Shelwien> !next2009-09-09 23:55:00