*** complogger has joined the channel | 2009-10-08 20:27:03 |
<toffer> | finally hi | 2009-10-08 20:29:18 |
<Shelwien> | ;) | 2009-10-08 20:29:23 |
<toffer> | i sometimes need to write my thesis, too | 2009-10-08 20:29:26 |
<Shelwien> | sure, i just wanted to ask whether you've seen the sfc benchmark | 2009-10-08 20:30:07 |
<toffer> | i looked at it this morning | 2009-10-08 20:30:38 |
| but i lost the link | 2009-10-08 20:30:43 |
<Shelwien> | http://shelwien.googlepages.com/m1x2-sfc.htm | 2009-10-08 20:30:49 |
<toffer> | thanks again | 2009-10-08 20:31:09 |
| you've written it via icq? | 2009-10-08 20:31:15 |
| initially | 2009-10-08 20:31:17 |
<Shelwien> | irc too | 2009-10-08 20:31:23 |
<toffer> | my conclusion is: | 2009-10-08 20:32:10 |
| - it works well on text and better than ccmx on lower memory levels | 2009-10-08 20:32:22 |
| - i need a match model | 2009-10-08 20:32:26 |
<Shelwien> | btw another benchmarks are | 2009-10-08 20:32:29 |
| http://shelwien.googlepages.com/m1x2.htm | 2009-10-08 20:32:30 |
<toffer> | confusing | 2009-10-08 20:32:35 |
<Shelwien> | http://ctxmodel.net/files/MIX/mix_v6.htm | 2009-10-08 20:32:38 |
| well, mine are: | 2009-10-08 20:33:02 |
| - filtering doesn't matter that much | 2009-10-08 20:33:18 |
<toffer> | ah | 2009-10-08 20:33:37 |
| and i forgot | 2009-10-08 20:33:38 |
| i did some experiments with the main coding loop | 2009-10-08 20:33:46 |
| there's almost no speed difference under linux when (not) unrolling the main coding loop | 2009-10-08 20:34:03 |
<Shelwien> | - you'd better properly compile it ;) | 2009-10-08 20:34:05 |
<toffer> | but on windows the tighter code is faster, which is pretty odd | 2009-10-08 20:34:20 |
| and for speed gain i can still double the nibble cache | 2009-10-08 20:34:36 |
| that gave me ~12% higher speed | 2009-10-08 20:34:47 |
| and the linux compiles somehow are 20% faster | 2009-10-08 20:34:54 |
<Shelwien> | well, i think you need to beat ccm in compression first | 2009-10-08 20:34:59 |
<toffer> | that's easy | 2009-10-08 20:35:05 |
| as i said i initially only tested on text | 2009-10-08 20:35:16 |
| and there it does the job | 2009-10-08 20:35:20 |
<Shelwien> | match model only won't change it imho | 2009-10-08 20:35:26 |
<toffer> | sure it will | 2009-10-08 20:35:33 |
<Shelwien> | don't forget these sparse submodels | 2009-10-08 20:35:44 |
<toffer> | i don't count that | 2009-10-08 20:35:53 |
<Shelwien> | it seems they do work on some files | 2009-10-08 20:35:58 |
<toffer> | since i can use other parameter profiles for other files | 2009-10-08 20:36:01 |
<Shelwien> | it doesn't count | 2009-10-08 20:36:20 |
<toffer> | i tuned a exe profiles on a e8 filtered mso97.dll | 2009-10-08 20:36:22 |
| erm | 2009-10-08 20:36:32 |
<Shelwien> | until you add a detector or something | 2009-10-08 20:36:33 |
<toffer> | that's a key feature | 2009-10-08 20:36:37 |
| nah | 2009-10-08 20:36:43 |
| filters just are accessories | 2009-10-08 20:37:07 |
<Shelwien> | well, you need that to beat ccm in benchmarks | 2009-10-08 20:37:09 |
*** TT has joined the channel | 2009-10-08 20:37:21 |
| ? | 2009-10-08 20:37:28 |
<toffer> | ? | 2009-10-08 20:37:55 |
<Shelwien> | ah | 2009-10-08 20:38:09 |
<thometal> | hi | 2009-10-08 20:38:22 |
<Shelwien> | hi ;) | 2009-10-08 20:38:25 |
| thometal: http://nishi.dreamhosters.com/chantail.pl?32 | 2009-10-08 20:38:54 |
| well, more like http://nishi.dreamhosters.com/chantail.pl?50 i guess | 2009-10-08 20:39:29 |
<toffer> | hi | 2009-10-08 20:39:44 |
<Shelwien> | toffer: anyway, sami etc | 2009-10-08 20:40:06 |
| won't run m1 with new manual config per file | 2009-10-08 20:40:39 |
| and if you don't want people to notice it | 2009-10-08 20:41:29 |
| then i guess i don't understand your goal at all ;) | 2009-10-08 20:41:38 |
<toffer> | well i can store profiles | 2009-10-08 20:44:21 |
<Shelwien> | sure | 2009-10-08 20:44:40 |
<toffer> | my goal is to write some context mixing compressor | 2009-10-08 20:44:47 |
| mainly as a fun project | 2009-10-08 20:44:52 |
<Shelwien> | well, i do such stuff too, mainly to test various ideas | 2009-10-08 20:46:24 |
| but its basically a day per compressor | 2009-10-08 20:46:34 |
| and my programs don't usually have that much in common | 2009-10-08 20:46:55 |
<thometal> | Shelwien: whats that link? | 2009-10-08 20:47:01 |
<Shelwien> | channel log | 2009-10-08 20:47:10 |
| we're discussing new toffer's compressor - http://shelwien.googlepages.com/m1x2.htm | 2009-10-08 20:48:23 |
<thometal> | ok dont need it =P | 2009-10-08 20:48:26 |
| experimental compressor? | 2009-10-08 20:48:56 |
<Shelwien> | sure | 2009-10-08 20:49:03 |
<thometal> | how can i help? | 2009-10-08 20:49:18 |
<Shelwien> | i guess you can do your own benchmark ;) | 2009-10-08 20:49:42 |
<toffer> | some testing | 2009-10-08 20:49:49 |
| but i'll add more changes in the future | 2009-10-08 20:50:00 |
<thometal> | then i need a linux x64 compile | 2009-10-08 20:50:11 |
<toffer> | nice | 2009-10-08 20:50:15 |
| that's my development plattform :) | 2009-10-08 20:50:26 |
<Shelwien> | what about ccm? | 2009-10-08 20:50:36 |
<toffer> | i cannot compile it here due to your weird casts from pointers to uint (64 <> 32 bit) | 2009-10-08 20:50:58 |
<Shelwien> | sure you can't compile it as 32-bit? | 2009-10-08 20:51:22 |
<toffer> | i looked into it to fix but after the first few i got really annoyed | 2009-10-08 20:51:26 |
| that'd not be good for speed comparsion | 2009-10-08 20:51:41 |
<Shelwien> | also what about osman's version? | 2009-10-08 20:51:42 |
| though its much slower | 2009-10-08 20:51:45 |
<toffer> | it's due to decompilation | 2009-10-08 20:51:50 |
<Shelwien> | my rc optimization too | 2009-10-08 20:52:09 |
<toffer> | i'll reboot and produce some more compiles | 2009-10-08 20:53:25 |
<Shelwien> | i really think you should provide some alternate compressors too | 2009-10-08 20:53:49 |
*** toffer has left the channel | 2009-10-08 20:53:49 |
*** toffer has joined the channel | 2009-10-08 20:56:12 |
| <Shelwien> i really think you should provide some alternate compressors too | 2009-10-08 21:01:45 |
| at least ccm_sh and ppmd | 2009-10-08 21:01:53 |
| or there won't be much sense in benchmarks imho | 2009-10-08 21:02:16 |
| what else is there for x64 anyway? | 2009-10-08 21:02:48 |
| bzip/gzip? ;) | 2009-10-08 21:02:51 |
<toffer> | i uploaded 4 different builds to the tmp folder again | 2009-10-08 21:11:36 |
*** pinc has left the channel | 2009-10-08 21:15:28 |
| somehow only the login to apocalypse works | 2009-10-08 21:17:24 |
| i cannot provide linux builds now since i'm under win32 again | 2009-10-08 21:17:46 |
| guess i really need a cross compiler | 2009-10-08 21:17:55 |
| ^^ | 2009-10-08 21:17:57 |
<Shelwien> | ;) | 2009-10-08 21:18:00 |
| btw, can you compile that parameter profile? | 2009-10-08 21:20:40 |
| even if its just variables instead of constants, they can affect the speed too | 2009-10-08 21:21:25 |
| and do you have any profile-dependent code in the loop? | 2009-10-08 21:22:03 |
| like branches? | 2009-10-08 21:22:15 |
<toffer> | guess | 2009-10-08 21:22:44 |
| well i don't | 2009-10-08 21:23:29 |
<Shelwien> | even Matt finally decided to compile zpaq configs now ;) | 2009-10-08 21:23:40 |
<toffer> | yeah | 2009-10-08 21:24:32 |
| i've uploaded a profile for e8 filtered files,too | 2009-10-08 21:27:18 |
<Shelwien> | and? | 2009-10-08 21:28:26 |
| well, i can run a test using external filter(s) | 2009-10-08 21:29:23 |
| but you won't like the speed at all probably ;) | 2009-10-08 21:29:30 |
<toffer> | you could just test the provided exes | 2009-10-08 21:33:31 |
| with the best4 profile | 2009-10-08 21:33:45 |
| and use the other profile for x86 data | 2009-10-08 21:36:56 |
| do you think the speed gain in the boost1 builds justifies the compression loss? | 2009-10-08 21:37:23 |
<Shelwien> | added a link | 2009-10-08 21:39:04 |
| so http://toffer.dreamhosters.com/tmp/ now points to that directory | 2009-10-08 21:39:21 |
| and i guess i can test these 4 unroll executables with best4 profile if you meant that | 2009-10-08 21:40:04 |
| but "other profile for x86 data" is a troublesome request | 2009-10-08 21:40:34 |
| i can test it on the whole fileset though ;) | 2009-10-08 21:41:02 |
| also i'd prefer not to test all the memory options | 2009-10-08 21:41:58 |
| can you suggest a single one? | 2009-10-08 21:42:11 |
| ok, guess i'd test -3 | 2009-10-08 21:44:20 |
| testing | 2009-10-08 21:50:39 |
*** thometal has left the channel | 2009-10-08 22:05:42 |
<toffer> | sorry, was a way | 2009-10-08 22:06:43 |
| 100 or 200 mb would be fine i guess | 2009-10-08 22:06:53 |
<Shelwien> | i'm running the same sh1 benchmark | 2009-10-08 22:07:12 |
| these 4 new exes with best4 at -3 | 2009-10-08 22:08:26 |
| and last of them with that new config | 2009-10-08 22:08:40 |
<toffer> | the last? | 2009-10-08 22:12:25 |
<Shelwien> | didn't work somehow... | 2009-10-08 22:12:53 |
| http://shelwien.googlepages.com/m1x2.htm | 2009-10-08 22:15:29 |
| reuploaded | 2009-10-08 22:54:28 |
| somehow that new profile has a totally negative effect | 2009-10-08 22:55:09 |
| even on executables | 2009-10-08 22:55:18 |
<toffer> | guess mso97 was the wrong tuning target than | 2009-10-08 23:00:23 |
| thanks for testing | 2009-10-08 23:00:36 |
| gonna go to bed now | 2009-10-08 23:00:40 |
| it looks like the versions with a larger nibble cache greatly improve the speed/compression tradeoff | 2009-10-08 23:01:46 |
| gn8 | 2009-10-08 23:02:29 |
*** toffer has left the channel | 2009-10-08 23:02:35 |
*** pinc has joined the channel | 2009-10-09 07:04:48 |
*** pinc has left the channel | 2009-10-09 14:55:25 |
*** TT has joined the channel | 2009-10-09 15:57:08 |
<Guest4655039> | hi | 2009-10-09 15:58:47 |
*** Guest4655039 has left the channel | 2009-10-09 15:59:04 |
*** thometal has joined the channel | 2009-10-09 15:59:30 |
<thometal> | re | 2009-10-09 15:59:39 |
| is bulat sometimes here? | 2009-10-09 15:59:47 |
*** thometal has left the channel | 2009-10-09 16:19:05 |
*** thometal has joined the channel | 2009-10-09 16:19:34 |
*** chornobyl has joined the channel | 2009-10-09 16:40:03 |
<chornobyl> | goodevening | 2009-10-09 16:40:25 |
<thometal> | hi | 2009-10-09 16:42:59 |
*** thometal has left the channel | 2009-10-09 17:05:17 |
<Shelwien> | hi | 2009-10-09 18:37:41 |
*** Shelwien has left the channel | 2009-10-09 19:43:33 |
<chornobyl> | not a crowded place | 2009-10-09 20:18:28 |
*** thometal has joined the channel | 2009-10-09 21:33:27 |
*** Shelwien has joined the channel | 2009-10-09 21:42:49 |
<thometal> | hey is bulat sometimes here? | 2009-10-09 22:04:06 |
<Shelwien> | bulat's never been here | 2009-10-09 22:05:48 |
| !grep bulat | 2009-10-09 22:06:08 |
| he's available on icq sometimes, though | 2009-10-09 22:07:09 |
<thometal> | oh never seen this | 2009-10-09 22:08:22 |
| thanks | 2009-10-09 22:08:25 |
| i though he advises this channel too | 2009-10-09 22:08:42 |
<Shelwien> | well, it was his idea | 2009-10-09 22:09:21 |
| ...which i used to advertise my channel ;) | 2009-10-09 22:09:32 |
<thometal> | hehe | 2009-10-09 22:09:41 |
<Shelwien> | sami appears here though | 2009-10-09 22:09:51 |
<thometal> | yeah but why dont visit the forum anymore (bad english) | 2009-10-09 22:10:49 |
| =P | 2009-10-09 22:10:51 |
<Shelwien> | sami? | 2009-10-09 22:11:09 |
<thometal> | yeah | 2009-10-09 22:11:13 |
<Shelwien> | encode threatened to ban him | 2009-10-09 22:11:32 |
<thometal> | why | 2009-10-09 22:11:39 |
<Shelwien> | because he was trolling too much | 2009-10-09 22:11:49 |
<thometal> | hehe | 2009-10-09 22:12:06 |
<Shelwien> | he didn't really ban him though | 2009-10-09 22:12:21 |
| but sami said he's quitting himself ;) | 2009-10-09 22:12:31 |
<thometal> | it looks better than be banned | 2009-10-09 22:12:59 |
<Shelwien> | dunno really | 2009-10-09 22:13:13 |
| his post was quite annoying anyway | 2009-10-09 22:13:27 |
<thometal> | do you have a link to samis hp | 2009-10-09 22:13:58 |
| ? | 2009-10-09 22:14:01 |
<Shelwien> | somehow sami seems like a much more reasonable person when not on the forum though | 2009-10-09 22:14:27 |
* thometal is asking if igor visits this channel sometimes | 2009-10-09 22:14:40 |
| no, but that might be an idea | 2009-10-09 22:15:07 |
| didn't talk to him lately | 2009-10-09 22:15:14 |
| as to sami | 2009-10-09 22:15:28 |
| http://nanozip.net/ | 2009-10-09 22:15:40 |
* thometal is asking himself why 7z has not so cute matchfinder like freearc | 2009-10-09 22:17:15 |
| i've seen some complaints | 2009-10-09 22:19:55 |
| that "rep is useless" because it only supports a window up to like 2G | 2009-10-09 22:20:21 |
| while 7-zip itself supports at least 1G | 2009-10-09 22:20:44 |
| but with much better precision | 2009-10-09 22:20:50 |
<thometal> | i though rep and matchfinder are differnt things | 2009-10-09 22:21:15 |
<Shelwien> | somewhat different algorithms | 2009-10-09 22:21:43 |
| but basically the same purpose | 2009-10-09 22:21:53 |
<thometal> | and freearc use first rep and then the matchfinder right? | 2009-10-09 22:22:02 |
<Shelwien> | yeah | 2009-10-09 22:22:07 |
<thometal> | i meant the ht4 matchfinder | 2009-10-09 22:22:29 |
<Shelwien> | rep is necessary to find long and far matches | 2009-10-09 22:22:34 |
<thometal> | yeah | 2009-10-09 22:22:47 |
<Shelwien> | and precise matchfinders, which can find any match | 2009-10-09 22:23:03 |
| have much greater memory requirements | 2009-10-09 22:23:25 |
| so they can only find matches in some small window | 2009-10-09 22:23:41 |
| like only 4M in rar case | 2009-10-09 22:23:50 |
| and rep is a different algorithm, intended to handle such cases | 2009-10-09 22:24:26 |
<thometal> | so is ht4 less precise like hc4 | 2009-10-09 22:24:33 |
<Shelwien> | so that eg. two versions if the same 200M file won't be encoded twice | 2009-10-09 22:25:15 |
<thometal> | yeah | 2009-10-09 22:25:39 |
| i wonder that same files or file parts were not stored once and checked via checksum | 2009-10-09 22:26:15 |
| but with rep and matchfinders | 2009-10-09 22:26:29 |
<Shelwien> | well, there're cases like remuxed versions of the same video | 2009-10-09 22:27:40 |
| rep is very helpful there | 2009-10-09 22:27:53 |
| imho its still better to check for whole file matches separately in archivers though | 2009-10-09 22:29:14 |
<thometal> | if i became a millionaire i will develop my own algorithm and archiver =P but so have no time... | 2009-10-09 22:30:11 |
| :-( | 2009-10-09 22:30:16 |
<Shelwien> | at least algorithms don't require that much time | 2009-10-09 22:30:40 |
<thometal> | but understanding | 2009-10-09 22:30:54 |
<Shelwien> | like all the stuff encode posts | 2009-10-09 22:31:03 |
| can be written in a day basically ;) | 2009-10-09 22:31:09 |
<thometal> | hehe | 2009-10-09 22:31:27 |
<Shelwien> | and its the same for me too, usually | 2009-10-09 22:32:03 |
| though hopefully i now have a paid job to write an archiver ;) | 2009-10-09 22:32:26 |
<thometal> | but you must know how so i have no experience | 2009-10-09 22:32:27 |
| hehe | 2009-10-09 22:32:35 |
| why not ask igor | 2009-10-09 22:32:43 |
<Shelwien> | well, its a backup solution actually | 2009-10-09 22:32:48 |
| but there're many similarities | 2009-10-09 22:33:12 |
| so i fully intend to make it both a remote backup, and a local archiver ;) | 2009-10-09 22:33:28 |
<thometal> | yeah i feel the job offer in this area is very nery limited | 2009-10-09 22:33:28 |
<Shelwien> | i don't think so | 2009-10-09 22:33:37 |
<thometal> | i never saw a position offer | 2009-10-09 22:33:56 |
| only video maybe image compression | 2009-10-09 22:34:16 |
<Shelwien> | ocarina hires people (where Matt works) | 2009-10-09 22:34:17 |
| not really | 2009-10-09 22:34:24 |
<thometal> | its too var a away | 2009-10-09 22:34:35 |
<Shelwien> | well, i work remotely | 2009-10-09 22:34:51 |
<thometal> | and the oly hire experts not like me | 2009-10-09 22:34:54 |
| =P | 2009-10-09 22:35:02 |
<Shelwien> | well, experts... you don't really have to know anything about compression i think | 2009-10-09 22:35:50 |
| for example, Matt basically reinvented the whole theory by himself | 2009-10-09 22:36:20 |
| before like paq6, the leading algorithms were based on a completely different theory | 2009-10-09 22:37:06 |
<thometal> | yeah but todo this you need time | 2009-10-09 22:37:12 |
<Shelwien> | mostly PPM and bytewise | 2009-10-09 22:37:15 |
| that's not what i meant anyway | 2009-10-09 22:37:28 |
| i tried to say that compression tasks are not that complex to solve on your own | 2009-10-09 22:38:04 |
| its just common sense basically | 2009-10-09 22:38:13 |
| not that much math even | 2009-10-09 22:38:22 |
<chornobyl> | and math | 2009-10-09 22:38:27 |
| and programming skills | 2009-10-09 22:38:38 |
<thometal> | hehe | 2009-10-09 22:38:51 |
<Shelwien> | i think areas like game design etc would require much more special expertise | 2009-10-09 22:38:52 |
| and math ;) | 2009-10-09 22:39:04 |
<thometal> | math | 2009-10-09 22:39:09 |
<Shelwien> | but programming, yeah | 2009-10-09 22:39:15 |
<thometal> | slowly i hate math | 2009-10-09 22:39:21 |
| =P | 2009-10-09 22:39:23 |
| because of my diploma thesis...... too much math | 2009-10-09 22:39:45 |
<Shelwien> | all the modern fashion | 2009-10-09 22:39:51 |
<thometal> | yeah.... | 2009-10-09 22:39:57 |
<Shelwien> | like C#,java, and functional languages | 2009-10-09 22:40:05 |
| are completely useless for compression ;) | 2009-10-09 22:40:17 |
<thometal> | i begin to had c++ too | 2009-10-09 22:40:32 |
| =p | 2009-10-09 22:40:33 |
<Shelwien> | that's one area where they can't compete neither in speed, nor resource usage | 2009-10-09 22:40:42 |
<thometal> | yeah i know | 2009-10-09 22:40:54 |
<chornobyl> | >functionaltell it bulat | 2009-10-09 22:41:06 |
| *to | 2009-10-09 22:41:19 |
<thometal> | java is sooooooooooo much programmer friendlier, but this is a other topic | 2009-10-09 22:41:21 |
| yeah i know haskell | 2009-10-09 22:41:39 |
*** Shelwien has left the channel | 2009-10-09 22:41:53 |
*** Guest9968193 has joined the channel | 2009-10-09 22:41:58 |
| we should write some programs in the university | 2009-10-09 22:42:04 |
<Shelwien> | haskell might be really good for stuff like filelist processing | 2009-10-09 22:42:13 |
<thometal> | it wasn funny | 2009-10-09 22:42:16 |
<Shelwien> | sorting etc | 2009-10-09 22:42:24 |
<thometal> | yeah low cpu usage stuff | 2009-10-09 22:42:38 |
<Shelwien> | but i think its different in bulat's case | 2009-10-09 22:42:42 |
| i think he actually started fa because he wanted some project to try haskell on | 2009-10-09 22:43:05 |
| not used haskell because it seemed good for specific project | 2009-10-09 22:43:25 |
<thometal> | ok | 2009-10-09 22:43:46 |
| hm | 2009-10-09 22:44:10 |
| i work with cuda | 2009-10-09 22:44:16 |
| do you have a idea which algorithm a can port | 2009-10-09 22:44:33 |
| ? | 2009-10-09 22:44:34 |
<Shelwien> | also http://haskell.org/bz is funny. most of code there is unrelated to haskell ;) | 2009-10-09 22:44:46 |
<thometal> | yeah i know | 2009-10-09 22:44:57 |
| =P | 2009-10-09 22:44:58 |
<Shelwien> | well, its most useful for password cracking | 2009-10-09 22:45:00 |
<thometal> | heheh | 2009-10-09 22:45:08 |
| nono i mean compressions algorithm | 2009-10-09 22:45:23 |
<Shelwien> | i'd like to have a cuda md5crypt cracker, if you want to write one ;) | 2009-10-09 22:45:34 |
<thometal> | there exist at least once | 2009-10-09 22:45:54 |
<Shelwien> | there're a lot of implementation for plain md5 | 2009-10-09 22:45:55 |
| but not for "freebsd md5" unix password hashes | 2009-10-09 22:46:08 |
<thometal> | i though md5 is md5 | 2009-10-09 22:46:33 |
<Shelwien> | not really ;) | 2009-10-09 22:46:38 |
<thometal> | like lz and lz** | 2009-10-09 22:46:52 |
| hehe | 2009-10-09 22:46:55 |
<Shelwien> | md5crypt is bruteforce speed is like 12000p/s | 2009-10-09 22:47:21 |
| on q9450 3.6ghz | 2009-10-09 22:47:27 |
<thometal> | may be i should write once if i finished my diplom thesis | 2009-10-09 22:47:34 |
<Shelwien> | well, that's most simple actually | 2009-10-09 22:47:55 |
| you can just write a fitting implementation for x86 | 2009-10-09 22:48:21 |
| and then port it | 2009-10-09 22:48:26 |
| cuda is convenient in that sense | 2009-10-09 22:48:40 |
<thometal> | yeah i know | 2009-10-09 22:48:47 |
<Shelwien> | but as to compression... | 2009-10-09 22:48:51 |
| unfortunately that's not very likely | 2009-10-09 22:49:05 |
<thometal> | iam working with cuda for my diplom thesis | 2009-10-09 22:49:11 |
<Shelwien> | i made a special version of bwt with that intention though | 2009-10-09 22:49:25 |
| maybe i'd port it to cuda once | 2009-10-09 22:49:38 |
| (rcbwt) | 2009-10-09 22:49:40 |
<thometal> | hmhm | 2009-10-09 22:50:52 |
<Shelwien> | unfortunately the bottleneck with compression algorithms is usually the memory | 2009-10-09 22:51:00 |
<thometal> | yeah | 2009-10-09 22:51:06 |
| so i though a simple cm algorithm would be the best | 2009-10-09 22:51:22 |
<Shelwien> | how? | 2009-10-09 22:51:39 |
<thometal> | for cuda | 2009-10-09 22:51:53 |
| but i dont know the which simple cm algo is the right | 2009-10-09 22:52:30 |
<Shelwien> | i mean, how? | 2009-10-09 22:54:04 |
| CMs, especially simple CMs | 2009-10-09 22:54:16 |
| mostly consist of memory lookups | 2009-10-09 22:54:25 |
<thometal> | hm | 2009-10-09 22:54:51 |
<Shelwien> | and memory read costs like 100 clocks on GPU | 2009-10-09 22:54:52 |
| and is not even really parallel | 2009-10-09 22:55:22 |
<thometal> | hm on this point you could splitt you file | 2009-10-09 22:55:51 |
| your file | 2009-10-09 22:55:55 |
| so far i know | 2009-10-09 22:56:01 |
<Shelwien> | i think it'd be much worse with random reads in a pile of even synchronized threads | 2009-10-09 22:56:04 |
| and even worse if they can get out of sync | 2009-10-09 22:56:31 |
| basically the GPU would stop being parallel then ;) | 2009-10-09 22:56:46 |
<thometal> | dou you have simple cm algo source code so i could look into | 2009-10-09 22:57:00 |
<Shelwien> | ccm source in the topic | 2009-10-09 22:57:12 |
| also something like http://ctxmodel.net/files/mix_test/mix_test_vC.rar maybe | 2009-10-09 22:57:56 |
<thometal> | i dont look into all but there are many loops and constants which is good | 2009-10-09 23:01:26 |
<Shelwien> | sure | 2009-10-09 23:01:37 |
| actually the compression part is easy enough to perform in parallel | 2009-10-09 23:01:51 |
| there're some easily threaded stages | 2009-10-09 23:02:06 |
| even the rangecoding can be split into 2 or 3 | 2009-10-09 23:02:30 |
| but that doesn't apply to decoding at all | 2009-10-09 23:02:52 |
| all the possible decoding speed optimizations would cost a considerable compression loss | 2009-10-09 23:03:42 |
| and then, even that easily threaded compression | 2009-10-09 23:04:17 |
| might not become any faster from processing on GPU | 2009-10-09 23:04:49 |
| because its still mostly about memory access | 2009-10-09 23:05:02 |
| and GPU has a lower clock | 2009-10-09 23:05:12 |
| and no memory cache | 2009-10-09 23:05:25 |
<thometal> | but bandwith is much higher | 2009-10-09 23:05:49 |
<Shelwien> | who needs that? | 2009-10-09 23:06:00 |
| compression requires random memory access | 2009-10-09 23:06:11 |
| especially all these modern CMs based on hashtables | 2009-10-09 23:06:39 |
<thometal> | ok thats bad | 2009-10-09 23:07:19 |
| :-/ | 2009-10-09 23:07:29 |
<Shelwien> | well, as i said, i have an idea for BWT though | 2009-10-09 23:08:58 |
| BWT can be implemented by simple parallel data scan, without a large global index | 2009-10-09 23:09:54 |
| and I suggested to use in-memory data compression | 2009-10-09 23:10:12 |
| to reduce memory access | 2009-10-09 23:10:25 |
| and make use of computing resources | 2009-10-09 23:10:33 |
<thometal> | yeah | 2009-10-09 23:12:10 |
<Shelwien> | http://ctxmodel.net/files/rcbwt_v2.rar | 2009-10-09 23:14:32 |
<thometal> | i will have a look into | 2009-10-09 23:18:12 |
<Shelwien> | well, i have some other ideas too | 2009-10-09 23:18:56 |
| like that ccm_sh in topic | 2009-10-09 23:19:07 |
| has a vectorized rc implementation in it | 2009-10-09 23:19:16 |
| uses 8 independent rangecoders | 2009-10-09 23:19:36 |
| which is helpful for paralleling too | 2009-10-09 23:19:50 |
<thometal> | yeah but first i will look in this code the next days and calc the time which i would need to understand | 2009-10-09 23:22:32 |
| =P | 2009-10-09 23:22:37 |
<Shelwien> | there're m1 sources too | 2009-10-09 23:23:06 |
| and Matt's of course ;) | 2009-10-09 23:23:18 |
*** asmodean has joined the channel | 2009-10-09 23:24:31 |
| welcome back ;) | 2009-10-09 23:25:22 |
<asmodean> | forgot what the channel was called and where it was ;p | 2009-10-09 23:25:54 |
<Shelwien> | there're irc searches for that ;) | 2009-10-09 23:26:10 |
| like netsplit ;) | 2009-10-09 23:26:13 |
<asmodean> | what would i search for. "hm i was idling ...on some server in ... some channel... with ... some guys..." | 2009-10-09 23:26:44 |
| heh | 2009-10-09 23:26:54 |
| oops oven is beeping | 2009-10-09 23:27:01 |
<Shelwien> | %) | 2009-10-09 23:27:06 |
| btw, as to cuda, here's some fun post | 2009-10-09 23:28:04 |
| http://www.semiaccurate.com/2009/10/06/nvidia-kills-gtx285-gtx275-gtx260-abandons-mid-and-high-end-market/ | 2009-10-09 23:28:06 |
<thometal> | hehe | 2009-10-09 23:32:24 |
<asmodean> | is that true? ;p | 2009-10-09 23:32:33 |
| i haven't paid attention to enthusiast graphics cards in many years | 2009-10-09 23:32:43 |
<Shelwien> | well, partly maybe | 2009-10-09 23:32:55 |
| it seems, nvidia announced a new chip recently | 2009-10-09 23:33:10 |
| to compete with AMD/ATI ones | 2009-10-09 23:33:24 |
<thometal> | yeah | 2009-10-09 23:33:35 |
<Shelwien> | and then, the reference card they showed appeared a fake ;) | 2009-10-09 23:33:49 |
<thometal> | but the shown were paper dummy | 2009-10-09 23:34:00 |
| ^^ | 2009-10-09 23:34:07 |
<Shelwien> | also i've seen some forum posts criticizing that article | 2009-10-09 23:34:51 |
| but they didn't say anything about specific facts being true or not | 2009-10-09 23:35:18 |
| like pulling off these mentioned cards | 2009-10-09 23:35:29 |
*** chornobyl has left the channel | 2009-10-09 23:37:23 |
*** thometal has left the channel | 2009-10-10 00:04:21 |
*** Shelwien has left the channel | 2009-10-10 03:04:24 |
*** Guest9968193 has joined the channel | 2009-10-10 03:04:27 |
*** pinc has joined the channel | 2009-10-10 07:46:49 |
*** pinc has left the channel | 2009-10-10 07:48:12 |
*** thometal has joined the channel | 2009-10-10 09:14:01 |
*** pinc has joined the channel | 2009-10-10 12:14:31 |
*** pinc has left the channel | 2009-10-10 12:30:20 |
*** thometal has left the channel | 2009-10-10 13:43:20 |
*** pinc has joined the channel | 2009-10-10 14:52:24 |
*** pinc has left the channel | 2009-10-10 15:36:14 |
*** Shelwien has left the channel | 2009-10-10 17:56:52 |
*** compbooks has left the channel | 2009-10-10 17:56:52 |
*** asmodean has left the channel | 2009-10-10 17:56:52 |
*** Shelwien has joined the channel | 2009-10-10 18:01:54 |
*** asmodean has joined the channel | 2009-10-10 18:01:54 |
*** compbooks has joined the channel | 2009-10-10 18:01:54 |
* ChanServ This channel has been registered with ChanServ. | 2009-10-10 18:01:55 |
*** Shelwien has left the channel | 2009-10-10 21:46:40 |
*** Shelwien has joined the channel | 2009-10-10 22:26:42 |
*** Skymmer has joined the channel | 2009-10-11 04:41:35 |
*** Skymmer has left the channel | 2009-10-11 05:07:03 |
*** pinc has joined the channel | 2009-10-11 07:45:46 |
*** pinc has left the channel | 2009-10-11 08:55:53 |
*** sami2 has joined the channel | 2009-10-11 09:58:20 |
<sami2> | hi! any news? | 2009-10-11 09:58:51 |
| michael released m03 btw. that's tested at comp.ratings. although it doesn't work for most files, but the bwt page has results for some files | 2009-10-11 11:13:49 |
| on guy submitted one compressor 3 times yesterday. one of the versions didn't compress anything losslessly. not the greatest day for me maintaining the benchmark | 2009-10-11 11:16:30 |
| ...one guy | 2009-10-11 11:16:39 |
| matt's generic compression also got ungenericized. somebody made a filter, which was expected. I really hope people now see what is the point of choosing real data, like in comp.ratings | 2009-10-11 11:27:40 |
<Shelwien> | hi | 2009-10-11 11:28:15 |
| i just posted another version of my rep-like tool | 2009-10-11 11:28:52 |
| http://shelwien.googlepages.com/fma_09.rar | 2009-10-11 11:28:55 |
| see readme.txt | 2009-10-11 11:29:03 |
| usage: | 2009-10-11 11:29:13 |
| fma-rep source literals matches | 2009-10-11 11:29:21 |
| fma-dec literals matches unpacked | 2009-10-11 11:29:35 |
| basically its a rep alternative / incremental remote diff about which i was talking | 2009-10-11 11:31:24 |
| still without that incremental functionality though | 2009-10-11 11:31:45 |
| but hopefully i fixed up enough stuff today to make it work | 2009-10-11 11:32:01 |
| ... | 2009-10-11 11:32:21 |
| as to m03... i found that it compressed obj2 better than bwtmix, which is kinda unpleasant ;) | 2009-10-11 11:33:30 |
| guess i'd have to write that contextual bwt coding of my own too, next time when i'd mess with bwt again ;) | 2009-10-11 11:34:30 |
| ... | 2009-10-11 11:34:57 |
| as to the forum, matt also finally got to compiling his zpaq configs ;) | 2009-10-11 11:35:32 |
| ... | 2009-10-11 11:36:36 |
| in other news, toffer posted a new version of m1, now with 4 models | 2009-10-11 11:36:53 |
<sami2> | hmm. how do i set minmatch for fma-rep? | 2009-10-11 11:36:58 |
<Shelwien> | and i tested it | 2009-10-11 11:37:03 |
| http://shelwien.googlepages.com/m1x2.htm | 2009-10-11 11:37:05 |
| http://shelwien.googlepages.com/m1x2-sfc.htm | 2009-10-11 11:37:11 |
| but unfortunately he didn't beat ccm in the end, it seems | 2009-10-11 11:37:24 |
| and as to minmatch - there's no commandline atm, just these executables | 2009-10-11 11:37:53 |
| and some commented out constants in crc32.inc | 2009-10-11 11:38:07 |
<sami2> | I compiled it, so what's the minmatch I got here? | 2009-10-11 11:38:28 |
<Shelwien> | 250 probably | 2009-10-11 11:38:45 |
| anchormask is the main parameter though | 2009-10-11 11:39:37 |
| its what actually determines the average fragment length | 2009-10-11 11:40:02 |
| while minmatch is just a way to set some upper limits on performance and hashfile size | 2009-10-11 11:41:05 |
| (no hashfile atm though ;) | 2009-10-11 11:41:12 |
<sami2> | fma outputs 99917554 literal file for enwik8. only 99998952 for my txt1 test. good. :-) | 2009-10-11 11:41:16 |
<Shelwien> | try uncommenting the first enum in crc32.inc | 2009-10-11 11:42:02 |
| or better 4th actually | 2009-10-11 11:42:12 |
<sami2> | I tried generating the txt1 to have as little matches as possible. but it got bored pretty quickly and didn't have time to analyze the matches in the end very carefully | 2009-10-11 11:42:26 |
<Shelwien> | (with winsize=32) | 2009-10-11 11:42:26 |
| well, i tested it with remuxed versions of some video | 2009-10-11 11:43:24 |
| and it generated 3-5M extra for the second file | 2009-10-11 11:44:10 |
| so it should be ok for its main purpose | 2009-10-11 11:44:37 |
| though i guess i should try attaching it to ppmd_sh or something | 2009-10-11 11:45:12 |
<sami2> | with the winsize=32 uncommented: enwik8 97874287, text1 99957470 | 2009-10-11 11:45:48 |
<Shelwien> | there're probably some considerable matches in some of your tests | 2009-10-11 11:45:50 |
| yeah, something like that | 2009-10-11 11:46:11 |
<sami2> | in comp.ratings there are lot's of matches in the app2 at least | 2009-10-11 11:46:56 |
| so the winsize=32 means 32 is min matchlen? | 2009-10-11 11:47:23 |
<Shelwien> | not really | 2009-10-11 11:47:31 |
| winsize is a window for the rolling crc32 used for fragment anchors | 2009-10-11 11:47:57 |
| (for remote LZ ;) | 2009-10-11 11:48:07 |
<sami2> | ok, but kind of minmatch then | 2009-10-11 11:48:19 |
<Shelwien> | not really, though somewhat related | 2009-10-11 11:48:30 |
| i guess you can set it to like 8 | 2009-10-11 11:48:43 |
| and minblklen to 8 | 2009-10-11 11:49:06 |
| and anchormask to (1<<4)-1 or (1<<3)-1 | 2009-10-11 11:49:31 |
<sami2> | now it's 83... and 75... , but with winsize:16 it's 944... 995.... | 2009-10-11 11:51:59 |
<Shelwien> | actually there're no sense to have minmatch smaller than 13, because 12 is the size of match record ;) | 2009-10-11 11:52:24 |
<sami2> | :-) | 2009-10-11 11:52:33 |
<Shelwien> | and its not quite reasonable for plain local matchfinding anyway | 2009-10-11 11:54:12 |
| but its able to replace fragments with references to remote data | 2009-10-11 11:55:09 |
<sami2> | but the numbers look ok for text1, which is good. so now we have a large text file which has little matches | 2009-10-11 11:55:42 |
| almost all the large files used in papers (manzini for example) have huge (1-10mb long) matches | 2009-10-11 11:56:37 |
<Shelwien> | to kill off the dumb BWT implementations maybe? ;) | 2009-10-11 11:57:09 |
<sami2> | the manzini file esp has really been an evil thing for me in the past. my bwt sort really suits it well and for some time I was evaluating it as "text" performance | 2009-10-11 11:59:17 |
| and really the long matches make it very unlike text | 2009-10-11 11:59:57 |
| my sort is almost 2x faster on that file than other sorts | 2009-10-11 12:01:14 |
<Shelwien> | I didn't spend much time on BWT actually | 2009-10-11 12:02:43 |
| but still believe that i can get some good results if i try | 2009-10-11 12:03:17 |
| yet, inverse transformation is much more obscure for me | 2009-10-11 12:04:16 |
| i don't even have a clear idea on how to make it in <1N memory | 2009-10-11 12:05:20 |
<sami2> | yeah :-) it's tricky | 2009-10-11 12:07:03 |
| I'm not sure is my implementation the best, I mean there might be better ones | 2009-10-11 12:07:49 |
<Shelwien> | ah | 2009-10-11 12:10:03 |
| also, toffer downloaded the newer dcc papers (2005-2009) | 2009-10-11 12:10:31 |
| want a link? | 2009-10-11 12:10:53 |
<sami2> | basically mine is pretty dumb, i just use some additional memory to convert the uncompressed index into compressed index | 2009-10-11 12:14:10 |
| sure! | 2009-10-11 12:14:19 |
<Shelwien> | here're the papers: http://toffer.dreamhosters.com/dcc.7z | 2009-10-11 12:14:45 |
| and here're my indexes for these: http://toffer.dreamhosters.com/dcc_idx.rar | 2009-10-11 12:14:59 |
<sami2> | can I give those links couple of friends? | 2009-10-11 12:15:49 |
<Shelwien> | i guess | 2009-10-11 12:16:01 |
| apparently they're not signed or anything | 2009-10-11 12:16:09 |
<sami2> | about benchmark machine upgrade. i7-860 is the cheapest 4-core with ht right? I don't see any movement in the price. is there a "roadmap" for price reductions on intel? | 2009-10-11 12:53:33 |
*** sami2 has left the channel | 2009-10-11 13:38:27 |
*** chornobyl has joined the channel | 2009-10-11 15:50:02 |
*** Simon|B has joined the channel | 2009-10-11 17:06:11 |
*** Simon|B has left the channel | 2009-10-11 18:09:02 |
*** chornobyl has left the channel | 2009-10-11 21:25:52 |
*** Shelwien has left the channel | 2009-10-11 22:28:43 |
*** Shelwien has joined the channel | 2009-10-11 23:04:22 |
<Shelwien> | !next | 2009-10-11 23:04:32 |