*** jack has left the channel		2009-12-01 16:36:38
*** Guest4704955 has left the channel		2009-12-01 16:44:47
*** Guest4704955 has joined the channel		2009-12-01 16:59:25
*** pinc has left the channel		2009-12-01 17:13:11
*** Krugz has joined the channel		2009-12-01 17:45:58
*** toffer has joined the channel		2009-12-01 18:16:18
*** Guest4704955 has left the channel		2009-12-01 18:54:08
*** Guest4704955 has joined the channel		2009-12-01 19:05:29
*** pinc has joined the channel		2009-12-01 19:06:56
*** schnaader has joined the channel		2009-12-01 20:23:15
<Shelwien>	people are gathering for some reason, but nobody talks %)	2009-12-01 20:33:09
<schnaader_afk>	Will be talking in a few minutes :P	2009-12-01 20:33:54
<schnaader>	Tada :)	2009-12-01 20:40:43
	Well, it's the same for most IRC channels - even if 200 peoples are in, you may wait for half an hour without anyone talking and everything is filled with join/quit messages.	2009-12-01 20:43:13
	And I think if you'd compare view/post numbers in the forum, you'd come to similar results :)	2009-12-01 20:44:03
<Shelwien>	sure	2009-12-01 20:44:32
	anyway, this channel's log is much more readable than any other which I know of ;)	2009-12-01 20:45:05
<schnaader>	Yes, I think that's the fact that people know each other quite a bit already and it's not a bunch of random people.	2009-12-01 20:45:39
<Shelwien>	...though now I'm going afk - food calls ;)	2009-12-01 20:46:43
<schnaader>	:) OK, see ya	2009-12-01 20:47:18
	Have a nice meal :)	2009-12-01 20:47:52
*** Guest4704955 has left the channel		2009-12-01 20:50:46
*** pinc has left the channel		2009-12-01 20:51:07
*** Guest4704955 has joined the channel		2009-12-01 21:05:26
*** Shelwien has left the channel		2009-12-01 21:15:41
*** Guest9968193 has joined the channel		2009-12-01 21:15:45
<Shelwien>	btw, schnaader	2009-12-01 21:18:35
	what happens when precomp encounters a broken deflate stream?	2009-12-01 21:18:49
	like a file with remapped cluster in that VM image?	2009-12-01 21:19:11
<schnaader>	There can be different behaviours.	2009-12-01 21:24:44
<Shelwien>	i mean, can it extract just a single block for deflate stream?	2009-12-01 21:25:17
<schnaader>	Worst behaviour would be a deflate (or other) stream that stops somewhere and is followed by a big bunch of same bytes which could lead to a very big output stream, recompression would detect failure in that case.	2009-12-01 21:25:57
<Shelwien>	%)	2009-12-01 21:26:25
<schnaader>	Did this experiment with a torrent that not finished downloading once, not recommended ;)	2009-12-01 21:27:01
	After the decompressed stream growing to several GB, Precomp stopped with "disk full"	2009-12-01 21:27:28
<Shelwien>	well, its not a problem with my approach - soundslimmer can losslessly process anything, even not an mp3 file	2009-12-01 21:28:09
<schnaader>	But streams haven't got to be complete, indeed. In most cases, compression will just stop because the stream is invalid at some point in this case and recompression will see how long the match is.	2009-12-01 21:28:37
<Shelwien>	ah. its better than i thought, then ;)	2009-12-01 21:29:15
<schnaader>	There are almost always some rare "attack" cases you can construct, but it's like with hash collisions - they're not that likely to happen :)	2009-12-01 21:29:28
	I even have these "penalty bytes" when some bytes of the compressed stream are different, but afterwards it's the same again.	2009-12-01 21:30:02
<Shelwien>	like patches?	2009-12-01 21:30:19
<schnaader>	Yes, but not that good, only works if they synchronize again, so it will not work with "00 01 02 04 05", "00 01 02 03 04 05", but with "00 01 02 FF 04 05"	2009-12-01 21:31:11
	Could be improved, but as I plan a complete rewrite that doesn't need brute force, it's not necessary :)	2009-12-01 21:31:39
	These patches work on bytes, the rewrite will be able to directly correct matches and always re-synchronize successful that way.	2009-12-01 21:33:49
<Shelwien>	like levenstein distance on bits? ;)	2009-12-01 21:35:18
<schnaader>	Yes, kind of, only that insertion is missing at the moment.	2009-12-01 21:36:04
	The rewrite will basically be an own deflate implementation instead of using zLib, so I can check the recompressed result parallel to decompression and put the deflate differences in a structure that can be appended to the decompressed stream.	2009-12-01 21:37:15
<Shelwien>	yeah	2009-12-01 21:37:31
	i've got a puff.c clean for that too, but didn't start it still ;)	2009-12-01 21:37:51
	*cleaned	2009-12-01 21:37:59
<schnaader>	Like "01230123", "that dumb encoder didn't get the match, encode literals instead" :)	2009-12-01 21:37:59
	I've got the decompression and most of the recompression now, but I have to add some ringbuffers to avoid using temporary files again :)	2009-12-01 21:38:39
<Shelwien>	%)	2009-12-01 21:38:51
	btw, why don't you add some other preprocessing too?	2009-12-01 21:39:07
	like that record/delta filter in ccm?	2009-12-01 21:39:21
<schnaader>	I thought about this, especially the 7-Zip + srep results brought that to my mind again.	2009-12-01 21:39:44
<Shelwien>	well, rep is separate stuff, it takes a lot of memory	2009-12-01 21:40:18
<schnaader>	It's also getting important with upcoming bZip2 compression-on-the-fly where we might want to reorder the data because we have 900 KB blocks.	2009-12-01 21:40:19
<Shelwien>	btw, did you see my explanation of what ccm does?	2009-12-01 21:40:59
<schnaader>	In the forum? Was a while ago, wasn't it?	2009-12-01 21:41:47
<Shelwien>	i don't quite remember myself ;)	2009-12-01 21:42:04
	anyway, its fairly simple, but has a very nice effect	2009-12-01 21:42:23
	ccm processes data in 64k blocks	2009-12-01 21:42:37
	and reorders them by bytes if it finds any records	2009-12-01 21:43:04
*** pinc has joined the channel		2009-12-01 21:43:10
	so, 16bit stereo wav	2009-12-01 21:43:16
	64k block turns into 4 x 16k byte blocks	2009-12-01 21:43:37
	and there's delta too	2009-12-01 21:43:45
*** pinc has left the channel		2009-12-01 21:44:13
<schnaader>	Ah, I see, like abcdabcdabcd => aaabbbcccddd	2009-12-01 21:44:32
<Shelwien>	yeah, but also with subtractions if necessary	2009-12-01 21:45:19
<schnaader>	How does it detect this? Does it know for some stream types only or does it use a general attempt by checking some stats about the bytes?	2009-12-01 21:45:45
<Shelwien>	general afaik	2009-12-01 21:46:54
	its a record filter	2009-12-01 21:46:58
	it not only supports wavs	2009-12-01 21:47:12
	but also images and tables with fixed records	2009-12-01 21:47:21
<schnaader>	Especially helpful if not 2 or 4 bytes record size	2009-12-01 21:47:44
<Shelwien>	yeah	2009-12-01 21:47:52
*** STalKer-X has joined the channel		2009-12-01 21:47:56
	?	2009-12-01 21:48:02
<STalKer-X>	pow	2009-12-01 21:48:09
<schnaader>	Could even be generalised to bits, but this would be harder to detect	2009-12-01 21:48:12
<Shelwien>	not much sense too, imho	2009-12-01 21:48:35
<schnaader>	Although if bit record size isn't prime, it's almost the same.	2009-12-01 21:48:36
*** Guest4704955 has left the channel		2009-12-01 21:48:52
<Shelwien>	there's another problem though, with database records	2009-12-01 21:48:53
	like a record can contain a string and a few numbers	2009-12-01 21:49:24
	and encoding the string part by columns might not be a good idea	2009-12-01 21:49:53
	as it could otherwise match something else	2009-12-01 21:50:08
<schnaader>	Well, in the bZip2 case, I can always apply different preprocessing and choose the best result for a given block.	2009-12-01 21:51:18
	Of course, most of the worst cases can be detected before, anyway and not be preprocessed.	2009-12-01 21:52:33
<Shelwien>	btw, intel's bzip is weird	2009-12-01 21:54:16
	produces files of different size at the same modes ;)	2009-12-01 21:54:38
<schnaader>	One of the first filter ideas was for PDF data, do you know these "(word )<ASCII float numbers and PDF commands>(other )(words and perhaps some)(l)<...>(etters)" crap they're doing in there? Splitting up to text, commands and encoding the floats binary would reeeeally help there :)	2009-12-01 21:55:17
	Don't combine RNGs and compression ;)	2009-12-01 21:56:27
	At least bZip2 isn't as bad as deflate - you know which mode was used and output will be the same most of the time, although still not 100% reliable.	2009-12-01 21:57:45
<Shelwien>	i guess, there's just much less implementations ;)	2009-12-01 21:58:11
	and pdf also has random stuff beside deflate... like that ascii85 etc	2009-12-01 21:59:25
<schnaader>	Yes, that's the main factor. Not that easy to implement BWT things as with huffman codes and literal/match decisions.	2009-12-01 21:59:26
	ascii85 is on my todo list, should have done this already :( Welcome lazyness ;)	2009-12-01 21:59:51
	By the way, do you know anything about encrypted PDFs? I'm pretty sure decrypting could be done, but I'm not sure if encrypting it back with same results would be possible. Not to mention that Adobe pretty sure wouldn't like such things...	2009-12-01 22:01:45
<Shelwien>	well, i can recommend a decrypting utility if you want ;)	2009-12-01 22:03:10
<schnaader>	I have decryption sources as well, thanks, but nobody cares about re-encryption ;)	2009-12-01 22:03:33
<Shelwien>	well, i think it should be possible to reconstruct	2009-12-01 22:04:52
	if you decrypt it yourself	2009-12-01 22:05:09
<schnaader>	Depends, as some (or most) of the algorithms seem to be asymmetrical, so you might have a public key, but perhaps would need the private key to re-encrypt, don't know...	2009-12-01 22:05:55
<Shelwien>	and as to adobe... maybe messing up something to avoid getting a usable decrypted pdf would be a good idea ;)	2009-12-01 22:05:59
<schnaader>	Output would be pretty messed up by the PCF format already, but I also thought about this, yes :)	2009-12-01 22:07:06
<Shelwien>	and keys should be available anyway, as software which does the encryption is available ;)	2009-12-01 22:07:36
<schnaader>	Right :)	2009-12-01 22:07:53
<Shelwien>	btw, what about text filters?	2009-12-01 22:08:14
	including LIPT etc?	2009-12-01 22:08:21
	like WRT?	2009-12-01 22:08:33
<schnaader>	LIPT? Google found "Leymann Inventory of Psychological Terror", lol	2009-12-01 22:08:51
	Yes, WRT and especially HTML/XML filters also came to my mind, same thing as with most Precomp ideas - would take too long for know, other things have higher priority :)	2009-12-01 22:10:04
	Although I have several code branches with such experiments at least using scripts and made-up examples.	2009-12-01 22:10:26
<Shelwien>	Length Index Preserving Transform	2009-12-01 22:11:30
	your version was better though, at least makes some sense ;)	2009-12-01 22:12:07
<schnaader>	There's always some sort of psychological terror involved when it comes to compression :)	2009-12-01 22:12:57
<Shelwien>	still, there're simpler text filters too, which still help	2009-12-01 22:15:11
	like "capital conversion"	2009-12-01 22:15:17
	and punctuation padding	2009-12-01 22:15:54
<toffer>	such stuff is more efficient when incorporated into the context generation :D	2009-12-01 22:17:52
<schnaader>	I think I should go for a cleaned up object oriented version of Precomp in beta phase (which will start soon, supporting multiple files and directories is the only big todo left for that), so generalising pre-/postprocessing would be easy and external DLLs could be used for quick tests.	2009-12-01 22:18:14
<toffer>	the compressor manually applies these transforms for already processed data to improve context clustering	2009-12-01 22:18:18
<Shelwien>	toffer: not quite, it also affects symbol decomposition	2009-12-01 22:18:43
<toffer>	not the decomposition itself, but the processed symbols	2009-12-01 22:19:24
<Shelwien>	ah	2009-12-01 22:19:36
	btw, considering dlls	2009-12-01 22:19:46
	did you see my precomp merged into a single exe? with packjpg?	2009-12-01 22:19:58
<toffer>	someting i always wondered... how large is your source?	2009-12-01 22:20:07
	@shelwien did you ever try to optimize such transforms?	2009-12-01 22:21:37
<Shelwien>	there's not much to optimize kinda	2009-12-01 22:22:13
<toffer>	a set of flags	2009-12-01 22:22:28
<Shelwien>	you either use it, or not	2009-12-01 22:22:29
<schnaader>	@Shelwien: Was this one of the posts in "How small could we get a Precomp SFX"? Something like this would be useful, although I thought about disabling PackJPG by default in the next version because the 2.4WIP version is too unstable.	2009-12-01 22:22:31
<toffer>	what to apply when	2009-12-01 22:22:33
	for every model, of course - assuming cm	2009-12-01 22:23:40
<Shelwien>	schnaader: i made a tool called dllmerge, which resolves exe imports/exports with a statically binded dll and merges them	2009-12-01 22:23:45
<toffer>	well it worked with pthread+m1	2009-12-01 22:24:05
<Shelwien>	worked with precomp too	2009-12-01 22:24:12
<schnaader>	@toffer: At the moment it's about 9000 LOC, 300 KB source size (excluding external GIF routines and zLib). It could be smaller, though as it pretty messed up, for example there are try_decompression_(pdf/zip/...) routines that could be merged into one with some branches.	2009-12-01 22:25:08
<toffer>	@eugene: i see you did "hand"-tuning to your mtf ? that ranking function only used an enum as a constant	2009-12-01 22:25:08
	ouch	2009-12-01 22:25:21
<Shelwien>	yeah	2009-12-01 22:25:24
<toffer>	300kb	2009-12-01 22:25:26
	i mean i got a few 1000 loc, but it's just	2009-12-01 22:25:44
	80kb	2009-12-01 22:25:47
	you should really concider c++	2009-12-01 22:26:19
	afaik it was c?	2009-12-01 22:26:24
	i mean templates are pretty useful for code generation	2009-12-01 22:26:40
<schnaader>	This is C++, but you're right, not using OO as I should :)	2009-12-01 22:26:42
<toffer>	^^	2009-12-01 22:26:53
<schnaader>	You also see that routine merge lazyness in the EXE - 400 KB -> 130 KB with UPX.	2009-12-01 22:26:59
<Shelwien>	;)	2009-12-01 22:27:20
<toffer>	usually such code attracts errors quite a bit	2009-12-01 22:27:30
<schnaader>	So I guess LOC could get down to about 3000 LOC easily, but it just wouldn't change much, so I didn't bother yet.	2009-12-01 22:27:31
	@toffer: Yes, this is indeed the best argument for a rewrite.	2009-12-01 22:28:01
	It also isn't helpful with new most of the new features like compression-on-the-fly where you have to replace all the fread/fwrite's you didn't generalize although you knew you should have done it :)	2009-12-01 22:29:06
<Shelwien>	http://en.wikipedia.org/wiki/Coroutine	2009-12-01 22:29:51
<toffer>	on the other hand i have serious trouble from time to time with c++ stl with vector of vector of vector and some other rather basic stuff. checked the assembly and the code was wrong causing random memory poking, etc.	2009-12-01 22:30:26
<schnaader>	And recursion would have been a lot easier without all that BAAAD global variables I have to push/pop now :(	2009-12-01 22:31:03
<toffer>	i mean an excessive usage of such c++ features reveals bugs quite often.	2009-12-01 22:31:04
	that's really ugly	2009-12-01 22:31:17
	i got no global vars in my code at all	2009-12-01 22:31:32
	^^	2009-12-01 22:31:33
<Shelwien>	you have them in fact	2009-12-01 22:31:49
	like _errno	2009-12-01 22:31:52
<schnaader>	Linux version will be an interesting thing because I'll do some valgrind experiments, could reveal some memory leaks/errors that are there quite sure.	2009-12-01 22:31:55
<toffer>	well that's the c library	2009-12-01 22:32:03
	but not the stuff i've written	2009-12-01 22:32:09
<Shelwien>	;)	2009-12-01 22:32:15
<toffer>	@eugene: i made some experiments for possible speedups. there'	2009-12-01 22:32:44
<Shelwien>	?	2009-12-01 22:32:58
<toffer>	there's some potential in replacing hashing with direct lookups	2009-12-01 22:33:00
	in m1	2009-12-01 22:33:01
	but that requires to detect, e.g. order1 and 2 context mask	2009-12-01 22:33:16
<Shelwien>	ah. like what i did in mix_test?	2009-12-01 22:33:21
<toffer>	and special code to handle.	2009-12-01 22:33:26
	short contexts only	2009-12-01 22:33:35
	o1,2	2009-12-01 22:33:39
	you always used lookup tables afaik	2009-12-01 22:33:46
<Shelwien>	they don't need any hashing obviously ;)	2009-12-01 22:33:54
<toffer>	but it's 5-8% faster	2009-12-01 22:34:21
	even with dumb code	2009-12-01 22:34:26
<Shelwien>	should be ;)	2009-12-01 22:34:35
	that might be useful for you then - http://encode.dreamhosters.com/showthread.php?t=396	2009-12-01 22:35:02
	you can check whether its a constant or variable	2009-12-01 22:35:19
	and select direct lookups if its constant and mask fits into 64k	2009-12-01 22:35:43
<toffer>	well more or less	2009-12-01 22:36:22
	but that'd require to makeloadable parameters constant	2009-12-01 22:36:41
<Shelwien>	you can generate multiple versions in compile-time	2009-12-01 22:37:15
	btw, the trick which i did in ccm_sh should be usable with gcc to i think	2009-12-01 22:37:33
<toffer>	that would bloat the exe size multiple times	2009-12-01 22:37:33
<Shelwien>	yeah, so what?	2009-12-01 22:37:42
	upx etc...	2009-12-01 22:37:48
<toffer>	that just sounds ill to me	2009-12-01 22:38:12
	if i can simply have a single more if	2009-12-01 22:38:21
<Shelwien>	why not if its faster	2009-12-01 22:38:21
	runtime if is bad	2009-12-01 22:38:33
<toffer>	one more if per model per byte	2009-12-01 22:38:35
<Shelwien>	worse than division	2009-12-01 22:38:37
<toffer>	if i'd do that per bit yes	2009-12-01 22:38:53
	but that way it's acceptable	2009-12-01 22:38:59
<Shelwien>	whatever, if the code would be still there	2009-12-01 22:39:07
	it also fragments code cache etc	2009-12-01 22:39:25
	anyway, i was talking about the idea	2009-12-01 22:39:40
	with compiling the same source multiple times with different macro parameters	2009-12-01 22:39:59
	and linking it all together after all	2009-12-01 22:40:15
	as i found, it gave me a considerable speedup	2009-12-01 22:40:35
	because i was able to use separate PGO for decoder and encoder	2009-12-01 22:40:55
	and different compiter options	2009-12-01 22:41:05
	*compiler	2009-12-01 22:41:09
	and their code ranges didn't overlap	2009-12-01 22:41:21
<toffer>	as you know all parameters must be run-time loadable	2009-12-01 22:42:02
	i simply cannot use such an approach	2009-12-01 22:42:07
<Shelwien>	well, you can	2009-12-01 22:42:35
	like, check the masks and select a codec version based on that	2009-12-01 22:42:56
	with direct or hashed lookups	2009-12-01 22:43:02
<toffer>	yes, but i won't do that for all possible combinations	2009-12-01 22:44:17
	since the number grows exponentially	2009-12-01 22:44:23
	it still requires to inject some code	2009-12-01 22:44:38
<Shelwien>	well, runtime code generation is the best	2009-12-01 22:44:51
	but damned C++ doesn't have such a feature	2009-12-01 22:45:01
<toffer>	not inject in that sense	2009-12-01 22:45:01
	but it would be very nice, indeed	2009-12-01 22:45:12
	as most parameters are just machine words	2009-12-01 22:45:36
<Shelwien>	...afk, sorry	2009-12-01 22:48:52
<schnaader>	Better afk than your chair getting wet :P	2009-12-01 22:49:46
<toffer>	i just tested the code	2009-12-01 22:50:37
	and hard coded that lookups	2009-12-01 22:50:43
	it's 1% faster	2009-12-01 22:50:47
	not worth the effort	2009-12-01 22:50:52
	5.66s -> 5.61s	2009-12-01 22:51:09
<schnaader>	Is that 1% constant or would it grow with more complex settings?	2009-12-01 22:51:23
<toffer>	constant	2009-12-01 22:53:45
	it would grow, of course	2009-12-01 22:53:54
<schnaader>	OK, just thought about it because you said combinations could grow exponentially.	2009-12-01 22:54:02
<toffer>	but the number of possible different combinations i'd need to compile grows exponentially	2009-12-01 22:54:13
<schnaader>	Yay, 2 dev/null/nethack trophies this year :) http://nethack.kahrens.com/playertrophies.php?id=424&year=2009&place=First&size=Large	2009-12-01 23:18:09
<toffer>	dunnot know about that	2009-12-01 23:25:07
	gonna watch family guy now	2009-12-01 23:25:13
<schnaader>	Family guy is so funny :) So have fun ;)	2009-12-01 23:26:01
<toffer>	really?	2009-12-01 23:32:02
	well i like it a lot	2009-12-01 23:32:05
<schnaader>	I like the kind of strong, but still somewhat critical humor in it, like in American Dad or Drawn Together (or in the Simpsons, although not that extreme).	2009-12-01 23:33:15
<toffer>	well the simpsons are really great. for kids and for adults. i mean when i was a child i didn't understand all of the stuff in it reflecting something real	2009-12-01 23:34:57
<schnaader>	Yes, though it's a good mix so you still like it as a child :)	2009-12-01 23:35:26
<toffer>	cheers	2009-12-01 23:38:34
	gn8	2009-12-02 00:55:59
*** toffer has left the channel		2009-12-02 00:56:06
*** schnaader has left the channel		2009-12-02 00:57:47
*** STalKer-Y has joined the channel		2009-12-02 04:06:57
*** STalKer-X has left the channel		2009-12-02 04:10:02
*** Krugz has left the channel		2009-12-02 07:02:48
*** pinc has joined the channel		2009-12-02 09:15:10
*** schnaader has joined the channel		2009-12-02 14:59:45
*** schnaader has left the channel		2009-12-02 15:15:06
*** toffer has joined the channel		2009-12-02 15:28:12
	hi guys	2009-12-02 15:28:50
<Shelwien>	hi toffer, they're all bots	2009-12-02 15:29:16
<toffer>	erm?	2009-12-02 15:29:27
	did you finally write some?	2009-12-02 15:29:59
<Shelwien>	no, they somehow appear even without me ;)	2009-12-02 15:30:41
	though i did write complogger ;)	2009-12-02 15:31:02
<toffer>	well, yes	2009-12-02 15:31:09
	but i thought pinc and asmodean are real	2009-12-02 15:31:21
<Shelwien>	well, sometimes, very rarely ;)	2009-12-02 15:31:46
<pinc>	yepp, sometimes I'm real ))	2009-12-02 15:32:03
<toffer>	you must be kidding - they're just idle	2009-12-02 15:33:31
<Shelwien>	of course, but in a sense, mirc without user is no different from complogger ;)	2009-12-02 15:34:48
<toffer>	so you're kidding ^^	2009-12-02 15:35:51
<Shelwien>	...	2009-12-02 15:36:11
	i'm writing a coroutine demo here	2009-12-02 15:36:29
	rewritten that mtf utility using setjmp/longjmp	2009-12-02 15:37:02
	and gcc is annoying me as usual	2009-12-02 15:37:14
	i mean, it works with MSC/Intel, but not gcc	2009-12-02 15:37:44
<toffer>	how do you want to parallelize it?	2009-12-02 15:48:02
<Shelwien>	its not about paralleling	2009-12-02 15:48:29
	its about building a data processing pipeline with readable syntax	2009-12-02 15:49:13
<toffer>	erm but...?	2009-12-02 15:49:16
<Shelwien>	well, i can post the current version, though it doesn't work with gcc yet	2009-12-02 15:50:38
	i'm trying to fix that too, but its tricky	2009-12-02 15:50:48
<toffer>	i'll first have a look at coroutines	2009-12-02 15:52:13
	could you grep it	2009-12-02 15:52:19
	?	2009-12-02 15:52:20
<Shelwien>	you can too	2009-12-02 15:52:29
<toffer>	!grep or something	2009-12-02 15:52:33
	ah	2009-12-02 15:52:39
	^^	2009-12-02 15:52:41
<Shelwien>	;)	2009-12-02 15:52:42
<toffer>	just guessed the syntax right	2009-12-02 15:52:51
	!grep coroutine	2009-12-02 15:52:54
	mh	2009-12-02 15:53:09
	was it on wikipedia?	2009-12-02 15:53:12
<Shelwien>	its case-sensitive	2009-12-02 15:53:12
	!grep Coro	2009-12-02 15:53:18
<toffer>	ah	2009-12-02 15:53:23
	thanks	2009-12-02 15:53:25
	btw i evaluated the speed gain of different implementations	2009-12-02 15:58:00
	regarding table lookups	2009-12-02 15:58:06
<Shelwien>	?	2009-12-02 15:58:15
<toffer>	i mean under certain circumstances it's beneficial to use lookup tables instead of hashign	2009-12-02 15:58:45
	hashing	2009-12-02 15:58:48
<Shelwien>	well, hashing is lookup tables with randomized indexing	2009-12-02 15:59:35
	of course direct indexing is faster	2009-12-02 15:59:45
<toffer>	i've written code to detect context masks like 0xff, 0x40ff, 0x405ff and the same for order 2.	2009-12-02 15:59:46
	and gonna use specialized codecs for either one or two directly addressable models	2009-12-02 16:00:25
	in mos cases i got order 1 and 2 anyway	2009-12-02 16:00:39
<Shelwien>	well, that's something too, i guess	2009-12-02 16:01:04
	though i hope you don't only support fixed masks, but count mask bits	2009-12-02 16:01:39
<toffer>	i could do that but it'd require to reorder the bits	2009-12-02 16:02:25
	which is slow	2009-12-02 16:02:38
<Shelwien>	yeah, i think that would be still faster than hashing	2009-12-02 16:02:45
<toffer>	the fastest implementation i can think of is to have translation tables	2009-12-02 16:03:22
	e.g. tab[c] is setup for mask m to stuff bits together	2009-12-02 16:04:07
	but that'd still require a loop over 8 bytes	2009-12-02 16:04:28
	which is slow	2009-12-02 16:04:31
<Shelwien>	well, yeah, though i just precompile the code for that	2009-12-02 16:04:34
	damned google finally completely dropped googlepages a few days ago	2009-12-02 16:05:27
	its annoying as hell now	2009-12-02 16:05:31
<toffer>	^^	2009-12-02 16:05:32
	google is evil	2009-12-02 16:05:35
<Shelwien>	http://sites.google.com/site/shelwien/gmtf_v0a.rar	2009-12-02 16:05:38
<toffer>	i cannot get specialized code for all of that	2009-12-02 16:05:44
	that's impossible	2009-12-02 16:05:47
	even for a single mask	2009-12-02 16:05:51
<Shelwien>	that depends on what you want to do	2009-12-02 16:06:07
<toffer>	since it'd require to have 256^# of bytes to translate different pieces	2009-12-02 16:06:09
	and you overestimate the speed gain of lookups	2009-12-02 16:06:32
	the optimization gets 8%	2009-12-02 16:06:43
	speed improvement	2009-12-02 16:06:49
<Shelwien>	but i don't think that being able to tune to data _and_ use new profiles right away with all possible speed optimization is that important	2009-12-02 16:07:16
	so you can either build new versions by recompiling the model after retuning	2009-12-02 16:08:18
	like i do, and zpaq now	2009-12-02 16:08:23
	or you can also implement a generalized version	2009-12-02 16:08:49
	which would support any profiles	2009-12-02 16:08:57
	but won't be speed-optimized	2009-12-02 16:09:02
<toffer>	well i still like to get that speed hit without specialisation	2009-12-02 16:10:37
	i could modify my code generator to produce a header with the hard-coded parameters.	2009-12-02 16:11:08
<Shelwien>	yeah	2009-12-02 16:11:14
<toffer>	actually it was like that for previous version <= 0.2	2009-12-02 16:11:17
<Shelwien>	and well, i don't see the point with loosing the possible gain with specialization	2009-12-02 16:11:37
<toffer>	but my current optimizer approach is more generalized thus support run-time parameter loading and multi threading	2009-12-02 16:11:39
	it's no loss	2009-12-02 16:11:53
	if it cannot use lookup tables it switches to the current implementation: hash tables	2009-12-02 16:12:17
<Shelwien>	well, of course its no loss until you properly optimize the specialized version	2009-12-02 16:12:30
*** chornobl has joined the channel		2009-12-02 16:12:33
	bl?	2009-12-02 16:13:03
<chornobl>	ive shortened it	2009-12-02 16:15:27
	since old nick where banned	2009-12-02 16:15:38
*** Krugz has joined the channel		2009-12-02 16:15:46
	btw, theres question about your p2p idea	2009-12-02 16:20:09
<Shelwien>	?	2009-12-02 16:20:23
<chornobl>	how would it handle multiple nested files	2009-12-02 16:21:11
	like iso which contans zip which conatains jpg	2009-12-02 16:21:43
<Shelwien>	well, its not quite related to p2p - that's more about matching recompressed data	2009-12-02 16:21:54
<chornobl>	anyway	2009-12-02 16:22:05
<Shelwien>	and afaiu, we can just compute multiple hashtables for a file	2009-12-02 16:22:29
	i mean, there could be a matching compressed version of original file	2009-12-02 16:22:58
	or, otherwise, some unpacked contents can match	2009-12-02 16:23:16
	but either way, we can detect that	2009-12-02 16:23:35
<chornobl>	so it will be hierarchical structure	2009-12-02 16:24:02
<Shelwien>	although reconstructing the file from multiple sources would be very tricky to implement	2009-12-02 16:24:16
	i mean, if i downloaded half of the zip archive compressed	2009-12-02 16:24:39
	and can't find any more seeds	2009-12-02 16:24:48
*** sami has joined the channel		2009-12-02 16:25:02
	and then i find other files supposedly contained there	2009-12-02 16:25:06
<sami>	hi!	2009-12-02 16:25:14
<Shelwien>	but unpacked, or with different compression	2009-12-02 16:25:40
	still, thats better than nothing	2009-12-02 16:25:59
	hi sami ;)	2009-12-02 16:26:01
<toffer>	hi	2009-12-02 16:26:13
<Shelwien>	sami: http://sites.google.com/site/shelwien/gmtf_v0a.rar	2009-12-02 16:26:18
	its my upcoming coroutine demo (still buggy)	2009-12-02 16:26:32
	do you have any suggestions?	2009-12-02 16:26:41
<chornobl>	depth of incapsulation should be limited, or manually controlled, to get sane hash size	2009-12-02 16:28:39
<Shelwien>	sane hash size doesn't really matter for p2p	2009-12-02 16:29:04
	as it won't be transferred anywhere until matches found	2009-12-02 16:29:33
<chornobl>	still there should be some adaptivity, because video file differs from example mentioned above, so bits need to be spread differently betwen levels (1 vs 3)	2009-12-02 16:34:03
<Shelwien>	i don't understand	2009-12-02 16:34:41
<chornobl>	i mean more bits can be given to video file (not precompressible)	2009-12-02 16:36:12
<Shelwien>	still don't know what are you talking about	2009-12-02 16:36:40
	the idea is that we can find somebody who has a given data fragment	2009-12-02 16:36:59
<chornobl>	than first nested level of same sized iso (precompresseble)	2009-12-02 16:37:00
<Shelwien>	by its hash	2009-12-02 16:37:03
	and some data can have multiple representations	2009-12-02 16:37:37
<chornobl>	guess i lost some comunication skills recently =)	2009-12-02 16:38:00
<Shelwien>	so we can index all or at least some of these	2009-12-02 16:38:03
<toffer>	having three specialized coding routines increases code size just by 20kb	2009-12-02 16:38:08
<Shelwien>	well, just think about it in asm terms	2009-12-02 16:38:28
	its still a lot actually ;)	2009-12-02 16:38:31
<toffer>	it's 20% slower now... guess gcc didn't do inlining properly...	2009-12-02 16:39:42
<Shelwien>	;)	2009-12-02 16:40:18
<toffer>	yep the bit coding routine isn't inlined	2009-12-02 16:40:40
	well explicit template instantiation does the job	2009-12-02 16:46:02
	let's see how large the exe will be ^^	2009-12-02 16:46:08
<Shelwien>	i'd remind the idea from ccm_sh	2009-12-02 16:46:32
	you can separately compile multiple codec instances to separate object files	2009-12-02 16:47:00
	and only then link them together	2009-12-02 16:47:21
	its especially helpful if taking into account the PGO	2009-12-02 16:47:47
<sami>	http://compressionratings.com/s_ref.html the "new" test files	2009-12-02 16:47:58
<Shelwien>	did you see new Bulat's benchmark btw?	2009-12-02 16:48:16
<sami>	it appears sorting the n/a gets put into the top	2009-12-02 16:48:28
	no, where is it?	2009-12-02 16:48:32
<Shelwien>	http://encode.dreamhosters.com/showthread.php?t=507	2009-12-02 16:48:43
<sami>	just noticed that bwtmix1 didn't get tested in these ref files, will fix that	2009-12-02 16:49:04
<Shelwien>	hope it won't die	2009-12-02 16:49:19
	i mean, freeze ;)	2009-12-02 16:49:30
<toffer>	somehow that seems to hurt compiler optimizations	2009-12-02 16:49:43
<chornobl>	it wot grow too much either	2009-12-02 16:49:48
<toffer>	it's 10% slower now	2009-12-02 16:49:50
	>.<	2009-12-02 16:49:53
<Shelwien>	what does?	2009-12-02 16:50:02
<chornobl>	as main purpose (i think) promote fa and new srep	2009-12-02 16:50:22
<Shelwien>	there's no sense to promote new srep (also its slow, especially decoding)	2009-12-02 16:50:54
	because people won't really care until he makes it internal	2009-12-02 16:51:17
<chornobl>	repack mainacs already care	2009-12-02 16:51:50
<Shelwien>	sami: http://encode.dreamhosters.com/showthread.php?p=10064#post10064	2009-12-02 16:52:34
*** pinc has left the channel		2009-12-02 17:09:58
<sami>	since bulat has public test file(s) that is reasonable and all switches are run already guarantees I pretty much like any test. seems that this is reasonable multithreading + long match test	2009-12-02 17:10:59
<Shelwien>	yeah, but i wonder about times	2009-12-02 17:12:02
<toffer>	somehow i get best gcc results when the encodign and decoding routine are separately compiled. but both into the same .cpp	2009-12-02 17:13:46
<sami>	the nz times doesn't look very positive, but I guess those are possible. io is much more expensive than fa and -cd is slower than -cD, which is only possible with some very huge long match	2009-12-02 17:14:06
	also I had to download the script to find out even how much memory is nz using, I wish that info would be on the tables	2009-12-02 17:15:04
<Shelwien>	;)	2009-12-02 17:15:19
	toffer: yeah, that's what i suggested too	2009-12-02 17:15:34
<toffer>	not really	2009-12-02 17:16:00
<Shelwien>	...meanwhile, it seems like i finally fixed that damned thing	2009-12-02 17:16:08
	and it works with gcc now	2009-12-02 17:16:12
<toffer>	i mean separate cpp for encoding and decoding instanciation hurt	2009-12-02 17:16:22
	but both inside the same helps a bit	2009-12-02 17:16:33
<Shelwien>	not sure what do you mean then	2009-12-02 17:17:03
	do you use separate .o files for encoder and decoder, or not?	2009-12-02 17:17:28
<toffer>	codec<ENCODE> in enc.o and codec<DECODE in dec.o separate hurts code generation after profiling. but both in one file helps	2009-12-02 17:18:01
	the thing which helps is to separate the codec instanciation from the driver code	2009-12-02 17:18:41
<Shelwien>	err... but you have to make different profiles for encoding and decoding, and use them properly	2009-12-02 17:20:23
<toffer>	i know	2009-12-02 17:21:16
	it's still weird	2009-12-02 17:21:24
	i got a command line switch to do both, encoding and decoding for profile generation	2009-12-02 17:21:44
<Shelwien>	yeah, but its bad actually	2009-12-02 17:22:01
	you see, the compiler would think that they work at once	2009-12-02 17:22:23
	(decoding and encoding)	2009-12-02 17:22:28
	it only collects numbers of occurences on branches etc	2009-12-02 17:22:46
	but doesn't understand the order	2009-12-02 17:22:55
	so if in if(cond) branch1; else branch2;	2009-12-02 17:23:22
	branch1 is always taken in encoding	2009-12-02 17:23:29
	and branch2 in decoding	2009-12-02 17:23:33
	it'd think that branch1 probability is 0.5	2009-12-02 17:24:15
<toffer>	that makes no sense if both routines are separate	2009-12-02 17:24:46
<Shelwien>	it doesn't understand that	2009-12-02 17:24:57
	and it doesn't understand a thing about layouts	2009-12-02 17:25:15
	so it would just generate functions in order of parsing	2009-12-02 17:25:42
<toffer>	i don'T see where the problem should be. there is a specialized function for encoding. it got stats for that. and for decoding there's a specialized function, too	2009-12-02 17:26:07
<Shelwien>	as i said... it thinks that they work both at once	2009-12-02 17:26:32
	so instead of optimizing each function alone	2009-12-02 17:26:58
	it would try to optimize "whole program"	2009-12-02 17:27:11
	28.547s 31.547s ccm_sh1d99	2009-12-02 17:28:25
	29.219s 29.891s ccm_sh1d9b	2009-12-02 17:28:25
	28.187s 29.515s ccm_sh1d9e # modular build	2009-12-02 17:28:25
<toffer>	yes, i understand that. but the odd thing i wanted to point out is that doing it that way hurts the generated code	2009-12-02 17:28:41
<Shelwien>	here first line has global PGO	2009-12-02 17:28:46
	and second decoder PGO	2009-12-02 17:28:50
	and third has both	2009-12-02 17:28:56
<toffer>	separating the driver program and the encoder,decoder helps	2009-12-02 17:29:17
	but having 3 separate components hurts	2009-12-02 17:29:25
	component = object file	2009-12-02 17:29:31
<Shelwien>	well, i had 3 and it helped	2009-12-02 17:29:39
	of course there're various alignment quirks etc	2009-12-02 17:29:52
	which i avoided but using different COFF sections for modules	2009-12-02 17:30:13
	dunno how to do it with gcc though	2009-12-02 17:30:22
<toffer>	i gonna do some exact speed tests now. up until now i get 4 models compressing enwik7 in 4.99secs. a single m1 took 2.1sec :D	2009-12-02 17:32:57
	somehow i don't understand why it scales better than linear	2009-12-02 17:33:14
<Shelwien>	memory lookups overlap with computing?	2009-12-02 17:33:42
<toffer>	dunnot know	2009-12-02 17:33:51
	but it looks odd to me	2009-12-02 17:33:55
	gonna be back in 30 mins	2009-12-02 17:34:00
	bye	2009-12-02 17:34:03
<Shelwien>	sami?	2009-12-02 17:34:17
*** toffer has left the channel		2009-12-02 17:34:25
<sami>	Shelwien, did I miss something? I'm now looking at your mtf stuff	2009-12-02 17:43:11
<Shelwien>	http://ctxmodel.net/files/mix_test/gmtf_v1.rar	2009-12-02 17:43:24
	supposedly i made it to work with gcc	2009-12-02 17:43:42
	please check if you can	2009-12-02 17:43:46
	and as to mtf, the version w/o coroutines might be easier to read - http://ctxmodel.net/files/mix_test/gmtf_v0.rar	2009-12-02 17:44:33
*** Krugz has left the channel		2009-12-02 17:46:01
<sami>	g++ compiles it, but -Wall spills out a lot of stuff	2009-12-02 17:46:57
<Shelwien>	didn't check that, the question is whether it works at all, or crashes	2009-12-02 17:47:36
	g++ mtf.cpp -o mtf	2009-12-02 17:47:53
	./mtf c book1bwt 1	2009-12-02 17:48:01
	./mtf d 1 2	2009-12-02 17:48:04
	should produce file 2 identical to book1bwt	2009-12-02 17:48:19
<sami>	works fine for book1rbwt	2009-12-02 17:48:47
<Shelwien>	ok, great	2009-12-02 17:48:58
	do you know a name for then weird MTF version then?	2009-12-02 17:49:15
	*for that	2009-12-02 17:49:21
<sami>	hopefully I understand soon what the setjmps hackery is	2009-12-02 17:49:24
<Shelwien>	setjmp hackery is http://en.wikipedia.org/wiki/Coroutine	2009-12-02 17:49:43
	after this i'm going to try writing all the coders in this style	2009-12-02 17:50:37
	it allows to use memory buffers and fast enough access to everything	2009-12-02 17:51:12
	and also allows to write completely separate modules with a simple API	2009-12-02 17:51:47
	its not really necessary in this MTF example	2009-12-02 17:52:04
	but already for something like Unicode-to-UTF8 converter	2009-12-02 17:53:14
	the main look with similar buffering would be much messier	2009-12-02 17:53:33
	and with rangecoders	2009-12-02 17:53:52
	I didn't really ever see a good library with a universal API	2009-12-02 17:54:22
	*the main loop	2009-12-02 17:55:24
<sami>	don't know what to call this. I've seen a lot of this kind of stuff, I don't recall what were they called. I'm not saying I've seen exactly this though. do you have results for this?	2009-12-02 18:04:55
<Shelwien>	what kind? i can benchmark v0 vs v1 if you want, but its not very sensible	2009-12-02 18:05:47
<sami>	probably this is novel anyway	2009-12-02 18:05:59
	I mean this kind of symbol ranking variants	2009-12-02 18:06:26
<Shelwien>	what's interesting	2009-12-02 18:06:43
	is that it gains ~3k vs plain MTF	2009-12-02 18:07:03
	(after entropy coding of book1rbwt)	2009-12-02 18:07:18
	and also it might be actually faster than MTF	2009-12-02 18:07:38
	because rank updates are skipped sometimes	2009-12-02 18:08:01
<sami>	what about the mtf that moves to rank 1 instead of 0 (and only to zero from one)?	2009-12-02 18:08:11
<Shelwien>	well, i can try that	2009-12-02 18:08:47
	btw, this MTF topic appeared	2009-12-02 18:09:16
	because of unary coding actually ;)	2009-12-02 18:09:21
<sami>	also to rank 2 instead of zero and only from <2 to 0	2009-12-02 18:09:30
<Shelwien>	as unary coding uses some ranking	2009-12-02 18:09:33
	235959, 232079, 231772, 229496	2009-12-02 18:15:23
	mtf+fpaq0p, mtf1, mtf2, gMTF	2009-12-02 18:16:07
<sami>	ok, nice	2009-12-02 18:16:34
<Shelwien>	mtf1 updates rank to rank<2?0:1	2009-12-02 18:16:39
	mtf2 - rank<3?0:2	2009-12-02 18:16:45
	its very easy to modify gMTF.inc to do that actually	2009-12-02 18:17:06
<sami>	although probably more testing would be needed, I mean some basic bwt fenwick structured model before we could say you killed mtf with this	2009-12-02 18:17:59
	can do you one more quick test with obj2, mtf2 vs gmtf?	2009-12-02 18:18:27
	or some other binary file	2009-12-02 18:18:36
<Shelwien>	obj2 or obj2bwt?	2009-12-02 18:18:39
<sami>	obj2bwt yeah	2009-12-02 18:18:46
<Shelwien>	ok, wait	2009-12-02 18:18:52
<sami>	the more testing is because the mtfs may be just too quick for fpaq adapt speed, so it may favour your method	2009-12-02 18:21:12
<Shelwien>	79724, 82177, 87487	2009-12-02 18:21:27
	mtf, mtf2, gmtf	2009-12-02 18:21:32
<sami>	ok	2009-12-02 18:21:47
<Shelwien>	gmtf has a parameter though	2009-12-02 18:22:26
*** schnaader has joined the channel		2009-12-02 18:31:01
<sami>	so did anybody check the new benchmark data?	2009-12-02 18:47:22
<Shelwien>	yours? i did open it... and didn't see any benchmark results afair...	2009-12-02 18:47:56
<sami>	the links should be at the top of the page	2009-12-02 18:48:17
<Shelwien>	yeah	2009-12-02 18:48:27
	i didn't get that actually ;)	2009-12-02 18:48:44
	thought that links on files go to file data ;)	2009-12-02 18:49:09
<sami>	ok, perhaps I can try work around something to avoid that from happening :-)	2009-12-02 18:49:59
<Shelwien>	results for book1 seem kinda weird... do they include decoder size?	2009-12-02 18:50:26
<sami>	the second number is without decoder	2009-12-02 18:50:47
	I mean the "w/o stub" column	2009-12-02 18:50:59
	so xwrt is leading in book1 if we don't take account the one megabyte dictionary	2009-12-02 18:51:46
<Shelwien>	i think maybe you should add a coefficient to it or something	2009-12-02 18:52:06
	because i can compile much smaller ash for sure	2009-12-02 18:52:15
<sami>	unfortunately that doesn't work because some programs use sfx	2009-12-02 18:52:45
	or perhaps we can just do it anyway and ignore the sfx issue like now	2009-12-02 18:53:12
	anyway, the whole point is to keep the decoder small	2009-12-02 18:53:59
<Shelwien>	but ppmy showing "better" results than paqs etc is just dumb	2009-12-02 18:54:22
<sami>	book1,obj2,geo are too small for test files	2009-12-02 18:54:29
	I'm just including them for reference	2009-12-02 18:54:41
<Shelwien>	so i suggest to add a decoder size coefficient	2009-12-02 18:54:48
	like if you compressed 10 such small files	2009-12-02 18:55:02
*** jj has joined the channel		2009-12-02 18:55:50
<schnaader>	Have you checked what this precompressed part of FlashMX.pdf actually includes? There are some big images in it, worst case could be that this is mainly testing image compression, although this wouldn't be that unusual for typical PDFs.	2009-12-02 18:55:57
<Shelwien>	yeah, probably	2009-12-02 18:56:21
<sami>	schnaader, no unfortunately I didn't have time to check it	2009-12-02 18:56:30
<schnaader>	I think I'll have a look at it here.	2009-12-02 18:57:37
<sami>	but yes, I recognize that may be possible, that's why I didn't cut ohs.doc or vcfiu.hlp because I might just be sampling something less interesting	2009-12-02 18:57:38
	Shelwien, so you suggest 0.1 is a good value?	2009-12-02 18:58:39
<Shelwien>	well, i think yes	2009-12-02 18:59:06
<sami>	perhaps I must provide a third size which has such coef, because I cannot replace the main size column because of compressors that use sfx	2009-12-02 18:59:34
	it's also drawback of the whole system that I cannot easily configure programs to run these test with no sfx	2009-12-02 19:00:00
<schnaader>	Actually, first 5 MB of FlashMX.pdf seem to be rather well mixed. There's about 3 or maybe 4 MB of it that's image content, but there also is a lot of text in it.	2009-12-02 19:03:42
<sami>	schnaader so we got lucky :-)	2009-12-02 19:04:55
	can you see a better offset there?	2009-12-02 19:05:06
<Shelwien>	yeah, but sorting by combined size makes it all weird	2009-12-02 19:05:36
<sami>	so that perhaps we could sample less of the image?	2009-12-02 19:05:36
	Shelwien you can sort the tables by clicking at the column	2009-12-02 19:06:15
<Shelwien>	and get lots of n/a first, yeah ;)	2009-12-02 19:06:58
<sami>	right, but I will fix that	2009-12-02 19:07:23
<schnaader>	The part after the first big image (1,2 MB decompressed) seems fine, there's another big image block (2*~1,5 MB) later but I think that's far away from it. So you could try 5 MB with a 2 or 3 MB offset, but as images are pretty mixed up with text, I think it could not be worth the effort and you could just leave it like it is :)	2009-12-02 19:08:53
<sami>	please reload the pages I forgot something, now they should look as they supposed to	2009-12-02 19:09:23
<Shelwien>	could you add some visible separators between column titles too?	2009-12-02 19:10:05
<sami>	schnaader, ok. thanks	2009-12-02 19:10:07
	Shelwien there should be pseudoseparators there already, but not between ct & dt (and not between cm & dm)	2009-12-02 19:12:02
<Shelwien>	i mean like "Size \| w/o stub" instead of "Size w/o stub"	2009-12-02 19:12:49
	i checked in chrome and still don't see any separators there	2009-12-02 19:13:06
<sami>	ok not on those table headers. I try to figure out something	2009-12-02 19:13:43
*** toffer has joined the channel		2009-12-02 19:22:30
* Guest4706822 slaps toffer around a bit with a large fishbot		2009-12-02 19:24:21
<schnaader>	Ouch.. no trouts here? I guess we'd better not misbehave...	2009-12-02 19:24:52
<Guest4706822>	not sure toffers awake	2009-12-02 19:25:08
<Shelwien>	sleepwalking?	2009-12-02 19:25:42
*** Guest4706822 has left the channel		2009-12-02 19:27:31
<schnaader>	That would be nice sleepwalking - "Last night I sleepwalked, logged in to IRC and coded some really nice compressors, now I've to understand what I did there, but results are really impressing" :D	2009-12-02 19:32:53
<Shelwien>	its happens sometimes with me, when i have to get up and suddenly do something	2009-12-02 19:34:48
	might not remember what i did later, especially if i'd return to sleeping after that ;)	2009-12-02 19:35:06
<sami>	I think 0.1 is too little. 100kb dictionary becomes only 10kb	2009-12-02 19:58:15
<Shelwien>	http://encode.dreamhosters.com/showthread.php?t=509	2009-12-02 19:58:34
<sami>	you state there that it's better than mtf. too early. i suggest increasing the fpaq0 adapt speed a bit for mtf	2009-12-02 20:01:32
<Shelwien>	i test it not only with fpaq0 there, but also with mix_test o2 coder too	2009-12-02 20:02:05
<sami>	do you have results for mtf2+mixtest vs gmtf+mixtest?	2009-12-02 20:03:02
<Shelwien>	wait...	2009-12-02 20:03:17
<sami>	I've never gotten around writing a tool for myself to have various simple models at hand for tests like this. many times it would be useful	2009-12-02 20:04:16
<Shelwien>	mtf:224696, mtf2:221903, gmtf:221140	2009-12-02 20:04:23
	well, i always just did it in the form of toolkits somehow	2009-12-02 20:04:54
	unfortunately lots of such tools I had written in asm	2009-12-02 20:06:02
	and they're not quite usable these days	2009-12-02 20:06:14
<schnaader>	But it's not like there's no more assembler out there ;) You could try to convert the code to fasm or some Windows assembler like masm32 (these are nice, it's possible to do WinApi calls with them), although with assembler code that's quite hard if its old and you don't know exactly what it does anymore.	2009-12-02 20:07:53
<Shelwien>	unfortunately its tasm	2009-12-02 20:08:33
	with very heavy use of macros and other specific things	2009-12-02 20:08:49
	and then, also these are DOS-32 programs using DPMI	2009-12-02 20:09:15
	they still work under XP now, in fact	2009-12-02 20:09:27
	like I have a 2k old PPM implementation etc	2009-12-02 20:09:41
<schnaader>	Ah OK, know these DOS memory things from old PowerBasic programs :) The days when you couldn't simply say "Give me 5 MB of memory"... better not port those monsters, yeah	2009-12-02 20:10:46
<Shelwien>	they're quite cool in fact	2009-12-02 20:11:27
	i'd like very much to have a preprocessor like in masm/tasm for C++	2009-12-02 20:11:48
	(though I use perl for that now)	2009-12-02 20:11:59
	its multipass and worked in a style of declarative programming to a point	2009-12-02 20:12:48
	like, i use a parity align macro somewhere	2009-12-02 20:13:07
	it aligned functions in such a way that parity of low byte of their address was fixed	2009-12-02 20:13:41
	was useful because of PF flag in x86 and JP/SETP etc stuff	2009-12-02 20:14:04
	well, its kinda like what C++ templates could be, but are not ;)	2009-12-02 20:14:42
<schnaader>	Hehe, well C/C++ has never been perfect, although there were some attempts to improve it, but I guess it's just kind of too popular now so everybody wants to improve different things.	2009-12-02 20:16:02
<Shelwien>	here's an example: http://91.124.210.5/lng-ppm.txt	2009-12-02 20:16:04
	all the unknown keywords are my macros basically	2009-12-02 20:16:58
	like functions called by calls are only linked into the program when they're really called from somewhere ;)	2009-12-02 20:17:31
	i'd probably program like that even now	2009-12-02 20:18:28
<sami>	Shelwien, how about this: reduce from size min(decoder size, median decoder size)*0.9	2009-12-02 20:18:45
<Shelwien>	it was much more powerful comparing to C/C++, as weird as it may sound	2009-12-02 20:18:47
<schnaader>	:) "dvd" is funny, guess it means "define variable data"	2009-12-02 20:19:21
<Shelwien>	yeah ;)	2009-12-02 20:19:25
<sami>	also we could do that for sfx programs as well, to approximate the decoder size	2009-12-02 20:19:27
<Shelwien>	not that i can say anything now	2009-12-02 20:19:47
	we only would be able to decide after looking at resulting order i think ;)	2009-12-02 20:20:13
	and as i see it, XWRT winning at book1 is wrong, but PPMY winning is wrong too	2009-12-02 20:21:13
<sami>	well, again book1 is too small for comparison 2 or more compressors as I explain in the text	2009-12-02 20:22:23
	ppmy still wins :-) http://compressionratings.com/sort.cgi?s_book1a.full.html+5+n	2009-12-02 20:26:26
	this is x-min(stub/10,100000)	2009-12-02 20:27:10
	no	2009-12-02 20:27:24
	error	2009-12-02 20:27:26
	this is correct I hope http://compressionratings.com/sort.cgi?s_book1c.full.html+5+n	2009-12-02 20:29:16
*** toffer has left the channel		2009-12-02 20:30:10
*** toffer has joined the channel		2009-12-02 20:43:40
*** toffer has left the channel		2009-12-02 20:44:43
	probably I need to make additional configs for sfx compressors for this test. I try to do that next weekend	2009-12-02 21:01:21
*** sami has left the channel		2009-12-02 21:01:45
*** schnaader has left the channel		2009-12-02 21:07:00
*** chornobl has left the channel		2009-12-02 21:09:52
*** Shelwien has left the channel		2009-12-02 21:14:49
*** Guest9968193 has joined the channel		2009-12-02 21:14:53
*** STalKer-Y has left the channel		2009-12-02 21:22:29
*** STalKer-X has joined the channel		2009-12-02 21:23:49
*** STalKer-X has left the channel		2009-12-02 21:46:11
*** STalKer-X has joined the channel		2009-12-02 21:56:39
*** STalKer-X has left the channel		2009-12-02 21:56:43
*** toffer has joined the channel		2009-12-02 22:16:22
*** toffer has left the channel		2009-12-02 23:47:32
*** Krugz has joined the channel		2009-12-03 00:58:43
*** bobzilla has joined the channel		2009-12-03 05:48:07
*** pinc has joined the channel		2009-12-03 06:27:06
*** pinc has left the channel		2009-12-03 06:29:17
*** pinc has joined the channel		2009-12-03 06:40:22
*** pinc has left the channel		2009-12-03 06:41:34
*** bobzilla has left the channel		2009-12-03 06:55:42
*** pinc has joined the channel		2009-12-03 07:48:45
*** STalKer-X has joined the channel		2009-12-03 10:23:26
<Shelwien>	...	2009-12-03 11:06:53
<STalKer-X>	o_o	2009-12-03 11:19:41
* Shelwien goes to bring in another bot		2009-12-03 11:21:38
*** compbooks has joined the channel		2009-12-03 11:27:44
<Shelwien>	!list	2009-12-03 11:28:48
<Krugz>	compbooks?	2009-12-03 11:51:43
<Shelwien>	only DCC articles for now	2009-12-03 11:52:05
<Krugz>	what do you plan to do with it? load it up with computer-related books?	2009-12-03 11:52:31
<Shelwien>	more like compression-related ;)	2009-12-03 11:52:42
<Krugz>	ahh ok	2009-12-03 11:52:46
	sounds good	2009-12-03 11:52:49
	I've been way too busy lately, but in a little while I'll have time to sit down and learn enough to be helpful, or at least interesting, around here	2009-12-03 11:53:27
* Krugz hasn't slept yet, has class in an hour		2009-12-03 11:53:43
<Shelwien>	i don't think you really have to learn anything	2009-12-03 11:54:27
<Krugz>	?	2009-12-03 11:54:34
<Shelwien>	i'm willing to talk about quite a lot of different things ;)	2009-12-03 11:54:42
<Krugz>	ya but I'm interested in data compression, not extremely or anything but enough that I'd be willing to sit down and learn more	2009-12-03 11:55:10
	just don't have the time recently, plus not exactly sure where to get started	2009-12-03 11:55:22
<Shelwien>	statistics probably	2009-12-03 11:55:45
<Krugz>	really? hmm	2009-12-03 11:55:55
<Shelwien>	not whole course maybe	2009-12-03 11:56:22
<Krugz>	alright well I'll look into it when I get some time	2009-12-03 11:57:18
	I have a bit more work to finish off, and then I have to study for finals	2009-12-03 11:57:35
<Shelwien>	but things like this: http://en.wikipedia.org/wiki/Maximum_likelihood#Examples	2009-12-03 11:57:42
<Krugz>	but after that, I'm clear to learn whatever I feel like for a long while	2009-12-03 11:57:46
<Shelwien>	well, i'm not going anywhere as far as i can see	2009-12-03 11:58:23
<Krugz>	ah don't worry about, I'm not going to drag you around to help me learn stuff :P	2009-12-03 11:59:01
	if you suggest where to start and stuff, that should be good :O	2009-12-03 11:59:11
<Shelwien>	well, there's kinda no publications on real compression algorithms	2009-12-03 11:59:53
	so i'd have to help one way or another	2009-12-03 12:00:10
<Krugz>	ya but I'm far from doing anything with an actual application	2009-12-03 12:00:14
<Shelwien>	"actual application" is something somewhat unrelated too, in fact ;)	2009-12-03 12:00:58
<Krugz>	I looked up BWT just a while ago, I understand the basic idea but there's definitely stuff I need to know before I really look into it	2009-12-03 12:01:07
	ah not "actual application", I meant like, I'm far from being able to understand how a compression algorithm would work	2009-12-03 12:01:44
<Shelwien>	most people only use zip despite availability of compressors with much better performance	2009-12-03 12:01:58
<Krugz>	I use rar mostly	2009-12-03 12:02:08
<Shelwien>	same rar	2009-12-03 12:02:12
	but rar is no better than zip really	2009-12-03 12:02:19
<Krugz>	I don't really need anything compressed much, I'm sloppy with my data	2009-12-03 12:02:26
<Shelwien>	same here	2009-12-03 12:02:41
<Krugz>	I just use it for packaging things to be sent around in one piece	2009-12-03 12:02:46
<Shelwien>	but as i said, there're lots of application for statistical models	2009-12-03 12:02:54
	and the best way to evaluate such models is by compression	2009-12-03 12:03:09
<Krugz>	hmm	2009-12-03 12:03:31
	so using compression as a tool to test models?	2009-12-03 12:03:41
<Shelwien>	for example, i'm thinking about making a talking bot here	2009-12-03 12:03:47
	which would generate text using a statistical model, by channel log data	2009-12-03 12:04:05
	for me, yes	2009-12-03 12:04:22
<Krugz>	ah ok	2009-12-03 12:04:45
	I see what you're saying, I think	2009-12-03 12:04:52
	jeez.. I'm getting really tired all at once	2009-12-03 12:05:13
<Shelwien>	;)	2009-12-03 12:05:19
<Krugz>	not because of you, lol	2009-12-03 12:05:23
	just lack of sleep	2009-12-03 12:05:27
	hitting me just now	2009-12-03 12:05:33
<Shelwien>	i just got up not long ago ;)	2009-12-03 12:06:02
<Krugz>	I woke up about 24 hours ago, now	2009-12-03 12:06:27
	I found an interesting e-book	2009-12-03 12:07:00
	it's a puzzle book, I find those interesting from time to time	2009-12-03 12:07:13
	this one was pretty well written, the puzzles are definitely entertaining	2009-12-03 12:07:32
	ok I have to go shower, and maybe get something to eat	2009-12-03 12:08:53
	I don't think I'll be back today/tonight	2009-12-03 12:09:24
	bye bye	2009-12-03 12:09:28
*** Krugz has left the channel		2009-12-03 12:09:35
*** Shelwien has left the channel		2009-12-03 12:09:50
*** Shelwien has joined the channel		2009-12-03 12:09:55
<Shelwien>	!next	2009-12-03 12:10:06