<STalKer-X> :-)2009-12-15 08:24:56
<Shelwien> its a command to split the log actually ;)2009-12-15 08:34:20
<STalKer-X> it does not do it automagically?2009-12-15 08:35:02
<Shelwien> no, it seemed wrong to me2009-12-15 08:35:34
<STalKer-X> i wonder when people will write some really good algorithm for defragmenting2009-12-15 08:36:38
<Shelwien> doesn't look like something requiring a good algorithm to me2009-12-15 08:37:15
 i mean, imho there's no problem to find the optimal transformation2009-12-15 08:37:34
 as the initial layout and target layout are clearly known2009-12-15 08:37:58
<STalKer-X> but its obviously not implemented when i see how programs work2009-12-15 08:38:02
<Shelwien> and available operations too2009-12-15 08:38:05
 who knows2009-12-15 08:38:24
<STalKer-X> large files don't get optimized very well if the space is limited2009-12-15 08:38:32
<Shelwien> i guess they care about safety too much2009-12-15 08:38:42
 like, i've seen defrags updating dir entries and stuff after moving each cluster ;)2009-12-15 08:39:18
<STalKer-X> yeah :x2009-12-15 08:39:34
<Shelwien> but well, i guess they don't really care anymore2009-12-15 08:39:47
 its similar to like there're no tools like scandisk etc 2009-12-15 08:40:19
<pinc> native api guarantees safety, so thats clearly a problem with algorithm2009-12-15 08:40:22
<Shelwien> err... no it doesn't guarantee anything2009-12-15 08:40:39
 the idea is that the partition state should be always valid2009-12-15 08:41:25
 so that even if the program crashes or machine reboots2009-12-15 08:41:38
 the partition would be still accessible2009-12-15 08:41:50
<STalKer-X> there is a windows built-in tool, though2009-12-15 08:43:16
<Shelwien> so?2009-12-15 08:43:36
<STalKer-X> what else do you need :D2009-12-15 08:44:16
<Shelwien> are, you mean chkdsk?2009-12-15 08:44:40
<STalKer-X> yes2009-12-15 08:44:47
<Shelwien> it doesn't work2009-12-15 08:44:48
<STalKer-X> does not?2009-12-15 08:44:55
<Shelwien> in less trivial cases you have no choice than to use some real partition repair software2009-12-15 08:45:21
 like "getdataback"2009-12-15 08:45:26
 and these don't allow to fix anything2009-12-15 08:45:45
 instead, they only allow to copy the contents of a broken partition to another place2009-12-15 08:46:14
 which might be wise sometimes2009-12-15 08:46:29
 but is very annoying in most cases2009-12-15 08:46:36
 like, you need an empty 2T drive2009-12-15 08:47:13
 to recover the data from a full 2T drive with broken MFT (8k) or boot sector (0.5k)2009-12-15 08:48:06
<STalKer-X> file systems are complicated2009-12-15 08:48:38
<Shelwien> not really2009-12-15 08:48:49
<pinc> Shelwien: native api on ntfs works seamlessly2009-12-15 08:49:05
<Shelwien> NTFS is not really more complex than FAT32 on low level2009-12-15 08:49:10
 pinc: what do you mean? it can't prevent RAM bugs or machine reboots or other hardware problems2009-12-15 08:49:59
<pinc> it doesn't matter. mft has 2 copies, updated transactionally, so your data wont be affected by defragmentation2009-12-15 08:51:01
<Shelwien> yeah, so i already had 2 cases of winxp killing both mtf copies2009-12-15 08:51:31
 and many other cases of it doing other weird stuff2009-12-15 08:51:43
 i have 20+ hdds online here, so i have some experience ;)2009-12-15 08:52:44
<pinc> using native defrag api? never heard about such cases2009-12-15 08:53:24
<Shelwien> without defrag or anything at all2009-12-15 08:53:45
 just that winxp keeps constantly rewriting mfts on all attached drives2009-12-15 08:54:09
 and if a drive goes a little crazy, then winxp kills it like that2009-12-15 08:54:41
 obviously if that happens during defrag, it'd be even worse2009-12-15 08:54:58
 and cheap non-server hdds are not that stable2009-12-15 08:57:21
 they're practically guaranteed to get stuck if you'd try to keep them working for weeks2009-12-15 08:57:59
<STalKer-X> hmmm2009-12-15 09:00:07
<pinc> you're messing things up. I am talking that defrag wont kill your data if your hardware work properly, despite reboots2009-12-15 09:00:40
<Shelwien> as to that, its exactly what i was explaining to stalker here2009-12-15 09:01:02
<pinc> if your hards are faulty - its not a problem of software that they loss data2009-12-15 09:01:02
<Shelwien> as the reason to defrag being inefficient2009-12-15 09:01:14
 <STalKer-X> i wonder when people will write some really good algorithm for defragmenting2009-12-15 09:01:28
 and yeah, its a problem of software still2009-12-15 09:01:53
 its kinda the same with http being unfit for transfer of large files2009-12-15 09:05:22
 hardware and low-level protocols are supposed to be stable2009-12-15 09:06:12
 but they're not really, in a long run2009-12-15 09:06:19
 so there's no other way except for compensating for that in software2009-12-15 09:06:44
<STalKer-X> quite funny that o&o defrag hangs at one file2009-12-15 10:15:25
<Shelwien> %)2009-12-15 10:15:38
<STalKer-X> dunno why2009-12-15 10:16:45
 maybe i should use another one2009-12-15 10:23:11
 why is it impossible to defragment a 10MB file when you have 5MB of space?2009-12-15 13:03:34
<Shelwien> dunno2009-12-15 13:19:03
 likely depends on defrag program 2009-12-15 13:19:17
<STalKer-X> not a single one does it2009-12-15 13:19:57
 they all depend on stuff that is called: "we need large enough free space to fit the whole file in to successfully defragment it" :x2009-12-15 13:20:39
 free contigous space2009-12-15 13:20:53
 well, oo defragat least tries to move files around to get this free space ;x2009-12-15 13:21:42
 but since it hangs... too bad o_o2009-12-15 13:21:59
<Shelwien> well, maybe its requirement of defrag api and all of them use it ;)2009-12-15 13:21:59
<STalKer-X> which would be kind of dumb :x2009-12-15 13:23:55
<Shelwien> yeah, but there's also checkdisk api, and no alternate disk fixers since it appeared2009-12-15 13:25:25
*** Krugz has left the channel2009-12-15 14:23:22
*** scott___ has joined the channel2009-12-15 15:19:30
*** scott___ has left the channel2009-12-15 15:29:32
*** pinc has left the channel2009-12-15 16:54:01
*** pinc has joined the channel2009-12-15 19:14:17
*** mike_____ has joined the channel2009-12-15 19:19:25
*** pinc has left the channel2009-12-15 20:33:04
*** mike_____ has left the channel2009-12-15 20:37:44
<STalKer-X> bleh, both diskeeper and oo defrag have problems o_o2009-12-15 21:23:46
 i best get rid of them :x2009-12-15 21:24:01
 and people want 50 euro update price every year for basically no change :x2009-12-15 21:24:45
*** Shelwien has left the channel2009-12-15 22:08:08
*** Shelwien has joined the channel2009-12-15 22:10:42
<Shelwien> i suspect that their get their money for support2009-12-15 23:43:53
 because its hard to recover the partition after a failed defrag otherwise ;)2009-12-15 23:44:10
*** STalKer-X has left the channel2009-12-16 00:11:36
*** STalKer-X has joined the channel2009-12-16 00:27:25
*** Krugz has joined the channel2009-12-16 00:38:00
*** Shelwien has left the channel2009-12-16 01:44:25
*** Shelwien has joined the channel2009-12-16 01:44:29
*** STalKer-Y has joined the channel2009-12-16 04:47:53
*** STalKer-X has left the channel2009-12-16 04:49:30
*** schnaader has joined the channel2009-12-16 06:08:25
<schnaader> good morning everyone2009-12-16 06:08:44
 meh, shelwien, your site isn't wget-friendly :) wget http://shelwien.googlepages.com/fma-diff_v0.rar fails2009-12-16 06:11:14
 correction: it works, but the output file has to be specified (-Ofma-diff_v0.rar)2009-12-16 06:12:27
<Shelwien> hi2009-12-16 06:12:56
<schnaader> hi :)2009-12-16 06:12:59
<Shelwien> its damned google2009-12-16 06:13:17
 it was ok before2009-12-16 06:13:30
 so i used it for fast uploading via browser2009-12-16 06:13:41
 but now it shows 20 files per page and has no search2009-12-16 06:14:03
 and has that redirect to google sites2009-12-16 06:14:22
<schnaader> otoh, it's free :) how many webspace is provided?2009-12-16 06:16:03
<Shelwien> dunno how's it now2009-12-16 06:16:21
 was 100M and 500 files before2009-12-16 06:16:27
<schnaader> yes, 100M is what I heard a while ago, too2009-12-16 06:16:44
<Shelwien> and as one can expect, i have 98M and 496 files there ;)2009-12-16 06:16:53
 ah, also 10M per file limit2009-12-16 06:17:02
<schnaader> restricting to 500 files is lame :) it's like what they wanted to do for Windows 7 Starter - "You can only start 3 programs in parallel"2009-12-16 06:17:29
 otoh, you can store some information in the filenames :)2009-12-16 06:17:56
<Shelwien> they also had image conversion2009-12-16 06:18:11
 even stuff like png to jpeg2009-12-16 06:18:20
 i had to disable it2009-12-16 06:18:28
<schnaader> :) OK, at least they allow to disable2009-12-16 06:18:47
<Shelwien> well, i already had to find a new way to post temp links2009-12-16 06:19:30
 so maybe i'd just copy the stuff from googlepages to there2009-12-16 06:19:59
<schnaader> guess it's better to do such things if you provide webspace for the masses - most of them will probably upload BMP files renamed to JPG and stuff like this :)2009-12-16 06:20:19
<Shelwien> and use the redirect instead of googlesites like now2009-12-16 06:20:30
<schnaader> the easiest thing still is to put up your own webserver (if you don't care about 100% availability but need MUCH space)2009-12-16 06:21:59
 (and won't have too much people downloading)2009-12-16 06:22:20
<Shelwien> not really2009-12-16 06:22:26
 services like dreamhost provide much more space2009-12-16 06:23:08
 than i can make available from here2009-12-16 06:23:15
<schnaader> that's a good solution, yes, although you better constantly check if the space provider changed the URLs to avoid dead links2009-12-16 06:24:50
<Shelwien> its a paid hosting, so they won't do that2009-12-16 06:25:36
<schnaader> ah, OK, that's another case. thought you're talking about free hosting only.2009-12-16 06:26:03
 paying some bucks a month is the easiest way then, yeah2009-12-16 06:26:28
<Shelwien> but at $100 per year its close enough imho ;)2009-12-16 06:26:34
 ah, that's the worse case though2009-12-16 06:26:54
 i think i still didn't pay anything ;)2009-12-16 06:27:03
<schnaader> btw, I recently had one of those p2p downloads again that run fine until 99% and then stop for long time2009-12-16 06:33:31
<Shelwien> ;)2009-12-16 06:33:42
 some people do that intentionally2009-12-16 06:33:56
<schnaader> for most files (video/audio) it doesn't matter as you can already have a look at it, but it sucks for most archives2009-12-16 06:33:59
<Shelwien> err.. rar/zip and be still used2009-12-16 06:34:18
 *can2009-12-16 06:34:24
<schnaader> yes, most times at least for the 99% case - and RAR has this nice repair function2009-12-16 06:34:53
 except if the first/last bytes are the 1% you need :)2009-12-16 06:35:06
 the best way to solve this inside the box is to f.e. use RAR with recovery2009-12-16 06:35:32
<Shelwien> yeah2009-12-16 06:35:42
<schnaader> but I thought about a outside the box way2009-12-16 06:35:47
<Shelwien> that most annoying case with torrents, though2009-12-16 06:35:51
 is with multiple files2009-12-16 06:35:57
 sometimes a torrent block can include end of one file and start of second2009-12-16 06:36:15
 and result is that i can't download the first file, even if somebody puts it there2009-12-16 06:36:50
 (but not the 2nd file)2009-12-16 06:37:02
<schnaader> you could create a recovery file 1% of the file size and upload it to some high available source, but use some better suited algorithm that doesn't matter where the 1% are missing. if you can get the information about what pieces are missing, you could use this to reconstruct the file.2009-12-16 06:37:31
 yes, this blocks across files bother me, too.2009-12-16 06:38:11
<Shelwien> yeah, but that thing about "high available source" is troublesome2009-12-16 06:38:31
<schnaader> well, just use some web upload. the file is only 1% of the original size and the size is what prevents uploading to those most time2009-12-16 06:40:55
 the real problem perhaps is to somehow connect the torrent and the recovery source2009-12-16 06:41:25
<Shelwien> i just post torrents with webseeds ;)2009-12-16 06:41:26
<schnaader> how does using webseeds help here?2009-12-16 06:46:01
<Shelwien> well, its basically http ddl with p2p users compensating the load ;)2009-12-16 06:48:02
<schnaader> but it can still happen that only 99% of the file get uploaded or are spread, upload will just be faster in most cases, right?2009-12-16 06:51:09
 if I think about it, the best way would be to create the recovery in the p2p client, upload that one first and give it a higher priority :)2009-12-16 06:52:26
<Shelwien> well, p2p clients have some block sharing priority2009-12-16 06:57:47
 so if you have a valid file, there's no sense to create some special recovery files or anything2009-12-16 06:58:12
 it would just start with sharing a least available block2009-12-16 06:58:29
<schnaader> won't help for the initial upload :)2009-12-16 07:05:14
 and for most rare files, there will only be one upload2009-12-16 07:05:33
 but I agree, with better p2p techniques this problem is getting very rare, so no need for solutions2009-12-16 07:06:39
 I still had no time to try messing with fpaq or modifying it :( hope I find some time for it until weekend, want to do some CM experiments, too :)2009-12-16 07:09:51
<Shelwien> %)2009-12-16 07:10:32
 i was thinking again about making a bytewise CM compressor2009-12-16 07:11:06
 the most troublesome thing is how to store statistics for it2009-12-16 07:11:23
<schnaader> What I need is a eBook reader with additional G++ support, that way I could code on my bus travels :)2009-12-16 07:11:24
<Shelwien> i have g++ on iphone ;)2009-12-16 07:11:51
<schnaader> yeah, iPhone would be the most obvious solution, although I'd prefer an android phone2009-12-16 07:12:37
 do you need to store 256 probabilities in that case, or is it even more difficult?2009-12-16 07:13:42
<Shelwien> well, a straightforward model is simple2009-12-16 07:14:11
 but i already tried making these before, and they're slow ;)2009-12-16 07:14:38
<schnaader> :)2009-12-16 07:14:47
<Shelwien> so now the question is how to make a bytewise CM which would be both faster than bitwise, and compress better ;)2009-12-16 07:15:00
 and common sources of redundancy in bitwise coders are2009-12-16 07:15:23
 1. hashtables. there're basically no other way to store bitwise statistics (binary tree would be too slow), and hashtable is lossly, so predictions are imprecise2009-12-16 07:16:06
<schnaader> would it be possible to detect cases where bitwise gets better compression and switch to a bitwise CM there?2009-12-16 07:17:09
<Shelwien> 2. counters. bitwise counters don't work right for text. appearance of a space doesn't really mean that all the symbols in the range 0x20-0x3F should have higher probability2009-12-16 07:17:38
<schnaader> although this would need sharing statistics between the two CMs which could get troublesome...2009-12-16 07:17:45
<Shelwien> yeah2009-12-16 07:17:59
 but even if we'd only target texts (like ppmd is called "text compression" ;)2009-12-16 07:18:38
 there're still issues2009-12-16 07:18:58
 like, ppmd has much better text compression ratio than ccm 2009-12-16 07:19:56
 but only with small enough files, where all statistics fit in memory2009-12-16 07:20:14
 and when ppmd has to reset or cut the tree - its not better than ccm anymore2009-12-16 07:20:40
 and the same applies to speed, for the same reason2009-12-16 07:21:09
<schnaader> so what we really need is more memory :) or new algorithms that don't need that much of it2009-12-16 07:22:06
<Shelwien> yeah, but first i'd like to have an efficient data structure to store the statistics for bytewise coding2009-12-16 07:22:51
 well, this bytewise coding is not completely bytewise though ;)2009-12-16 07:23:55
<schnaader> yeah, you'll most probably use bitmasks somewhere, too :)2009-12-16 07:24:31
<Shelwien> symbols are transformed into some bitcodes and these are encoded with a bitwise rc ;)2009-12-16 07:24:33
 but at least stats should be bytewise2009-12-16 07:24:55
<schnaader> and you'll have to process only once per byte, which is where the speed gain comes from2009-12-16 07:25:27
<Shelwien> err, no exactly2009-12-16 07:26:56
 its just that with bitwise coding we have to encode 8 bits per byte2009-12-16 07:27:16
 and with adaptive byte decomposition it can byte 1 bit in 50%+ cases2009-12-16 07:27:54
 *can be2009-12-16 07:28:05
 of course, word coding could be even faster2009-12-16 07:28:47
 but that requires even more complex data structures2009-12-16 07:28:59
 and its not quite solved even for bytewise case ;)2009-12-16 07:29:12
<schnaader> so perhaps start with something between like 4 bits :D2009-12-16 07:30:18
<Shelwien> err, no2009-12-16 07:30:34
 then something like static huffman coding of the symbols would be better2009-12-16 07:30:56
 bytes are important not only because they're larger units2009-12-16 07:31:22
<schnaader> but because they're another "common unit" and most things will either be aligned by bits or by bytes?2009-12-16 07:32:00
<Shelwien> but because many files are generated as a sequence of bytes2009-12-16 07:32:03
 its not about alignment even2009-12-16 07:32:30
 for example, there're even things like codepages2009-12-16 07:32:55
 so the same russian text of the same size2009-12-16 07:33:18
 can contain different codes depending on the used codepage2009-12-16 07:33:31
*** pinc has joined the channel2009-12-16 07:33:32
*** Krugz has left the channel2009-12-16 07:34:00
<schnaader> yeah, same with DOS/Windows/Linux enconding of german texts, just that it's the umlauts only and not the whole text2009-12-16 07:34:27
<Shelwien> so modelling bits there is wrong2009-12-16 07:34:40
<schnaader> although that's a special case you could fix with a table transform, but I get your point.2009-12-16 07:36:23
<Shelwien> such transform won't really change much, if we'd use bitwise model after that2009-12-16 07:41:38
 probability update is completely different in bytewise and bitwise models2009-12-16 07:42:24
<schnaader> yes, you'd only get rid of the problem that 2 different codings won't give the same output although they actually mean the same2009-12-16 07:42:44
<Shelwien> and btw, even though bitwise models usually seem better for stuff like executables2009-12-16 07:43:59
 with some preprocessing it changes too2009-12-16 07:44:11
 like durilca, which disassembles the executables into multiple bytewise streams2009-12-16 07:44:52
 and compresses them with ppmonstr coder2009-12-16 07:45:07
<schnaader> well, executables are a weird filetype anyway because code and data is mixed2009-12-16 07:45:36
<Shelwien> also there're frequently texts, images, and other stuff ;)2009-12-16 07:46:28
 its like another archive format basically ;)2009-12-16 07:46:39
<schnaader> although I guess most EXE preprocessors are already quite good at seperating code and data2009-12-16 07:46:49
<Shelwien> not really2009-12-16 07:47:00
<schnaader> that's a point to improve, then :)2009-12-16 07:47:17
<Shelwien> there's basically no good open source exe preprocessor2009-12-16 07:47:22
 the usual E8 filters just fix the relative offsets in CALL instructions2009-12-16 07:48:24
<schnaader> another problem with exe is that there are too many different formats floating around, like with videos, although PE and MZ (and perhaps ELF) will be the majority2009-12-16 07:48:27
<Shelwien> yeah, nobody really something anything other than PE2009-12-16 07:49:02
 *really supports2009-12-16 07:49:15
<schnaader> although MZ shouldn't be hard and there still are many DOS games etc. floating around2009-12-16 07:49:33
<Shelwien> it is hard in fact2009-12-16 07:49:49
 it has relocations and stuff like FPU emulation and overlays2009-12-16 07:50:18
 and documentation is not really good2009-12-16 07:50:46
<schnaader> but it should be able to use E8 filters regardless to the format. you just had to detect assembler code2009-12-16 07:52:36
<Shelwien> afaik nobody does that2009-12-16 07:53:00
 at best they just detect the PE headers2009-12-16 07:53:12
 and rar just checks whether it improves the compression2009-12-16 07:53:44
<schnaader> but in this case RAR does it right, I guess. This should work for other formats, too, or does it only do this for PE files?2009-12-16 07:54:24
<Shelwien> for anything i think, at least at -m52009-12-16 07:55:05
 maybe they do some lighter checks at lower levels, dunno2009-12-16 07:55:24
 their detector doesn't always work right even at -m5 though2009-12-16 07:56:43
 sometimes -mce+ helps on archives with multiple executables2009-12-16 07:57:02
<schnaader> I'll go off now - going to work2009-12-16 07:57:56
<Shelwien> ok, good luck ;)2009-12-16 07:58:37
<schnaader> thanks, though at the moment I work at my own bad code - for (i = 0..nodecount-1) { delete(node #i) }... should be node #0 here, so that's one easy fix :)2009-12-16 08:00:50
 I guess that's where my good solved bugs/day stats come from ;)2009-12-16 08:01:36
*** schnaader has left the channel2009-12-16 08:02:34
*** pinc has left the channel2009-12-16 16:06:01
*** schnaader has joined the channel2009-12-16 16:19:31
*** mike_____ has joined the channel2009-12-16 16:28:58
*** schnaader has left the channel2009-12-16 16:47:02
*** pinc has joined the channel2009-12-16 18:24:00
*** Krugz has joined the channel2009-12-16 20:28:31
*** pinc has left the channel2009-12-16 20:45:00
<Shelwien> !next2009-12-16 21:04:27