Hopbot log for 2007-09-18 - Helma IRC channel: #helma on irc.freenode.net

2007-09-18:

[19:39] <rjb> i got something pretty funny going on with rhino
[19:40] <rjb> in a rhino shell session, i read the contents of a multimegabyte file into a variable and called a replace() on it
[19:40] <rjb> by now it's been workin for over an hour of pure cpu time, according to `top'
[19:40] <rjb> uh, well over an hour
[19:41] <rjb> hogging the cpu like mad
[19:49] <rjb> hmmm is a search-and-replace on 3+ MB of data really too much to ask for?
[19:54] <rjb> ok, time for a little SIGKILL..
[19:56] <zumbrunn_> are you doing this with a Javascript String?
[19:56] <zumbrunn_> maybe a Java StringBuffer would perfom better
[20:19] <rjb> perform better?
[20:19] <rjb> i don't mind waiting a little, but this didn't perform at all
[20:20] <rjb> in that time i could have easily edited the file manually in a text editor
[20:20] <rjb> even w/o any fancy macros
[20:23] <zumbrunn> ok... perform, period :-)
[20:23] <rjb> i suspect it must have somehow worked its way into an infinite loop
[20:23] <rjb> though strangely, it did respond to signals
[20:32] <rjb> (one call to sed would have fixed that file in seconds, of course)
[20:36] <rjb> 2.3 sec, actually
[20:57] <rjb> hey this is quite slow when using a java class, too
[20:57] <rjb> var sb = new java.lang.StringBuilder(readFile('Large_SNR_Dec_Rig.txt'));
[20:58] <rjb> var k = -1; var i = 0;
[20:58] <rjb> while ( (k = sb.indexOf(',', i)) != -1 ) { sb.replace(k, k+1, ' '); i = k; }
[21:00] <rjb> over 3min cpu time, not done yet
[21:01] <rjb> 5min
[21:02] <rjb> uhh this seems wrong
[21:04] <rjb> not my code, but the fact that it never seems to complete
[21:05] <rjb> ok, i'll tell it to print some progress indicator
[21:09] <rjb> yep it is working, but it's incredibly sloooow
[21:10] <rjb> no more than 50 replaces per second, or so
[21:11] <rjb> well maybe up to 80
[21:20] <rjb> 40 .. 30 .. 50 .. 60 ..well it stabilizes a little over 70 replaces per second
[21:22] <rjb> 80
[21:23] <rjb> hey this is ridiculous, i can't believe this needs to be so slooow
[21:37] <rjb> about 90000 items needed to be replaced. (out of 3409234 bytes)
[21:38] <rjb> i just don't believe it. Pike (a rather fast scripting language) does it in about 0.3 seconds, on the very same string
[21:43] <rjb> come on, i really need to know what am i doing wrong
[21:48] <rjb> actually, Pike does 1073824 string replaces per sec. in this trial and is done in 0.08 s
[21:48] <rjb> while java never goes over 90 per sec.
[22:22] <rjb> on a smaller string (~ 900kB) i get about 443 replacements per second (using StringBuilder)
[22:22] <rjb> while better, this is still ridiculous
[22:31] <rjb> using StringBuffer instead of StringBuilder actually seems to speed up replacement a bit
[22:31] <rjb> contrary to what sun's docs say
[22:31] <rjb> but not significantly
[22:35] <rjb> I mean, 4 orders of magnitude slower than some scripting language sounds like a joke
[22:54] <rjb> btw String.replace() achieves about 30 replacements per sec. on the same data
[22:54] <rjb> unbelievably slow
[23:00] <earl> the way you're doing this is maybe a bit sub-optimal
[23:01] <rjb> show me a better way
[23:01] <earl> String#replace(',', ' ')
[23:03] <rjb> http://helma.pastebin.com/m6ab8298a
[23:03] <rjb> here are my test functions
[23:07] <earl> what about a simple var str = new java.lang.String(s); str.replace(',', ' ');
[23:08] <earl> (assuming you want to replace all , with blanks)
[23:17] <rjb> about 29 replacements per millisecond
[23:18] <rjb> much better, yes
[23:18] <rjb> hardly impresive, though
[23:22] <rjb> let me say again that Pike does over 1M string replacements per sec, working on a large string
[23:23] <rjb> (same machine of course)
[23:24] <earl> mhm
[23:32] <rjb> ok, updated the pastebin
[23:33] <rjb> i still fail to understand why would String.replace() be so pathetically slow
[23:33] <rjb> after all, it's a builtin function
[23:33] <rjb> so presumably, once you hand it the args, the rest is done by java code
[23:39] <rjb> ok, got one that does 47 / ms
[23:40] <rjb> wonder if you can guess how it's done ;-)
[23:43] <rjb> (no, not reaching into java classes)
[23:44] <rjb> the answer is in the pastebin
[23:45] <rjb> still, it's 50/ms vs 1000/ms ..
[23:46] <rjb> imho the pathetic String.replace should be reported as a bug against rhino
[23:54] <rjb> of course the `magic' behind Pike's replace() is that it's a builtin, implemented in C like the rest of the runtime
[23:54] <rjb> and those guys seem to be really bent on optimizing the shit out of their runtime

 

 

In the channel now:

Logs by date: