Show enters and exits. Hide enters and exits.

00:15:50brixenevan: nope, running just rails -h it doesn't not seem to have an effect
00:16:03brixenover 5 runs precompiled
00:16:19evanjust curious
00:16:22brixenthe compilation run was slightly less but I only ran one each
00:16:33brixenso could be flutter
00:16:40brixenI will test it with hash
00:16:52brixensince this is what I was essentially asking the other day
00:17:08brixen#each_item can yield a single object, instead of key, value
00:17:15brixenso I'll see what effect that has
00:17:43brixengonna head to a coffee shop, bbiab..
00:18:01slavabrixen: you live in amsterdam?
00:18:08brixenslava: yes :D
00:18:22brixenno, unfortunately, I do not
00:18:33slavaah, ok, so your 'coffee shop' is not the real thing, then
00:18:38brixenbut for coffee, I live in one of the top 2 places in the US :)
00:18:51brixenno, it's a real coffee shop heh
00:18:56slavaI thought you were going to kick rbx dev up a notch with a 3 gram lebanese hash bomber
00:19:11brixenyou knowz it yo
00:20:13evanI think that the StackDepthCalculator is going to prove to be a good benchmark
00:20:32slavayou determine the max operand stack depth statically now?
00:20:35brixenahh good point
00:20:46slavaevan: nice
00:21:00evanI removed all instructions that had an unbound effect on the stack
00:21:04evanlike push_array
00:21:17slavanow you can convert stack values to LLVM IR registers very easily!
00:21:31evani'm not yet
00:21:34evanbut, awesomely
00:21:36evanI don't have to
00:21:42evanLLVM is so fucking smart, it does it for me.
00:21:53evanie, emit code that manipulates a stack pointer
00:21:56slavathe alias analysis pass picks up on your stack locations?
00:22:01evanand it tracks the usage and removes it
00:22:04evanand uses the static locations
00:22:09slavacool, that's what factor's codegen does
00:57:54headiuscute :)
00:58:03slavawhatdoes the benchmark do?
00:58:10evancall Array#each in a loop
00:58:28slavahow big is the array?
00:59:09slavaand how many times do you call each on it?
00:59:11evanso calling Array#each 5k times on an array with 5k elements
00:59:19slavaand the block does nothing?
00:59:28slavais the array allocated once?
00:59:30headiushow much of that is actually jitting?
00:59:36evanheadius: zero in this case
00:59:40evanit's stupid
00:59:45headiusoh, what does --fast do then?
00:59:50evan--fast has a list of methods to compile
00:59:55evanand it does them up front
01:00:03evanit's totally dumb
01:00:13evani don't have any metrics hooked up yet
01:00:15headiusoh, so it's jitting at startup then
01:00:28evanthought it might be fun to play with picking some methods to jit up front
01:00:29headiusso what portion of that is jitting at startup? :)
01:00:32evanyeah, jit at startup
01:00:44slavaevan: factor iterates over a 5k array 5k times in 0.07 seconds :-P
01:01:00evanbut the jit is taking no time
01:01:00headiusok, I don't get it then
01:01:03evansince it's just one method
01:01:25evanwell, I wanted to make Array#each fast
01:01:27evanso I did
01:01:40evanby having it compiled up front
01:01:43headiusso each is compiling
01:01:53evanis compiled
01:01:57headiuscompiled when
01:02:08evanin loader.rb
01:02:11headiusso at startup
01:02:12evanbefore it loads a script
01:02:14headiusisn't that what I said?
01:02:20evanand I answered
01:02:22evandidn't I?
01:02:32headiusI asked what portion of this is jitting at startup and you said none
01:02:37evanvirtually none
01:02:41headiusexcept each
01:02:44evanit's not measurable
01:02:44headiuswhich would not be none
01:02:59headiuseach is compiled before you start up or when you start up?
01:03:01slavaheadius: I doubt compiling one method would take long enough to measure
01:03:13evanwhen I start up
01:03:17headiusI was just wondering how much of this bench was running as compiled code
01:03:25evanjust Array#each
01:03:27headiuscompiled to asm
01:03:34evanliterally I'm doing
01:03:40evanRubinius.jit Array.instance_method(:each)
01:03:45evanno really, literally that.
01:03:48headiusso basically the part that would be in C for MRI
01:03:54headiusahh sure
01:04:23headiusok, I understand now :)
01:04:59evanit's pretty stupid
01:05:02evanit's SO stupid
01:05:05evanit's hard to get
01:05:37headiusnah, it makes sense
01:05:58headiusif we had more of core in ruby we'd precompile all of it to jvm bytecode, obviously
01:06:07headiusthis is similar to that, but you have to compile at startup
01:06:42slavahotspot compiles everything on startup too
01:06:44evani might make a mode where I can dump out the compiled code
01:08:41slavaI looked at the array_each benchmark in the source
01:08:55slavaits not a very good benchmark, because you can optimize it down to nothing
01:09:02evanif you can
01:09:05evanawesome for you!
01:09:12evanthen the benchmark shows how studly you are
01:09:18evangrab a cold one and hit the couch!
01:09:59slavasince the actual elements are not used, the array lookup is optimized away, and then the only thing that's done to the array is taking its length, but the length is known since its allocated right there, so the array allocation is eliminated entirely
01:10:07headiuswe can't inline blocks yet, but in theory it could optimize away if we did
01:10:07slavaand it becomes a counted loop
01:10:29headiusslava: that's exactly my concern about a lot of the super-micro benchmarks being run on macruby
01:10:41headiuswith their optimistic LLVM stuff, most of those benchmarks are probably not doing anything at all
01:11:03headiuswhich is cool, but obviously...optimistic
01:11:22slavathat's why the factor version of the benchmark runs 40x faster, because most of the benchmark gets deleted by the optimizer
01:11:25evanslava: please write me an optimizer to do that for ruby! :D
01:12:04slavaif I deleted empty loops then it would boil down to a no-op
01:12:07evanslava: it's not really fair, since you're doing it entirely staticly
01:12:16evanand we have to use a lot of dynamic feedback to do it
01:12:19slavaI typed it in the REPL and ran it
01:12:56headiusslava: if I wrote it in java it would probably not even be measurable
01:13:07headiushere, i'll write it in bitescript :)
01:13:13evanif it weren't a closure, it would be easier too
01:13:23headiusfo' sho'
01:13:38evanbut thems not how we does it here in the ruby wild west.
01:13:46slavaheadius: well ,the only difference in machine code would be that java keeps values in registers between basic blocks
01:13:51slavaso the loop counter doesn't have to be loaded and stored
01:14:18slavaonce I eliminate this restriction factor should be as fast as hotspot on integer and FP code that doens't dispatch
01:15:28slavaevan: your job is easier actually, because you're using llvm for codegen
01:15:44slavaso you're writing similar to my high-level optimizer, but the low-level optimizer is just llvm
01:16:21evanit's just running late stage
01:16:28evanto use type feedback to make decisions
01:16:34evanrather than upfront
01:18:29slavaif I move the array allocation outside of the word, then it doesn't optimize as much
01:18:46slava0.2 seconds
01:18:57slavanow its doing a method dispatch on every iteration
01:20:10evannow move it so that the optimizer can't see the body of the loop
01:20:51slavathat won't compile
01:21:00evanwhy not?
01:21:05evanyou can't make a loop that does a method dispatch?
01:21:11evanand the method is the loop body
01:21:30slavathe optimizer always inlines blocks
01:21:34slavaat their call site
01:21:49slavahigher-order functions are more like macros than real functions
01:22:17evanso make it so that it can't inline it.
01:22:22evanmake it suck
01:22:23evanin other words.
01:22:24slavaI can disable the optimizer
01:22:38slava1.7 seconds
01:22:58evanis that calling out to the block now?
01:23:02evanas a function?
01:23:17slavathis is with the non-optimizing compiler, that compiles everything as a string of subroutine calls
01:24:12slavaevan: I disabled PICs. 2.3 seconds :-)
01:24:29evankeep going
01:24:30slavaits officially starting to suck
01:24:31evanmake it really bad
01:24:36evani want you to get up to 5s
01:24:59evanthen reverse it all, and put each step in a blog post
01:25:29slava2.6 seconds if the optimzier is disabled for the library too
01:25:38slavaoh I have an idea
01:25:49slavarecompiles the VM with assertions enabled
01:26:08evannow we're talking!
01:27:30mvr(btw, had to do to compile with LLVM, dunno if it makes sense)
01:28:01slavaevan: damn, didnt make a difference. I geuss not enough time is spent running C code to matter
01:28:15evanmvr: oh
01:28:21evanmvr: thanks, i'll throw that in.
01:29:13evanoh damn!
01:29:17evanwhere's boyscout?!
01:30:08headiushuh, bitescript bug
01:30:14evanboyscout: stop slackin'!
01:30:16headiuscan't construct primitive array
01:30:22boyscoutsorry boss :/
01:30:44boyscoutCI: 04c99ba success. 2684 files, 10335 examples, 32880 expectations, 0 failures, 0 errors
01:32:37headiusok yeah...
01:33:10boyscoutAdd extra -D for newer LLVM - 9379ed0 - Evan Phoenix
01:33:11headiusI have to go to nanoseconds to time this
01:34:41headiusbut yeah, basically doesn't do anything
01:35:22boyscoutCI: 9379ed0 success. 2684 files, 10335 examples, 32880 expectations, 0 failures, 0 errors
01:42:47mvrhrm. getting a segfault running bin/rbx?
01:44:32evanWAY at the begining too
02:35:52boyscoutUse the proper VMMethod - 739a3e8 - Evan Phoenix
02:35:53boyscoutAdd support for long returns into JITd code - bea33b3 - Evan Phoenix
02:38:13evannext I think i'm going to have to add exception handlers to the JIT
02:41:15slavaevan: sweet
02:42:14boyscoutCI: bea33b3 success. 2684 files, 10335 examples, 32880 expectations, 0 failures, 0 errors
02:44:35kliare there any new performance updates on FFI vs regular extensions?
02:46:01evannon that are performance related
02:46:18evanthere are some FFI enhancements to the JIT on the roadmap
02:46:30evanperformance enhancements
02:46:40slavayou should be able to make FFI calls as fast as C calls with the JIT, just by generating machine code stubs
02:46:44slavaand then you can ditch libffi
02:46:49evanthats the plan
02:47:00evanboth inline and callable stubs
02:47:02slavawill you still support non-JITted execution?
02:47:17evandon't see a reason to ditch it
02:47:22slavaless code to maintain
02:47:28evanplus the interpreter is what generates all the data to make the JIT good.
02:48:00slavayou can use LLVM to generate the interpreter
02:48:09slavathen you can do direct threading without gcc extensions
02:48:17evannot really
02:48:23evanLLVM doesn't support label-as-value properly.
02:48:35slavaoh? damn
02:48:43evanyeah, i've asked about it a number of times.
02:49:51kliso umm
02:50:05kliyou guys are saying in the future, that FFI will == native?
02:50:16evannative what?
02:50:25slavaevan: so llvm has no way to generate a jump to an address stored in a register at all?
02:50:31slavaevan: how are function pointers implemented then?
02:50:40evanit can do that
02:50:40klinative C extension
02:50:59evanyou can do "call *%eax"
02:51:05evanwell, the HL equiv
02:51:06slavaevan: can it geerate a jmp *%eax ?
02:51:15evanif you structure it to be a tailcall, yes.
02:51:19evanotherwise no
02:51:20slavathen you can do direct threading
02:51:38evantailcall through function pointers as the blocks
02:51:40slavayou don't need labels as values, that's only if you wanted to express dtc in terms of C semantics
02:52:00evankli: well, they're different beasts
02:52:02slavaat the end of each instruction, you advance the instruction pointer, load from it, and jump
02:52:08evankli: FFI will be a bunch faster than native
02:52:19evanand it already is now somewhat faster
02:52:20slavaevan: does rbx have a C extension API other than the MRI compat?
02:52:30evanslava: no, but i've considering adding one
02:52:35slavaevan: what for?
02:52:40evanhaven't out of dicipline
02:52:51slavaI don't think you need it, FFI should be sufficient
02:52:54kliah, you're comparing it to the compat performance?
02:53:08evancompat performance?
02:53:13klii have this big library for MRI right now
02:53:17klithat i'm considering porting to ffi
02:53:26evanit would be faster under rubinius if you use FFI
02:53:27evanmost likely
02:53:35evanbecause of the semantics of the extensions
02:53:36slavakli: in theory FFI should be jsut as fast, in practice it depends on the implementation of FFI and how inefficient it is
02:53:41evanwe have to copy data back and forth a lot.
02:53:48evanFFI is more straight forward
02:53:52evanbut you'll end up doing more in ruby
02:54:02evanso as we get ruby faster and faster
02:54:15evanFFI will overtake native extensions more
03:09:31dkubbI'm trying to convince dbussink to port DataObjects to FFI, at least the do_sqlite3 driver. the biggest unknown right now is the performance it'll have in MRI
03:10:13brixendkubb: ah that's cool
03:10:21brixencould have both C ext and FFI
03:10:28brixenlet the user select
03:10:30dkubbpart of the benefit of DataObjects (DO) is that alot of the work is done in C, and nicely formed Ruby objects are returned.. rather than just an Array of strings that need to be coerced in Ruby
03:10:55brixenhm, yeah, that's the use case for C ext
03:12:15dkubbthe other benefit of DO is a uniform API across all the DB drivers, which still holds regardless of whether it's FFI or a C ext
03:15:51dkubbas far as rubinius goes, dbussink said that DO should run on it soon-ish
03:16:13dkubbI don't have any dates, he just said it was a possibility soon
03:17:54evani should explore dinner.
03:21:17evanperhaps the In and the Out..
03:32:28evanexception handlers!
06:37:56boyscoutHook up exception handling - 6338aa5 - Evan Phoenix
06:38:05evani figured that would take longer.
06:51:12boyscoutCI: 6338aa5 success. 2684 files, 10335 examples, 32880 expectations, 0 failures, 0 errors
07:50:24tmorninibrixen: Do you know how to build and enable JIT on the llvm-jit branch? I see references to -DENABLE_LLVM and RBX_JIT environment var.
07:51:12brixen3 steps
07:51:24brixenget llvm from svn
07:51:27tmorninigot it
07:51:32tmorninioh, not release?
07:51:37brixenapply this patch
07:51:52brixenwhich will require some manual work
07:51:58brixenor at least it did for me
07:51:59evantmornini: the llvm-jit branch has been merged back into master
07:52:04evanbrixen: really?
07:52:09tmorninievan: Ah, very cool.
07:52:10evanlet me redo the patch then
07:52:13brixenevan: yeah
07:52:18evanso it's clean against svn
07:52:25tmorniniThis is really exciting to watch!
07:52:30evanI might be behind a little
07:52:55brixenthat file has changed a bit
07:53:00evanok, one sec.
07:53:05tmorniniSo I need LLVM HEAD?
07:53:05evani'm updating and i'll commit a new patch
07:53:14brixentmornini: yes
07:53:38slavaall this JIT action makes me jealous, I've been working on docs all day
07:54:56evanok, my patch has ported forward
07:55:02evangimme a sec to make sure it's still clean
07:55:15evani'm trying to get this patch accepted to llvm
07:55:16evanto avoid this.
07:55:34evanI guess i could conditional the functionality the patch uses...
07:55:45tmorniniWe need to get on the LLVM Users page...
07:56:08headiusevenin tmornini
07:56:21tmorniniheadius: Howdy!
07:58:40tmorniniheadius: How are things at Oracle? :-)
07:58:58headiusbeats me, I don't work there yet
07:59:22tmorniniNo ouch intended, honestly, just checking which way the wind is blowing.
07:59:43headiusI'm not really sure how things are at Sun most days :) I pretty much work for JRuby, Inc
08:00:00tmorniniheadius: That's cool.
08:01:23tmorniniheadius: You guys are rockin' on JRuby. Congrats.
08:01:58tmorninillvm checked out.
08:02:07evanone sec.
08:02:07headiusthanks! hopefully for 1.4 we'll move to an optimizing compiler and rock a bit more
08:02:22evanwell, i'll just push the patch
08:03:05evantmornini: ok, grab the newest jit-info.diff
08:05:21evanhaha yes!
08:05:41evanthe governor got in a dig at UCLA during his commencement speech today
08:08:24tmorniniOK, all patched.
08:08:46evanok, now link the check out directory to vm/external_libs/llvm
08:08:58tmorniniI just checked it out there. :-)
08:09:05brixenand configure && make
08:09:11evana quick aside: it might be worth it to "vendor" a compiled version of LLVM for OS X
08:09:29tmorniniIn the future, sure
08:09:54brixentmornini: I'm guessing you haven't built llvm because I heard no complaints about your machine melting
08:10:11tmorniniI think rake makes LLVM if it's in place, right?
08:10:29tmorniniI've built it recently trying to see what's happening. I know it takes a while.
08:10:33headiusevan: does llvm have anything that would help with dynamic deoptimization?
08:11:37evantmornini: yeah, rake should do it if llvm is in the right place
08:11:50evanheadius: not really
08:11:54tmorniniCool. it's building now.
08:11:57evanit's at a layer below that kind of thing
08:12:10headiusmm ok
08:12:25evanbut, given that I have an interpreter, it's easy enough to insert into it's IR something like
08:12:28headiusthat's kinda what I was thinking..doesn't preclude you doing that, but doesn't help you do it
08:12:30tmorniniSo what is the LLVM JIT status now?
08:12:38tmorniniMerged, but still being worked on?
08:12:47evanret call rbx_start_interpreter(state, call_frame, <some ip>)
08:13:08evanie, tailcall to a function that starts the interpreter at a particular point
08:13:11slavawhat about on-stack replacement? :)
08:13:19evanslava: thats like a 3.0 feature
08:13:36evanslava: YOU don't even have that :D
08:13:51evandidn't john rose say he thought not having on-stack replacement was a good thing?
08:13:55evani thought he did
08:14:12evanheadius: dynamic deopt makes a lot of assumptions
08:14:19evanabout the infrastructure the code is running in
08:14:31evanit's really a layer above the optimizing compiler
08:14:31headiusI don't recall what he said about on-stack replacement
08:14:41evanit was at lunch one of the days
08:14:46evani think you were at a different table
08:14:57evanslava and I were picking his brain :D
08:15:01headiusmmm perhaps so :)
08:15:23headiusyeah, I know deopt makes a lot of assumptions
08:15:29evanthought I really need to cyphone his brain
08:15:31evannot just pick at it
08:15:42headiusthat's would have to know how to deopt and what to deopt to, which presumably would be different for each language
08:15:53slavaevan: you could just port rbx to jvm :)
08:16:07headiushuzzah, HotRubinius
08:17:13evanDING DING DING
08:17:17evanCOME AND GET IT!
08:18:34headiusthat's JRuby 1.5
08:19:01evanhows the invokedynamic stuff coming?
08:19:34headiusI think it's merged into openjdk mainstream now
08:19:43headiusI'm going to try to give it a shot this weekend
08:19:50headiustoo much to do
08:19:56evanI keep forgetting it's friday
08:20:02evanabby went to Cleveland yesterday
08:20:07evanso my bachlor weekend started a day early
08:20:15evanwhich really just meant staying up to 3am coding
08:20:44brixenfunny how not having wife/gf around tends to encourage that :)
08:21:01evanshe doesn't nag me or anything
08:21:35evanjust having her around i'm like "well, bed time."
08:21:47evani guess I revert to college Evan when i'm alone at night
08:24:29tmorniniIf you're saying you work less when Abby is around, EY may be able to send her on vacation. :-)
08:24:48evanI love her to death, and i'm sure she'd agree to that.
08:25:48tmorniniEvan wins employee of the month, May, Abby gets 4 week vacation!
08:26:16slavaas ESR said, sex and programming don't mix
08:26:24slavaall the great programmers like RMS are virgins
08:26:24tmorniniWow, LLVM does take a while.
08:26:37evango get a drink.
08:26:57evanslava: *facepalm*
08:27:25tmorniniI always thought I was a great programmer. Now I know where I went wrong. Pity!
08:27:42slavahe didn't really say that. in fact, he wrote which is one of the all-time lamest things ever
08:27:56headiustmornini: my machine must be faster than yours, just finished
08:28:07evani'd say i'm surprised this is the discussion we're having at 12:30am on a friday, but i'm not surprised.
08:30:56tmorniniI got a late start, had to back up and redo with "RBX_LLVM rake"
08:37:35headiusso RBX_LLVM=true rake?
08:37:51evanset it to anything
08:37:57headiusthen what to run it?
08:37:58evanit's jsut it's presence that matters
08:38:17evanit's not on-demand
08:38:22evani haven't hooked up any metrics
08:38:24evanso you have to do
08:38:34evanRubinius.jit Blah.method(:foo)
08:38:43evanit takes Method and UnboundMethod objects
08:44:17tmorniniCongrats on the merge. I'm going to start counting sheep. :-)
08:44:46evansay hi to the sheeps for us.
08:47:13evanMAN I love LLVM seeing through the stack manipulations
08:47:49slavaits a pretty trivial optimization to implement anyway
08:48:04tmorniniWould this be expected behavior?
08:48:21evanit's takes a Method object
08:48:23evanchange that to be
08:48:35evanyou're brave.
08:50:28evanhah ha!
08:50:34evanthats.. actually expected!
08:50:44tmorniniCool. :-)
08:51:00evani should have Rubinius.jit just fail quietly
08:51:03evani'll do that in a sec.
08:51:30tmorniniIf you cannot JIT Rubinius::AccessVariable, JIT probably isn't worth it! :-)
08:57:46boyscoutHandle breaking out of a block, enhance rbx.jit.dump_code - 0de71f8 - Evan Phoenix
08:59:04evanfor those interested:
08:59:26evanrbx.jit.dump_code now takes 3 bit values
08:59:34evan1 for simple (and long) IR
08:59:39evan2 for optimized IR
08:59:45evan4 for machine code
08:59:57evanthus, 7 prints out all of them.
09:03:42boyscoutCI: 0de71f8 success. 2684 files, 10335 examples, 32880 expectations, 0 failures, 0 errors
09:08:53headius"not supported yet" means it won't compile I presume?
09:09:13evanwhat were you trying to compile?
09:09:34headiusjust fib
09:09:40headiusI thought it would be simple enough
09:09:44evanwhich version?
09:09:46evanpastie the code
09:09:51evanyeah, i'd think it would be simple enough
09:09:51headiusrecursive one
09:10:02evanpastie it
09:10:08evanso i don't chase my tail at 1am
09:10:24headiusbench_fib_recursive in jruby/bench
09:10:32headiusI can pastie in a minute
09:11:28evanlet me try.
09:11:44headiusdo I need to pass anything to make Rubinius.jit work?
09:11:53headiusI just did RBX_LLVM=1 rake to build
09:12:20evanthat should be enough.
09:12:29evani got it too...
09:12:34evanok, one sec.
09:12:55evanthe minus meta op
09:12:58evani thought I did that one..
09:13:00evanone sec.
09:13:31headiusiterative seems to compile, but of course the bottleneck there is bignum
09:13:42headiusoh, and it blows up memory too, must still be a leak in bignum
09:13:54headiusate up 1GB in just a couple seconds
09:13:58evanprobably the same one as before
09:14:07evanbignum's internal heap is growing
09:14:13evanand not advising the rubinius GC to run
09:14:20evani'm guessing that old bug came back in
09:14:50slavayou don't do a GC check when allocating?
09:15:04evani do
09:15:08evanthis is a more complicated problem
09:15:17evanbecause we use an external lib for bignum support
09:15:20evanthat uses malloc internally
09:17:38headiushmm, many of the benchmarks I have just aren't suited to testing something like llvm
09:17:45headiusthey end up not doing any work
09:18:33headiuserg, or testing jruby with peephole optz apparently
09:18:46evanwhats a number i should try with fib?
09:18:58headiusthe arg is just number of iterations
09:19:02headiusit defaults to fib(30)
09:19:15headiussecond arg would be what fib to calculate
09:20:18evanpretty sure it's working
09:20:19evani'll push
09:20:21evanand dobule check
09:20:27evani need to recompile not in DEV mode
09:20:56boyscoutAdd meta_send_op_minus - ff5ea90 - Evan Phoenix
09:21:25evanin Activity Monitor
09:21:32evanTweetie shows up as a 64bit program
09:21:38evanusing 3G of virtual memory
09:21:46evanand 52M of real
09:22:25headiusthey're rather pessimistic about how much memory they'll need, perhaps
09:23:04evani think it's a by-product of it being a 64bit binary
09:23:22headiusbleh, I keep forgetting jruby can't build rbx anymore
09:24:33headiusin this case it seems rubypp fails
09:24:57evanthe build process needs work.
09:25:44headiusa local var bench seems to get the expected speedup from llvm
09:25:49headiusmostly because it doesn't actually do any work
09:26:02headiusbunch of a = a lines
09:26:05evanit sees the stack movement
09:26:16evanand says "hey now, i know this story"
09:26:32boyscoutCI: ff5ea90 success. 2684 files, 10335 examples, 32880 expectations, 0 failures, 0 errors
09:27:29headiusyeah, fib looks more like it did with your other jit
09:27:30headiusjit number 2
09:27:40evanhere i'm getting
09:27:44evan4.6s for interpreter
09:27:50evanthis is with fib(35)
09:27:57evan2.09s with the JIT
09:28:24evanbeing a recursive algo
09:28:31evanmakes sense between the 2 jits
09:28:34evanthere isn't a lot to optimize.
09:28:47evansince i'm not spying types for LLVM in this case
09:29:08evanwhats jruby?
09:29:30headiushere with jit, rbx is 1.6s
09:29:38headiuschecking jruby
09:29:41evani've got a faster machine :D
09:29:44evanyou've got!
09:29:52evani'm running fib(35) under MRI
09:29:55evanit's still going....
09:29:56slava35 fib => 0.4 seconds with generic arithmetic
09:29:59evanbeen about 20 seconds
09:30:02headiusjruby is 2s without --fast and 1.35s with --fast
09:30:11evanthere we go
09:30:15headiusslava: do I have to run duby numbers again?
09:30:24slavaheadius: duby doesn't have generic arithmetic semantics
09:30:36headiusduby knocks the pants off factor and looks nicer
09:30:38headiusso there
09:30:54slavaheadius: don't make me code recursive fib in asm :)
09:30:59evanslava: are you type splitting?
09:31:12headiusslava: that *might* beat duby by a bit
09:31:22headiusless guards
09:32:05headiusmacruby is about 0.53s on this
09:32:12evanhey guess what, you want fib(35)? it's 9227465
09:32:15evansee how fast I was?!
09:32:16headiusabout the ratio I see for them on most fixnum math
09:32:20evani'm better than all the rest.
09:32:27evanAND I tell jokes!
09:32:32headiusnow my cheer-me-up tweak
09:32:37headiusmake fib use floats
09:33:03evanheadius: i should see what else they're doing in there
09:33:10evanit should be about the same as i'm doing.
09:33:10headiusit's the recursive optz
09:33:14headiusit's not a generally useful one
09:33:16evanoh right.
09:33:18slavayou guys need to make your math ops dispatch faster
09:33:19evanthat silly thing.
09:33:26evanheadius: do this
09:33:28headiusif you make it self.fib_ruby their performance falls way off
09:33:29evanalias your fib method to
09:33:32evanfib1 ond fib2
09:33:35evanand call those inside fib
09:33:43evanto defeat the tailcall opt
09:33:52evanyes, i believe it's that stupid.
09:33:53headiusjust self along defeats it
09:34:02headiusthey don't do it for call, only for fcall/vcall
09:34:07slavawait, are you guys doing recursive fib here?
09:34:11evanwhats the performance if you throw self. in there?
09:34:13evanslava: course
09:34:15headiusbut I've done the alias things too
09:34:16evanslava: did you do iterative?
09:34:17slavathe only tail call there is the call to +
09:34:19slavaevan: no haha
09:34:23headiusmacruby's not hard to defeat on most of these
09:35:12headiusso macruby goes from 0.53s to longer than I care to wait with floats
09:35:22evanwhat about with self. ?
09:35:24slavamy fib(35) is 14930352
09:35:28slavasomething's wrong
09:35:40evanthats fucking perfect
09:35:41slavafib(34) = 9227465
09:35:54slavathat takes 0.27 seconds
09:35:59evanmaybe this algo is wrong.
09:36:16headiusok, 23s for fib(35.0) in macruby
09:36:30headiusbut both cores at solid 100% for that whole time
09:36:41headiusno idea what they're doing
09:36:41evanslava: nope
09:36:45evani'm right!
09:36:53slava0.46 for fib 35 with floats in factor
09:37:18slavaless than 2x slower than the fixnum version
09:37:24evanheadius: what was the time with self. on fixnums?
09:37:38slavaI WIN
09:37:48headiusjruby is about 4.9s with float fib 35
09:37:56headiusok, checking self now
09:38:44slavaare we going to benchmark ackermann and tak soon?
09:39:00evansure, if ya like.
09:39:14headiusok, so it was 0.53s with self-recursion optz, 23s without
09:39:16evanand after than, a rube goldberg machine simulator
09:39:17headiusso basically what float was
09:39:31headiusfloat would probably be unbearably slow without recursion optz
09:39:37evanI think I must have a GC bug
09:39:44evandoing 35.0 blew up the memory footprint
09:39:48evanso someting is a miss.
09:39:49headiusI mentioned this on macruby list and I think I made an enemy out of laurent
09:40:02headiusI was just curious
09:40:14slavaheadius: just detect recursive fib in your compiler and convert it to iterative
09:40:16headiusit's a drastic change just adding self
09:40:36evanthat they're optz doen't really translate into application performance?
09:41:05headiusno, just that it seems like a lot of the benchmarks people are running are dependent on very specific things like self-recursion
09:41:13headiusand wondering if they have some other benchmarks that would be better
09:41:28headiusobviously fib is worthless for comparison because of things like this
09:41:37headiushow often do you have a ton of self-recursion in a ruby app?
09:42:00evanpersonally thats all I do in ruby
09:42:17evanwrite self recursive madlib geeraters
09:42:21evandidn't I tell you that?
09:42:27evanthats why I'm working on rubinius
09:42:30evanbetter, faster madlibs.
09:42:38headiusif you write your entire program in a single method, the whole thing could be self recursive
09:42:40headiusthen macruby would win
09:43:11headiusso I shouldn't try rbx with the float version?
09:43:15evanheadius: what if you store the results in locals
09:43:17evanand then add them
09:43:25evanso there is no call at the very end
09:43:31evandoes it realize that?
09:43:33headiusthere would be a call to add
09:43:42headiusthat's not really any different than current algorithm
09:44:02evani just mean it seems like they're spying on the direct AST form
09:44:09headiusI could try it
09:44:10evanso any small change changes things radically
09:44:17evanyeah, the float is messed up for rubinius
09:44:19evanhave to look at it
09:44:22headiusI didn't see any spying for the most part
09:44:30headiusjust the recursion optimization
09:44:39evanit's them doing it though
09:44:42evannot LLVM
09:44:44headiusand I figured out how I could do it too, but it seems like it's only useful for benchmark arm-wrestling
09:44:45evanthats not an LLVM opt
09:45:08headiusyeah, they're just looking at every call to see if it's a call to the same exact method
09:45:16headiusand then having llvm skip dispatch logic
09:45:28evanbut only if it's at the very end
09:45:31slavaif you have inline caching a self call should be fast anyway
09:45:32evanor no?
09:45:39headiusbasically call_directly if lookedup_method == current method
09:45:39evanare they still doing a call
09:45:42headiusdoesn't have to be at end
09:45:45evanah ah
09:45:50headiusit's a static dispatch if it's the same method object
09:45:52headiusthat's all
09:45:52evanso they're just bypassing lookup
09:45:56evani thought tehy turned it into a loop
09:46:07slavaevan: a self tail call is the same as a loop
09:46:07headiusthey still check if the lookup produces the same method object though
09:46:14evani could do that
09:46:19headiusit's not hard
09:46:19evanand not have it based on any forms
09:46:25evani've got all that data from the ICs
09:46:35headiusI don't understand why they only do it for fcall and vcall myself
09:46:47headiusmaybe something about how objc dispatches
09:46:53slavawhat is fcall and vcall?
09:47:00headiusfor me it would be just another method wth a different receiver passed
09:47:02evanslava: a call without a receiver
09:47:07evanit means send it to self
09:47:13slavayou guys and your jargon :)
09:47:17headiusthey all have receivers :)
09:47:24headiusthat's why I don't understand why it's any different
09:47:30evanno programmer typed receiver
09:48:27headiusyou're right
09:48:46headiusassigning to locals and assigning separately does tank perf too
09:49:22headiusI don't understand why
09:49:45evani'm pretty sure they're turning it into a loop
09:49:58evanif they see a self call at the end
09:50:05evanthey just loop back to the top
09:50:05headiusoh wait
09:50:10headiuserg, I'm still running floats
09:50:11evani thought I saw that in there
09:50:17headiusnevermind everything I said for the past 15 minutes
09:50:27evanexpunges his memory
09:50:35headiusrough to_f
09:50:53headiusassigning to locals has no effect
09:51:00headiusthat's what I expected to see
09:51:22evanso ya think it's turning it into a direct call
09:51:26headiusremoving recursion optz puts it in the neighborhood of jruby --fast
09:51:35headiusaround 1.44 for them, 1.35 for us
09:51:46evani wonder....
09:51:57headiusscan for NODE_FCALL in roxor and you'll find it
09:52:21headiusit's basically doing if incoming_selector = current_selector; please LLVM call the function directly instead
09:52:39headiuswhich causes llvm to skip the dispatch pipeline entirely if it's the same method
09:52:56evanok, i'm going to give myself 5 minutes
09:53:00evanand see if i could do the same
09:53:01evanjust for fun.
09:53:06headiusthat's basically the cheat I did when I got fib(30) down to 0.05, which is surprisingly enough exactly what they get for fib(30)
09:53:32headiussee I'm only insecure about this stuff when I don't know *why* someone's faster, or why they might be faster in the future
09:53:38headiusbecause i've looked at this stuff inside and out
09:54:06headiusknowing why fib is faster in macruby made me a lot more comfortable with the world :)
09:55:16headiusmy only fear right now is that I don't know how to write enough of LLVM to make jruby + JVM compete :)
09:55:25headiusbut time heals all wounds
09:55:42evanheadius: try running the float bench
09:55:52evansee if your memory goes..
09:55:53evanoh wait
09:56:01evanle sigh.
09:56:42headiusthe perils of editing in place
09:57:04headiusyeah, I wish I could get an asm dump from macruby
09:57:14headiusit would help me sleep at night
09:57:23headiusbecause I know exactly what asm is coming out of jruby
09:57:45evani wonder how wolframalpha does fib
09:58:15evanit's near instant for any number
09:58:46headiusoh, is it live?
09:58:55headiuscool, wasn't a couple days ago
09:59:01evanwent live today
09:59:35slavaevan: you take the matrix (1 1; 1 0) and raise it to a power n
09:59:39slavathen the top left entry is fib(n)
09:59:55slavathere are fast algorithms for matrix exponentiation
10:00:06evanto get math problems done
10:00:13evani like to take matrix (42; 1; 1; 0)
10:00:19evanand raise it BY THE POWER OF GREYSKULL!
10:00:35evanthats more fun that homework
10:01:00headiusbest line in hot fuzz
10:01:03slavayou can also do (phi^n - (1 - phi)^n)/sqrt(5)
10:01:18slavawhere phi is (1+sqrt(5))/2
10:01:25headiuswait, is this the math phi or the ssa phi
10:01:30headiusI'm all confused now
10:01:34slavabut this will have roundoff errors for large values of n
10:01:47headiusmaybe we should use klingon to clarify
10:01:59evanit's more poetic in the original klingon
10:02:21headiusI guess that's actually a line from nazi germany, goebbels or something
10:02:30headius"shakespeare in the original german"
10:03:17headiuswolfram doesn't know what jruby is
10:03:35headiusand how important it is to holisitic computology
10:03:42evannor rubinius
10:03:44evandon't feel bad.
10:04:31evanok, no
10:04:34evani'm not going to try to do this
10:04:50evani verified that, yes, the JIT can easily detect from the ICs it's making a self recursive call
10:04:59headiusI don't see how wolfram can ever scale
10:05:01evanit's a plumbing exercise after that
10:05:22headiusmaybe if it has a huge datacenter and caches like mad
10:05:32evani've wondered the same thing
10:06:08evanI thought maybe it would tell me
10:06:17evanwouldn't that be awesome of them
10:06:21headiusmaybe they just want to get bought by google
10:07:10headiusit doesn't know my salary
10:07:13headiuswhat good is it to me
10:07:20headiuseven I know that
10:07:48headius"charles" is on a steady decline in popularity
10:08:08evanevan is on the up tick
10:08:15evanand i guess i'm one of the older ones
10:09:33headiuswow, the curve for oliver is wacky
10:09:49headiusmy son is oliver...we did it because it was my grandfather's name
10:09:56headiusin about 2000
10:10:08headiusmaybe all the olivers were finally dead then and it became popular again
10:11:50evanthat is weird
10:11:50headiusok, late for me
18:16:22ddubsometimes I get excited while reading an IRC channel's history and forget that it isn't active conversation
18:16:44ddubluckily, in many cases (including this one) I catch myself before I type something that makes me feel stupid
18:20:09malumaluddub: you're not alone with that :)
19:11:58evanddub: something from last night you want to discuss?
19:18:47slavahi evan
19:18:53slavaI want to discuss something from last night
19:18:54slavaYOUR MOM
19:19:03evanoh? are you reading her books?
19:19:24evanlet me suggest some: x=0&y=0
19:23:30evantheres swearing and murder in them
19:23:34evanif you're into that kind of thing
19:41:06evanheadius: morning
19:41:16evanheadius: i'm poking around in the macruby code
19:41:22evanfound some pretty funny stuff
19:41:35evani'm sure it's just "we need this to work now" stuff
19:41:38evanbut funny none the less
19:41:48evanlike, when it looks up to call a method
19:42:00evanit looks in the ObjC method tables
19:42:26evanand then passes the result to be looked up in a STL map!
19:42:34evanto find out if it's a ruby method of an ObjC method
19:42:49evanso the more methods the system gets, the slower lookup becomes
19:42:51headiusthat's on latest experimental branch stuff?
19:43:36evanthey're dispatch to method_missing is buried behind a whole bunch of code too
19:43:45evanso anything that uses it is going to be taking the slow boat to china
19:44:23headiusyeah, I guess my biggest curiousity is how they're going to continue along this path
19:44:39evanoh actually, this code is only being run when the cache is empty
19:44:40evanok, nm.
19:44:50evanso they really need the caches
19:44:58evanbecause they've had to inject a lot more work in refilling them
19:44:59headiusfor polymorphic calls performance really goes down
19:45:21headiusand basically anything that has real objects is much slower
19:46:13evanthey've got a lot of strange workarounds
19:46:28evanlike the spy on calling #new inside the dispatch function
19:46:50evaner. #class, not #new
19:48:34headiusyeah, I have been watching their updates for a while
19:49:21headiusobviously their compiler is promising, but they're really going to have to speed up all the core types
19:50:02evanI guess the upside for them is that when they find ObjC is in their way
19:50:06evanthey just schedule a meeting
19:53:45evanI guess I hope the macruby guys don't think we're picking on them.
19:56:34headiusthey do
19:56:49headiusbut there's also rampant fanboyism going on over there
19:57:11headiusexperimental branch barely runs and it's become the second coming of Ruby for a bunch of people
19:57:32evani'm still mostly ignoring taht
19:57:42evani'm just interested in their approach
19:57:49evanand how they're tackling certain problems
19:58:26evanI think that that post by the macruby intern rubbed you (and I) the wrong way
19:58:34evansaying that there was no way we'd ever beat MacRuby
19:59:22headiusyeah, like I mentioned last night, the macruby perf numbers only bothered me when I didn't know how they achieved them
19:59:42headiusthey don't bother me as much now, except where they mean we'll need to do more work to actually optimize our compiler
19:59:47evanthey're the shiny spot welds in a half finished plain
19:59:48headiuswhich will be fun anyway
20:00:17evani guess i'd hope they'd adopt the attitude you and I have
20:00:28evanwhich is a healthy rivalry :D
20:00:42headiusI guess what bugged me about rbx in the beginning, what bugged me about maglev in the beginning, what bugged me about macruby experimental in the beginning, is the early belittling of other efforts
20:00:49evancombined with comrodery
20:00:57headiuslike "oh, they must not have known how to make things fast...I'm going to do so much better"
20:01:11headiusthat just seems insulting to me
20:01:20evanwell, i've never meant to bilittle any other efforts
20:01:32headiusno, most of the time it's unintentional
20:01:47headiusunderestimating how much work will actually be necessary based on early successes
20:01:49headiusI do the same too
20:02:01evanthats part of the rivalry
20:02:12evanif we didn't think we were doing something worthwhile
20:02:15headius"I got fib running fast, I'll probably have Rails running that fast in a couple months"
20:02:16evanwe wouldn't be doing it :)
20:02:26evanyeah, wouldn't THAT be great.
20:02:43headiuscrosstwine was another new entry, but he's been really mild about it
20:02:51headiusI even tried to bait him
20:03:50evanso, should we publish our fib results?
20:03:54evansaying we're 10x faster?
20:04:10headiusyeah, I don't remember what I said, but he was pretty accepting that there was a lot more work, not all apps would optimize so well, his stuff did nothing to improve core classes and so on
20:04:34headiusI suppose it's techies failing too
20:04:52headiusthat they extrapolate from the specific to the general almost immediately
20:05:12evantechies have big imaginations
20:05:17evanand VM/compiler work is hard
20:05:38evanbig improvements to small forms of code does not scale automatically
20:05:52evanlike making fib fast != rails fast
20:06:17headiusI suppose what I've been trying to do recently is latch on to what makes jruby unique, since it's obvious it won't always be performance
20:06:36headiuslike the fact that our core classes are way faster than anyone else's, JVMs are everywhere, etc
20:06:49headiusnot particularly flashy stuff, not as flashy as a ruby-based core, for example
20:06:51evani'm sure thats good for your sanity
20:06:59evansince i'm never going to say "JVM intergration!"
20:07:10evanbut then you have to deal with the groovy/scala crowds
20:07:10headiusnobody will be able to provide the full jruby story, so no matter what we'll be ok
20:07:18headiusso many battles :)
20:07:24headiusscala guys are less of an issue
20:07:31headiusmost people I know that use scala use jruby too
20:07:41headiusgroovy on the other hand seems to be an island
20:56:46ddubevan: you should just hardwire functions named 'fib' in rubinius
20:57:08ddubhave a fib opcode
20:58:16ddubmaybe just build in a lookup table for the first 4k numbers in the fib sequence :)
21:01:56brixenddub: you should just prove that SSA <-> CPS <-> fib(n) for the space of all relevant programs
21:02:16brixenthen well compose fib() and just have one opcode
21:51:12evansounds good!
21:51:22evanok, so lets pick the top 10 trivial benchmarks
21:51:29evanand reimplement them as bytecodes
21:51:56evan--ultra-benchmark mode
21:52:06brixenheh, win!
21:53:42evanwe'll have it pop up a window with the ultra bencmark assistant
21:53:51evan"It looks like you're trying to write fib(), let me help with that."
21:54:48brixenoh man, that would make a sweet screencast
21:55:06brixen"rubinius helps you write performant code..."
21:56:22evan"It looks like you're trying to sort an array, might I suggest a log(n) algorithm?"
22:11:26ddub"You appear to be trying to write a web 2.0 app, shall I help you find VC funding?"
22:13:47evanfixed the memory runaway
22:14:18boyscoutCheck interrupts at the top of every JITd function - 16524e9 - Evan Phoenix
22:14:33evanfib(35.0) in 13s
22:16:40evannow for another tweak...
22:21:16boyscoutCI: 16524e9 success. 2684 files, 10335 examples, 32880 expectations, 0 failures, 0 errors
22:21:23evanhm, darn
22:21:42evanneed to reorganize a little to add an IC to send_op_plus
22:23:18brixenok, meter is up, bbiab..
22:24:24evantime to clean up the condo a bit.