Show enters and exits. Hide enters and exits.
| 00:00:01 | brixen | yeah |
| 00:00:12 | evan | but thats WHEN there is a seperator |
| 00:00:17 | brixen | yes |
| 00:00:18 | evan | this is the split on characters path |
| 00:00:28 | brixen | that's what I mean, doesn't make sense for "" |
| 00:02:07 | brixen | heh, the behavior when the limit parameter is 1 is even more wtf |
| 00:02:21 | brixen | "1,2,,3,4,,".split(',', 1) # => ["1,2,,3,4,,"] |
| 00:02:51 | brixen | I mean, seriously? |
| 00:03:11 | brixen | split and give me at most 1 field == the whole string unsplit? |
| 00:03:56 | brixen | haha, oh man |
| 00:04:29 | brixen | an code translater that converts stuff like Array(obj) for obj.kind_of? String to obj.split(",", 1) |
| 00:05:05 | evan | hehe |
| 00:05:12 | evan | yes, it's stupid. |
| 00:05:22 | evan | there are a million dumb edge cases in String#split that should not be there. |
| 00:05:32 | maharg | ? that's entirely sensible. It follows exactly from the meaning of that argument. You want one field, which leaves nothing to split. What would you expect? |
| 00:05:55 | brixen | maharg: I would expect at least the behavior of passing 2 |
| 00:06:06 | brixen | I already have the whole string |
| 00:06:14 | brixen | why do I want to ask to split it and get the whole string back |
| 00:06:19 | brixen | that makes no sense |
| 00:06:30 | evan | 1 is ok |
| 00:06:33 | evan | but what about limit == 0 |
| 00:06:35 | evan | or limit == -1 |
| 00:06:38 | evan | wtf do those mean? |
| 00:07:47 | maharg | if I passed 1 in there and got out two elements to the array, I would not consider that expected behaviour |
| 00:07:49 | brixen | maharg: really what I'd expect from obj.split(sep, n) is obj.split(sep).first(n) |
| 00:08:03 | brixen | maharg: why split then? |
| 00:08:09 | brixen | you already have a string |
| 00:08:49 | maharg | the point of the N>0 argument is so you can have it stop splitting after N arguments. So you can have a CSV with 5 columns and not have to escape ,s in the last column |
| 00:09:00 | brixen | if sep is delimiting fields, and " If limit is a posi- |
| 00:09:02 | brixen | tive number, at most that number of fields will be returned" |
| 00:09:08 | brixen | from pickaxe |
| 00:09:12 | brixen | I'd expect a field |
| 00:09:18 | brixen | not the whole string |
| 00:09:24 | maharg | "blah,blah,blah".split(",", 2) => ["blah", "blah,blah"] |
| 00:09:45 | maharg | without that argument, you can't get that behaviour. The behaviour you want you can get exactly as you showed |
| 00:09:53 | maharg | and it is useful behaviour |
| 00:10:00 | brixen | really? |
| 00:10:07 | brixen | show me a use for 1 |
| 00:11:08 | maharg | ergh. It doesn't matter that there isn't a very good use for 1, the output of 1 is just the obvious result of the behaviour of that argument. You're suggesting it be yet another stupid special case like -N is. |
| 00:11:23 | binary42 | brixen: I see it more so as a case where a variable is passed in so you don't need an extra if somewhere. |
| 00:11:50 | brixen | binary42: but you are asking to split into fields and limit the # of fields returned |
| 00:11:53 | maharg | it returning an array of 2 when you ask it for an array of 1 would be wrong |
| 00:11:58 | brixen | you get no "fields" if you pass 1 |
| 00:12:19 | maharg | I just can't see any way to agree with you that it should do anything else, even if it is pointless. It FOLLOWS |
| 00:12:35 | brixen | follows what? |
| 00:13:02 | brixen | I'm splitting for fields, I have no reason to split if I just want the string |
| 00:13:03 | binary42 | brixen: I'm not sure what you mean by fields. |
| 00:13:20 | binary42 | It says there are 2 segments. CSV is just one use of split. |
| 00:13:42 | brixen | I'm not saying this is limited to CSV |
| 00:13:50 | brixen | why would I split a string? |
| 00:13:56 | brixen | that's the crux of it |
| 00:14:01 | brixen | I'm calling a method |
| 00:14:04 | brixen | why would I do that? |
| 00:14:07 | maharg | "blah,blah,blah".split(",", 3) => ["blah","blah","blah"]; "blah,blah,blah".split(",", 2) => ["blah", "blah,blah"]; "blah,blah,blah".split(",", 1) => ["blah,blah,blah"] |
| 00:14:41 | brixen | maharg: unconvinced |
| 00:14:48 | evan | ANYWAY |
| 00:14:56 | evan | 1 is hardly the weirdest part of String#split |
| 00:15:01 | binary42 | brixen: I haven't used it so feel free to pretend it isn't there. |
| 00:15:25 | brixen | binary42: unfortunately, I'm not free to pretend any of MRI weirdness is not there ;) |
| 00:15:43 | brixen | were it that I could, this sunny day would be sunny |
| 00:15:49 | binary42 | Oh I know. I was being sarcastic about the rant. |
| 00:15:52 | brixen | heh |
| 00:16:08 | evan | brixen: just curious, would you rather 1 raise a ArgumentError then? |
| 00:16:15 | brixen | evan: nope |
| 00:16:22 | brixen | I'd expect the result of passing 2 |
| 00:16:30 | brixen | I want at most 1 field |
| 00:16:35 | evan | that would be weird. |
| 00:16:36 | brixen | I have no reason to split otherwise |
| 00:16:43 | brixen | *shrug* |
| 00:16:44 | evan | we can agree to disagree on a non-issue :) |
| 00:16:49 | brixen | indeed |
| 00:17:06 | brixen | I'll just be sure to use obj.split("", 1) when I mean Array(obj) |
| 00:17:12 | evan | ok! |
| 00:17:23 | evan | i just want split(//) to work on utf8 strings :/ |
| 00:17:30 | evan | that btw, is what i'm trying to fix. |
| 00:17:35 | brixen | ahh yes |
| 00:17:59 | evan | i've aded String#find_character(offset) that respects kcode |
| 00:18:01 | binary42 | brixen: So 0 is the whole string and the result is the tail? Just start counting from 1 instead. |
| 00:18:22 | evan | just trying to figure out how to get it into split |
| 00:18:35 | brixen | binary42: 0 is unlimited |
| 00:19:10 | brixen | the fact that pickaxe goes to parenthetical special explanations for the case of 1 I will take as confirmation the behavior is weird |
| 00:19:12 | binary42 | brixen: I know, just illustrating why 1 is good. |
| 00:19:20 | brixen | heh |
| 00:19:22 | brixen | anyway |
| 00:19:28 | binary42 | [*a] is faster anyway... we all know how much evan likes that. |
| 00:19:35 | evan | you know I do. |
| 00:19:44 | evan | i have a [*a] pillow |
| 00:19:48 | evan | a sleep on everynight |
| 00:30:42 | maharg | it's possible there could be a better meaning for the argument as a whole (should perhaps be the number of splits to perform, with 1 being the lowest possible), but arbitrarily renumbering 1 to 2 and leaving 2... as their current meanings is just adding an insane edge case imo. I'll leave it at that. |
| 00:31:51 | brixen | "leaving 2... as their current meanings" huh? |
| 00:32:01 | brixen | I'm just saying limit should be the limit of the fields |
| 00:32:09 | brixen | because I have no reason to split otherwise |
| 00:33:05 | evan | treating 1 like 2 is weirder |
| 00:33:06 | evan | imho |
| 00:33:20 | maharg | it's the limit on the number of times it splits (+1), not the limit on the number of fields it returns. |
| 00:33:38 | brixen | at 1 it splits 0 times |
| 00:33:42 | brixen | that makes no sense |
| 00:34:02 | brixen | I'm calling a method but I want to specify it perform no action |
| 00:34:03 | brixen | huh? |
| 00:34:23 | evan | hey! i got split(//) to work on utf8 characters! |
| 00:34:27 | evan | can we stop talking about split now? |
| 00:34:28 | evan | :D |
| 00:34:28 | brixen | sweet! |
| 00:38:42 | evan | i love specs that test 3*3*4*2 things in one it block! |
| 00:38:45 | evan | *eyeroll* |
| 00:41:17 | brixen | me too |
| 00:41:47 | evan | FUUUCK |
| 00:41:59 | evan | also ones that loop over something AND DON'T USE IT |
| 00:42:21 | brixen | which spec? |
| 00:42:39 | evan | string/split_spec.rb |
| 00:42:41 | evan | at the bottom |
| 00:42:51 | evan | it "taints...." |
| 00:42:56 | evan | the limit isn't even used |
| 00:43:22 | brixen | nice |
| 00:44:46 | evan | wait wait wait |
| 00:44:47 | evan | WTF |
| 00:44:58 | evan | i've actually read this spec completely wrong |
| 00:45:11 | evan | the 2nd should loop checks that a tainted Regexp DOESN'T taint the output |
| 00:45:15 | evan | ARG |
| 00:46:18 | evan | way to go! add a completely an inversion check to an it block for the positive check! |
| 00:46:18 | brixen | which is so clearly described by "taints the resulting strings if self is tainted" |
| 00:46:24 | evan | just to confuse the fuck out of me! |
| 00:46:59 | brixen | welcomes evan to his painful alice-in-wonderland-ish world |
| 00:47:17 | evan | i'm going to bring a moose back from Canada |
| 00:47:23 | evan | to threaten people with. |
| 00:52:47 | evan | woop. |
| 00:52:51 | evan | passes. |
| 00:56:11 | evan | hm, other spec failures... |
| 01:05:58 | evan | oh weird, ok, back on track! |
| 01:06:45 | evan | there we goes! |
| 01:08:03 | evan | hm, need to fix the globals hooks |
| 01:28:33 | evan | continues to struggle against $KCODE |
| 02:16:00 | evan | ok |
| 02:16:04 | evan | 2 more KCODE aware methods done |
| 02:16:08 | evan | there are only like 3 more |
| 02:16:17 | evan | gotta love that it's so spotty |
| 02:16:18 | evan | oh well. |
| 02:26:54 | evan | brixen: in |
| 02:27:02 | evan | string[/blah/] = "foo" |
| 02:27:09 | evan | what should I call "foo" in at it block? |
| 02:27:10 | evan | rhs? |
| 04:05:47 | brixen | evan: sure, rhs or 'value' |
| 04:05:56 | brixen | what's the whole description string? |
| 18:20:49 | boyscout | Add ByteArray#utf8_char and use it to implement String#unpack - 3f50099 - Evan Phoenix |
| 18:20:50 | boyscout | Add additional String#split spec - 79b85e6 - Evan Phoenix |
| 18:20:50 | boyscout | Fix String#split(//) via String#find_character - 83bba61 - Evan Phoenix |
| 18:20:50 | boyscout | Add spec for String#gsub + $KCODE - a9c2b16 - Evan Phoenix |
| 18:20:50 | boyscout | Add spec for String#scan + $KCODE - 0ffc7ea - Evan Phoenix |
| 18:20:50 | boyscout | Teach String#gsub and String#scan about $KCODE - bc6cf71 - Evan Phoenix |
| 18:20:51 | boyscout | Add String#[]= with Regexp specs - b2c7ca4 - Evan Phoenix |
| 18:20:51 | boyscout | Fix typo - db94b58 - Evan Phoenix |
| 18:20:52 | boyscout | Add proper unicode support to Regexp - 72341f2 - Evan Phoenix |
| 18:29:44 | boyscout | CI: rubinius: 72341f2 successful: 3037 files, 11914 examples, 36155 expectations, 0 failures, 0 errors |
| 19:14:21 | kronos_vano | evan, why setting $KCODE='u' breaks String#inspect? Is it known issue? |
| 19:14:35 | evan | i'm not aware of that |
| 19:14:37 | evan | example? |
| 19:15:07 | kronos_vano | $KCODE='u'; "\xe3\x81\x82".inspect |
| 19:15:35 | kronos_vano | output should be "\"\\343\\201\\202\"" |
| 19:15:58 | kronos_vano | but I see only spaces and " |
| 19:16:29 | kronos_vano | or just "\xe3\x81\x82" without inspect |
| 19:17:32 | evan | ok, i have to head out |
| 19:17:36 | evan | i'll take a look and fix that. |
| 19:17:42 | kronos_vano | k |
| 19:19:07 | kronos_vano | Now in rubinius it works like 1.9 |
| 19:20:29 | evan | actually |
| 19:20:31 | evan | works fine for me |
| 19:20:40 | evan | I get the same result on rbx as 1.8 |
| 19:20:58 | evan | with $KCODE = "U" |
| 19:20:59 | evan | i get |
| 19:21:11 | evan | >> str.inspect |
| 19:21:11 | evan | => "\"あ\"" |
| 19:21:13 | evan | on both |
| 19:21:35 | evan | one unicode character with quotes around it. |
| 19:22:07 | kronos_vano | Something wrong with my terminal because if I copy symbols from console to Xchat: I got: "あ" |
| 19:22:30 | evan | thats correct |
| 19:22:45 | evan | you should get quotes around a single japanese character |
| 19:23:38 | evan | thats what I get on 1.8 |
| 19:23:42 | evan | and rbx |
| 19:23:55 | evan | ok, packing up the laptop. |
| 19:23:57 | evan | later |
| 22:09:11 | boyscout | update spec for expand_path(a, b): fixes for 'b' being a relative path - 4e1605c - Sylvain Joyeux |
| 22:09:11 | boyscout | fix File.expand_path(relative_path, relative_path) - 6bff64f - Sylvain Joyeux |
| 22:14:26 | boyscout | CI: rubinius: 6bff64f successful: 3037 files, 11914 examples, 36156 expectations, 0 failures, 0 errors |
| 23:02:43 | kronos_vano | a simple optimization speeds up Array#inspect 2x: http://gist.github.com/308179 |
| 23:06:24 | BrianRice-work | good point; avoiding multiple string allocations |
| 23:07:34 | Zoxc | doesn't it do more string allocations or is the diff inverted? |
| 23:10:49 | kronos_vano | Zoxc, no it isn't. multiply calling << is faster than creating another 1 big array and then joining its elements. |
| 23:11:36 | Zoxc | not on MRI =P |
| 23:12:11 | Zoxc | creating an array rapes string concatenation |
| 23:27:05 | maharg | array#join already optimizes that way, so it shouldn't really be that much of an improvement. In fact, you just made inspect look pretty much like a duplicate of join. Except without tainting. |
| 23:29:26 | maharg | which probably should do (and was getting 'free' from using join before) |
| 23:31:12 | maharg | the interpolation syntax ("[#{blah}]") and tainging might have been what really gave you the speedup. It's possible join is too aggressive in tainting, as I'd assume adding something tainted to an array would taint the array but it's checking on every element |
| 23:31:50 | maharg | nope, guess not |
| 23:33:48 | maharg | actually, array#join should probably end up tainting in the out.append anyways, so maybe doesn't need it either? |
| 23:42:21 | Zoxc | my guess is that Array#join doesn't do a single allocation |
| 23:58:42 | maharg | don't need to guess, line 834 of kernel/common/array.rb :) |