Carl Wiedemann

Kaizen: Ruby

2024-05-14

Over the past 3 years, I’ve been using Ruby full time at my day job. At first I wasn’t fond of it. But I’ve come to appreciate it more.

For certain use cases (i.e. scripting), I find it superb. For other use cases (i.e. large-scale applications), I find it less than superb.

After learning some of the basics, the language continues to surprise me in things I didn’t know. I’m documenting some of those here, both as a reminder for myself but also for others who are learning the language.

This post will be continuously updated.

#itself (docs)

Calling on any object simply returns…itself 🥁.

a = 123
a == a.itself

Why on earth is this useful? Sometimes there are instances where blocks should simply return the sole argument passed to the block:

a = [1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3]

# Convert to a hash, keyed by unique array values:
g = a.group_by { |v| v }
g == { 1 => [1, 1, 1], 2 => [2, 2, 2, 2], 3 => [3, 3, 3, 3, 3] }

# Alternative, using numbered parameter shorthand:
g = a.group_by { _1 }
g == { 1 => [1, 1, 1], 2 => [2, 2, 2, 2], 3 => [3, 3, 3, 3, 3] }

Using the #to_proc shorthand (&) with #itself is perhaps more explicit:

g = a.group_by(&:itself)
g == { 1 => [1, 1, 1], 2 => [2, 2, 2, 2], 3 => [3, 3, 3, 3, 3] }

(Sidenote: For this particular example, you’d might be more interested in Enumerable#tally)

Single character strings (docs)

Denote a single character string by prefixing a character with ?:

?a == 'a'

Should you use this? Hard to say. I suppose it saves some keystrokes, but can be a little tricky, especially if you see something like foo.split(?|)

Splat * destructuring assignment (docs)

Can be used in assignment to match multiple values to an array:

a, *b = [1, 2, 3, 4]
a == 1
b == [2, 3, 4]

*c, d = [5, 6, 7, 8]
c == [5, 6, 7]
d == 8

Splat * on Range (stackoverflow)

Splatting a range will convert to an array, it is basically a shorthand for #to_a:

a = *1..10

a == [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Saves a few keystrokes from (1..10).to_a.

Passing Range to Array#[] (docs)

Can accept a range ending with -1 to match to the end:

a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

a[5..-1] == [6, 7, 8, 9, 10]

Interestingly enough, using a non-inclusive range will exclude the last element (probably not useful):

a[5...-1] == [6, 7, 8, 9]

See doc link above for more interesing examples using Enumerator::ArithmeticSequence.

Passing Regexp to String#[] (docs)

Can accept a Regexp which will return the first match as a string, nil if no match:

"asdf12345"[/\d/] == "1"
"asdf12345"[/foo/] == nil

Different ways to call lambdas/procs (docs)

Lambdas/Procs can be called using #call, #(), #[], #yield:

# With arguments...
f = ->(name) { "Hi #{name}" }

f.call("John")  == "Hi John"
f.("John")      == "Hi John"
f["John"]       == "Hi John"
f.yield("John") == "Hi John"

# Without arguments...
g = -> { "Goodbye" }

g.call  == "Goodbye"
g.()    == "Goodbye"
g[]     == "Goodbye"
g.yield == "Goodbye"

What’s best? I think I prefer the explicitness of call, all the other syntax seem really unnatural to me (especially #[]). Also, #yield is something that I prefer to reserve for methods that receive blocks.

The DATA global variable (docs)

If the line __END__ appears in a Ruby file, all subsequent lines will be available earlier in the script as DATA:

# data.rb
DATA.each do |line|
  puts "Line: #{line.chomp}"
end
__END__
xray
yankee
zulu

Running:

❯ ruby data.rb
Line: xray
Line: yankee
Line: zulu

What could this be used for? Perhaps coding exercises where that must parse some input and now you don’t have to save it in a separate file to use File.readlines.

The BEGIN and END global variables (docs)

Can be passed blocks that will happen before and after all other steps in the script. The block syntax must use {...} instead of do...end.

# begin_end.rb
puts "Today is #{Time.now.strftime('%F')}"

BEGIN { puts "hai" }
END { puts "bai" }

Running:

❯ ruby begin_end.rb
hai
Today is 2024-05-14
bai

Instantiated variables are global scope, but the BEGIN block must appear before they are used. Here is the same script using a variable to store the date string:

# begin_end.rb
BEGIN {
  now_str = Time.now.strftime('%F')
  puts "hai"
}

END { puts "bai" }

# In order to use the variable, the `BEGIN` block must exist on an earlier line.
puts "Today is #{now_str}"

The variables are only available in the scope of the existing file, they are not available in other files that would be require‘d etc.

Is this useful? Maybe for debugging or doing some sort of setup & teardown. (I have learned these have been influenced by Perl)

Always be closing learning

Hopefully you have learned some things as well. Check back on this post for future updates.