Fun With Gsub

Fun With Gsub

During a recent ruby meetup we touched on some finer points of ruby's global
substitution method. That's right gsub, likely you have used it in your every
day ruby code and never knew all the things it could do. The most common usage of gsub is for simple pattern match and replace.

"some string".gsub(/some/, "some other")
# "some other string"

However this is just the tip of the iceberg of what you can do with gsub. So
lets explore that a bit and see what all we can do with gsub.

Another way to use gsub is with a ruby block.

"some string some string".gsub(/some/) do |match|
  match.upcase
end
# "SOME string SOME string"

The block allows you the chance to do more complex computation of each match
as it's passed into the block, allowing you greater control than the first
option.

But there is a more interesting option for gsub that I only recently became
aware of. gsub can also take a hash of key value matches.

"some string".gsub(/(string|some)/,{"some" => "some other", "string" => "thing"})
# "some other thing"

This can be used to handle multiple matches in the same way each time. While
this is the most specialized use case for gsub it can be handy to pull it out
when the right time arises. Such as a 1337 Sp34k traslator.

def translate_to_leet_speak(string)
  string.gsub(/[aetlo]/, 'l' => '1', 'e' => '3', 't' => '7', 'o' => '0', 'a'=> '4')
end

translate_to_leet_speak("leet")
# 1337
translate_to_leet_speak("haxor")
# h4x0r
translate_to_leet_speak("speak")
# sp34k

Granted this is not the best translator, since we are not handling the edge
cases of transforming words to their leet speak equivalent, for instance
"hacker" should be translated to "h4x0r" in such a case. In the above example
we had to pass through haxor instead of just hacker. However we could easily use gsub again to translate know words into their alternate form, so "elite" would become "leet" and "hacker" would become "haxor" then run those through the original gsub.

def translate_to_leet_speak(string)
  string.gsub!(/hacker|elite/, 'elite' => 'leet', 'hacker' => 'haxor')
  string.gsub(/[aetlo]/, 'l' => '1', 'e' => '3', 't' => '7', 'o' => '0', 'a'=> '4')
end

Notice the bang "!" on the end of gsub in the first gsub call. When calling gsub with and exclamation mark or "bang" gsub will then modify the original string it was called on.

Using the above method we can now catch known words that need to be changed to an alternative and unknown words that contain the key letters would still be handled on a basic level. This is essentially how many translation programs handle translation, they take a matrix of words that map to other words or series of words and then proceed to find and replace them. Although this doesn't yield the best results, and completely ignores grammar and other considerations for true translation.

Happy hacking with gsub.