Crystal Grep

Today, we’re going to have some more command line fun with Crystal. Along the way, we’re also going to cover Regex, IO and other subjects.

This tutorial was influenced by my friend 8bitmiker and his recent YouTube video, where he shows you how to create a similar program in Ruby. The final version in Crystal can be found here.

To start with, we’re going to make sure that the program is called with an argument. Without any argument, there’s nothing for this program to do. You may remember from my previous blog post that Crystal uses the same ARGV array constant for command line arguments as Ruby. This, and other similarities allows us to use the same line as used in the Ruby version of this program.

abort "Need a regex pattern" unless ARGV.size > 0

The abort command does what you’d expect it to do, it exits the program immediately.

While Ruby has Array methods called length, size and count, Crystal has a non-alias philosophy, and in this case, has chosen size to return the number of elements in an array.

Combine these two with the in-line unless (another Ruby beauty), and the program will exit on the first line if the user didn’t give it any command line arguments.

Next, we’re going to take that Regex from the ARGV array, and store it. That way, we don’t have to keep accessing ARGV, and also make our code more readable. Using another Ruby inspired method, shift, will do the trick:

regex = ARGV.shift

Just like in Ruby, shift removes the first element in an array, and returns it. In this case, passing it into the aptly named regex variable. Now, it’s important to note, ARGV is an array of strings. The regex variable therefore, contains a string of the regex pattern that the user entered. Later, we’ll convert that string to a Regex object.

Now that we have our second line, we can show just how important that first line is. If the user doesn’t enter any command line arguments, ARGV will be an empty array, and trying to run shift on an empty array will result in an Index out of bounds error. Hence, the need to abort the program if there’s no arguments provided. Remember, Crystal may look like Ruby, but it’s not Ruby.

Let’s run a quick test of the program at this point. For a program like this, it’s best to compile and run the compiled version. I saved the file as cgrep.cr, so we can compile it with the command:

$ crystal build cgrep.cr

Now we run it with:

$ ./cgrep

This will of course return the message Need a regex pattern since we didn’t give it a regex pattern. To run it properly, you need to not only give it the regex pattern, but also give it some input to process.

$ ls -l | ./cgrep .*

If you’re not incredibly familiar with Command line interfaces, the pipe(|) operator takes the output from the left side, and inputs it to the standard input of the command on the right side. So this line takes the output from ls -l, and sends it to the standard input of our Crystal program.

Unlike the original Ruby program from 8bitmiker, Crystal doesn’t possess the $_ operator, so we’ll access our standard input using the built-in STDIN variable. STDIN is an IO::FileDescriptor object, so you can use any methods available to this class on your input. Since ls -l returns multiple lines, we’re going to use the popular each_line method, which iterates over each line in the input.

STDIN.each_line do |i|
end

And there’s our main loop. It will iterate over each line of the input, and pass that line into the i variable for access within the loop. Now, what we do we want to do with that line? Well, we want to compare it to the regex. To do that, we need to turn the string representation of our regex into a proper regex using interpolation:

%r(#{regex})

We also have the familiar match operator (=~) to see if the regex matches the line from standard input, and output if we have a match, like so:

puts i if i =~ %r(#{regex})

Placing that into the loop, and we have our finished program:

abort "Need a regex pattern" unless ARGV.size > 0
regex = ARGV.shift

STDIN.each_line do |i|
  puts i if i =~ %r(#{regex})
end

Running it with the same command as before:

$ ls -l | ./cgrep .*

Will produce an output similar to this:

total 2048
-rwxr-xr-x 1 chrislarsen staff 1037464 27 Dec 14:36 cgrep
-rw-r--r-- 1 chrislarsen staff 128 4 Jan 14:20 cgrep.cr

This is definitely not a production ready program. Ideally we’d have a check to make sure that the regex string is a regex pattern, and we’d interpolate that into a Regex object before the loop instead of wasting resources doing that conversion in the loop, but it does work as desired. It’s what 8bitmiker would call “sloppy and dirty.”

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s