Homework 3: The grep program

Homework is important practice for you, but it’s not graded so you don’t have to submit it.

Overview

The third homework assignment is all about a (rather simplistic) clone of the UNIX grep command. We hacked a first version of this in lecture and the code is posted on Piazza for you to start from. Your job is to “complete” the program by addressing its shortcomings regarding memory management. (You may want to take a few minutes to read up on the real grep command as well, it’s a very useful tool.)

The Program

The grep program takes one string as an argument and reads from standard input, one line at a time. For each line, it checks if the argument string occurs in the line: if yes the line is printed to standard out, if no the line is skipped. Here’s an example:

$ ./grep phf
A line without.
A line with phf.
A line with phf.
Amazing things, right?
phf?
phf?
Indeed.

I started the program with the string phf as our “needle” in the line-sized “haystack” as it were. The first line typed into the program disappears, but the second line is printed back since it contains phf as a substring. And so on until I git CTRL-D to signal end-of-file.

The Issue

The grep we wrote in lecture deals with end-of-file and with linefeed characters, but it doesn’t address what should happen if the buffer we have is not large enough to hold the line. You’re supposed to fix that. Note that there’s no “right” way to fix it for now, just better and worse ways. Completely ignoring the problem results in an incorrect program, that’s the worst, and that’s what you’re starting from. Terminating with an error when a line is too long is better, but it makes grep completely useless even when only one line out of thousands is too long. Warning about the issue and skipping a line that’s too long might be better. Trying to still find “as much as possible” (while of course also warning) is probably best. Have fun figuring it out.

Hints

Deliverables

At minimum, you should have a fixed grep.c with your actual program code and a Makefile that builds the grep program with the correct compiler options. The program should compile without warnings.

Things to look into…

Ideally you’d also have test cases, a test script, a clean target and a test target, etc. Note that in order to test correct error behavior, you’ll have to extend the test framework a little: there must be test cases that trigger problems and your script must be able to tell (exit status!) that grep failed. You can redirect stderr to a file as well to check error messages. Furthermore you should use valgrind as part of the test script, which could produce its own errors. Things are about to get a little more complicated for testing…