This assignment is mostly a warmup exercise giving you a chance to review your C programming skills by writing two simple programs. You’ll also learn something useful about what’s actually in a file, that’s going to be very handy for Assignment 3.
Both programs are very easy to test because you can use existing UNIX tools to compare your output against. If you have trouble with this assignment, you’ll likely have even more trouble with future assignments.
This problem asks you to implement a simplified version of the UNIX tool
wc, the so-called “word count” program.
Start by reading the manual page for
wc to remind yourself what the
program does. Note, in particular, the definition of “word” used in the
A word is a non-zero-length sequence of characters delimited by white space.
Obviously the notion of “white space” is important for this program, but
we’re in luck, there’s a function for checking whether a character
counts as white space already:
So here’s how your version of
wc should work:
Read input character-by-character from standard input;
maintain three counters,
one for the total number of characters,
one for the total number of words,
one for the total number of lines;
EOF is encountered, print the number of lines, words, and characters
to standard output in the format detailed below and end the program.
Here are a few examples of how
wc should behave:
$ ./wc 0 0 0 $ ./wc Peter was here. 1 3 16 $ ./wc Here is some more input for you to play with. 4 10 56
In all of these examples, there are no spaces after the last character
on a line (but there’s a line feed character there of course). Note that
for the empty input, you should produce all zeros as output. The numbers
are separated by a single space. Obviously your source code should be in
a file called
wcprogram formats the output line a bit differently, but you should follow our specification here and not theirs. Just check whether the numbers agree, don’t format your output like theirs.
Start by reading up on what
For this problem, you will write a program
hex.c that produces a
hexdump on standard output for data read from standard input.
Let’s start with an example:
$ ./hex Hello 00000000: 48 65 6c 6c 6f 0a Hello.
The program was started, then the user typed the word “Hello” followed by return/enter, then CTRL-D was used to stop the input. The result shows the ASCII code for each character (in hexadecimal, so it’s guaranteed to be two digits wide for each character), including the newline character generated by the return/enter key. The formatting may look a bit strange, but the purpose of the large gap becomes apparent if we examine a longer input:
$ ./hex This is a longer example of a hexdump. Marvel at it's magnificence. 00000000: 54 68 69 73 20 69 73 20 61 20 6c 6f 6e 67 65 72 This is a longer 00000010: 20 65 78 61 6d 70 6c 65 20 6f 66 20 61 20 68 65 example of a he 00000020: 78 64 75 6d 70 2e 20 4d 61 72 76 65 6c 20 61 74 xdump. Marvel at 00000030: 20 69 74 27 73 20 6d 61 67 6e 69 66 69 63 65 6e it's magnificen 00000040: 63 65 2e 0a ce..
This time the user entered two sentences, then signaled end of input with CTRL-D. Again, we see the ASCII code for each character (including spaces and newlines). The formatting is set up so that regardless of the number of characters, we always have three “columns” of output:
xxdhas apparently been updated to use 8 recently.)
Note that there’s a single space between the colon after the offset and the ASCII values, but there are two spaces between the ASCII values and the string-like representation.
On Piazza you’ll find some starter code for this program. You can of course ignore the starter code and write the entire thing from scratch yourself, but we recommend you use the starter code: It contains a few important hints that you may not want to live without. Good luck!
xxdprogram to check your output against. Running
xxd -g 1and then typing into that should produce the same output as your
./hex <hexwill show you lots of interesting bits that you may not expect.
chartype is virtually identical to “byte” so it’s probably close enough to use “character” this way; at least in the context of a hexdump program.)
Please follow the submission instructions as detailed on Piazza. Make sure that your tarball contains no derived files whatsoever (i.e. no executable files), but allows building all required derived files. Also, be sure to include a Makefile that sets the appropriate compiler flags and builds all programs by default. Finally, make sure to include your name and email address in every file you turn in (well, in every file for which it makes sense to do so anyway)!
For reference, here is a short explanation of the grading criteria; some of the criteria don’t apply to all problems, and not all of the criteria are used on all assignments.
Packaging refers to the proper organization of the stuff you hand in, following both the guidelines for Deliverables above as well as the general submission instructions for assignments on Piazza.
Style refers to C programming style, including things like consistent indentation, appropriate identifier names, useful comments, suitable documentation, etc. Simple, clean, readable code is what you should be aiming for. Make sure you follow the style guide posted on Piazza!
Design refers to proper modularization (functions, modules, etc.) and an appropriate choice of algorithms and data structures.
Performance refers to how fast/with how little memory your programs can produce the required results compared to other submissions.
Functionality refers to your programs being able to do what they should according to the specification given above; if the specification is ambiguous, ask for clarification! (It also refers to you simply doing the required work, which may not be programming alone.)
If your programs cannot be built you will get no points whatsoever.
If your programs cannot be built without warnings using the required
compiler options given on Piazza we will take off 10%
(except if you document a very good reason).
If your programs cannot be built using
make we will take off 10%.
valgrind detects memory errors in your programs, we will take off 10%.
If your programs fail miserably even once, i.e. terminate with an
exception of any kind or dump core, we will take off 10% (for each such