Assignment 1: Warmup
Out on:
January 28, 2008
Due by:
February 4, 2008, 11:00 am (before lecture)
Collaboration:
None
Grading:
Packaging 10%, Style 10%, Performance 10%, Design 10%, Functionality 60%
Overview
The first assignment is mostly a warmup exercise,
giving you a chance to (re-)familiarize yourself with
Unix, the C compiler gcc, and a variety of
pervasive Unix tools such as
tar, gzip, and make.
Of course you'll also do some basic C programming. :-)
Convince us that you're in the right course!
Problem 1: Word Frequencies (90%)
Check out this program which reads from standard input until end-of-file and then prints some statistics about the characters appearing in the input stream (which ones appear and how often). You should compile it and play with it a little just to get back into the Unix swing of things. :-)
Your task, should you choose to accept it, is to write a
program wreq.c that works pretty much the same
way as freq.c does, except for words
instead of characters.
Your program should read from standard input until end-of-file
and then print statistics about the words appearing in the input
stream (which ones appear and how often).
For our purposes, a "word" is any sequence of
printable characters that doesn't include a
whitespace; unprintable characters are simply
ignored.
You can use all of C as well as all of the standard C library for your program, but nothing beyond that (no Unix system calls, no other libraries, no external Perl interpreters, etc).
Problem 2: Specification Time (10%)
In your README file, reflect on the specification
for wreq.c given above.
What aspects of the specification gave you
trouble when implementing the program?
What aspects of the specification (not your implementation!)
are going to give the user trouble when they
run the program?
Can you write a better yet still somewhat concise specification
of the problem?
Hints
- Feel free to use the functions for error handling from our text, and (of course) actually handle all error conditions that can arise.
- Modularize your code "properly" into key abstractions. For example, you may want to define some kind of set or dictionary abstraction separately from the main program.
-
The
indenttool can be quite helpful to ensure that your code is formatted in a consistent way... - Pay attention to "edge cases" in the input your program can be expected to handle. For example, make sure that you handle an empty file in a reasonable way.
-
The
mancommand is your friend! Use it liberally while exploring Unix. Tryman manfor sure! Maybeman 2 intro,man 3 intro, andman fgetsare interesting as well? If you feel like learning a lot, readman gcc. :-)
Deliverables
Please turn in a
gzip
compressed
tarball
of your assignment;
the filename should be
cs211-assign-1-login.tar.gz
with login replaced by your Unix login name
on ugradx.cs.jhu.edu
(so I would use cs211-assign-1-phf.tar.gz).
The tarball should contain no derived files whatsoever
(i.e. no executable files),
but allow building all derived files.
Include a README file that briefly explains what your
programs do and contains any other notes you want us to check out
before grading.
Grading
For reference, here is a short explanation of the grading criteria.
Packaging refers to the proper organization of the
stuff you hand in, following the guidelines for Deliverables above.
Style refers to C programming style, including
things like consistent indentation, appropriate identifiers,
useful comments, suitable documentation, etc.
Simple, clean, readable code is what you should be aiming for.
Performance refers to the amount of resources
your program needs to produce the required results; this can
include space, time, and other metrics.
Design refers to proper modularization and the
proper choice of algorithms and data structures; often this can
be judged by asking "How hard would it be to add feature X?"
and "How hard is it to replace algorithm X with algorithm Y?".
Functionality refers to your programs being
able to do what they should according to the specification
given above; if the specification is ambiguous and you had
to make a certain choice, defend that choice in your
README file; if the specification is too general
and you had to add certain restrictions, defend those in your
README file as well.
If your programs cannot be built on ugradx.cs.jhu.edu
you will get no points whatsoever.
If your programs cannot be built without warnings using
gcc -ansi -pedantic -Wall -Wextra -std=c99 -O
we will take off 10% (except if you document a very good reason).
If your programs cannot be built using make we will
take off 10%.
If your programs fail miserably even once,
i.e. terminate with an exception of any kind or dump core,
we will take off 10%.
Finally, make sure to include your name and email address in
every file you turn in!