The program for this assignment, and everything else except the README file, is due at 9 PM on Wednesday, February 4, 2004. The README file is due at midnight on the same day (i.e., the moment Thursday begins). Refer to the homework policies page for general homework guidelines.
This assignment has several purposes:
It's next term and you have been promoted from innocent CS 70 student to expert CS 70 grader. As usual, the graders are trying to evaluate student programs without adequate supporting software. They would like to have a program that automatically detects basic mechanical problems with submissions.
Prof. Kuenning is currently trying to get money from the Dean to have a student write the final version of this program, with elaborate features and a sexy web interface, over next summer. Meanwhile, however, they need something to help, even if it's basic. Since you were late to a grader meeting, you have been put in charge of writing a first-draft version. Although it must be written quickly, it must be kept modular so that it can be extended into the final elaborate system.
Your program will analyze the input code and tell the user about:
Your code will make use of a function named dumbReadLine
,
written by some student GHK who graduated long ago. This function might
not be what you'd like, but Prof. Kuenning wants you to build only the
analysis program now and deal with the input-reader later. You should
refrain from pointing out that dumbReadLine
is almost
identical to the getline
function in the C++ library,
because the minor differences will actually make your life easier.
As with homework 1, you
must use cs70checkout
and
cs70submit *
to manage your code. Start your
assignment by going to wherever you keep your CS70 assignments and typing:
cs70checkout hw02
Your program must be in one file, named assign_02.cc
.
Your code must use the dumbReadLine
function, supplied in
the files dumbreadline.hh and dumbreadline.cc . It must compile with
the supplied Makefile. These three files,
along with some sample input and output
files, will be provided to you when you create your assignment
with cs70checkout
.
Do not modify any of the three supplied files. Create only the file
assign_02.cc
and your README
file. If you
modify any of
the supplied
files, the graders will replace them with their own copy.
Use cs70submit *
to submit your code.
Your code must process the input line-by-line using the function
dumbReadLine
. Each type of report should be created by a
separate function. Each of these functions should examine only one
line of input: none of them requires examining more than one line at a
time or maintaining state from one line to the next.
The dumbreadline
function supplies each line as an array
of characters (see below). (This is the
standard way of representing a string in C, and it is still quite
common in C++ programs as well.) Thus, the first character on the
line is lineBuffer[0]
, the second is
lineBuffer[1]
, and so forth. To examine all characters,
you will have to write a loop that searches through the buffer until
it reaches the '\n'
(newline) character at the end of the
line.
To receive a top grade, your code must:
When counting the number of characters in a line, each TAB character
('\t'
in C++)
must be converted to the appropriate number of spaces. A TAB in the
input moves the cursor to the next column which is a "tab stop." If
the cursor is already on a tab stop, TAB moves it to the next such
column.
The terminology dates back to old mechanical typewriters. Those allowed tab stops to be put in arbitrary columns. In normal computer programming, however, a column is a tab stop if the column is a multiple of the "tab width". (Columns start with 0, so the first column is the first tab stop.) Your program should set the tab width to the industry standard of 8 characters. It must be capable of handling tabs anywhere in a line, no matter how long that line is.
Here is an example of 8-character tab stops:
1 2 3 4 5 012345678901234567890123456789012345678901234567890123456789 X X X X X X XThe X's indicate where the cursor would next land if you typed a TAB at the beginning of the line or immediately after each X.
Hint: it is useful to apply the modulo (%) operator to the current column number when calculating tab stops.
The number of characters in a line does NOT count the final end-of-line character.
A sequence of capital letters can contain whitespace (but not line breaks). A sequence of whitespace characters counts as part of the sequence of capitals if there are capital alphabetic characters immediately before and after the whitespace. For the purposes of this test, you should count any whitespace, including TAB, as a single character. (It's hard enough to do without properly expanding TABs, and you've already proven that you can expand tab stops...)
To test your program, you will probably want to run it on one or more
test files. To run your program on a file named
test.txt
, use the following Turing command:
./assign_02 < test.txt
This command will provide test.txt
to your program on the
standard input device. If you want to collect the output into a file,
for example testoutput
, use:
./assign_02 < test.txt > testoutput
Finally, if you want to use diff
(see below) to check
the output against a sample output file named
sampleoutput
, use:
./assign_02 < test.txt | diff sampleoutput -If your program runs exactly correctly, this command will produce no output. Otherwise, the output will list all of the differences between your program and the sample output.
All output must be written to the standard output device, cout
.
When your code reports lines with the above properties, the reports
must be in the following format:
Line
n is too long:
m characters.
Line
n has a string of
m capital letters and spaces.
Line
n seems to contain a goto
statement.
Line
n
contains a // comment without surrounding
whitespace.
If a line suffers from multiple problems, all problems must be reported, and they must be reported in the order given above. No problem should be reported more than once per line.
The assignment includes some input and output example files to test your program on:
hw02-input1.txt
is
a very simple sample file that
exercises each message at least once, so that you can check
to see that your output matches the corresponding output file. It
does not test the tricky cases.
cs70checkout
.
WARNING:The graders are extremely
picky about format.
For full credit, your output must exactly match the model
solution. Use the program diff
(see homework #1) to check for
subtle differences.
If diff
reports differences, but you can't spot any
trouble, try the following variation:
% ./assign_02 | diff original.out - | cat -v -e -tThe "
| cat -v -e -t
" part causes normally invisible things to
become visible. Specifically, it replaces all invisible control
characters with a two-character sequence starting with a caret (^)
[-v], marks the end of each line by appending a dollar sign ($) so
that you can see if there are trailing spaces [-e], and represents
TABs as ^I so you can tell them from strings of spaces [-t].
Incidentally, you can write this command more briefly as "cat
-vet
" and remember it as "take the cat to the vet to diagnose
problems."
The required style for this assignment differs slightly from what will be required in later assignments. Specifically:
assign_01.cc
). We will explain
soon how to divide your code into several files (e.g. put certain
code in a .hh
file). Even if you have prior
experience and already
understand how to use .hh
files, put everything
in one file for this
assignment.
main
function be
at the start of your file, utilities towards the end, and
related functions be grouped next to one another.
However, doing this requires that you use forward
declarations (prototypes). If you
do not understand forward declarations yet, you have permission
for this assignment to define functions after functions they depend
on, even though this results in "upside-down" code with the
main
function at the end.
class
or
struct
keywords. You
don't need classes to complete this assignment.
string
class.
<string.h>
include file), with the exception of strlen
,
strcmp
, and strncmp
. (It is
unlikely that you would find use for all three of these
routines.) The reason for the prohibition is that this
assignment is intended to teach you how to write the sorts of
processing loops that are included in the string routines.
Put the following lines at the top of your program. These include declarations allow you to use (in order), various useful character-analysis utilities, the input and output operations, and the dumbReadLine function.
#include <cctype> #include <iostream> #include "dumbreadline.hh"
To print an item foo to standard output, you use the statement
cout << foo;
. Here are some handy examples of
using cout
:
// Write a literal piece of text cout << "Test string"; // Write an integer cout << 3; // Write the value of a variable x cout << x; // Start a new output line cout << endl; // Compact way to write several values cout << "The value of x is " << x << " right now" << endl;
There is much more information on I/O available in the class C++ notes.
The inputs to dumbReadLine
are an input stream, an array
of characters, and the length of the character array.
For this assignment, the input stream will always be cin
(standard
input) and the array length will always be 1024.
For example (bad style alert! -- 1024 is a "magic number"):
char lineBuffer[1024]; dumbReadLine(cin, lineBuffer, 1024);
The call to dumbReadLine
will fill
lineBuffer
(which you have previously created) with
characters from the input, stopping when
it hits the end of a line or the end of the file. When looking
through your array, you can tell when you've reached the end of this
line because the last two characters will be an end of line ('\n'
)
character followed by a null ('\0') character. You can
stop when you
hit the '\n'
; there is no need to check for the '\0'
.
You can pass the contents of lineBuffer
to another
function as follows:
// ... use dumbReadLine as above ... handyHelperFunction(lineBuffer, ...);
In your helper function, declare things like this (bad style alert!):
void handyHelperFunction(char buffer[1024], ...) { ... if (buffer[i] == ...) ... }Note that you pass the array by name only, without brackets, and that the function declares its formal parameter the same was as the main program did (including the size). There are other ways to accomplish the same result, but this one will work fine for now.
If an input line is longer than the buffer you have given to
dumbReadLine
, it will generate an error message and halt
your program.
When you have read all the lines of the input file and there is
nothing left, a call to dumbReadLine
will return just as if it had
succeeded in reading some additional line. Therefore,
immediately after you call
dumbReadLine
, you must check the status of the input stream,
to see whether the dumbReadLine
call succeeded in reading a
line of data or hit the end of the file. Do this as follows:
if (!cin) // nothing more in the input cin; we have hit the end of the file or if (cin) // cin still has input; we have not yet hit the end of the file
Warning: you can't check the status of
cin
for EOF until
after you have tried to read a line. It's sort of like a
blind person at a curb: he doesn't know there's a drop-off until he has
tried to put his foot down on the pavement. This is a general
principle of C++ I/O that often bites people.
The C++ utility function isspace
determines whether a
character is whitespace (including tab, blank, newline, and a few
other characters). The function isupper
determines
whether it is uppercase. The function isalnum
checks for
alphanumeric characters. Use these functions: don't improvise your
own. They are used as follows:
if (isspace(lineBuffer[i])) // The character in lineBuffer[i] is whitespace
For the section of your program that deals with "//" comments, it might help to know that a newline character (end of line) is considered to be whitespace.
There is more information on these
operations, as well as on the general subject of C/C++ string
processing, available in the class C++ notes.
Note that for this assignment you are prohibited from using
strcpy
or strcasecmp
.
It is possible (though not likely) that you will find that you need to
use the constant 4 to represent the
length of the string goto
. It turns out that there's a
clean way to avoid building that number into your program. You can
use the following construct:
sizeof "goto" - 1to generate the proper value. You can even give it a name:
const int GOTO_SIZE = sizeof "goto" - 1;Even better, you can also make "goto" a constant:
const char GOTO_STATEMENT[] = "goto"; const int GOTO_SIZE = sizeof GOTO_STATEMENT - 1;so that there is a single point of change in the extremely unlikely event that you want to change the spelling of
goto
.
For a passing grade, we expect that:
In particular, it is unacceptable to submit undocumented or very poorly documented code, ignore the formatting guidelines, and/or write one gianormous main function that does most of the work. Merely producing the correct output is not sufficient.
Prof. Kuenning is a nut about spelling (see "ispell -v
)".
Check out the style
guidelines on spelling and the instructions on how to use ispell.
For an "A", we expect that:
Ok, ok. No one's perfect. An "A" submission can have small deviations from the above. But only small ones.
© 2004, Geoff Kuenning
This page is maintained by Geoff Kuenning.