Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!utcsri!uthub!utecfb!larry
From: larry@utecfb.Toronto.Edu (Larry Philps)
Newsgroups: comp.edu
Subject: Re: Cheating on Programming Assignments
Message-ID: <139@utecfb.Toronto.Edu>
Date: Fri, 3-Apr-87 10:14:06 EST
Article-I.D.: utecfb.139
Posted: Fri Apr  3 10:14:06 1987
Date-Received: Sat, 4-Apr-87 20:04:41 EST
References: <248@rruxa.UUCP> <403@bacchus.MIT.EDU>
Reply-To: larry@utecfb.UUCP (Larry Philps)
Organization: Engineering Computing Facility, University of Toronto
Lines: 39
Keywords: autotest semantic comparison

Several years ago 2 of us here at U of T wrote a (large) system called
"autotest".   This system allows and instructor and his tutors to build
a database of test cases for a particular assignment.  Solutions are
generated by running the instructors (working we hope) program against
all the test cases and storing the results.  A reasonably flexible comparison
function that ignores white space, case, and believes that if you generate
an error message in the right place, you must be reporting the correct
condition comes with the system, or you can write, debug and have the
system run a program of yours to do the comparison.

Students can access part of the data via a program called "exercise"
which will test their solution against a random set of data and
tell them whether or not they got an acceptable result.  When satified
they "submit" their source and an executable.  After the "due date"
a tutor runs "evaluate" which runs every submitted program against every
piece of test data and produces a performance summary for each.

Usually the tutors only actually look at about 20% of the paper submissions
for each assignment, and just rely on the evaluate output for marking.
I personally love having the evaluate output beside me when marking the
code.  I is great when you see that their program failed in the case when
say, to much data was given.  It is easy to find that part of the code
and circle it with an appropriate comment.  Basically you can just read
the code for style and leave the test of functionality to "evaluate."

I addition, I just found out a few days ago, a grad student here wrote
a program which parses Turing programs to an intermediate form then
compares them.  The result from 0 (different) to 1 (identical) is
completely independent of variable names, program structure, and
organization.  This is now being run to compare all submitted programs.
If the result of a comparison is higher than about .85 (I think) then
the programs are virtually the same.
-- 
Larry Philps	Engineering Computing Facility	University of Toronto
NEW PATH:   larry@ecf.toronto.edu
USENET:     {linus, ihnp4, allegra, decvax, floyd}!utcsri!ecfb!larry
CSNET:      larry@Toronto
ARPA:       larry%Toronto@CSNet-Relay
BITNET:     larry@ecf.utoronto.BITNET