Automatic Generation of Exams in R - Microsoft R Application Network
exercises and a master file controlling the layout of the final PDF document. Sweave files for each exercise interweaving R and LATEX code,
adsPart of the document
This first introduction to theRpackageexamsis a (slightly) modified version of
Gr¨un and Zeileis(2009), published in theJournal of Statistical Software. It describes
how to produce PDF files from exercises in Sweave format. Meanwhile,examshas been
considerably extended by
Zeileis, Umlauf, and Leisch(2014) to also produce HTML output
or e-learning exams forMoodle,OLAT/OpenOLAT, etc. This has resulted in some small
changes that are not fully backward-compatible and which are marked with "UPDATE"
in the text below.
Packageexamsprovides a framework for automatic generation of standardized statis-
tical exams which is especially useful for large-scale exams. To employ the tools, users
just need to supply a pool of exercises and a master file controlling the layout of the final
PDF document. The exercises are specified in separate Sweave files (containingRcode
for data generation and L
ATEX code for problem and solution description) and the master
file is a L
ATEX document with some additional control commands. This paper gives an
overview of the main design aims and principles as well as strategies foradaptation and
extension. Hands-on illustrations - based on example exercises and control files provided
in the package - are presented to get new users started easily.
Keywords: exams, multiple choice, arithmetic problems, Sweave, LATEX,R.
1. Introduction
Several lecturers from the Department of Statistics and Mathematics teach
this course in parallel. In order to ensure an efficient, consistent,and transparent organization,
the course format and all its teaching materials (presentation slides,collections of exercises,
exams, etc.) were re-designed in a collaborative effort during 2006/7. Among many other
aspects - such as specification of a topic list or definition of learning outcomes, etc. - this re-
design encompassed several technological challenges. Hence, theexamspackage was designed
to address these challenges and thus facilitate the discussions aboutthe content of the new
course. More specifically,examsaims to provide software infrastructure for:
?Scalable exams:Automatic generation of a large number of different exams in order to
provide an individual test to each student.
?Associated self-study materials:Collections of exercises and solutions from the same
pool of examples.
2Automatic Generation of Exams inR
?Joint development:Development and maintenance of a large pool of exercises in a
multi-author and cross-platform setting.
Specifically, at WU Wien about 10-15 lecturers were working in small teamsof 2-4 people on
different chapters for the presentation slides. For each chapter, the corresponding team would
also provide suitable exercise templates that could be used for self-study materials, exams,
and solutions.
The pool of exercises does not only need to contain different types of exercises, but also
variants of the same type to avoid that students learn the solutions "by heart". Correction
should be fast and easy. This restricts the suitable types of exercises to those which either
have a single number as result which only needs to be checked to assess the correctness,
multiple-choice questions, or potentially questions which require only a short text answer.
These requirements on maintenance, variation, and correction of exercises led to the following
design principles for packageexams:
?Maintenance:Each exercise template is a single file (also just called "exercise").
?Variation:Exercises are dynamic documents, containing a problem/solution alongwith
a data-generating process (DGP) so that random samples can be drawn easily.
?Correction:Solutions for exercises are either multiple-choice answers (logicalvectors),
numeric values (e.g., a test statistic or a confidence interval), short text answers (e.g.,
the appropriate null hypothesis corresponding to a given problem), orcombinations of
these.
Thus, the DGP of an exercise controls the distribution of possible solutions and can be utilized
to make them (approximately) evenly distributed and difficult to"guess"or"learn by heart".
In addition to the variability within an exercise, one can add furthervariation by providing
several exercise templates for the same type of problem. Dependingon the flexibility of the
DGP, the pool of exercises can thus be rather small or needs to be somewhat larger.
Mixing problems/solutions and DGPs for exam generation poses challenges that are similar
to those of making data analysis reproducible. Thus,examsemploys many ideas from literate
data analysis