1.6 Design Principles

Educational design principles helped influence the design of the autograder and the direction for creation of exercises. While these design principles helped influence features and the user interface of the autograder, they are also extremely important as a guideline for how instructors and content authors should think about writing exercises and implementing them in the classroom.

1.6.1 Dual Modalities

A central challenge faced when considering development of the autograder was what to prioritize: Feedback or Grading? While both features are necessary for each other, there is an acute tension between a tool which is primarily motivated by providing open-ended feedback, and one that is designed to provide grades. The goal of this section is to consider the best possible interface we could give to students to help them improve their programming skills and complete lab exercises.

A key aspect of this tension is how to handle the idea of correctness. In an introductory computer science course, we are often lenient with small differences in output from a program. (For example, Snap! allows students to use both traditional arrays and linked lists, but their visual output is the same. If test authors are not careful, it is easy to mistake one type of data as incorrect when it should be accepted.) While it is often possible to account for these differences when writing test cases, it can be significantly difficult. We need to make sure that when these tools are used for grading, they do not cause students unnecessary stress or frustration.

1.6.2 Learner-Centered Design

Learner-centered design (LCD) is a design principle adapted from user-centered-design (UCD) [Citation not found] [Citation not found] [Citation not found]. Both LCD and UCD design principles start by establishing attributes about the user's goals. For LCD there are 4 main attributes:

• Learners do not possess the same domain expertise as users.

• Learners are heterogeneous.

• Learners may not be intrinsically motivated in the same manner as experts.

• Learners’ understanding grows as they engage in a new work domain.

While languages like Snap! are designed to be easier to learn, they do not necessarily employ LCD principles because they are still intended as a general-purpose programming environment. We can make sure our autograder takes into each of these principles:

Domain expertise: Programming languages have to show error messages that could make sense in any situation. Unfortunately, this means they usually fall to the lowest common denominator type cases, and do not provide any contextual information to the use. In Snap!, it is not uncommon to see a message that is similar to Type Error: expecting a list but got number. We can improve upon these messages by showing students hints which are specific to the problem at hand.

Heterogeneity: This is one of the harder aspects for the autograder to handle. Not everyone approaches problems in the same way. Our general approach is to try to be as lenient as possible (while still ensuring correctness) when writing test cases. We have spent lots of time considering how authors should handle different formats of output so that we try to avoid nit-picky errors.

Motivation: We try to motivate users by carefully choosing how we present the tool and the results. In class, and in the text which appears on screen we try to downplay the idea of grades or errors and instead focus on helping students improve.

Changing Understanding: Dynamically capturing a user's understanding is incredibly difficult to do. At this point, we are not able to dynamically adjust exercises or feedback presented, but we have planned out possible methods for doing so. Currently, the best way for us to achieve this is to have Teaching Assistants (TAs) and instructors, who are conscious of students needs, recommend different problems for students to practice with.

1.6.3 Knowledge Integration

Knowledge integration (KI) is a framework for approaching how students should synthesize information [Citation not found] . The KI framework has four components to orangize ideas:

Adding Knowledge involves bringing in new ideas that students have not seen before.
Eliciting Ideas is the process of critically examining ideas students already know.
Distinguishing Knowledge asks students to take multiple ideas and figure out how they fit together; whether they are compatible or not.
Reflecting is the process of drawing conclusions from what students have learned.

We used KI as a basis for writing the feedback messages that students are shown through the autograder. The goal is to focus primarily on the eliciting knowledge and distinguishing knowledge components. We wanted to focus on these two pieces because there are many common computer science problems which can be viewed through this lens. Systematically debugging code follows a process of eliciting knowledge when trying to figure out why something is broken. Distinguishing knowledge (such as the differences between two kinds of loops, or recursion and iteration) is a natural process for programming.

We chose not to use the adding knowledge component because we do not currently have the autograder setup to give good feedback when students are doing exploratory work (where they would be most likely to uncover new concepts). However, these types of messages will likely appear in future versions. Similarly, while reflecting is a valuable step, we do not have the capability to collect nor give feedback to open-ended reflection questions.

However, writing proper KI messages proved challenging in the current setup. The initial version of the autograder was designed more around presenting the results of test cases, than it was for longer forms of feedback. (This is one area for improvement.) Furthermore, trying to follow KI occasionally led to messages that did not necessarily fit within the rest of the BJC curriculum as it was not designed around the KI framework.

1.6.4 "TA-Centered Design"

Though this is certainly lower on the priority list than learner-centered design, we make a point to describe TA-centered design, and why this matters for the tools we build. Teaching Assistants (TAs) and instructors, are critical users of the infrastructure in courses. They need to be able to easily update and create content, handle grades and so on. The longer or more difficult these tasks are, the less time TAs have to spend helping students learn.

When considering TA-Centered Design, a TA is much more like a typical user in UCD than a learner in LCD, but there are many ideas that should be specifically recognized for TAs:

TAs are often lacking pedagogical content knowledge (PCK). PCK is making the distinction between knowing how to program, and how to teach programming. TAs could use guidance in applying good pedagogy.
- The admin dashboards built into λ give TAs more insights than they currently have about how students are performing and how often they are completing the lab work. While the dashboards have a ways to go in functionality, this is an improvement and gives TAs a reason to keep using the system.
- The test case authoring interface should be adapted to make it effortless to write consistent and detailed feedback.
While TAs are motivated to teach they are not always motivated to complete the extra work required of them, such as grading or writing assignments.
- Test cases need to be easy to write and upload.
  - The Implementation chapter describes the problems with our initial approach using edX's tools.
  - The Future Work chapter describes how we can improve the experience of writing autograder test files to lower the barrier.
Often TAs are not experts in the tools they are required to use to accomplish their teaching duties, such as LMSs and grading systems such as λ.
TAs, like most users, have a limited amount of time to complete their work.
- Limit (or automate) repetitive tasks, especially ones that involve configuration.
- The ability to automatically upload grades for students is a huge time saver which allows TAs to focus on more important tasks, and allows students to stop worrying about the status of their grades.

While these three ideas may seem obvious, they are important to recognize if our work is to be used beyond our initial test implementation. Then, we need to consider how TAs will use λ. The success of a new autograder, even if it is beneficial for students requires TAs to be comfortable configuring and writing new autograder tests.

Ultimately, we're are trying to recognize that TA's (or individual instructors) are already limited by time. Making test cases as easy to write as possible is a necessary part of the process.