TY - GEN
T1 - Automatic grading of programming assignments
T2 - 41st IEEE/ACM International Conference on Software Engineering: Software Engineering Education and Training, ICSE-SEET 2019
AU - Liu, Xiao
AU - Wang, Shuai
AU - Wang, Pei
AU - Wu, Dinghao
PY - 2019/5
Y1 - 2019/5
N2 - Programming assignment grading can be time-consuming and error-prone if done manually. Existing tools generate feedback with failing test cases. However, this method is inefficient and the results are incomplete. In this paper, we present AutoGrader, a tool that automatically determines the correctness of programming assignments and provides counterexamples given a single reference implementation of the problem. Instead of counting the passed tests, our tool searches for semantically different execution paths between a student's submission and the reference implementation. If such a difference is found, the submission is deemed incorrect; otherwise, it is judged to be a correct solution. We use weakest preconditions and symbolic execution to capture the semantics of execution paths and detect potential path differences. AutoGrader is the first automated grading tool that relies on program semantics and generates feedback with counterexamples based on path deviations. It also reduces human efforts in writing test cases and makes the grading more complete. We implement AutoGrader and test its effectiveness and performance with real-world programming problems and student submissions collected from an online programming site. Our experiment reveals that there are no false negatives using our proposed method and we detected 11 errors of online platform judges.
AB - Programming assignment grading can be time-consuming and error-prone if done manually. Existing tools generate feedback with failing test cases. However, this method is inefficient and the results are incomplete. In this paper, we present AutoGrader, a tool that automatically determines the correctness of programming assignments and provides counterexamples given a single reference implementation of the problem. Instead of counting the passed tests, our tool searches for semantically different execution paths between a student's submission and the reference implementation. If such a difference is found, the submission is deemed incorrect; otherwise, it is judged to be a correct solution. We use weakest preconditions and symbolic execution to capture the semantics of execution paths and detect potential path differences. AutoGrader is the first automated grading tool that relies on program semantics and generates feedback with counterexamples based on path deviations. It also reduces human efforts in writing test cases and makes the grading more complete. We implement AutoGrader and test its effectiveness and performance with real-world programming problems and student submissions collected from an online programming site. Our experiment reveals that there are no false negatives using our proposed method and we detected 11 errors of online platform judges.
UR - http://www.scopus.com/inward/record.url?scp=85072117626&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85072117626&partnerID=8YFLogxK
U2 - 10.1109/ICSE-SEET.2019.00022
DO - 10.1109/ICSE-SEET.2019.00022
M3 - Conference contribution
T3 - Proceedings - 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering Education and Training, ICSE-SEET 2019
SP - 126
EP - 137
BT - Proceedings - 2019 IEEE/ACM 41st International Conference on Software Engineering
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 25 May 2019 through 31 May 2019
ER -