Doctor Code: A machine learning-based approach to program repair

Document Type : Article

Authors

Faculty of Computer Science and Engineering, Shahid Beheshti University, Tehran, Iran

Abstract

To address the problems of automatic repair techniques, we present Doctor Code, a new APR technique that chooses repair operators by systematically learning from the features of the most common bugs in different programs, using machine learning. The wise selection of repair operators reduces the number of candidate patches. We compare our technique against Mutation repair, a test suite-based APR technique, using the Siemens suite. The experiment results indicate that our technique can fix 41 bugs while the baseline only repairs 22. In addition, Doctor Code can produce patches that do not exist in the search space of the three test suite-based techniques called SPR, Prophet, and SemFix. We also experiment with Doctor Code utilizing three buggy versions of a program called Space (9K LOC), to indicate its capability of repairing large-sized programs. In addition, we compare Doctor Code against 7 state-of-the-art APR tools like Elixir, using the Defects4j dataset. The experiment results indicate that our technique outperforms the other tools regarding the number of fixed bugs and overfitted patches.
Comparing Doctor Code with RAPR as the baseline indicates that using machine learning reduces the number of overfitted patches and the time of patch production by 33.33% and 82.68%, respectively.

Keywords


References:
1. Wang, Y., Yang, J., Lou, Y., et al., Attention: Not Just Another Dataset for Patch-Correctness Checking, arXiv preprint arXiv:2207.06590 (2022).
2. Tassey, G., The Economic Impacts of Inadequate Infrastructure for Software Testing, National Institute of Standards and Technology (2002).
3. Marshall, I.J. and Wallace, B.C. "Toward systematic review automation: A practical guide to using machine learning tools in research synthesis", Systematic Reviews, 8, pp. 1-10 (2019).
4. Goues, C.L., Nguyen, T., Forrest, S., et al. "Genprog: A generic method for automatic software repair", IEEE Transactions on Software Engineering, 38(1), pp. 54-72 (2012).
5. Nguyen, H.D.T., Qi, D., Roychoudhury, A., et al. "Semfix: Program repair via semantic analysis", in: Proc. International Conference on Software Engineering, pp. 772-781 (2013).
6. Goues, C.L., Pradel, M., and Roychoudhury, A. "Automated program repair", Communications of the ACM, 62(12), pp. 56-65 (2019).
7. Long, F. and Rinard, M. "Staged program repair with condition synthesis", in: Proc. 10th Joint Meeting on Foundations of Software Engineering, pp. 166-178 (2015).
8. Long, F. and Rinard, M. "Automatic patch generation by learning correct code", in: Proc. 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pp. 298-312 (2016).
9. Perkins, J.H., Kim, S., Larsen, S., et al. "Automatically patching errors in deployed software", in: Proc. ACM SIGOPS 22nd Symposium on Operating Systems Principles, pp. 87-102 (2009).
10. Carzaniga, A., Gorla, A., Mattavelli, A., et al. "Automatic recovery from runtime failures", in: Proc. International Conference on Software Engineering, pp. 782-791 (2013).
11. Wang, S., Wen, M., Lin, B., et al. "Automated patch correctness assessment: How far are we?", In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, pp. 968-980 (2022).
12. Qi, Z., Long, F., Achour, S., et al. "An analysis of patch plausibility and correctness for generateand- validate patch generation systems", In Proceeding of International Symposium on Software Testing and Analysis, pp. 24-36 (2015).
13. Le, X.B.D., Le, T.D.B., and Lo, D. "Should fixing these failures be delegated to automated program repair?", in: Proc. IEEE 26th International Symposium on Software Reliability Engineering, pp. 427-437 (2015).
14. Saha, R.K., Lyu, Y., Yoshida, H., et al. "Elixir: Effective object-oriented program repair", in: Proc. 32nd IEEE/ACM International Conference on Automated Software Engineering, pp. 648-659 (2017).
15. Do, H., Elbaum, S., and Rothermel, G. "Supporting controlled experimentation with testing techniques: An infrastructure and its potential impact", Empirical Software Engineering, 10(4), pp. 405-435 (2005).
16. Just, R., Jalali, D., and Ernst, M.D. "Defects4j: A database of existing faults to enable controlled testing studies for java programs", in: Proc. International Symposium on Software Testing and Analysis, pp. 437- 440 (2014).
17. Arcuri, A. "On the automation of fixing software bugs", in: Proc. Companion of the 30th International Conference on Software Engineering, pp. 1003-1006 (2003).
18. Ghosh, D. and Singh, J. "Spectrum-based multi-fault localization using chaotic genetic algorithm", Information and Software Technology, 133, p. 106512 (2021). 
19. Weimer, W., Fry, Z.P., and Forrest, S. "Leveraging program equivalence for adaptive program repair: Models and first  results", in: Proc. 28th IEEE/ACM International Conference on Automated Software Engineering, pp. 356-366 (2013).
20. Kim, D., Nam, J., Song, J., et al. "Automatic patch generation learned from human-written patches", in: Proc. International Conference on Software Engineering, pp. 802-811 (2013).
21. Monperrus, M. "A critical review of "automatic patch generation learned from human-written patches": Essay on the problem statement and the evaluation of automatic software repair", in: Proc. 36th International Conference on Software Engineering, pp. 234- 242 (2014).
22. Debroy, V. and Wong, W.E. "Using mutation to automatically suggest fixes for faulty programs", in: Proc. 3rd International Conference on Software Testing, Verification and Validation, pp. 65-74 (2010).
23. Jones, J.A. and Harrold, M.J. "Empirical evaluation of the tarantula automatic fault-localization technique", in: Proc. 20th IEEE/ACM International Conference on Automated Software Engineering, pp. 273-282 (2005).
24. Abreu, R., Zoeteweij, P., Golsteijn, R., et al. "A practical evaluation of spectrum-based fault localization", Journal of Systems and Software, 82(11), pp. 1780- 1792 (2009).
25. Xuan, J., Martinez, M., DeMarco, F., et al. "Nopol: Automatic repair of conditional statement bugs in Java programs", IEEE Transactions on Software Engineering, 43(1), pp. 34-55 (2017).
26. Li, Y., Wang, S., and Nguyen, T.N. "Dlfix: Contextbased code transformation learning for automated program repair", In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, pp. 602-614 (2020).
27. Lee, A.H. and Silvapulle, M.J. "Ridge estimation in logistic regression", Communications in Statistics- Simulation and Computation, 17(4), pp. 1231-1257 (1988).
28. Hall, M., Frank, E., Holmes, G., et al. "The Weka data mining software: An update", SIGKDD Explorations Newsletter, 11(1), pp. 10-18 (2009).
29. Barr, E.T., Brun, Y., Devanbu, P., et al. "The plastic surgery hypothesis", in: Proc. 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, pp. 306-317 (2014).
30. Campos, E.C. and Maia, M.d.A. "Common bug-fix patterns: A large-scale observational study", in: Proc. ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, pp. 404-413 (2017).
31. Parsa, S., Vahidi-Asl, M., and Asadi-Aghbolaghi, M. "Hierarchy-debug: a scalable statistical technique for fault localization", Software Quality Journal, 22(3), pp. 427-466 (2014).
32. Parsa, S., Mousavian, Z., and Vahidi-Asl, M. Analyzing program dynamic graphs for software fault localization", in: Proc. 5th International Symposium on Telecommunications, pp. 169-174 (2010).
33. Sokolova, M. and Lapalme, G. "A systematic analysis of  performance measures for classification tasks", Information Processing and Management, 45(4), pp. 427-437 (2009).