fbpx

Automated Code Review System

By Sonu J on 19th January 2024

Problem statement

Manual code review processes can be time-consuming, and identifying code quality issues may require significant effort. The project addresses this challenge by developing an automated code review system that leverages static code analysis and machine learning to identify code quality issues efficiently and provide constructive feedback to developers.

Abstract

This project focuses on automating the code review process by implementing a system that analyzes source code using static code analysis techniques. Machine learning algorithms will be employed to identify common code quality issues, potential bugs, and adherence to coding standards. The goal is to streamline the code review process, improve code quality, and enhance collaboration among development teams.

Outcome

  • Implementation of an automated code review system using static code analysis.
  • Identification of code quality issues and potential bugs.
  • Improved code quality and efficiency in the software development process.

Reference

Code review is a common process that is used by developers, in which a reviewer provides useful comments or points out defects in the submitted source code changes via pull request. Code review has been widely used for both industry and open-source projects due to its capacity in early defect identification, project maintenance, and code improvement. With rapid updates on project developments, code review becomes a non-trivial and labor-intensive task for reviewers. Thus, an automated code review engine can be beneficial and useful for project development in practice. Although there exist prior studies on automating the code review process by adopting static analysis tools or deep learning techniques, they often require external sources such as partial or full source code for accurate review suggestion. In this paper, we aim at automating the code review process only based on code changes and the corresponding reviews but with better performance. The hinge of accurate code review suggestion is to learn good representations for both code changes and reviews. To achieve this with limited source, we design a multi-level embedding (i.e., word embedding and character embedding) approachto represent the semantics provided by code changes and reviews. The embeddings are then well trained through a proposed attentional deep learning model, as a whole named CORE. We evaluate the effectiveness of CORE on code changes and reviews collected from 19 popular Java projects hosted on Github. Experimental results show that our model CORE can achieve significantly better performance than the state-of-the-art model (DeepMem), with an increase of 131.03% in terms of Recall@10 and 150.69% in terms of Mean Reciprocal Rank. Qualitative general word analysis among project developers also demonstrates the performance of CORE in automating code review.

1.M. E. Fagan, “Design and code inspections to reduce errors in program development”, IBM Systems Journal, vol. 15, no. 3, pp. 182-211, 1976.

2.C. Sadowski, E. Söderberg, L. Church, M. Sipko and A. Bacchelli, “Modern code review: A case study at google”, International Conference on Software Engineering Software Engineering in Practice track (ICSE SEIP), 2018.

3.M. Beller, A. Bacchelli, A. Zaidman and E. Juergens, “Modern code reviews in open-source projects: Which problems do they fix?”, Proceedings of the 11th Working Conference on Mining Software Repositories ser. MSR 2014, pp. 202-211, 2014.

4.L. Fan, T. Su, S. Chen, G. Meng, Y. Liu, L. Xu, et al., “Large-scale analysis of framework-specific exceptions in android apps”, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE), pp. 408-419, 2018.

5.L. Fan, T. Su, S. Chen, G. Meng, Y. Liu, L. Xu, et al., “Efficiently manifesting asynchronous programming errors in android apps”, Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. ACM, pp. 486-497, 2018.

https://ieeexplore.ieee.org/document/9054794/references#references