College of William and Mary, USA
Deep Learning & Software Engineering: Past, Present and Future
Abstract:
Bridging the abstraction gap between concepts and source code is the essence of Software Engineering (SE). SE researchers regularly use machine learning to bridge this gap, but there are two fundamental issues with traditional applications of machine learning in SE research. Traditional applications are too reliant on human intuition, and they are not capable of learning expressive yet efficient internal representations. Ultimately, SE research needs approaches that can automatically learn representations of massive, heterogeneous, datasets in situ, apply the learned features to a particular task, and possibly transfer knowledge from task to task.
Improvements in both computational power and the amount of memory in modern computer architectures have enabled new approaches to canonical machine learning tasks. Specifically, these architectural advances have enabled machines that are capable of learning deep, compositional representations of massive data depots. This rise of Deep Learning (DL) has led to tremendous advances in several fields. Given the complexity of software repositories, deep learning has the potential to usher in new analytical frameworks and methodologies for SE research and the practical applications it reaches. Conversely, the development of DL algorithms and models represents an entirely new type of software engineering that is still evolving.
This talk examines how DL algorithms can enhance or automate several critical SE tasks involving natural language and code, including program generation and program repair among others. We demonstrate that deep learners significantly outperform state-of-practice canonical machine learning approaches for these tasks. These examples illustrate the tremendous potential that DL can have on the science and practice of software engineering by moving SE research from the art of feature engineering to the science of automated discovery. We also explore how advancements in software engineering tools and practices can enable further progress in making DL frameworks more accessible and useful for researchers, programmers, and data scientists. The talk will conclude with a discussion of promising future directions of work as well as an overview of potential opportunities for the Deep Learning for SE research community to continue to drive impactful, open, and reproducible work.
Biography: Denys Poshyvanyk is a Full Professor in the Computer Science Department at William and Mary where he leads SEMERU research group. He received his Ph.D. from Wayne State University. His current research is in the area of software engineering, evolution and maintenance, program comprehension, applications of machine learning (deep learning) and information retrieval in SE, mobile app (Android) development and testing, reverse engineering, repository mining, source code analysis and traceability. His papers received several Best Paper Awards and ACM SIGSOFT Distinguished Paper Awards at the most important SE conferences, and he also received the Most Influential Paper Awards at ICSME'16, ICPC'17, ICPC'20 and ICSME'21. He is a recipient of the NSF CAREER award (2013).