A Recap On Our Pilot Webinar With Boris Cergol

A Recap On Our Pilot Webinar With Boris Cergol

Speaking on how Deep Learning can support software development.

For years, there have been many debates and discussions surrounding the possibilities of Artificial Intelligence (AI) and Machine Learning (ML). With more and more technologies able to do things that humans do in homes and businesses across the globe, this is an area that is fast evolving and new approaches are constantly emerging.

Take Deep Learning (DL) for example, it is a subset of machine learning whereby artificial neural networks can learn from large masses of data, it is already being used in a variety of different areas. However, it is only recently that there has been a surge of interest in terms of a researching how DL methods can be applied to data sets containing source code.

DL certainly offers fantastic potential and there are indications that this particular form of ML will have a major impact on software engineering practices in the years to come.

Code search;

For some time, ML has been used for code search – for example, when you are dealing with a question in natural language and are seeking a snippet of code that provides the answer to the question. A key part of this process of retrieving code fragments (the answer) from larger masses of code involves code embeddings which are the mappings of code fragments to vectors of numbers. DL is very effective at not only finding these embeddings but identifying nicer presentations in complex spaces.

Code completion;

Another area where DL has been applied and is more advanced is code suggestion and completion. Here, its aim is to assist the software engineer with useful suggestions to complete a given source code. For instance, predicting the end of a sentence when the first part of it has been entered. As well as being able to perform auto-completion of the code. DL can also offer a range of suggestions for the content and, in some rare cases, recommendations for reviews of code snippets that have been written.

Unsupervised pre-training;

The idea here is that you train a DL model using a very large amount of data without supervision. Here, you remove one part of a code and mask it – the goal being that the model then predicts what is missing based on the remaining surrounding information. If successful, the resulting model can be used for all sorts of things from natural language processing to analyzing data sets containing source code.

Code translation;

Motivated by results from natural language processing, DL also has the capability to translate the source code from one programming language into another programming language. However, while there have been some interesting results in this area, to date, there are still severe limitations with this application of DL and more research and development are required.

Detection of code defects;

DL can also be applied to detect code defects or bugs in early software development and provide suggestions on how to remove or repair them. This can help to speed up the development process, saving time dealing with issues at a later stage, and the model can even be trained to identify and handle a specific type of defect.

Program synthesis;

The use of deep neural networks in learning how to generate computer programs based on the inputs and outputs defined by the software engineer is becoming more common and effective. It presents many opportunities in terms of software development, reinforcement learning algorithms and more general AI application. However, there are many challenges involved here including the best approach to take (neural versus symbolic).

It is certain that some of these developments will have a major impact on the world of software engineering. Therefore, the question is not about ‘if’, but ‘when’ these practices will be implemented on a more widespread basis. Significant progress has been made already. When DL first emerged, the frameworks were complex which meant it was time-consuming and required a deep level of technical expertise. Now, it is much easier to do and there is greater engagement from software engineers.

Dylan Gibson once said: “the future is already here; it is just unevenly distributed.”

We say: “DL is already here; we just haven’t seen its true potential yet.”

There’s more to come!