|AI Translates From One Programming Language To Another|
|Written by Mike James|
|Wednesday, 10 June 2020|
AI is coming for your programming job! Well that's the headline. The truth is that it is going to take longer to get the human completely out of the loop, but that doesn't mean that there aren't tasks that AI can't take over now.
The recent paper "Unsupervised Translation of Programming Languages" comes from Facebook A I Research, and describes how a neural network learned how to translate existing programs from one language to another - transcompilation. As the paper says:
"A transcompiler, also known as source-to-source translator, is a system that converts source code from a high-level programming language (such as C++ or Python) to another. Transcompilers are primarily used for interoperability, and to port codebases written in an obsolete or deprecated language (e.g. COBOL, Python 2) to a modern one. They typically rely on handcrafted rewrite rules, applied to the source code abstract syntax tree. Unfortunately, the resulting translations often lack readability, fail to respect the target language conventions, and require manual modifications in order to work properly. The overall translation process is time-consuming and requires expertise in both the source and target languages, making code-translation projects expensive."
This is, of course, the reason that Cobol is still used in so many financial systems. If you have ever tried translating even a small program from two similar languages, Python 2 to 3 say, you know that it is surprisingly difficult. Things that you never thought about crop up and make things not go according to plan.
The Facebook group took lots of code from GitHub in C++, Java and Python. The idea was to use techniques from natural language processing to extract the patterns from the languages. The program learned a language-independent representation of a function and then was able to use this to generate the function in another language. The key factor is that this representation was learned in an unsupervised way - that is, no human told the neural network what the program did, there was no target to learn and there was no reinforcement reward applied. The patterns in the language are apparently enough. Which is surprising, but a similar approach works for natural languages - which is even more surprising.
The system wasn't trained by presenting it with examples of the same function written in different language which is how you might imagine it would work. This is more the way a human would do the job, by reading the function, understanding what it does and then re-expressing it in the new language.
Does it work?
It seems that it does.
"We observe that TransCoder successfully understands the syntax specific to each language, learns data structures and their methods, and correctly aligns libraries across programming languages. "
It doesn't always get it right, but it is impressive enough to suggest that something deep is going on.
The idea of a language-independent representation of a program that can be used to create an implementation in a particular lauguage is something that could lead to the sort of AI that could generate programs from requirements or descriptions.
It may not be with us just yet, but, yes, I have to say the day that AI may take over generating programs seems a lot closer after this research.
Marie-Anne Lachaux, Baptiste Roziere, Lowik Chanussot and Guillaume Lample
or email your comment to: firstname.lastname@example.org
|Last Updated ( Wednesday, 10 June 2020 )|