|Netflix Releases Polynote|
|Written by Kay Ewbank|
|Thursday, 31 October 2019|
Netflix has released a new open source software tool called Polynote that's a polyglot notebook with first-class Scala support, Apache Spark integration, with multi-language interoperability including with Scala, Python, and SQL.
Polynote is described as providing a notebook environment that allows data scientists and machine learning researchers to integrate Netflix's JVM-based ML platform with the Python ecosystem’s machine learning and visualization libraries. Netflix's ML platform, Infra, uses Scala and is used to create personalized recommendations for "discoveries of engaging video content that maximize member joy".
The developers on the Netflix Personalization Infrastructure team were frustrated with the support of Scala in existing notebook tools. Such tools are mainly aimed at Python developers working in an environment constructed using a package manager with a relatively small number of dependencies. In contrast, Scala developers typically work in a project-based environment with a build tool managing hundreds of dependencies, some conflicting with others. In addition, developers using Spark need their distributed code to work no matter which node of the clustered environment it's running on.
Polynote aims to support this by providing configuration and dependency setup information that is saved within the notebook itself. Another feature of Polynote is "reproducibility by design". By taking a cell’s position in the notebook into account when executing it, Polynote helps prevent bad practices that make notebooks difficult to re-run from the top. This is designed to overcome the problem that cells in a notebook can be modified and executed independently, and can also depend on the output of other cells in the notebook. The way expressions are usually evaluated in notebooks can cause problems meaning notebooks can't be reliably rerun from the top, which makes them very difficult to reproduce and share.
Other benefits, according to the development team, include the fact that each cell in a notebook can be written in a different language with variables shared between them. Currently Scala, Python, and SQL cell types are supported. The software is also integrated with matplotlib and Vega to give users a way to communicate with others through visualizations.
or email your comment to: firstname.lastname@example.org
|Last Updated ( Thursday, 31 October 2019 )|