Kedro Open Source library For Machine Learning
Written by Kay Ewbank   
Thursday, 20 June 2019

A new open source development workflow framework for creating machine learning code has been released. Kedro has PySpark integration and an SDK for working with datasets.

Kedro has been developed by QuantumBlack, an analytics firm acquired by McKinsey's in 2015, and the name Kedro derives from the Greek word meaning center or core. Kedro helps structure your data pipeline using software engineering principles. It also provides a standardized approach to collaboration for teams.

kedro

The rationale for developing Kendro is that data scientists aren't used to working in teams, so a common ground needs to be agreed on for efficient collaboration. Kendro is designed to let teams adopt an unbiased standard. It is also designed to create code that is reproducible, modular, monitored, tested and well documented.

Kedro has been deployed Kedro internally in QuantumBlack and McKinsey for over 50 projects, and the developers say  it has revolutionised their workflows.

The software is based on a standard project template that can be configured for credentials, logging, data loading and Jupyter Notebooks. It also has Sphinx integration for creating documentation. It also takes care of data abstraction and versioning.

Kedro has support for pure Python functions (nodes) to break large chunks of code into small independent sections. It has automatic resolution of dependencies between nodes, and there are plans for a visualization tool that will show you the pipeline structure of Kedro projects.

kedro pipeline visualisation

 

 

Kedro also includes Kedro-Airflow, a tool that lets you prototype your data pipeline in Kedro before deploying to Airflow. There's also Kedro-Docker, a tool for packing and shipping Kedro projects in Docker containers.

Kedro can be deployed locally, on-premise and cloud (AWS, Azure and GCP) servers, or clusters (EMR, Azure HDinsight, GCP and Databricks)

It is suitable for a wide range of applications, ranging from single-user projects, to enterprise-level software driving business decisions backed by machine learning models.

kedro 

 

 

More Information

Kedro On GitHub

Kedro On PyPi

Related Articles

RAPIDS GPU Data Analysis Platform Launched

Pachyderm Gets Faster And Gets Funding

Apache Beam Moves To Top Level

Infer.NET Machine Learning Framework Now Open Source

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


Margaret Martonosi Receives Computer Architecture Award
11/06/2021

The 2021 Eckert-Mauchly Award has been awarded to Margaret Martonosi for contributions to the design, modeling, and verification of power-efficient computer architecture which have led to new fields o [ ... ]



Instagram Cinder Python Accelerator Open Sourced
20/05/2021

Facebook has open sourced Cinder, Instagram's internal production version of CPython 3.8. Cinder has been developed to improve Python performance, and is now available as a Facebook incubator project. [ ... ]


More News

square

 



 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Thursday, 20 June 2019 )