LinkedIn Open Sources Feathr Machine Learning Feature Store
Written by Kay Ewbank   
Friday, 22 April 2022

LinkedIn has made Feathr open source. Feathr is the feature store LinkedIn built to simplify machine learning feature management and improve developer productivity.

The developers say that at LinkedIn dozens of applications use Feathr to define features, compute them for training, deploy them in production, and share them across teams.

linkedin

Feathr was developed to mitigate a problem faced by LinkedIn, that of preparing and managing features based on raw data sources for use by machine learning models. LinkedIn has hundreds of ML models running in applications like Search, Feed, and Ads, and those models are powered by thousands of features about entities.

feathr

Preparing and managing features for use by those ML applications is difficult and takes time. Feature preparation pipelines are made up of the systems and workflows that transform raw data into features for model training and inference. The pipelines are used to bring together time-sensitive data - potentially from multiple sources. Those 'features' are then joined to training labels, stored, and used by the ML applications.

Feathr provides a way to make feature preparation pipeline creation easier. It is an abstraction layer that provides a common feature namespace for defining features and a common platform for computing, serving, and accessing them “by name” from within ML workflows.

Feathr can be used to define features based on raw data sources, including time-series data, using simple APIs. Once the features have been defined, Feathr can be used to access those features by their names during model training and model inferencing. Features can also be shared across teams.

Feathr automatically computes feature values and joins them to training data, using point-in-time-correct semantics to avoid data leakage. It also supports deploying features for use online in production.

Feathr’s abstraction creates producer and consumer personas for features. Producers define features and register them into Feathr, and consumers access/import groups of features into their ML model workflows.

For the consumer, Feathr acts like a software package management tool for ML features. Feathr lets feature-consumers list the names of the features they want to “import” in their model, abstracting the nontrivial details about how they are sourced and computed.

Feathr is available on GitHub now.

linkedin

More Information

Feathr On GitHub

Related Articles

LinkedIn Open Sources Data Streaming Tool

LinkedIn Restricts Developer Access  

LinkedIn Groups API

LinkedIn Developer Network Opens 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


Striim Launched On Google Cloud
19/05/2022

A new service that offers Google Cloud customers real-time streaming data integration and analytics has been launched by Striim. Striim Cloud on Google Cloud is claimed to be the fastest way for custo [ ... ]



$200K Call For Code 2022 Announced
27/04/2022

Sustainability and climate change are the focus of this year's Call For Code, the annual contest for developers run by IBM, the David Clarke Cause, United Nations Human Rights, and the Linux Foundatio [ ... ]


More News

pythondata

 



 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Friday, 22 April 2022 )