Apache PredictionIO Reaches Top Level Status
Written by Kay Ewbank   
Thursday, 26 October 2017

PredictionIO is an open source machine learning platform that has been primarily supported by Salesforce. Having been donated to Apache last summer,  it has now graduated from the Apache Incubator to become a Top-Level Project, making it more generally available.

PredictionIO is based on an open source machine learning stack that can be used to build, evaluate and deploy engines with machine learning algorithms. It also has an event server that handles events from multiple platforms, and a template gallery with engine templates for different type of machine learning applications. These include recommendation, classification, regression, and natural language processing. The developers say that using the templates cuts down the time to build a recommendation engine to a couple of weeks with one or two engineers.


The code of the engines consist of DASE components:

  • [D] Data Source and Data Preparator
    The Data Source reads data from an input source and transforms it into a desired format. The Data Preparator preprocesses the data and forwards it to the algorithm for model training.

  • [A] Algorithm
    The Algorithm component includes the Machine Learning algorithm, and the settings of its parameters, determines how a predictive model is constructed.

  • [S] Serving
    The Serving component takes prediction queries and returns prediction results. If the engine has multiple algorithms, this element will combine the results into one. Additionally, business-specific logic can be added in Serving to further customize the final returned results.

  • [E] Evaluation Metrics
    An Evaluation Metric quantifies prediction accuracy with a numerical score. It can be used for comparing algorithms or algorithm parameter settings.

All the elements can be customized for your particular needs.

The event server is used to collect data from your application while its running. The PredictionIO engine then builds predictive models based on one or more algorithms using the data. Once you've deployed it as a web service, it listens to queries from your application and responds with predicted results in real-time.

There are a number of SDKs for the engine, currently for Java, PHP, Python and Ruby. It can also be full machine learning stack, bundled with Apache Spark, MLlib, HBase, Spray and Elasticsearch.



More Information


Related Articles

Apache Kylin Adds RDBMS Support

Azure Machine Learning Enhancements

Earthquake Prediction Using Machine Learning

Apache Fluo Improves Spark Integration

Apache Kudu Improves Web Interface

Apache Spark With Structured Streaming

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on, Twitter, Facebook or Linkedin.



DataCamp For Hands-On Learning

DataCamp, the course provider with its focus firmly on data science, has a new logo and a whole new look. It's a good time to see what it has to offer businesses and individuals.

CockroachDB 20.2 Adds PostGIS Spatial Data Support

There's a new version of CockroachDB that has updates for developers, better security and new features including support for storing and indexing spatial data using Postgre PostGIS-compatible SQL synt [ ... ]

More News





or email your comment to: comments@i-programmer.info

Last Updated ( Thursday, 26 October 2017 )