Linux Data Science Virtual Machine
Linux Data Science Virtual Machine
Written by Kay Ewbank   
Thursday, 28 April 2016

A virtual machine image packed with data science tools has been released by Microsoft.

Two versions of the machine, one Windows and the other Linux, have been released. The Data Science Virtual Machines are custom images. The Linux version is built on the OpenLogic CentOS-based Linux version 7.2. Both versions contain data science tools used by data scientists, developers, educators and researchers. The idea is that the VM saves users the time and effort of having to discover, install, configure and manage the tools individually.

In addition to the standard operating system utilities, the pre-installed tools include:

  • Microsoft R Open with the Intel Math Kernel Library.
  • Anaconda Python Distribution with Python 2.7 and 3.5.
  • Jupyter Notebooks with Python and R kernel for browser based data exploration and development.

You also get a local Postgres database instance, and a set of machine learning tools. These include Azure ML, which lets you create R and Python models locally on the VM, then publish them to Microsoft's cloud based Azure ML service through pre-installed libraries. You also get the Computational Network Toolkit (CNTK), a deep learning tool from Microsoft Research; Vowpal Wabbit, an ML system supporting techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning; and XGBoost, a tool providing fast and accurate boosted tree implementation.

The Rattle (the R Analytical Tool To Learn Easily) GUI tool for learning to do data analysis with R is another program included on the machine.

The collection of development Tools includes an Azure SDK in Java, Python, Node.js, Ruby, PHP; Eclipse IDE with Azure Toolkit plugin; code editors like vim, gedit and Emacs (with ESS, auctex add-ons); SQL Server drivers and command line tools like bcp (Bulk Copy), sqlcmd (text based SQL Server query utility); SQuirreL SQL graphical client to access various databases.

The idea is that in around 15 minutes you should be able to be up and using your own data science VM. You have full administrative access to the VM and can install additional software as needed. There’s no separate fee to use the VM image. You only pay for actual hardware compute usage of the virtual machine depending on the size of the VM you’re provisioning.

mlazure

More Information

Machine Learning Data Science Linux Intro

Machine Learning Data Science Windows Intro

Related Articles

R Programming Course

Get On The Machine Learning Bandwagon With Google

R Tools for Visual Studio

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter,subscribe to the RSS feed and follow us on, Twitter, FacebookGoogle+ or Linkedin

 

Banner


Creator of cURL Awarded Prestigious Swedish Prize
17/10/2017

Daniel Stenberg, himself a Swede, is the recipient of this year's Polhems Prize, consisting of a gold medal and 250,000 SEK (about $31,000 USD), for his creation of cURL, an open source program t [ ... ]



Honda's Disaster Response Robot
08/10/2017

Honda's E2-DR prototype described as an "experimental legged robot for inspection and disaster response in plants" was revealed at last month's IROS 2017 in Vancouver.


More News

 

 
 

 

blog comments powered by Disqus

Last Updated ( Thursday, 28 April 2016 )
 
 

   
Banner
RSS feed of news items only
I Programmer News
Copyright © 2017 i-programmer.info. All Rights Reserved.
Joomla! is Free Software released under the GNU/GPL License.