GitHub For Data Under Development
Written by Kay Ewbank   
Monday, 24 February 2020

Gretel, which will enable developers to work collaboratively with data, comes from a team made up of engineers and developers who previously worked for the National Security Agency, Google and Amazon Web Services.

Their project addresses the problem of needing realsitic user data to work with and provides a way for developers to share  sensitive data in real time while maintaining data privacy.

gretel

The developers of Gretel, Alex Watson, John Myers, Ali Golshan and Laszlo Bock, say it works in real time to enable safe sharing and collaboration between developers and applications, and has tools that are "open, intelligent, and integrated".

The team highlights the importance of developers being able to safely learn and experiment with data in order to support rapid innovation on behalf of customers:

"As developers, we don’t always need full access to sensitive customer data. We know that it’s often best to only select the data that you need for developing new features or exploring insights — especially if you can use your developer identity to access data in seconds, instead of spending weeks or months of waiting for compliance approval."

The Gretel solution is to meet this need using a combination of machine learning, synthetic data, and formal reasoning to offer provable privacy guarantees for data. Using this to ensure privacy into developer workflows, Gretel can enable safe access to data within seconds of the time it is created, unlocking siloed data and opening the door for new ideas.

The synthetic data is fake data that follows the same patterns as real user data, but presumably is more realistic than the old "A. Person, 3, High Street, Sometown" variety that programmers usually resort to. Gretel uses machine learning to work out the categories of the data, and classifies it using as many tags to the data as it can find. Those tags are then used to apply "differential privacy" to make the data anonymous so it doesn't match customer information. This results in an entirely fake data set generated by machine learning.

Alongside the data privacy aspects, the team is developing machine learning models to help developers make sense of their data, and to automate joining data with complementary open source datasets, private datasets, or anything in between. They say all Gretel services are available via simple APIs that integrate with developers’ existing workflows and tools. 

gretel

More Information

Gretel Homepage

Related Articles

Google Dataset Search Out Of Beta

New Database For Data Scientists

Project Cortex Adds AI To Office 365

Google Open Sources Differential Privacy Library

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


Apache NiFi Adds Python Processor Support
09/07/2024

Apache NiFi 2, a project for processing and distributing data, has been released with support for Python processors in the MiNiFi framework, and a completely rebuilt user interface.



BusyBeaver(5) Is 47,176,870
03/07/2024

The thing about the BusyBeaver function is that it is very easy to understand, but very difficult to compute. We now know its value up to 5, which isn't much progress for more than 50 years work.


More News

kotlin book

 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Tuesday, 25 February 2020 )