GitHub For Data Under Development
Written by Kay Ewbank   
Monday, 24 February 2020

Gretel, which will enable developers to work collaboratively with data, comes from a team made up of engineers and developers who previously worked for the National Security Agency, Google and Amazon Web Services.

Their project addresses the problem of needing realsitic user data to work with and provides a way for developers to share  sensitive data in real time while maintaining data privacy.


The developers of Gretel, Alex Watson, John Myers, Ali Golshan and Laszlo Bock, say it works in real time to enable safe sharing and collaboration between developers and applications, and has tools that are "open, intelligent, and integrated".

The team highlights the importance of developers being able to safely learn and experiment with data in order to support rapid innovation on behalf of customers:

"As developers, we don’t always need full access to sensitive customer data. We know that it’s often best to only select the data that you need for developing new features or exploring insights — especially if you can use your developer identity to access data in seconds, instead of spending weeks or months of waiting for compliance approval."

The Gretel solution is to meet this need using a combination of machine learning, synthetic data, and formal reasoning to offer provable privacy guarantees for data. Using this to ensure privacy into developer workflows, Gretel can enable safe access to data within seconds of the time it is created, unlocking siloed data and opening the door for new ideas.

The synthetic data is fake data that follows the same patterns as real user data, but presumably is more realistic than the old "A. Person, 3, High Street, Sometown" variety that programmers usually resort to. Gretel uses machine learning to work out the categories of the data, and classifies it using as many tags to the data as it can find. Those tags are then used to apply "differential privacy" to make the data anonymous so it doesn't match customer information. This results in an entirely fake data set generated by machine learning.

Alongside the data privacy aspects, the team is developing machine learning models to help developers make sense of their data, and to automate joining data with complementary open source datasets, private datasets, or anything in between. They say all Gretel services are available via simple APIs that integrate with developers’ existing workflows and tools. 


More Information

Gretel Homepage

Related Articles

Google Dataset Search Out Of Beta

New Database For Data Scientists

Project Cortex Adds AI To Office 365

Google Open Sources Differential Privacy Library


To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.


Amazon Ending Alexa Skills Payments

Amazon has told developers who are signed up to the Alexa Developer Rewards Program that their monthly payments will end at the end of June. The announcement follows a decision to end the program unde [ ... ]

AWS Introduces A New JavaScript Runtime For Lambda

Amazon has announced the availability, albeit for experimental purposes, of a new JavaScript based runtime called Low Latency Runtime or LLRT for short, to bring JavaScript up to the performance throu [ ... ]

More News

raspberry pi books



or email your comment to:

Last Updated ( Tuesday, 25 February 2020 )