Data, Data, Everywhere ...
Written by Sue Gee   
Tuesday, 08 November 2022

... and not enough people with the skills to cope with it all. The whole raft of Udacity's data-related Nanodegrees and courses are restarting November 9th and Udacity is currently offering a 75% discount!


udacityLogoNew

Disclosure: When you make a purchase having followed a link to from this article, we may earn an affiliate commission.

Udacity's 75% discount means that even for a 4-month course you would pay just $399 - the usual "Pay as you go" rate for a single month. The offer applies to all Udacity programs and courses and you'll find all the options by search its Program Catalog but this article looks at some of what on offer in the School of Data Science culminating in the Data Architect Nanodegree, a valuable credential to earn if you want to move into a senior, data-related role responsible for defining the policies, procedures, models and technologies to be used in collecting, organizing, storing and accessing company information.

In this digital age we are collecting data at an unprecented rate. A couple of years ago in Data Scientist or Data Engineer? Choose Your Path On Udacity I quoted the statistic that 2.5 million terrabytes of data are created every day - giving us the problems of how to store, organize and analyze it all.

Referring back that article we can make this distinction between Data Scientists and Data Engineers:

  • Data Scientists can be thought of as those who make sense of the data, present the information it contains and contribute to making decisions based on it.
  • Data Engineers are those who build the infrastructure to work with massive datasets - the data warehouses or data lakes and the data pipelines -  and devise the models in that help companies make sense of it all. 

Depending on your starting point you might need to complete two or more Nanodegrees to prepare for either of these roles. If you aspire to be a Data Scientist the prerequisite for embarking on the 4-month Data Scientist Nanodegree is Machine Learning with Intro to Machine Learning with PyTorch being recommended. You might also like to follow up with the Data Visualization Nanodegree, which I covered in detail here.

Prior to embarking on the Data Engineer Nanodegree, the modules for which are listed here you need both Python and SQL skills at intermediate level that can be gained through the Programming for Data Science with Python Nanodegree

The role of Data Architect can be thought of an an extension of Data Engineer focused on high-level business intelligence relationships and data policies combined with that of a Database Architect  so the skills that you need as prerequsites for enrolling in Udacity's Data Architect Nanodegree include intermediate SQL, for which the Learn SQL Nanodegree  would help and the basics of ETL/Data Pipelines, for which you might want the Data Streaming Nanodegree as outlined here

data arch

Data Architect Nanodegree is a 4-month program, assuming you devote around 10 hours per week to it as it is self-paced you can reduce or extend the time frame to suit your other commitments.

It overview states:

In this program, you’ll plan, design and implement enterprise data infrastructure solutions and create the blueprints for an organization’s data management system. You’ll create a relational database with PostGreSQL, design an Online Analytical Processing (OLAP) data model to build a cloud based data warehouse, and design scalable data lake architecture that meets the needs of Big Data. Finally, you’ll learn how to apply the principles of data governance to an organization’s data management system. 

It comprises four courses, each of which is based around a hands-on project: 

  • Data Architecture Foundations
    Project - Designing an HR Database
    Begin by learning the characteristics of good data architecture and how to apply them. Next you will move on to data modeling. You will learn to design a data model, normalize data and create a professional ERD. Finally, you will take everything you learned and create a physical database using PostGreSQL
  • Designing Data Systems
    Project - Design a Data Warehouse for Reporting and OLAP
    Learn to design enterprise data architecture. You will build a cloud based data warehouse with Snowflake; evaluate various data assets of an organization and characteristics of
    these data sources; design a staging area for ingesting varieties of data coming from source systems and
    design an Operational Data Store (ODS). Finally, you will learn to design OLAP dimensional data models,
    design ELT data processing capable of moving data from an ODS to a data warehouse, and write SQL queries for the purpose of building reports.

  • Big Data Systems
    Project - Design an Enterprise Data Lake System
    Learn about how to help organizations with massive amounts of data, including identification of Big Data problems and how to design Big Data solutions. You will learn about the internal architecture of many of the Big Data tools such as HDFS, MapReduce, Hive and Spark, and how these tools work internally to provide distributed storage, distributed processing capabilities, fault tolerance and scalability. You will also learn how to evaluate NoSQL databases and their use cases. Finally, you will learn how to implement Data Lake design patterns and how to enable transactional capabilities in a Data Lake.

  • Data Governance
    Data Governance at SneakerPark
    Learn how to design a data governance solution that meets your company’s needs. First, you will learn about the different types of metadata and how to build a Metadata Management System, Enterprise Data Model and Enterprise Data Catalog. Next, you will learn how to perform data profiling using various techniques including data quality dimensions, how to identify remediation options for data quality issues, and how to measure and monitor data quality. Finally, you will learn the concepts of Master Data and golden record, different types of Master Data Management Architectures, as well as the golden record creation and master data governance processes.

The skills acquired through this program are already in demand and will be increasingly required in the coming years. Earning this credential is a passport to a well paid job in many different industries.   

More Information

Udacity Program Catalog

Data Architect Nanodegree

Data Scientist Nanodegree

Data Visualization Nanodegree 

Data Engineer Nanodegree

Programming for Data Science with Python Nanodegree

Data Streaming Nanodegree

Learn SQL Nanodegree

Related Articles

A Time To Enrol, A Time To Save

Data Scientist or Data Engineer? Choose Your Path On Udacity

Udacity Data Science Nanodegrees Restarting 

Data Scientists Salary Data 

Data Scientist Best Paying Entry-Level Job Says Glassdoor 

New Udacity Nanodegree In Data Streaming

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


A Giant of Computer Science - Fred Brooks
22/11/2022

Fred Brooks, who managed the development of software for IBM's System/360 family of computers and based his book The Mythical Man-Month on that experience, has died at the age of 91.



Computer Pioneer Kathleen Booth Dies At Age 100
30/10/2022

Kathleen Booth, who died last month, had a remarkable career in which she achieved many firsts. She is credited with the first assembly language, founding and teaching in the first university computer [ ... ]


More News

picobook

 



 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Tuesday, 08 November 2022 )