The Data Engineering Vault
Written by Nikos Vaggalis   
Friday, 11 October 2024

A curated network of knowledge designed to facilitate exploration, discovery, and deep learning in the field of data engineering.

Or in other words an encyclopedia on everything about data engineering. Attempts like this are interesting because despite the advent of the LLMs which can answer just about anything, we witness a resurgence in dictionaries/encyclopedias of tech terminology.

Recently we took a look at "Dev Encyclopedia" , an open-source, easy-to-use online resource that helps make sense of complicated tech terms. It's mission is simple:

How many times have you attended a meeting or watched a presentation or read a manual which were riddled with such jargon unknown to you? Wouldn't an accessible and quick lookup guide help in such situations?

Well wish granted. Here is the "Dev Encyclopedia" made by a single Python backend developer who thought along those lines and set out to assist those bewildered by providing clear, concise explanations for all those tricky tech terms

The database of terms of Dev Encylopedia is exhaustive, encompassing many SE disciplines. It ranges from ACID to APIs and Antivirus, from Big O to Deep Learning and LLMs, from MVC and up to Zombie processes. On the contrary,The Data Engineering Vault is not generalized, but specialized on the narrow field of data engineering.

The knowledge base is organized in seven categories:

  • Data Engineering Concepts
  • Data Engineering Tools and Technologies
  • Data Engineering Practices
  • Modern Data Engineering
  • Data Engineering Management and Analysis
  • Data Engineering Design and Development
  • Additional Data Engineering Resources

As such you'll find information ranging from the Classical Architecture of Data Warehouse, to the Relational Model, SQL, Python and Apache Arrow, from ELT and ETL, to OLAP Cubes and Business Intelligence.It's all there.

Navigating from the categories to the main material, say picking "Classical Architecture of Data Warehouse", we are presented with a html page resembling a markdown document (Digital Garden) that contains clear and structured information, diagrams included too:

secondbraindiagram

 

We find that the Classical Architecture of Data Warehouse is split into the Layers of Staging, Cleansing, Core areas, the Data Mart and the Metadata. The document then delves into them.

The point is that the encyclopedia is not comprised of stiff explanations of the terminology but delves into the topic going through the advantages and disadvantages too. You can expect the same level of detail for all the material, rendering the Vault
an indispensable guide in your data engineering journey.

secondbrainlogo

More Information

Data Engineering Vault: A Second Brain Knowledge Network 

Related Articles

Dev Encyclopedia Shares The Knowledge

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


IBM Updates Granite Models
28/10/2024

IBM has released new Granite models that it says provide state-of-the-art performance relative to model size. The Granite 3.0 collection includes a new, instruction-tuned, dense decoder-only LLM.



AI Propels Python To Top Language on GitHub
30/10/2024

This year's Octoverse Report reveals how AI is expanding on GitHub and that Python has now overtaken JavaScript as the most popular language on GitHub. The use of Jupyter Notebooks has also surged.


More News

espbook

 

Comments




or email your comment to: comments@i-programmer.info