Machine Learning My Way - Self-study and Review Guide

Subscribe
Archives
April 2, 2024

0.3.0 Databases and storage section updated

Databases and storage 

  1. Disk: HDD, SDD and cloud (S3) storage. Disk costs and time to access. File systems (HDFS).

  2. Relational databases, ER diagram, normalized forms. SQL. Other types of databases and NoSQL (key-value, graph, column, vector).

  3. Processing large data in parallel: MapReduce, Hadoop (HDFS+Yarn+MapReduce), Spark. 

  4. Cloud providers (AWS, GCP, Azure). New data solution providers (Snowflake, Databricks).

  5. Decoupling storage and compute. Data warehouses and DW architectures. OLAP and OLTP.

  6. Mergers and acquisitions, venture financing and forks.

Extra video: The Ancient Art of Data Management (2023) from DuckDB co-founder.

Don't miss what's next. Subscribe to Machine Learning My Way - Self-study and Review Guide:
custom
This email brought to you by Buttondown, the easiest way to start and grow your newsletter.