Skip to content

TravisH0301/learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

669 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Learning

Repository containing brief notes made during learning.

Table of Contents

  1. Software Engineering
  2. Backend Engineering
  3. Data Engineering
  4. Data Science / Machine Learning
  5. Miscellaneous

1. Software Engineering

Back to table of contents

Python

  • Context Manager: Use of context manager to manage external resources on Python
  • pre-commit: How to set up Git hooks with pre-commit to review code automatically before the commit
  • Multiprocessing and Ray on Python: Instruction on implementing parallel processing on Python using Multiprocessing and Ray, and their comparison
  • Pandas Parallelism via Modin: Instruction on how to run Pandas operations in parallel by using Modin
  • Concurrency: How to accelarte I/O-bound & CPU-bound tasks by leveraging ascynchronous, multi-threading and multi-processing
  • Recursion: Recursion in Python with examples
  • Python Style Guides: Summary of recommended style guides from PEP 8, Google and Black

Network

  • SSH: How to establish SSH session between server and client using public key authentication, and how to transfer files using SFTP
  • Cloud Networking in AWS: Basic networking concepts in AWS

Security

DevOps

  • Git: Instruction to version control using Git
  • GitOps: Information on how GitOps streamlines continuous deployment for a system with declarative desired states (ex. Kubernetes)
  • Git Workflows: Covers different types of development workflows using Git
  • Codefresh: What is Codefresh and its CI/CD pipeline with examples
  • Test-Drive Development (TDD): Definition of Test-Driven Development with examples of unit test in Python using unittest module

2. Backend Engineering

Back to table of contents

Internet

  • Internet: Basic explanation of what internet is, and how information is communicated through internet with different protocol layers
  • HTTP: Characteristics of HTTP, how communication is made between a client and a server using HTTP request and HTTP response, and HTTP/2 & HTTP/3

API

  • REST API: Architectural constraints of REST API

Authentication

  • OAuth: Working mechanism of OAuth to delegate access to the applications

3. Data Engineering

Back to table of contents

Database

  • Database Engine & API: Definition of database engine in database management system and introduction of database engine API such as Open Database Connectivity (ODBC) and Object Linking and Embedding, Database (OLE DB)
  • Distributed Database: Pros & Cons of distributed database with an introduction to the distributed NoSQL database, Apache Cassandra
  • MPP Database: Introduction to Massively Parallel Processing (MPP) and its architectures of grid computing and clustering | Methods of table partitioning: Distribution style & Sorting key
  • Partitioning in Teradata: How data is partitioned in Teradata and how to optimise for queries by further partitioning data in nodes and collecting statistics
  • Query Optimisation in Modern Data Warehouses: Query optimisation methods used in modern data warehouses

Data Modelling

  • Datebase vs Data Warehouse vs Data Lake: Definition of relational database (OLTP & OLAP), data warehousing (architecture - Kimball's & Inmon's, dimensional data modelling, ETL vs ELT & OLAP Cube) and data lake
  • Data Modelling: How to do data modelling (Entity Relationship Diagram) and aspects of relational database & non-relational (NoSQL) database
  • Relational Data Model: How to structure normalised/denormalised data models
  • Types of Fact tables: Different types of fact tables and what they are used for
  • Star Schema & Snowflake Schema: Introduction to star schema & snowflake schema
  • Slowly Changing Dimension (SCD): Types of slowly changing dimensions (SCDs) to adapt to changes in the data source
  • Data Vault: Data vault architecture and its components, and how data vault fits into the medallion architecture
  • Semantic Layer: What is semantic layer, and how it differs from metrics layer, metrics store, and headless BI

Data Pipeline

Data Governance

  • Data Governance: What is data governance? Key components of data governance - processes, people & technology

Event Streaming

Spark

SQL

dbt

Storage

4. Data Science / Machine Learning

Back to table of contents

Statistics

Machine Learning

Deep Learning

LLM

5. Miscellaneous

Back to table of contents

Computer Science

  • Binary, Bit & Byte: Explanation of binary, bit and byte, and how they are used in modern computer architecture and character encoding
  • Encoding and Schema: Types of encoding and schema (Avro as an example)

Linux

  • Linux Server: Description of how to connect remote Linux server with some basic Linux terminal commands

Anaconda

Geographic Information System

On-prem SharePoint API

About

Brief notes on learning.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published