1️⃣Introduction to databases Relational Databases Data about entities are organized into tables Each row or record is an instance of an entity Each column has information about an attribute Tables can be linked to each other via unique keys Support more data, multiple simultaneous users, and data quality controls Data types are specified for each column SQL to interact with databases Connecting ..
Pieces of My Youth
blog
6 Posts
DATA/Data Engineer
DATA/Data Engineer
CH2) 3. Importing Data from Databases
DATA/Data Engineer
CH2) 2. Importing Data From Excel Files
2. Importing Data From Excel Files 1️⃣Introduction to spreadsheets Spreadsheets : (= excel files) have formatting and formulas pd.read_excel() nrows, skiprows, usecols (same with read_csv()) 2️⃣Getting data from multiple worksheets read_excel() loads first sheet by default use sheet_name keyword to load other sheets by position # or list of names arguments applied to read_excel() apply to all sh..
DATA/Data Engineer
CH2 Streamlined Data Ingestion with pandas) 1. Importing Data from Flat Files
1. Importing Data from Flat Files 1️⃣ Introduction to flat files pandas - python library : load & manipulate data Data frames pandas - specific structure for 2D data Flat Files simple, easy to produce format Data stored as plain text One row per line seperated by a delimiter ex) csv one pandas function to load them all: read_csv() Loading Other Flat Files Specify a different delimiter with sep 2..
DATA/Data Engineer
CH1) 3. Extract, Transform and Load (ETL)
3. Extract, Transform and Load (ETL) 1️⃣Extract Extracting data: What does it mean? Extract from text files unstructures: plain text Flat files : row = record, column = attribute ex) csv JSON (JavaScript Object Notation) Semi-structures Atomic number, string, boolean, null Composite array, object Data on Web Data on the Web through APIs Send data in JSON format API: application programming inter..
DATA/Data Engineer
CH1) 2. Data Engineering Toolbox
2. Data Engineering Toolbox 1️⃣Databases What are databases? -holds data -organizes data -retrieve/search data through DBMS (database management system) Structured and unstructured data Structured: database schema ex)relational database Semi-structured : ex) JSON Unstructured: schemaless, more like files ex) videos, photos SQL Tables Database schema Relational databases NoSQL Non-relational data..
DATA/Data Engineer
1. Introduction to Data Engineering
1️⃣What is data engineering? 데이터 엔지니어링의 개념 When the data engineer is needed? Gather data from different sources optimized database for analysis remove corrupt data cloud technology Data scientists: “Yay!” Data engineer definition An engineer that develops, constructs, tests, and maintains architectures such as databases and large-scale processing systems Q. Does the data engineer come up with a ..