6 Posts
DATA/Data Engineer
1️⃣Introduction to databases Relational Databases Data about entities are organized into tables Each row or record is an instance of an entity Each column has information about an attribute Tables can be linked to each other via unique keys Support more data, multiple simultaneous users, and data quality controls Data types are specified for each column SQL to interact with databases Connecting ..
2. Importing Data From Excel Files 1️⃣Introduction to spreadsheets Spreadsheets : (= excel files) have formatting and formulas pd.read_excel() nrows, skiprows, usecols (same with read_csv()) 2️⃣Getting data from multiple worksheets read_excel() loads first sheet by default use sheet_name keyword to load other sheets by position # or list of names arguments applied to read_excel() apply to all sh..
1. Importing Data from Flat Files 1️⃣ Introduction to flat files pandas - python library : load & manipulate data Data frames pandas - specific structure for 2D data Flat Files simple, easy to produce format Data stored as plain text One row per line seperated by a delimiter ex) csv one pandas function to load them all: read_csv() Loading Other Flat Files Specify a different delimiter with sep 2..
3. Extract, Transform and Load (ETL) 1️⃣Extract Extracting data: What does it mean? Extract from text files unstructures: plain text Flat files : row = record, column = attribute ex) csv JSON (JavaScript Object Notation) Semi-structures Atomic number, string, boolean, null Composite array, object Data on Web Data on the Web through APIs Send data in JSON format API: application programming inter..
2. Data Engineering Toolbox 1️⃣Databases What are databases? -holds data -organizes data -retrieve/search data through DBMS (database management system) Structured and unstructured data Structured: database schema ex)relational database Semi-structured : ex) JSON Unstructured: schemaless, more like files ex) videos, photos SQL Tables Database schema Relational databases NoSQL Non-relational data..
1️⃣What is data engineering? 데이터 엔지니어링의 개념 When the data engineer is needed? Gather data from different sources optimized database for analysis remove corrupt data cloud technology Data scientists: “Yay!” Data engineer definition An engineer that develops, constructs, tests, and maintains architectures such as databases and large-scale processing systems Q. Does the data engineer come up with a ..