![]() |
Data Engineering Bootcamp - Series 1
![]() Data Engineering Bootcamp - Series 1 Published 6/2025 MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz, 2 Ch Language: English | Duration: 10h 0m | Size: 3.71 GB Get Started Today and Build Your Career in Data Engineering! What you'll learn Understand the Fundamentals of Modern Data Engineering Build and Manage Scalable Data Lakes on AWS S3 Design Star Schema Data Models with Fact & Dimension Tables Implement Slowly Changing Dimensions (SCD1 & SCD2) Develop ETL Pipelines Using PySpark with Data Quality Checks Query and Explore Data Lakes with AWS Athena and Glue Catalog Automate Workflows and Pipelines Using Apache Airflow Create Custom Airflow Plugins to Manage EMR Spark Jobs Apply the WAP (Write-Audit-Publish) Pattern for Production Pipelines Implement Data Quality Frameworks and Data Contracts Deploy and Monitor Data Pipelines on AWS EMR Optimize Data Workflows for Cost, Performance, and Reliability Gain Hands-On Experience with Real-World Use Cases Prepare for Data Engineering Interviews with Confidence Requirements Basic knowledge of SQL and Python Familiarity with Docker and Bash scripting helpful Description Take your first step into the world of data engineering and future-proof your career with this hands-on, project-based bootcamp built on the modern data stack. Taught by a seasoned data architect with over 11 years of industry experience, this course blends theory with practice, designed for aspiring data engineers, software engineers, analysts, and anyone eager to learn how to build real-world data pipelines.You'll learn to design scalable data lakes, build dimensional data models, implement data quality frameworks, and orchestrate pipelines using Apache Airflow, all using a real-life ride-hailing application use case to simulate enterprise-scale systems.What You'll LearnSection 1: Context SetupBuild your foundation with the Modern Data Stack, understand OLTP systems, and explore real-world data platform architectures.Gain clarity on how data flows in data-driven companiesLearn using a ride-hailing app scenarioGet properly onboarded into the bootcamp journeySection 2: Data Lake EssentialsLearn how to build and manage scalable data lakes on AWS S3. S3 architecture, partitioning, layers, and schema evolutionIAM, encryption, storage classes, event notificationsLifecycle management, backup & recoveryHands-on with Boto3 S3 APIsSection 3: Data ModelingMaster star schema design and implement SCD Type 1 and Type 2 dimensions.Dimensional & fact modelingETL development for analytical reportingBuild end-to-end models and data marts with hands-on labsSection 4: Data QualityEnsure trust and integrity in your data pipelines.Understand accuracy, completeness, and consistencyImplement DQ checks using industry best practicesUse data contracts for accountabilitySection 5: AWS AthenaQuery massive datasets with serverless power using AWS Athena.Learn DDL, Glue Catalog, and workgroup managementAutomate queries using Boto3 APIsCompare Athena vs Presto vs TrinoOptimize queries with best practicesSection 6: Apache SparkBuild production-grade data pipelines with PySpark on AWS EMR.Learn Spark architecture and PySpark APIsBuild data pipelines using the WAP (Write-Audit-Publish) patternRun scalable jobs on AWS EMRApply UDFs and data quality within transformation logicSection 7: Apache AirflowOrchestrate workflows using Airflow and build custom plugins.Design DAGs, schedule pipelines, manage dependenciesAutomate Spark jobs using custom AWS EMR pluginHands-on labs for ingestion and transformation DAGsBuild reliable, reusable orchestration solutionsWhat You'll BuildA production-style data platform for a ride-hailing company, including:Data lake on AWS S3Dimensional data model with SCD logicSpark-based transformation pipelinesAutomated orchestration with AirflowQuery layer with AthenaBuilt-in data quality validations Who this course is for Aspiring data engineers looking to break into the field Software engineers or analysts transitioning into data roles Professionals seeking real-world, hands-on data engineering experience Anyone interested in mastering the modern data engineering stack Цитата:
|
Часовой пояс GMT +3, время: 15:48. |
vBulletin® Version 3.6.8.
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Перевод: zCarot