![]() |
Pyspark Essential Training: Introduction To Building Data Pipelines
![]() Pyspark Essential Training: Introduction To Building Data Pipelines Published 08/2025 MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz, 2 Ch Language: English | Duration: 1h 18m | Size: 146 MB Video description PySpark is a powerful library that brings Apache Spark's distributed computing capabilities to Python, making it a key tool for processing large-scale data efficiently. In this course, data engineer and analyst Sam Bail provides a structured and hands-on introduction to PySpark, starting with an overview of Apache Spark, its architecture, and its ecosystem. Learn about Spark's core concepts, such as the DataFrame API, transformations, lazy evaluations, and actions, before setting up a lab environment and working with a real dataset. Plus, gain insights into how PySpark fits into a broader data engineering ecosystem and best practices on running PySpark in a production environment. Цитата:
|
Часовой пояс GMT +3, время: 22:55. |
vBulletin® Version 3.6.8.
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Перевод: zCarot