概要
対象書籍を翻訳して輪読する勉強会です。
対象書籍
- Fundamentals of Data Engineering (O'Reilly, 2022/07)
-
https://www.amazon.co.jp/dp/1098108302/
第00回 Preface
- 2023/04/03
- connpass:https://gaisaba.connpass.com/event/279517/
- 資料:https://qiita.com/Shumpei_Kubo/items/9cb9145b4b695b3b5346
- 箇所:Preface
第01回 Chapter3:Good Data Architecture
- 2023/04/10
- connpass:https://gaisaba.connpass.com/event/279986/
- 資料:https://qiita.com/Shumpei_Kubo/items/44ba187698d3799372a2
- 箇所:
- Chapter3:
- Good Data Architecture
- 「Principles of Good Data Architecture」
- Good Data Architecture
- Chapter3:
第02回 Chapter3:Tight vs Loose Coupling: Tiers, Monoliths, and Microservices
- 2023/04/17
- connpass:https://gaisaba.connpass.com/event/280778/
- 資料:https://qiita.com/hikarumosity/items/38d8e713b8203645cf7c
- 箇所:
- Chapter3
- Tight vs Loose Coupling: Tiers, Monoliths, and Microservices
- (User Access: Single Versus Multitenant の手前まで)
- Chapter3
第03回 Chapter3:「User Access: Single Versus Multitenant」「EventDriven Architecture」
- 2023/04/27
- connpass:https://gaisaba.connpass.com/event/281425/
- 資料:
- 箇所:
- Chapter3
- 「User Access: Single Versus Multitenant」
- 「EventDriven Architecture」
- Chapter3
第04回 Chapter3:「User Access: Single Versus Multitenant」「EventDriven Architecture」
- 2023/05/08
- connpass:https://gaisaba.connpass.com/event/282438/
- 資料:https://cellardoor.hatenablog.jp/entry/fundamentalsofdataengineering-part4/
- 箇所:
- Chapter3
- User Access : Single versus Multitenant
- Event-Driven Architecture
- Chapter3
第05回 Chapter3:Convergence, Next-Generation Data Lakes, and the Data Platform etc...
- 2023/05/15
- connpass:https://gaisaba.connpass.com/event/283380/
- 資料:https://techblog.otakashi.jp/archives/301
- 箇所:
- Chapter3
- Convergence, Next-Generation Data Lakes, and the Data Platform
- Modern Data Stack
- Lambda Architecture
- Kappa Architecture
- The Dataflow Model and Unified Batch and Streaming
- Chapter3
第06回 Chapter3:Data Mesh~
- 2023/05/22
- connpass:https://gaisaba.connpass.com/event/284114/
- 資料:https://afoil.notion.site/Fundamentals-of-Data-Engineering-6-ac601910f1b4414eb95749da58be9604
- 箇所:
- Chapter3
- Data Mesh から Chapter 3 の最後まで(参考文献の一覧は不要)
- Chapter3
第07回 Chapter4:Choosing Technologies Across the Data Engineering Lifecycle
- 2023/05/29
- connpass:https://gaisaba.connpass.com/event/285167/
- 資料:https://qiita.com/mongolmongol/private/463e7c57879d0c8ea4d6
- 箇所:
- Chapter4
- Choosing Technologies Across the Data Engineering Lifecycle より
- Cost Optimization and Business Value
- Total Cost Of Ownership
- Total Opportunity Cost of Ownership
- FinOps
- Cost Optimization and Business Value
- Choosing Technologies Across the Data Engineering Lifecycle より
- Chapter4
第08回 Chapter4:Choosing Technologies Across the Data Engineering Lifecycle
- 2023/06/05
- connpass:https://gaisaba.connpass.com/event/285671/
- 資料:https://qiita.com/Shumpei_Kubo/items/6543b3c2e7a11d31152b
- 箇所:
- Chapter4
- Choosing Technologies Across the Data Engineering Lifecycle より
- Today Versus the Future: Immutable Versus Transitory Technologies
- Our Advice の終わり、Location の手前まで
- Choosing Technologies Across the Data Engineering Lifecycle より
- Chapter4
第09回 Chapter4:Choosing Technologies Across the Data Engineering Lifecycle
- 2023/06/12
- connpass:https://gaisaba.connpass.com/event/286435/
- 資料:https://cellardoor.hatenablog.jp/entry/fundamentalsofdataengineering-part9/
- 箇所:
- Chapter4
- Choosing Technologies Across the Data Engineering Lifecycle より
- Undercurrents and Their Impacts on Choosing Technologies から Chapter 4 の終わりまで
- Choosing Technologies Across the Data Engineering Lifecycle より
- Chapter4
第10回 Chapter2:The Data Engineering Lifecycle
- 2023/06/19
- connpass:https://gaisaba.connpass.com/event/287143/
- 資料:https://afoil.notion.site/Fundamentals-of-Data-Engineering-10-02de3e92a5c5447497a10cb541b7f20d
- 箇所:
- Chapter2
- The Data Engineering Lifecycle
- Chapter2の最初から、Storageの手前まで
- The Data Engineering Lifecycle
- Chapter2
第11回 Chapter2:The Data Engineering Lifecycle
- 2023/06/26
- connpass:https://gaisaba.connpass.com/event/287901/
- 資料:
- 箇所:
- Chapter2
- The Data Engineering Lifecycle より
- Storage(ホイル焼きさん担当)
- Ingestion(okamotoさん担当)
- The Data Engineering Lifecycle より
- Chapter2
第12回 Chapter2:The Data Engineering Lifecycle
- 2023/07/03
- connpass:https://gaisaba.connpass.com/event/288576/
- 資料:https://qiita.com/hikarumosity/items/a82757627c5aa36a37d9
- 箇所:
- Chapter2
- The Data Engineering Lifecycle
- Transformation ~ Serving Data の Embedded analyticsまで
- (Machine Learning の手前まで)
- The Data Engineering Lifecycle
- Chapter2
第13回 Chapter2:The Data Engineering Lifecycle
- 2023/07/10
- connpass:https://gaisaba.connpass.com/event/289254/
- 資料:
- 箇所:
- Chapter2
- The Data Engineering Lifecycle
- Serving Data より
- Machine Learning
- Reverse ETL
- Serving Data より
- The Data Engineering Lifecycle
- Chapter2
第14回 Chapter2:The Data Engineering Lifecycle
- 2023/07/18
- connpass:https://gaisaba.connpass.com/event/290023/
- 資料:https://qiita.com/ak-sakatoku/items/286f92bc911e1e981c69
- 箇所:
- Chapter2
- The Data Engineering Lifecycle
- Serving Data より
- Major Undercurrents Across the Data Engineering Lifecycle
→ Data governance の手前まで
- Major Undercurrents Across the Data Engineering Lifecycle
- Serving Data より
- The Data Engineering Lifecycle
- Chapter2
第15回 Chapter2:The Data Engineering Lifecycle
- 2023/07/24
- connpass:https://gaisaba.connpass.com/event/290705/
- 資料:https://qiita.com/Shumpei_Kubo/items/5dcc5a0ff63ade09c22d
- 箇所:
- Chapter2
- The Data Engineering Lifecycle
- Major Undercurrents Across the Data Engineering Lifecycle より
- Data Management 配下の、以下の箇所
- Data governance
- Discoverability
- Metadata
→ Data accountability の手前まで
- The Data Engineering Lifecycle
- Chapter2
第16回 Chapter2:The Data Engineering Lifecycle
- 2023/07/31
- connpass:https://gaisaba.connpass.com/event/291256/
- 資料:https://www.slideshare.net/secret/4JBetounXOoNAU
- 箇所
- Chapter2
- The Data Engineering Lifecycle
- Major Undercurrents Across the Data Engineering Lifecycle より
- Data Management 配下の、以下の箇所 (途中のMaster Data Management も含む)
- Data accountability
- Data quality
- Data modeling and design
- Data lineage
- Data integration and interoperability
- Data lifecycle management
- Ethics and privacy
→ DataOps の手前まで
- Data Management 配下の、以下の箇所 (途中のMaster Data Management も含む)
- Major Undercurrents Across the Data Engineering Lifecycle より
- The Data Engineering Lifecycle
- Chapter2
第17回 Chapter2:The Data Engineering Lifecycle
- 2023/08/21
- connpass:https://gaisaba.connpass.com/event/291970/
- 資料:https://qiita.com/ak-sakatoku/items/baf5c130c67d6b3c9fef
- 箇所:
- Chapter2
- The Data Engineering Lifecycle
- Major Undercurrents Across the Data Engineering Lifecycle より
- DataOps
- Data Architecture
(Orchestration の手前まで)
- Major Undercurrents Across the Data Engineering Lifecycle より
- The Data Engineering Lifecycle
- Chapter2
第18回 Chapter2:The Data Engineering Lifecycle
- 2023/09/04
- https://gaisaba.connpass.com/event/294111/
- 資料:https://qiita.com/Takamori_Sonoda/items/3bcb3ba379e1604ce596
- 箇所:
- Chapter2
- The Data Engineering Lifecycle
- Major Undercurrents Across the Data Engineering Lifecycle より
- Orchestration
- Software Engineering
- Conclusion
- (Additional Resourcesは不要)
- Major Undercurrents Across the Data Engineering Lifecycle より
- The Data Engineering Lifecycle
- Chapter2
第19回 Chapter8:Queries, Modeling, and Transformation
- 2023/09/25
- https://gaisaba.connpass.com/event/295482/
- 資料:https://afoil.notion.site/Chapter-8-Queries-Modeling-and-Transformation-4e1283d95aa94741b30c2956e0896b80
- 箇所:
- Chapter8 : Queries, Modeling, and Transformation
- Queries
- What is a Query?
- The Life of a Query
- The Query Optimizer
(Improving Query Performance の手前まで)
- Chapter8 : Queries, Modeling, and Transformation
第20回 Chapter8:Queries, Modeling, and Transformation
- 2023/10/16
- https://gaisaba.connpass.com/event/298222/
- 資料:https://qiita.com/kota9/items/0f1b7bc28df151159abe
- 箇所:
- Chapter8 : Queries, Modeling, and Transformation
- Improving Query Performance
("Queries on Streaming Data" の手前まで)
- Improving Query Performance
- Chapter8 : Queries, Modeling, and Transformation
第21回 Chapter8:Queries, Modeling, and Transformation
- 2023/10/30
- https://gaisaba.connpass.com/event/299659/
- 資料:https://qiita.com/IQ_Bocchi/items/a86bf374ccfccb3e98ac
- 箇所:
- Chapter8 : Queries, Modeling, and Transformation
- Queries on Streaming Data (Data Modeling の手前まで)
- Chapter8 : Queries, Modeling, and Transformation
第22回 Chapter8 Data Modeling
- 2023/11/13
- connpass : https://gaisaba.connpass.com/event/301330/
- 資料:https://afoil.notion.site/Fundamentals-of-Data-Engineering-22-7aca59c67d594dd1bee7f3b0bec11de6
- 箇所:
- Chapter8 Data Modeling
- (Techniques for Modeling Batch Analytical Data の手前まで)
- Chapter8 Data Modeling
第23回 Chapter8 Techniques for Modeling Batch Analytical Data
- 2023/11/27
- connpass : https://gaisaba.connpass.com/event/301330/
- 資料:https://techblog.otakashi.jp/archives/467
- 箇所:
- Chapter8 Techniques for Modeling Batch Analytical Data
- 冒頭
- Inmon
- Kimball
- Fact tables
- (Star schema の手前まで)
- Chapter8 Techniques for Modeling Batch Analytical Data
第24回 Chapter8 Techniques for Modeling Batch Analytical Data
- 2023/12/11
- connpass : https://gaisaba.connpass.com/event/304283/
- 資料: https://afoil.notion.site/Fundamentals-of-Data-Engineering-24-481cc0716ac54549bfad360b9a88356c
- 箇所:
- Chapter8 Techniques for Modeling Batch Analytical Data
- Star schema
- Data Vault
- Hubs
- Links
- Satellites
- (Wide denormalized tables の手前まで)
- Chapter8 Techniques for Modeling Batch Analytical Data
第25回 Chapter8 Techniques for Modeling Batch Analytical Data
- 2024/01/15
- connpass : https://gaisaba.connpass.com/event/305037/
- 資料: https://qiita.com/Shumpei_Kubo/items/5efdf7db68fda12c71f4
- 箇所:
- Chapter8 Techniques for Modeling Batch Analytical Data
- Wide denormalized tables
- Modeling Streaming Data
- (Transformation の手前まで)
- Chapter8 Techniques for Modeling Batch Analytical Data
第26回 Chapter8 Techniques for Modeling Batch Analytical Data
- 2024/01/29
- connpass : https://gaisaba.connpass.com/event/307845/
- 資料: https://techblog.otakashi.jp/archives/523
- 箇所:
- Chapter8 Techniques for Modeling Batch Analytical Data
- Transformations
- Batch Transformation
- Broadcast join
- Shuffle hash join
- ETL, ELT, and data pipeline
- Batch Transformation
- Transformations
- Chapter8 Techniques for Modeling Batch Analytical Data
第27回 Chapter8 Transformation
- 2024/03/04
- connpass : https://gaisaba.connpass.com/event/309141/
- 資料: https://qiita.com/IQ_Bocchi/items/cfbe13c29bd9a94442b5
- 箇所:
- Chapter8
- Transformation
- Batch Transformation
より- SQL and code-based transformation tools
- SQL is declarative...but it can still build complex data workflows
- Example: When to avoid SQL for batch transformations in Spark
- Example: Optimizing Spark and other processing frameworks
(Update patterns の手前まで)
第28回 Chapter8 Update patterns
- 2024/03/18
- connpass : https://gaisaba.connpass.com/event/312534/
- 資料: https://afoil.notion.site/2024-03-18-f14f3a7a836e44e9a4e7590129bec5c9
- 箇所:Chapter8
- Update patterns
- Truncate and reload
- Insert only - Delete - Upsert / merge
- (Schema updates の手前まで)
第29回 Chapter8 Schema Updates
- 2024/04/1
- connpass : https://gaisaba.connpass.com/event/313833/
- 資料: https://techblog.otakashi.jp/archives/568
- 箇所:Chapter8
- Schema updates
- Data wrangling
- Example : Data transformation in Spark
- Business logic and derived data
第30回 Chapter8 Materialized Views, Federation, and Query Virtualization
- 2024/04/15
- connpass : https://gaisaba.connpass.com/event/315096/
- 資料: https://afoil.hatenablog.com/entry/2024/04/15/002450
- 箇所:Chapter8
- Materialized Views, Federation, and Query Virtualization
- Streaming Transformations and Processing
第31回 Chapter8 Who'm you'll work with
- 2024/05/13
- connpass : https://gaisaba.connpass.com/event/316502/
- 資料: https://techblog.otakashi.jp/archives/588
- 箇所:Chapter8
- Who'm you'll work with 〜 Chapter 8 の最後まで
第32回 Chapter11 The Future of Data Engineering
- 2024/05/27
- connpass : https://gaisaba.connpass.com/event/319782/
- 資料: https://qiita.com/Shumpei_Kubo/items/2ab938d665f3952923a2
- 箇所:Chapter11
- 冒頭〜The Decline of Complexity and the Rise of Easy-to-Use Data Tools
- (The Cloud-Scale Data OS and Improved Interoperability の手前まで)
第33回 Chapter11 The Future of Data Engineering
- 2024/06/10
- connpass : https://gaisaba.connpass.com/event/320973/
- 資料: https://qiita.com/IQ_Bocchi/items/a55385fdae07ebb7db90
- 箇所:Chapter11
- The Cloud-Scale Data OS and Improved Interoperability
- "Enterprisey" Data Engineering
- Titles and Responsibilities Will Morph...
- (Moving Beyond the Modern Data Stack, Toward the Live Data Stack の手前まで)
第34回 Chapter11 The Future of Data Engineering
- 2024/06/24
- connpass : https://gaisaba.connpass.com/event/322000/
- 資料: https://afoil.notion.site/ch-11-modern-data-stack-0a9cb544c17448b09bcfc23cd0a5b1f7
- 箇所:Chapter11
- Moving Beyond the Modern Data Stack, Toward the Live Data Stack 〜 Chapter11の最後まで
第35回 第1章 データエンジニアリング概説
- 2024/07/08
- connpass : https://gaisaba.connpass.com/event/323400/
- この回より日本語資料を読む回になりました。
- 資料:なし
第36回 第4章 データエンジニアリングライフサイクルにおけるテクノロジの選択
- 2024/07/29
- connpass : https://gaisaba.connpass.com/event/326399/
- 資料:https://techblog.otakashi.jp/archives/612
第37回 第5章 ソースシステムにおけるデータ生成
- 2024/08/19
- connpass : https://gaisaba.connpass.com/event/328140/
- 資料:https://afoil.notion.site/ch5-561a6229a31f496db07915c7dcf59719
第38回 6章 ストレージへの保存
- 2024/09/02
- connpass : https://gaisaba.connpass.com/event/328612/
- 資料:https://qiita.com/IQ_Bocchi/items/ca8e76a1bf50b5c475a4
第39回 7章 データ取り込み
- 2024/09/30
- connpass : https://gaisaba.connpass.com/event/329964/
- 資料:https://afoil.notion.site/ch7-50a1ea2f69634e8d850e138b95466d8d
第40回 第9章 アナリティクス、機械学習、リバースETLへのデータの提供
- 2024/10/21
- connpass : https://gaisaba.connpass.com/event/334165/
- 資料:https://qiita.com/Shumpei_Kubo/items/245b6bc67341d5bf0437
第41回 第10章 セキュリティとプライバシー(最終回)
- 2024/10/28
- connpass : https://gaisaba.connpass.com/event/334166/
- 資料:https://qiita.com/IQ_Bocchi/items/b9f716feaa74da9713bc