Course Outline
Introduction to Apache Iceberg
- Overview of Apache Iceberg
- Importance and use cases in modern data architecture
- Key features and benefits
Core Concepts
- Iceberg table format and architecture
- Comparison with other table formats
- Partitioning and schema evolution
- Time travel and data versioning
Setting Up Apache Iceberg
- Installation and configuration
- Integrating Iceberg with various data processing engines
- Setting up an Iceberg environment on a local machine
Basic Operations
- Creating and managing Iceberg tables
- Writing to and reading from Iceberg tables
- Basic CRUD operations
Data Migration and Integration
- Migrating data from Hive and other systems to Iceberg
- Integration with BI tools
- Migrating a sample dataset to Iceberg
Optimizing Performance
- Performance tuning techniques
- Optimizing queries and data scans
- Performance optimization in Iceberg
Overview of Advanced Features
- Partition evolution and hidden partitioning
- Table evolution and schema changes
- Time travel and rollback features
- Implementing advanced features in Iceberg
Summary and Next Steps
Requirements
- Familiarity with concepts such as tables, schemas, partitions, and data ingestion
- Basic knowledge of SQL
Audience
- Data engineers
- Data architects
- Data analysts
- Software developers
Testimonials (5)
The trainer's practical experience, not coloring the discussed solution, but also not introducing a negative connotation. I feel that the trainer is preparing me for real and practical use of the tool - these valuable details are usually not found in books.
Krzysztof Miodek - Krajowy Rejestr Dlugow Biuro Informacji Gospodarczej S.A.
Course - Apache Spark Fundamentals
Machine Translated
The live examples
Ahmet Bolat - Accenture Industrial SS
Course - Python, Spark, and Hadoop for Big Data
very interactive...
Richard Langford
Course - SMACK Stack for Data Science
Sufficient hands on, trainer is knowledgable
Chris Tan
Course - A Practical Introduction to Stream Processing
Get to learn spark streaming , databricks and aws redshift