Plan Szkolenia
Section 1: Data Management in HDFS
- Various Data Formats (JSON / Avro / Parquet)
- Compression Schemes
- Data Masking
- Labs : Analyzing different data formats; enabling compression
Section 2: Advanced Pig
- User-defined Functions
- Introduction to Pig Libraries (ElephantBird / Data-Fu)
- Loading Complex Structured Data using Pig
- Pig Tuning
- Labs : advanced pig scripting, parsing complex data types
Section 3 : Advanced Hive
- User-defined Functions
- Compressed Tables
- Hive Performance Tuning
- Labs : creating compressed tables, evaluating table formats and configuration
Section 4 : Advanced HBase
- Advanced Schema Modelling
- Compression
- Bulk Data Ingest
- Wide-table / Tall-table comparison
- HBase and Pig
- HBase and Hive
- HBase Performance Tuning
- Labs : tuning HBase; accessing HBase data from Pig & Hive; Using Phoenix for data modeling
Wymagania
- comfortable with Java programming language (most programming exercises are in java)
- comfortable in Linux environment (be able to navigate Linux command line, edit files using vi / nano)
- a working knowledge of Hadoop.
Lab environment
Zero Install: There is no need to install hadoop software on students’ machines! A working hadoop cluster will be provided for students.
Students will need the following
- an SSH client (Linux and Mac already have ssh clients, for Windows Putty is recommended)
- a browser to access the cluster. We recommend Firefox browser
Opinie uczestników (5)
Many hands-on sessions.
Jacek Pieczątka
Szkolenie - Administrator Training for Apache Hadoop
practical things of doing, also theory was served good by Ajay
Dominik Mazur - Capgemini Polska Sp. z o.o.
Szkolenie - Hadoop Administration on MapR
Projekt do samodzielnego przygotowania, interesujący przykład DevOps-owej pacy z Ambari, wsparcie trenera (logowanie na maszynę wirtualną, dobra i bezpośrednia komunikacja)
Bartłomiej Krasiński - Rossmann SDP
Szkolenie - HBase for Developers
Część praktyczna.
Arkadiusz Iwaszko
Szkolenie - Big Data Hadoop Analyst Training
I thought he did a great job of tailoring the experience to the audience. This class is mostly designed to cover data analysis with HIVE, but me and my co-worker are doing HIVE administration with no real data analytics responsibilities.