Plan Szkolenia
Introduction
Principles of Distributed Computing
- Apache Spark
- Hadoop
Principles of Data Serialization
- How data object is passed over the network
- Serialization of objects
- Serialization approaches
- Thrift
- Protocol Buffers
- Apache Avro
- data structure
- size, speed, format characteristics
- persistent data storage
- integration with dynamic languages
- dynamic typing
- schemas
- untagged data
- change management
Data Serialization and Distributed Computing
- Avro as a subproject of Hadoop
- Java serialization
- Hadoop serialization
- Avro serialization
Using Avro with
- Hive (AvroSerDe)
- Pig (AvroStorage)
Porting Existing RPC Frameworks
Summary and Conclusion
Wymagania
- A general familiarity with distributed computing.
Opinie uczestników (5)
Many hands-on sessions.
Jacek Pieczątka
Szkolenie - Administrator Training for Apache Hadoop
practical things of doing, also theory was served good by Ajay
Dominik Mazur - Capgemini Polska Sp. z o.o.
Szkolenie - Hadoop Administration on MapR
Projekt do samodzielnego przygotowania, interesujący przykład DevOps-owej pacy z Ambari, wsparcie trenera (logowanie na maszynę wirtualną, dobra i bezpośrednia komunikacja)
Bartłomiej Krasiński - Rossmann SDP
Szkolenie - HBase for Developers
Część praktyczna.
Arkadiusz Iwaszko
Szkolenie - Big Data Hadoop Analyst Training
I thought he did a great job of tailoring the experience to the audience. This class is mostly designed to cover data analysis with HIVE, but me and my co-worker are doing HIVE administration with no real data analytics responsibilities.