Plan Szkolenia

Introduction to Data Analysis and Big Data

  • What Makes Big Data "Big"?
    • Velocity, Volume, Variety, Veracity (VVVV)
  • Limits to Traditional Data Processing
  • Distributed Processing
  • Statistical Analysis
  • Types of Machine Learning Analysis
  • Data Visualization

Big Data Roles and Responsibilities

  • Administrators
  • Developers
  • Data Analysts

Languages Used for Data Analysis

  • R Language
    • Why R for Data Analysis?
    • Data manipulation, calculation and graphical display
  • Python
    • Why Python for Data Analysis?
    • Manipulating, processing, cleaning, and crunching data

Approaches to Data Analysis

  • Statistical Analysis
    • Time Series analysis
    • Forecasting with Correlation and Regression models
    • Inferential Statistics (estimating)
    • Descriptive Statistics in Big Data sets (e.g. calculating mean)
  • Machine Learning
    • Supervised vs unsupervised learning
    • Classification and clustering
    • Estimating cost of specific methods
    • Filtering
  • Natural Language Processing
    • Processing text
    • Understaing meaning of the text
    • Automatic text generation
    • Sentiment analysis / topic analysis
  • Computer Vision
    • Acquiring, processing, analyzing, and understanding images
    • Reconstructing, interpreting and understanding 3D scenes
    • Using image data to make decisions

Big Data Infrastructure

  • Data Storage
    • Relational databases (SQL)
      • MySQL
      • Postgres
      • Oracle
    • Non-relational databases (NoSQL)
      • Cassandra
      • MongoDB
      • Neo4js
    • Understanding the nuances
      • Hierarchical databases
      • Object-oriented databases
      • Document-oriented databases
      • Graph-oriented databases
      • Other
  • Distributed Processing
    • Hadoop
      • HDFS as a distributed filesystem
      • MapReduce for distributed processing
    • Spark
      • All-in-one in-memory cluster computing framework for large-scale data processing
      • Structured streaming
      • Spark SQL
      • Machine Learning libraries: MLlib
      • Graph processing with GraphX
  • Scalability
    • Public cloud
      • AWS, Google, Aliyun, etc.
    • Private cloud
      • OpenStack, Cloud Foundry, etc.
    • Auto-scalability

Choosing the Right Solution for the Problem

The Future of Big Data

Summary and Conclusion

Wymagania

  • A general understanding of math.
  • A general understanding of programming.
  • A general understanding of databases.

Audience

  • Developers / programmers
  • IT consultants
 35 godzin

Liczba uczestników



Cena za uczestnika

Opinie uczestników (5)

Szkolenia Powiązane

QGIS for Geographic Information System

21 godzin

Advanced Data Analysis with TIBCO Spotfire

14 godzin

Introduction to Spotfire

14 godzin

AI-Driven Data Analysis with TIBCO Spotfire X

14 godzin

Data Analysis with SQL, Python and Spotfire

14 godzin

Sensu: Beginner to Advanced

14 godzin

Monitoring Your Resources with Munin

7 godzin

Automated Monitoring with Zabbix

14 godzin

Fluentd for Log Data Unification

14 godzin

Nagios Certified Administrator Preparation

21 godzin

Advanced Nagios

21 godzin

Nagios

35 godzin

Nagios Core

21 godzin

Nagios Certified Professional Preparation

21 godzin

Nagios XI Administration

21 godzin

Powiązane Kategorie