Course Outline
Introduction
- Data mining as the analysis step of the KDD process ("Knowledge Discovery in Databases")
- Subfield of computer science
- Discovering patterns in large data sets
Sources of methods
- Artificial intelligence
- Machine learning
- Statistics
- Database systems
What is involved?
- Database and data management aspects
- Data pre-processing
- Model and inference considerations
- Interestingness metrics
- Complexity considerations
- Post-processing of discovered structures
- Visualization
- Online updating
Data mining main tasks
- Automatic or semi-automatic analysis of large quantities of data
- Extracting previously unknown interesting patterns
- groups of data records (cluster analysis)
- unusual records (anomaly detection)
- dependencies (association rule mining)
Data mining
- Anomaly detection (Outlier/change/deviation detection)
- Association rule learning (Dependency modeling)
- Clustering
- Classification
- Regression
- Summarization
Use and applications
- Able Danger
- Behavioral analytics
- Business analytics
- Cross Industry Standard Process for Data Mining
- Customer analytics
- Data mining in agriculture
- Data mining in meteorology
- Educational data mining
- Human genetic clustering
- Inference attack
- Java Data Mining
- Open-source intelligence
- Path analysis (computing)
- Reactive business intelligence
Data dredging, data fishing, data snooping
Requirements
Fair knowledge about relational data structures, SQL
Testimonials (5)
how the trainor shows his knowledge in the subject he's teachign
john ernesto ii fernandez - Philippine AXA Life Insurance Corporation
Course - Data Vault: Building a Scalable Data Warehouse
Prepared material. Full professionalism. Very good contact with the trainer. Full engagement and openness to changing the planned training format (very valuable open discussions on the topics we prepared)
Kamil Trebacz - Bank Gospodarstwa Krajowego
Course - Pentaho Data Integration (PDI) - moduł do przetwarzania danych ETL (poziom zaawansowany)
Machine Translated
Open discussion with trainer
Tomek Danowski - GE Medical Systems Polska Sp. Z O.O.
Course - Process Mining
Very useful in because it helps me understand what we can do with the data in our context. It will also help me
Nicolas NEMORIN - Adecco Groupe France
Course - KNIME Analytics Platform for BI
I genuinely enjoyed the hands passed exercises.