Szkolenia Data Mining

Szkolenia Data Mining

Data Mining Courses

Plany Szkoleń Data Mining

Identyfikator Nazwa Czas trwania (po 7h zegarowych dziennie) Przegląd
463964 MATLAB Fundamental 21 hours This three-day course provides a comprehensive introduction to the MATLAB technical computing environment. The course is intended for beginning users and those looking for a review. No prior programming experience or knowledge of MATLAB is assumed. Themes of data analysis, visualization, modeling, and programming are explored throughout the course. Topics include: Working with the MATLAB user interface Entering commands and creating variables Analyzing vectors and matrices Visualizing vector and matrix data Working with data files Working with data types Automating commands with scripts Writing programs with logic and flow control Writing functions Part 1 A Brief Introduction to MATLAB Objectives: Offer an overview of what MATLAB is, what it consists of, and what it can do for you An Example: C vs. MATLAB MATLAB Product Overview MATLAB Application Fields What MATLAB can do for you? The Course Outline Working with the MATLAB User Interface Objective: Get an introduction to the main features of the MATLAB integrated design environment and its user interfaces. Get an overview of course themes. MATALB Interface Reading data from file Saving and loading variables Plotting data Customizing plots Calculating statistics and best-fit line Exporting graphics for use in other applications Va​riables and Expressions Objective: Enter MATLAB commands, with an emphasis on creating and accessing data in variables. Entering commands Creating variables Getting help Accessing and modifying values in variables Creating character variables Analysis and Visualization with Vectors Objective: Perform mathematical and statistical calculations with vectors, and create basic visualizations. See how MATLAB syntax enables calculations on whole data sets with a single command. Calculations with vectors Plotting vectors Basic plot options Annotating plots Analysis and Visualization with Matrices Objective: Use matrices as mathematical objects or as collections of (vector) data. Understand the appropriate use of MATLAB syntax to distinguish between these applications. Size and dimensionality Calculations with matrices Statistics with matrix data Plotting multiple columns Reshaping and linear indexing Multidimensional arrays Part 2 Automating Commands with Scripts Objective: Collect MATLAB commands into scripts for ease of reproduction and experimentation. As the complexity of your tasks increases, entering long sequences of commands in the Command Window becomes impractical. A Modelling Example The Command History Creating script files Running scripts Comments and Code Cells Publishing scripts Working with Data Files Objective: Bring data into MATLAB from formatted files. Because imported data can be of a wide variety of types and formats, emphasis is given to working with cell arrays and date formats. Importing data Mixed data types Cell arrays Conversions amongst numerals, strings, and cells Exporting data Multiple Vector Plots Objective: Make more complex vector plots, such as multiple plots, and use color and string manipulation techniques to produce eye-catching visual representations of data. Graphics structure Multiple figures, axes, and plots Plotting equations Using color Customizing plots Logic and Flow Control Objective: Use logical operations, variables, and indexing techniques to create flexible code that can make decisions and adapt to different situations. Explore other programming constructs for repeating sections of code, and constructs that allow interaction with the user. Logical operations and variables Logical indexing Programming constructs Flow control Loops Matrix and Image Visualization Objective: Visualize images and matrix data in two or three dimensions. Explore the difference in displaying images and visualizing matrix data using images. Scattered Interpolation using vector and matrix data 3-D matrix visualization 2-D matrix visualization Indexed images and colormaps True color images Part 3 Data Analysis Objective: Perform typical data analysis tasks in MATLAB, including developing and fitting theoretical models to real-life data. This leads naturally to one of the most powerful features of MATLAB: solving linear systems of equations with a single command. Dealing with missing data Correlation Smoothing Spectral analysis and FFTs Solving linear systems of equations Writing Functions Objective: Increase automation by encapsulating modular tasks as user-defined functions. Understand how MATLAB resolves references to files and variables. Why functions? Creating functions Adding comments Calling subfunctions Workspaces  Subfunctions Path and precedence Data Types Objective: Explore data types, focusing on the syntax for creating variables and accessing array elements, and discuss methods for converting among data types. Data types differ in the kind of data they may contain and the way the data is organized. MATLAB data types Integers Structures Converting types File I/O Objective: Explore the low-level data import and export functions in MATLAB that allow precise control over text and binary file I/O. These functions include textscan, which provides precise control of reading text files. Opening and closing files Reading and writing text files Reading and writing binary files Note that the actual delivered might be subject to minor discrepancies from the outline above without prior notification. Conclusion Note that the actual delivered might be subject to minor discrepancies from the outline above without prior notification. Objectives: Summarise what we have learnt A summary of the course Other upcoming courses on MATLAB Note that the course might be subject to few minor discrepancies when being delivered without prior notifications.
118127 Model MapReduce w implementacji oprogramowania Apache Hadoop 14 hours Szkolenie skierowane jest do organizacji chcących wdrożyć rozwiązania pozwalające na przetwarzanie dużych zbiorów danych za pomocą klastrów. Data Mining i Bussiness Intelligence Wprowadzenie Obszary zastosowań Możliwości Podstawy eksploracji danych i odkrywania wiedzy Big data Co rozumiemy pod pojęciem Big data? Big data a Data mining MapReduce Opis modelu Przykładowe zastosowanie Statystyki Model klastra Hadoop Czym jest Hadoop Instalacja Podstawowa konfiguracja Ustawienia klastra Architektura i konfiguracja Hadoop Distributed File System Komendy i obsługa z konsoli Narzędzie DistCp MapReduce i Hadoop Streaming Administracja i konfiguracja Hadoop On Demand Alternatywne rozwiązania
417029 From Data to Decision with Big Data and Predictive Analytics 21 hours Audience If you try to make sense out of the data you have access to or want to analyse unstructured data available on the net (like Twitter, Linked in, etc...) this course is for you. It is mostly aimed at decision makers and people who need to choose what data is worth collecting and what is worth analyzing. It is not aimed at people configuring the solution, those people will benefit from the big picture though. Delivery Mode During the course delegates will be presented with working examples of mostly open source technologies. Short lectures will be followed by presentation and simple exercises by the participants Content and Software used All software used is updated each time the course is run so we check the newest versions possible. It covers the process from obtaining, formatting, processing and analysing the data, to explain how to automate decision making process with machine learning. Quick Overview Data Sources Minding Data Recommender systems Target Marketing Datatypes Structured vs unstructured Static vs streamed Attitudinal, behavioural and demographic data Data-driven vs user-driven analytics data validity Volume, velocity and variety of data Models Building models Statistical Models Machine learning Data Classification Clustering kGroups, k-means, nearest neighbours Ant colonies, birds flocking Predictive Models Decision trees Support vector machine Naive Bayes classification Neural networks Markov Model Regression Ensemble methods ROI Benefit/Cost ratio Cost of software Cost of development Potential benefits Building Models Data Preparation (MapReduce) Data cleansing Choosing methods Developing model Testing Model Model evaluation Model deployment and integration Overview of Open Source and commercial software Selection of R-project package Python libraries Hadoop and Mahout Selected Apache projects related to Big Data and Analytics Selected commercial solution Integration with existing software and data sources
417013 Data Mining z wykorzystaniem R 14 hours Sources of methods Artificial intelligence Machine learning Statistics Sources of data Pre processing of data Data Import/Export Data Exploration and Visualization Dimensionality Reduction Dealing with missing values R Packages Data mining main tasks Automatic or semi-automatic analysis of large quantities of data Extracting previously unknown interesting patterns groups of data records (cluster analysis) unusual records (anomaly detection) dependencies (association rule mining) Data mining Anomaly detection (Outlier/change/deviation detection) Association rule learning (Dependency modeling) Clustering Classification Regression Summarization Frequent Pattern Mining Text Mining Decision Trees Regression Neural Networks Sequence Mining Frequent Pattern Mining Data dredging, data fishing, data snooping
417012 Predictive Models with PMML 7 hours The course is created to scientific, developers, analysts or any other people who want to standardize or exchange their models with Predictive Model Markup Language (PMML) file format.Predictive Models Intro to predictive models Predictive models supported by PMML PMML Elements Header Data Dictionary Data Transformations Model Mining Schema Targets Output API Overview of API providers for PMML Executing your model in a cloud
463779 Data Shrinkage for Government 14 hours Why shrink data Relational databases Introduction Aggregation and disaggregation Normalisation and denormalisation Null values and zeroes Joining data Complex joins Cluster analysis Applications Strengths and weaknesses Measuring distance Hierarchical clustering K-means and derivatives Applications in Government Factor analysis Concepts Exploratory factor analysis Confirmatory factor analysis Principal component analysis Correspondence analysis Software Applications in Government Predictive analytics Timelines and naming conventions Holdout samples Weights of evidence Information value Scorecard building demonstration using a spreadsheet Regression in predictive analytics Logistic regression in predictive analytics Decision Trees in predictive analytics Neural networks Measuring accuracy Applications in Government
295297 Podstawy systemów rekomendacyjnych 7 hours Szkolenie skierowane jest dla pracowników działów marketingu oraz leaderów działów IT. Problemy i nadzieje związane z gromadzeniem danych Information overload Rodzaje gromadzonych danych Potencjał danych dziś i jutro Podstawowe pojęcia związane z Data Mining Rekomendacja a wyszukiwanie Wyszukiwanie i filtrowanie Sortowanie Określanie wag wyników Wykorzystanie synonimów Wyszukiwanie pełnotekstowe Koncepcja Long Tail Idea Chrisa Andersona Argumenty przeciwników koncepcji Long Tail; Argumentacja Anity Elberse Próba określenia podobieństw Produkty Użytkownicy Dokumenty i strony internetowe Content-Based Recomendation i miary podobieństw Odległość cosinusowa Odległość euklidesowa wektórów TFIDF i pojęcie częstości występowania termów Collaborative filtering Rekomendacja na podstawie ocen społeczności Wykorzystanie grafów Możliwości grafów Określanie podobieństwa grafów Rekomendacja na podstawie relacji pomiędzy użytkownikami Sieci neuronowe Zasada działania Dane wzorcowe Przykładowe zastosowanie sieci neuronowych dla systemów rekomendacyjnych HR Zachęcanie użytkowników do udostępniania informacji Wygoda działania serwisu Ułatwienia nawigacji Funkcjonalność i UX Systemy rekomendacyjne na świecie Problemy i popularrność systemów rekomendacyjnych Udane wdrożenia systemów rekomendacyjncyh Przykłady na podstawie popularnych serwisów
165013 Statystyka z SPSS Predictive Analytics SoftWare 14 hours Cel: Opanowanie umiejętności pracy z programem SPSS na poziomie samodzielności Adresaci: Analitycy, Badacze, Naukowcy, studenci i wszyscy, którzy chcą pozyskać umiejętność posługiwania się pakietem SPSS oraz poznać popularne techniki eksploracji danych. Obsługa programu Okna dialogowe wprowadzanie/wczytywanie danych pojęcie zmiennej i skale pomiarowe przygotowanie bazy danych generowanie tabel i wykresów formatowanie raportu Język poleceńsyntax automatyzacja analiz zapisywanie i modyfikacja procedur tworzenie własnych procedur analitycznych Analiza danych Statystyki opisowe kluczowe terminy: zmienna, hipoteza, istotność statystyczna miary tendencji centralnej miary dyspersji rozkłady cech statystycznych standaryzacja Wprowadzenie do badania zależności między zmiennymi metody korelacyjne a eksperymentalne Podsumowanie: analiza przypadku i omówienie
463718 Wprowadzenie do Neo4j - grafowej bazy danych 7 hours Wprowadzenie do Neo4j Instalacja i konfiguracja Struktura aplikacji Neo4j Relacyjne i grafowe sposoby reprezentacji danych Model grafowy danych Czy zagadnienie można i powinno reprezentować się jako graf? Wybrane przypadki użycia i modelowanie wybranego zagadnienia Najważniejsze pojęcia modelu grafowego Neo4j: Węzeł Relacja Właściwość Etykieta Język zapytań Cypher i operacje na grafach Tworzenie i zarządzanie schematem za pomocą języka Cypher Operacje CRUD na danych Zapytania Cypher oraz ich odpowiedniki w SQL Algorytmy grafowe wykorzystywane w Neo4j Interfejs REST Podstawowe zagadnienia administracyjne Tworzenie i odtwarzanie kopii zapasowych Zarządzanie bazą z poziomu przeglądarki Import i eksport danych w uniwersalnych formatach
417032 Data Mining 21 hours Course can be provided with any tools, including free open-source data mining software and applicationsIntroduction Data mining as the analysis step of the KDD process ("Knowledge Discovery in Databases") Subfield of computer science Discovering patterns in large data sets Sources of methods Artificial intelligence Machine learning Statistics Database systems What is involved? Database and data management aspects Data pre-processing Model and inference considerations Interestingness metrics Complexity considerations Post-processing of discovered structures Visualization Online updating Data mining main tasks Automatic or semi-automatic analysis of large quantities of data Extracting previously unknown interesting patterns groups of data records (cluster analysis) unusual records (anomaly detection) dependencies (association rule mining) Data mining Anomaly detection (Outlier/change/deviation detection) Association rule learning (Dependency modeling) Clustering Classification Regression Summarization Use and applications Able Danger Behavioral analytics Business analytics Cross Industry Standard Process for Data Mining Customer analytics Data mining in agriculture Data mining in meteorology Educational data mining Human genetic clustering Inference attack Java Data Mining Open-source intelligence Path analysis (computing) Police-enforced ANPR in the UK Reactive business intelligence SEMMA Stellar Wind Talx Zapaday Data dredging, data fishing, data snooping
809300 Big Data Hadoop Analyst Training 28 hours Big Data Analyst Training to praktyczny kurs, który polecany jest każdemu, kto chce w przyszłości zostać ekspertem Data Scientist. Kurs skupia sie na aspektach potrzebnych do pracy nowoczesnego analityka w technologii Big Data. W trakcie kursu prezentowane są narzędzia pozwalające na uzyskanie dostępu, zmianę, transformację i analizę skomplikowanych struktur danych umieszczonych w klastrze Hadoop. W trakcie kursu będą poruszane tematy w ramach technologii Hadoop Ecosystem (Pig, Hive, Impala, ELK i inne). Funkcjonaloność narzędzi Pig, Hive, Impala, ELK, pozwalające na zbieranie danych, zapisywanie wyników i analitykę. Jak Pig, Hive i Impala mogą podnieść wydajność typowych i codziennych zadań analitycznych. Wykonywanie w czasie rzeczywistym interaktywnych analiz ogromnych zbiorów danych aby uzyskać cenne i wartościowe elementy dla biznesu oraz jak interpretować wnioski. Wykonywanie złożonych zapytań na bardzo dużych wolumenach danych. Podstawy Hadoop. Wprowadzenie do Pig. Podstawowa analiza danych z wykorzystaniem narzędzia Pig. Procesowanie złożonych danych z Pig. Operacje na wielu zbiorach danych z wykorzytaniem Pig. Rozwiązywanie problemów i optymalizacja Pig. Wprowadzenie do Hive, Impala, ELK. Wykonywanie zapytań w Hive, Impala, ELK. Zarządzanie danymi w Hive. Przechowywanie danych i wydajność. Analizy z wykorzystaniem narzędzi Hive i Impala. Praca z narzędziem Impala i ELK. Analiza tekstu i złożonych typów danych. Optymalizacja Hive, Pig, Impala, ELK. Interoperacyjność i przepływ pracy. Pytania, zadania, certyfikacja.
1029501 Data Visualization 28 hours This course will lead you to create very effective plots and ways to present and represent your data in good way that appeal to the decision makers and help them to find out hidden information. Day 1: what is data visualization why it is important data visualization vs data mining human cognition HMI common pitfalls Day 2: different type of curves drill down curves categorical data plotting multi variable plots data glyph and icon representation Day 3: plotting KPIs with data R and X charts examples what if dashboards parallel axes mixing categorical data with numeric data Day 4: different hats of data visualization how can data visualization lie disguised and hidden trends a case study of student data visual queries and region selection
209768 Big Data Business Intelligence for Govt. Agencies 40 hours Advances in technologies and the increasing amount of information are transforming how business is conducted in many industries, including government. Government data generation and digital archiving rates are on the rise due to the rapid growth of mobile devices and applications, smart sensors and devices, cloud computing solutions, and citizen-facing portals. As digital information expands and becomes more complex, information management, processing, storage, security, and disposition become more complex as well. New capture, search, discovery, and analysis tools are helping organizations gain insights from their unstructured data. The government market is at a tipping point, realizing that information is a strategic asset, and government needs to protect, leverage, and analyze both structured and unstructured information to better serve and meet mission requirements. As government leaders strive to evolve data-driven organizations to successfully accomplish mission, they are laying the groundwork to correlate dependencies across events, people, processes, and information. High-value government solutions will be created from a mashup of the most disruptive technologies: Mobile devices and applications Cloud services Social business technologies and networking Big Data and analytics IDC predicts that by 2020, the IT industry will reach $5 trillion, approximately $1.7 trillion larger than today, and that 80% of the industry's growth will be driven by these 3rd Platform technologies. In the long term, these technologies will be key tools for dealing with the complexity of increased digital information. Big Data is one of the intelligent industry solutions and allows government to make better decisions by taking action based on patterns revealed by analyzing large volumes of data — related and unrelated, structured and unstructured. But accomplishing these feats takes far more than simply accumulating massive quantities of data.“Making sense of these volumes of Big Data requires cutting-edge tools and technologies that can analyze and extract useful knowledge from vast and diverse streams of information,” Tom Kalil and Fen Zhao of the White House Office of Science and Technology Policy wrote in a post on the OSTP Blog. The White House took a step toward helping agencies find these technologies when it established the National Big Data Research and Development Initiative in 2012. The initiative included more than $200 million to make the most of the explosion of Big Data and the tools needed to analyze it. The challenges that Big Data poses are nearly as daunting as its promise is encouraging. Storing data efficiently is one of these challenges. As always, budgets are tight, so agencies must minimize the per-megabyte price of storage and keep the data within easy access so that users can get it when they want it and how they need it. Backing up massive quantities of data heightens the challenge. Analyzing the data effectively is another major challenge. Many agencies employ commercial tools that enable them to sift through the mountains of data, spotting trends that can help them operate more efficiently. (A recent study by MeriTalk found that federal IT executives think Big Data could help agencies save more than $500 billio while also fulfilling mission objectives.). Custom-developed Big Data tools also are allowing agencies to address the need to analyze their data. For example, the Oak Ridge National Laboratory’s Computational Data Analytics Group has made its Piranha data analytics system available to other agencies. The system has helped medical researchers find a link that can alert doctors to aortic aneurysms before they strike. It’s also used for more mundane tasks, such as sifting through résumés to connect job candidates with hiring managers. Breakdown of topics on daily basis: (Each session is 2 hours) Day-1: Session -1: Business Overview of Why Big Data Business Intelligence in Govt. Case Studies from NIH, DoE Big Data adaptation rate in Govt. Agencies & and how they are aligning their future operation around Big Data Predictive Analytics Broad Scale Application Area in DoD, NSA, IRS, USDA etc. Interfacing Big Data with Legacy data Basic understanding of enabling technologies in predictive analytics Data Integration & Dashboard visualization Fraud management Business Rule/ Fraud detection generation Threat detection and profiling Cost benefit analysis for Big Data implementation Day-1: Session-2 : Introduction of Big Data-1 Main characteristics of Big Data-volume, variety, velocity and veracity. MPP architecture for volume. Data Warehouses – static schema, slowly evolving dataset MPP Databases like Greenplum, Exadata, Teradata, Netezza, Vertica etc. Hadoop Based Solutions – no conditions on structure of dataset. Typical pattern : HDFS, MapReduce (crunch), retrieve from HDFS Batch- suited for analytical/non-interactive Volume : CEP streaming data Typical choices – CEP products (e.g. Infostreams, Apama, MarkLogic etc) Less production ready – Storm/S4 NoSQL Databases – (columnar and key-value): Best suited as analytical adjunct to data warehouse/database Day-1 : Session -3 : Introduction to Big Data-2 NoSQL solutions KV Store - Keyspace, Flare, SchemaFree, RAMCloud, Oracle NoSQL Database (OnDB) KV Store - Dynamo, Voldemort, Dynomite, SubRecord, Mo8onDb, DovetailDB KV Store (Hierarchical) - GT.m, Cache KV Store (Ordered) - TokyoTyrant, Lightcloud, NMDB, Luxio, MemcacheDB, Actord KV Cache - Memcached, Repcached, Coherence, Infinispan, EXtremeScale, JBossCache, Velocity, Terracoqua Tuple Store - Gigaspaces, Coord, Apache River Object Database - ZopeDB, DB40, Shoal Document Store - CouchDB, Cloudant, Couchbase, MongoDB, Jackrabbit, XML-Databases, ThruDB, CloudKit, Prsevere, Riak-Basho, Scalaris Wide Columnar Store - BigTable, HBase, Apache Cassandra, Hypertable, KAI, OpenNeptune, Qbase, KDI Varieties of Data: Introduction to Data Cleaning issue in Big Data RDBMS – static structure/schema, doesn’t promote agile, exploratory environment. NoSQL – semi structured, enough structure to store data without exact schema before storing data Data cleaning issues Day-1 : Session-4 : Big Data Introduction-3 : Hadoop When to select Hadoop? STRUCTURED - Enterprise data warehouses/databases can store massive data (at a cost) but impose structure (not good for active exploration) SEMI STRUCTURED data – tough to do with traditional solutions (DW/DB) Warehousing data = HUGE effort and static even after implementation For variety & volume of data, crunched on commodity hardware – HADOOP Commodity H/W needed to create a Hadoop Cluster Introduction to Map Reduce /HDFS MapReduce – distribute computing over multiple servers HDFS – make data available locally for the computing process (with redundancy) Data – can be unstructured/schema-less (unlike RDBMS) Developer responsibility to make sense of data Programming MapReduce = working with Java (pros/cons), manually loading data into HDFS Day-2: Session-1: Big Data Ecosystem-Building Big Data ETL: universe of Big Data Tools-which one to use and when? Hadoop vs. Other NoSQL solutions For interactive, random access to data Hbase (column oriented database) on top of Hadoop Random access to data but restrictions imposed (max 1 PB) Not good for ad-hoc analytics, good for logging, counting, time-series Sqoop - Import from databases to Hive or HDFS (JDBC/ODBC access) Flume – Stream data (e.g. log data) into HDFS Day-2: Session-2: Big Data Management System Moving parts, compute nodes start/fail :ZooKeeper - For configuration/coordination/naming services Complex pipeline/workflow: Oozie – manage workflow, dependencies, daisy chain Deploy, configure, cluster management, upgrade etc (sys admin) :Ambari In Cloud : Whirr Day-2: Session-3: Predictive analytics in Business Intelligence -1: Fundamental Techniques & Machine learning based BI : Introduction to Machine learning Learning classification techniques Bayesian Prediction-preparing training file Support Vector Machine KNN p-Tree Algebra & vertical mining Neural Network Big Data large variable problem -Random forest (RF) Big Data Automation problem – Multi-model ensemble RF Automation through Soft10-M Text analytic tool-Treeminer Agile learning Agent based learning Distributed learning Introduction to Open source Tools for predictive analytics : R, Rapidminer, Mahut Day-2: Session-4 Predictive analytics eco-system-2: Common predictive analytic problems in Govt. Insight analytic Visualization analytic Structured predictive analytic Unstructured predictive analytic Threat/fraudstar/vendor profiling Recommendation Engine Pattern detection Rule/Scenario discovery –failure, fraud, optimization Root cause discovery Sentiment analysis CRM analytic Network analytic Text Analytics Technology assisted review Fraud analytic Real Time Analytic Day-3 : Sesion-1 : Real Time and Scalable Analytic Over Hadoop Why common analytic algorithms fail in Hadoop/HDFS Apache Hama- for Bulk Synchronous distributed computing Apache SPARK- for cluster computing for real time analytic CMU Graphics Lab2- Graph based asynchronous approach to distributed computing KNN p-Algebra based approach from Treeminer for reduced hardware cost of operation Day-3: Session-2: Tools for eDiscovery and Forensics eDiscovery over Big Data vs. Legacy data – a comparison of cost and performance Predictive coding and technology assisted review (TAR) Live demo of a Tar product ( vMiner) to understand how TAR works for faster discovery Faster indexing through HDFS –velocity of data NLP or Natural Language processing –various techniques and open source products eDiscovery in foreign languages-technology for foreign language processing Day-3 : Session 3: Big Data BI for Cyber Security –Understanding whole 360 degree views of speedy data collection to threat identification Understanding basics of security analytics-attack surface, security misconfiguration, host defenses Network infrastructure/ Large datapipe / Response ETL for real time analytic Prescriptive vs predictive – Fixed rule based vs auto-discovery of threat rules from Meta data Day-3: Session 4: Big Data in USDA : Application in Agriculture Introduction to IoT ( Internet of Things) for agriculture-sensor based Big Data and control Introduction to Satellite imaging and its application in agriculture Integrating sensor and image data for fertility of soil, cultivation recommendation and forecasting Agriculture insurance and Big Data Crop Loss forecasting Day-4 : Session-1: Fraud prevention BI from Big Data in Govt-Fraud analytic: Basic classification of Fraud analytics- rule based vs predictive analytics Supervised vs unsupervised Machine learning for Fraud pattern detection Vendor fraud/over charging for projects Medicare and Medicaid fraud- fraud detection techniques for claim processing Travel reimbursement frauds IRS refund frauds Case studies and live demo will be given wherever data is available. Day-4 : Session-2: Social Media Analytic- Intelligence gathering and analysis Big Data ETL API for extracting social media data Text, image, meta data and video Sentiment analysis from social media feed Contextual and non-contextual filtering of social media feed Social Media Dashboard to integrate diverse social media Automated profiling of social media profile Live demo of each analytic will be given through Treeminer Tool. Day-4 : Session-3: Big Data Analytic in image processing and video feeds Image Storage techniques in Big Data- Storage solution for data exceeding petabytes LTFS and LTO GPFS-LTFS ( Layered storage solution for Big image data) Fundamental of image analytics Object recognition Image segmentation Motion tracking 3-D image reconstruction Day-4: Session-4: Big Data applications in NIH: Emerging areas of Bio-informatics Meta-genomics and Big Data mining issues Big Data Predictive analytic for Pharmacogenomics, Metabolomics and Proteomics Big Data in downstream Genomics process Application of Big data predictive analytics in Public health Big Data Dashboard for quick accessibility of diverse data and display : Integration of existing application platform with Big Data Dashboard Big Data management Case Study of Big Data Dashboard: Tableau and Pentaho Use Big Data app to push location based services in Govt. Tracking system and management Day-5 : Session-1: How to justify Big Data BI implementation within an organization: Defining ROI for Big Data implementation Case studies for saving Analyst Time for collection and preparation of Data –increase in productivity gain Case studies of revenue gain from saving the licensed database cost Revenue gain from location based services Saving from fraud prevention An integrated spreadsheet approach to calculate approx. expense vs. Revenue gain/savings from Big Data implementation. Day-5 : Session-2: Step by Step procedure to replace legacy data system to Big Data System: Understanding practical Big Data Migration Roadmap What are the important information needed before architecting a Big Data implementation What are the different ways of calculating volume, velocity, variety and veracity of data How to estimate data growth Case studies Day-5: Session 4: Review of Big Data Vendors and review of their products. Q/A session: Accenture APTEAN (Formerly CDC Software) Cisco Systems Cloudera Dell EMC GoodData Corporation Guavus Hitachi Data Systems Hortonworks HP IBM Informatica Intel Jaspersoft Microsoft MongoDB (Formerly 10Gen) MU Sigma Netapp Opera Solutions Oracle Pentaho Platfora Qliktech Quantum Rackspace Revolution Analytics Salesforce SAP SAS Institute Sisense Software AG/Terracotta Soft10 Automation Splunk Sqrrl Supermicro Tableau Software Teradata Think Big Analytics Tidemark Systems Treeminer VMware (Part of EMC) 

Kursy ze Zniżką

Szkolenie Miejscowość Data Kursu Cena szkolenia [Zdalne/Stacjonarne]
Java Spring Kraków, ul. Rzemieślnicza 1 pon., 2016-08-29 09:00 7039PLN / 5245PLN
Programowanie w WPF 4.5 Warszawa, ul. Złota 3/11 pon., 2016-09-05 09:00 2809PLN / 1805PLN
Java Spring Szczecin, ul. Małopolska 23 pon., 2016-09-05 09:00 7039PLN / 5044PLN
Tworzenie aplikacji internetowych w języku PHP Szczecin, ul. Małopolska 23 wt., 2016-09-06 09:00 2688PLN / 2081PLN
Building Web Apps using the MEAN stack Szczecin, ul. Małopolska 23 pon., 2016-09-12 09:00 4388PLN / 3003PLN
Java Spring Gdańsk, ul. Powstańców Warszawskich 45 pon., 2016-09-12 09:00 7039PLN / 5153PLN
Java Spring Poznań, Garbary 100/63 pon., 2016-09-12 09:00 7039PLN / 4961PLN
MS Access - poziom średniozaawansowany Bydgoszcz, ul. Dworcowa 94 wt., 2016-09-13 09:00 1218PLN / 910PLN
Java Spring Warszawa, ul. Złota 3/11 pon., 2016-09-19 09:00 7039PLN / 4961PLN
Java Performance Tuning Gdynia, ul. Ejsmonda 2 pon., 2016-09-19 09:00 4150PLN / 2866PLN
Java Spring Wrocław, ul.Ludwika Rydygiera 2a/22 pon., 2016-09-19 09:00 7039PLN / 4961PLN
Oracle 11g - Programowanie w PL/SQL II Wrocław, ul.Ludwika Rydygiera 2a/22 pon., 2016-09-26 09:00 2363PLN / 1785PLN
BPMN 2.0 dla Analityków Biznesowych Wrocław, ul.Ludwika Rydygiera 2a/22 wt., 2016-09-27 09:00 3110PLN / 2337PLN
ITIL® Foundation Certificate in IT Service Management Warszawa, ul. Złota 3/11 pon., 2016-10-10 09:00 2639PLN / 2076PLN
Visual Basic for Applications (VBA) w Excel - poziom zaawansowany Wrocław, ul.Ludwika Rydygiera 2a/22 pon., 2016-10-10 09:00 1689PLN / 1296PLN
Prognozowanie Rynku Poznań, Garbary 100/63 czw., 2016-10-13 09:00 2936PLN / 2112PLN
Microsoft Office Excel - efektywna praca z arkuszem Rzeszów, Plac Wolności 13 wt., 2016-10-18 09:00 918PLN / 843PLN
Wdrażanie efektywnych strategii cenowych Poznań, Garbary 100/63 śr., 2016-10-26 09:00 1427PLN / 1093PLN
Agile Project Management with Scrum Kraków, ul. Rzemieślnicza 1 śr., 2016-11-02 09:00 1746PLN / 1449PLN
Visual Basic for Applications (VBA) w Excel - poziom zaawansowany Białystok, ul. Malmeda 1 pon., 2016-11-14 09:00 1689PLN / 1413PLN
Techniki graficzne (Adobe Photoshop, Adobe Illustrator) Wrocław, ul.Ludwika Rydygiera 2a/22 wt., 2016-12-06 09:00 1963PLN / 1470PLN

Najbliższe szkolenia

Szkolenie Data Mining, Data Mining boot camp, Szkolenia Zdalne Data Mining, szkolenie wieczorowe Data Mining, szkolenie weekendowe Data Mining , kurs online Data Mining, lekcje UML, edukacja zdalna Data Mining, wykładowca Data Mining , nauczanie wirtualne Data Mining,Kursy Data Mining,Kurs Data Mining, instruktor Data Mining, kurs zdalny Data Mining, Trener Data Mining, nauka przez internet Data Mining

Some of our clients