Big Data Business Intelligence for Govt. Agencies - Plan Szkolenia

Primary tabs

Course CodeKod kursu


Duration Czas trwania

35 godzin (zwykle 5 dni wliczając przerwy)

Requirements Wymagania

1. Should have basic knowledge of business operation and data systems in Govt. in their domain 2. Must have basic understanding of SQL/Oracle or relational database 3. Basic understanding of Statistics ( in Spreadsheet level)

Overview Charakterystyka kursu

Advances in technologies and the increasing amount of information are transforming how business is conducted in many industries, including government. Government data generation and digital archiving rates are on the rise due to the rapid growth of mobile devices and applications, smart sensors and devices, cloud computing solutions, and citizen-facing portals. As digital information expands and becomes more complex, information management, processing, storage, security, and disposition become more complex as well. New capture, search, discovery, and analysis tools are helping organizations gain insights from their unstructured data. The government market is at a tipping point, realizing that information is a strategic asset, and government needs to protect, leverage, and analyze both structured and unstructured information to better serve and meet mission requirements. As government leaders strive to evolve data-driven organizations to successfully accomplish mission, they are laying the groundwork to correlate dependencies across events, people, processes, and information.

High-value government solutions will be created from a mashup of the most disruptive technologies:

  • Mobile devices and applications

  • Cloud services

  • Social business technologies and networking

  • Big Data and analytics

IDC predicts that by 2020, the IT industry will reach $5 trillion, approximately $1.7 trillion larger than today, and that 80% of the industry's growth will be driven by these 3rd Platform technologies. In the long term, these technologies will be key tools for dealing with the complexity of increased digital information. Big Data is one of the intelligent industry solutions and allows government to make better decisions by taking action based on patterns revealed by analyzing large volumes of data — related and unrelated, structured and unstructured.

But accomplishing these feats takes far more than simply accumulating massive quantities of data.“Making sense of these volumes of Big Data requires cutting-edge tools and technologies that can analyze and extract useful knowledge from vast and diverse streams of information,” Tom Kalil and Fen Zhao of the White House Office of Science and Technology Policy wrote in a post on the OSTP Blog.

The White House took a step toward helping agencies find these technologies when it established the National Big Data Research and Development Initiative in 2012. The initiative included more than $200 million to make the most of the explosion of Big Data and the tools needed to analyze it.

The challenges that Big Data poses are nearly as daunting as its promise is encouraging. Storing data efficiently is one of these challenges. As always, budgets are tight, so agencies must minimize the per-megabyte price of storage and keep the data within easy access so that users can get it when they want it and how they need it. Backing up massive quantities of data heightens the challenge.

Analyzing the data effectively is another major challenge. Many agencies employ commercial tools that enable them to sift through the mountains of data, spotting trends that can help them operate more efficiently. (A recent study by MeriTalk found that federal IT executives think Big Data could help agencies save more than $500 billio while also fulfilling mission objectives.).

Custom-developed Big Data tools also are allowing agencies to address the need to analyze their data. For example, the Oak Ridge National Laboratory’s Computational Data Analytics Group has made its Piranha data analytics system available to other agencies. The system has helped medical researchers find a link that can alert doctors to aortic aneurysms before they strike. It’s also used for more mundane tasks, such as sifting through résumés to connect job candidates with hiring managers.

Course OutlinePlan Szkolenia

Breakdown of topics on daily basis: (Each session is 2 hours)

  1. Day-1: Session -1: Business Overview of Why Big Data Business Intelligence in Govt.

  • Case Studies from NIH, DoE

  • Big Data adaptation rate in Govt. Agencies & and how they are aligning their future operation around Big Data Predictive Analytics

  • Broad Scale Application Area in DoD, NSA, IRS, USDA etc.

  • Interfacing Big Data with Legacy data

  • Basic understanding of enabling technologies in predictive analytics

  • Data Integration & Dashboard visualization

  • Fraud management

  • Business Rule/ Fraud detection generation

  • Threat detection and profiling

  • Cost benefit analysis for Big Data implementation

  1. Day-1: Session-2 : Introduction of Big Data-1

  • Main characteristics of Big Data-volume, variety, velocity and veracity. MPP architecture for volume.

  • Data Warehouses – static schema, slowly evolving dataset

  • MPP Databases like Greenplum, Exadata, Teradata, Netezza, Vertica etc.

  • Hadoop Based Solutions – no conditions on structure of dataset.

  • Typical pattern : HDFS, MapReduce (crunch), retrieve from HDFS

  • Batch- suited for analytical/non-interactive

  • Volume : CEP streaming data

  • Typical choices – CEP products (e.g. Infostreams, Apama, MarkLogic etc)

  • Less production ready – Storm/S4

  • NoSQL Databases – (columnar and key-value): Best suited as analytical adjunct to data warehouse/database

  1. Day-1 : Session -3 : Introduction to Big Data-2

NoSQL solutions

    • KV Store - Keyspace, Flare, SchemaFree, RAMCloud, Oracle NoSQL Database (OnDB)

    • KV Store - Dynamo, Voldemort, Dynomite, SubRecord, Mo8onDb, DovetailDB

    • KV Store (Hierarchical) - GT.m, Cache

    • KV Store (Ordered) - TokyoTyrant, Lightcloud, NMDB, Luxio, MemcacheDB, Actord

    • KV Cache - Memcached, Repcached, Coherence, Infinispan, EXtremeScale, JBossCache, Velocity, Terracoqua

    • Tuple Store - Gigaspaces, Coord, Apache River

    • Object Database - ZopeDB, DB40, Shoal

    • Document Store - CouchDB, Cloudant, Couchbase, MongoDB, Jackrabbit, XML-Databases, ThruDB, CloudKit, Prsevere, Riak-Basho, Scalaris

    • Wide Columnar Store - BigTable, HBase, Apache Cassandra, Hypertable, KAI, OpenNeptune, Qbase, KDI

Varieties of Data: Introduction to Data Cleaning issue in Big Data

    • RDBMS – static structure/schema, doesn’t promote agile, exploratory environment.

    • NoSQL – semi structured, enough structure to store data without exact schema before storing data

    • Data cleaning issues

  1. Day-1 : Session-4 : Big Data Introduction-3 : Hadoop

  • When to select Hadoop?

  • STRUCTURED - Enterprise data warehouses/databases can store massive data (at a cost) but impose structure (not good for active exploration)

  • SEMI STRUCTURED data – tough to do with traditional solutions (DW/DB)

  • Warehousing data = HUGE effort and static even after implementation

  • For variety & volume of data, crunched on commodity hardware – HADOOP

  • Commodity H/W needed to create a Hadoop Cluster

Introduction to Map Reduce /HDFS

  • MapReduce – distribute computing over multiple servers

  • HDFS – make data available locally for the computing process (with redundancy)

  • Data – can be unstructured/schema-less (unlike RDBMS)

  • Developer responsibility to make sense of data

  • Programming MapReduce = working with Java (pros/cons), manually loading data into HDFS

  1. Day-2: Session-1: Big Data Ecosystem-Building Big Data ETL: universe of Big Data Tools-which one to use and when?

  • Hadoop vs. Other NoSQL solutions

  • For interactive, random access to data

  • Hbase (column oriented database) on top of Hadoop

  • Random access to data but restrictions imposed (max 1 PB)

  • Not good for ad-hoc analytics, good for logging, counting, time-series

  • Sqoop - Import from databases to Hive or HDFS (JDBC/ODBC access)

  • Flume – Stream data (e.g. log data) into HDFS

  1. Day-2: Session-2: Big Data Management System

  • Moving parts, compute nodes start/fail :ZooKeeper - For configuration/coordination/naming services

  • Complex pipeline/workflow: Oozie – manage workflow, dependencies, daisy chain

  • Deploy, configure, cluster management, upgrade etc (sys admin) :Ambari

  • In Cloud : Whirr

  1. Day-2: Session-3: Predictive analytics in Business Intelligence -1: Fundamental Techniques & Machine learning based BI :

  • Introduction to Machine learning

  • Learning classification techniques

  • Bayesian Prediction-preparing training file

  • Support Vector Machine

  • KNN p-Tree Algebra & vertical mining

  • Neural Network

  • Big Data large variable problem -Random forest (RF)

  • Big Data Automation problem – Multi-model ensemble RF

  • Automation through Soft10-M

  • Text analytic tool-Treeminer

  • Agile learning

  • Agent based learning

  • Distributed learning

  • Introduction to Open source Tools for predictive analytics : R, Rapidminer, Mahut

  1. Day-2: Session-4 Predictive analytics eco-system-2: Common predictive analytic problems in Govt.

  • Insight analytic

  • Visualization analytic

  • Structured predictive analytic

  • Unstructured predictive analytic

  • Threat/fraudstar/vendor profiling

  • Recommendation Engine

  • Pattern detection

  • Rule/Scenario discovery –failure, fraud, optimization

  • Root cause discovery

  • Sentiment analysis

  • CRM analytic

  • Network analytic

  • Text Analytics

  • Technology assisted review

  • Fraud analytic

  • Real Time Analytic

  1. Day-3 : Sesion-1 : Real Time and Scalable Analytic Over Hadoop

  • Why common analytic algorithms fail in Hadoop/HDFS

  • Apache Hama- for Bulk Synchronous distributed computing

  • Apache SPARK- for cluster computing for real time analytic

  • CMU Graphics Lab2- Graph based asynchronous approach to distributed computing

  • KNN p-Algebra based approach from Treeminer for reduced hardware cost of operation

  1. Day-3: Session-2: Tools for eDiscovery and Forensics

  • eDiscovery over Big Data vs. Legacy data – a comparison of cost and performance

  • Predictive coding and technology assisted review (TAR)

  • Live demo of a Tar product ( vMiner) to understand how TAR works for faster discovery

  • Faster indexing through HDFS –velocity of data

  • NLP or Natural Language processing –various techniques and open source products

  • eDiscovery in foreign languages-technology for foreign language processing

  1. Day-3 : Session 3: Big Data BI for Cyber Security –Understanding whole 360 degree views of speedy data collection to threat identification

  • Understanding basics of security analytics-attack surface, security misconfiguration, host defenses

  • Network infrastructure/ Large datapipe / Response ETL for real time analytic

  • Prescriptive vs predictive – Fixed rule based vs auto-discovery of threat rules from Meta data

  1. Day-3: Session 4: Big Data in USDA : Application in Agriculture

  • Introduction to IoT ( Internet of Things) for agriculture-sensor based Big Data and control

  • Introduction to Satellite imaging and its application in agriculture

  • Integrating sensor and image data for fertility of soil, cultivation recommendation and forecasting

  • Agriculture insurance and Big Data

  • Crop Loss forecasting

  1. Day-4 : Session-1: Fraud prevention BI from Big Data in Govt-Fraud analytic:

  • Basic classification of Fraud analytics- rule based vs predictive analytics

  • Supervised vs unsupervised Machine learning for Fraud pattern detection

  • Vendor fraud/over charging for projects

  • Medicare and Medicaid fraud- fraud detection techniques for claim processing

  • Travel reimbursement frauds

  • IRS refund frauds

  • Case studies and live demo will be given wherever data is available.

  1. Day-4 : Session-2: Social Media Analytic- Intelligence gathering and analysis

  • Big Data ETL API for extracting social media data

  • Text, image, meta data and video

  • Sentiment analysis from social media feed

  • Contextual and non-contextual filtering of social media feed

  • Social Media Dashboard to integrate diverse social media

  • Automated profiling of social media profile

  • Live demo of each analytic will be given through Treeminer Tool.

  1. Day-4 : Session-3: Big Data Analytic in image processing and video feeds

  • Image Storage techniques in Big Data- Storage solution for data exceeding petabytes

  • LTFS and LTO

  • GPFS-LTFS ( Layered storage solution for Big image data)

  • Fundamental of image analytics

  • Object recognition

  • Image segmentation

  • Motion tracking

  • 3-D image reconstruction

  1. Day-4: Session-4: Big Data applications in NIH:

  • Emerging areas of Bio-informatics

  • Meta-genomics and Big Data mining issues

  • Big Data Predictive analytic for Pharmacogenomics, Metabolomics and Proteomics

  • Big Data in downstream Genomics process

  • Application of Big data predictive analytics in Public health

  1. Big Data Dashboard for quick accessibility of diverse data and display :

  • Integration of existing application platform with Big Data Dashboard

  • Big Data management

  • Case Study of Big Data Dashboard: Tableau and Pentaho

  • Use Big Data app to push location based services in Govt.

  • Tracking system and management

  1. Day-5 : Session-1: How to justify Big Data BI implementation within an organization:

  • Defining ROI for Big Data implementation

  • Case studies for saving Analyst Time for collection and preparation of Data –increase in productivity gain

  • Case studies of revenue gain from saving the licensed database cost

  • Revenue gain from location based services

  • Saving from fraud prevention

  • An integrated spreadsheet approach to calculate approx. expense vs. Revenue gain/savings from Big Data implementation.

  1. Day-5 : Session-2: Step by Step procedure to replace legacy data system to Big Data System:

  • Understanding practical Big Data Migration Roadmap

  • What are the important information needed before architecting a Big Data implementation

  • What are the different ways of calculating volume, velocity, variety and veracity of data

  • How to estimate data growth

  • Case studies

  1. Day-5: Session 4: Review of Big Data Vendors and review of their products. Q/A session:

  • Accenture

  • APTEAN (Formerly CDC Software)

  • Cisco Systems

  • Cloudera

  • Dell

  • EMC

  • GoodData Corporation

  • Guavus

  • Hitachi Data Systems

  • Hortonworks

  • HP

  • IBM

  • Informatica

  • Intel

  • Jaspersoft

  • Microsoft

  • MongoDB (Formerly 10Gen)

  • MU Sigma

  • Netapp

  • Opera Solutions

  • Oracle

  • Pentaho

  • Platfora

  • Qliktech

  • Quantum

  • Rackspace

  • Revolution Analytics

  • Salesforce

  • SAP

  • SAS Institute

  • Sisense

  • Software AG/Terracotta

  • Soft10 Automation

  • Splunk

  • Sqrrl

  • Supermicro

  • Tableau Software

  • Teradata

  • Think Big Analytics

  • Tidemark Systems

  • Treeminer

  • VMware (Part of EMC) 

Bookings, Prices and EnquiriesTerminarz i Cennik

Szkolenie gwarantowane uruchamiamy nawet dla jednego uczestnika!
Szkolenie Zamknięte
Szkolenie Zamknięte
Uczestnicy tylko z jednej organizacji. Nie ma możliwości dołączenia uczestników z zewnątrz. Program szkolenia jest zazwyczaj dostosowany do konkretnej grupy, tematy zajęć są uzgadniane pomiędzy klientem a trenerem.
Szkolenie Zdalne
od 34000PLN
Szkolenie Zdalne
Instruktor oraz uczestnicy znajdują się w różnych fizycznych lokalizacjach i komunikują się przez Internet. More Information

Im więcej zgłaszasz uczestników, tym większe oszczędności. Tabela przedstawia cenę za uczestnika w zależności od liczby zgłaszanych osób i służy jedynie to zilustrowania przykładowych cen. Aktualna oferta dotycząca szkolenie może być inna.

Liczba uczestników Szkolenie Zdalne
1 34000PLN
2 18000PLN
3 12667PLN
4 10000PLN
Szkolenie Otwarte
od 18000PLN
Szkolenie Otwarte
W szkoleniu uczestniczą kursanci z różnych firm. Kurs realizowany jest wg planu szkolenia zamieszczonego na naszych stronach.

Im więcej zgłaszasz uczestników, tym większe oszczędności. Tabela przedstawia cenę za uczestnika w zależności od liczby zgłaszanych osób i służy jedynie to zilustrowania przykładowych cen. Aktualna oferta dotycząca szkolenie może być inna.

Liczba uczestników Szkolenie Otwarte
1 18000PLN
2 10000PLN
3 7333PLN
4 6000PLN
Nie znalazłeś pasującego terminu? Zaproponuj termin szkolenia >>
Zbyt drogo? Podaj swoją cenę

Powiązane Kategorie

Szkolenia Powiązane

Najbliższe szkolenia

MiejscowośćData KursuCena szkolenia [Zdalne / Stacjonarne]
Tarnów ul. Kościuszki 10 pon., 2017-11-06 09:0034000PLN / 18750PLN
Szczecin, ul. Sienna 9pon., 2017-11-06 09:0034000PLN / 18250PLN
Kielce, ul. Warszawska 19pon., 2017-11-06 09:0034000PLN / 18000PLN
Gdynia, ul. Ejsmonda 2pon., 2017-11-06 09:0034000PLN / 18250PLN
Bydgoszcz, ul. Dworcowa 94pon., 2017-11-06 09:0034000PLN / 18250PLN

Kursy w promocyjnej cenie

Szkolenie Miejscowość Data Kursu Cena szkolenia [Zdalne / Stacjonarne]
Adobe InDesign Wrocław, ul.Ludwika Rydygiera 2a/22 pon., 2017-10-23 09:00 1881PLN / 1027PLN
Adobe Premiere Pro Gdynia, ul. Ejsmonda 2 pon., 2017-10-23 09:00 3960PLN / 2480PLN
Administracja systemu Linux Gdynia, ul. Ejsmonda 2 wt., 2017-10-24 09:00 4950PLN / 3225PLN
Adobe Photoshop Elements Lublin, ul. Spadochroniarzy 9 śr., 2017-10-25 09:00 1881PLN / 1127PLN
Business Analysis, BABOK V3.0 and IIBA Certification Preparation Kraków, ul. Rzemieślnicza 1 śr., 2017-10-25 09:00 9405PLN / 5903PLN
Node.js Olsztyn, ul. Kajki 3/1 czw., 2017-10-26 09:00 3861PLN / 2431PLN
Zaawansowana administracja MySQL Poznań, Garbary 100/63 czw., 2017-10-26 09:00 3416PLN / 2108PLN
Microsoft Office Excel - efektywna praca z arkuszem Warszawa, ul. Złota 3/11 czw., 2017-10-26 09:00 2475PLN / 1225PLN
Advisory & Leadership Skills Wrocław, ul.Ludwika Rydygiera 2a/22 pon., 2017-10-30 09:00 8524PLN / 2983PLN
SQL Advanced in MySQL Warszawa, ul. Złota 3/11 czw., 2017-11-02 09:00 1881PLN / 1141PLN
Projektowanie stron na urządzenia mobilne Bielsko-Biała, Al. Armii Krajowej 220 czw., 2017-11-02 09:00 2624PLN / 1605PLN
Język SQL w bazie danych MSSQL Wrocław, ul.Ludwika Rydygiera 2a/22 czw., 2017-11-02 09:00 2970PLN / 1143PLN
Microsoft Office Excel - analiza statystyczna Warszawa, ul. Złota 3/11 czw., 2017-11-02 09:00 2673PLN / 1291PLN
Android - Podstawy Wrocław, ul.Ludwika Rydygiera 2a/22 pon., 2017-11-06 09:00 9801PLN / 4180PLN
Java Spring Wrocław, ul.Ludwika Rydygiera 2a/22 pon., 2017-11-06 09:00 14414PLN / 5970PLN
Automatyzacja testów za pomocą Selenium Łódź, ul. Tatrzańska 11 pon., 2017-11-06 09:00 7722PLN / 3474PLN
Symfony 3 Kraków, ul. Rzemieślnicza 1 pon., 2017-11-06 09:00 6930PLN / 3300PLN
Oracle 11g - Język SQL dla programistów - warsztaty Bielsko-Biała, Al. Armii Krajowej 220 pon., 2017-11-06 09:00 6930PLN / 4140PLN
Programowanie Aplikacji Webowych z Java EE 6 / 7 Zielona Góra, ul. Reja 6 pon., 2017-11-06 09:00 7722PLN / 3340PLN
Kontrola jakości i ciągła integracja Wrocław, ul.Ludwika Rydygiera 2a/22 wt., 2017-11-07 09:00 2673PLN / 1737PLN
Oracle Service Bus 11g - Design and Integration Gdańsk, ul. Powstańców Warszawskich 45 wt., 2017-11-07 09:00 15315PLN / 5391PLN
Visual Basic for Applications (VBA) w Excel - wstęp do programowania Warszawa, ul. Złota 3/11 wt., 2017-11-07 09:00 3564PLN / 1691PLN
Programming for Biologists Warszawa, ul. Złota 3/11 wt., 2017-11-07 09:00 11781PLN / 3745PLN
JMeter Fundamentals Warszawa, ul. Złota 3/11 śr., 2017-11-08 09:00 1871PLN / 824PLN
Język UML w Enterprise Architect - warsztaty Warszawa, ul. Złota 3/11 śr., 2017-11-08 09:00 5940PLN / 3570PLN
Tableau Advanced Gdynia, ul. Ejsmonda 2 śr., 2017-11-08 09:00 7425PLN / 2975PLN
Managing Configuration with Ansible Warszawa, ul. Złota 3/11 śr., 2017-11-08 09:00 16612PLN / 5634PLN
Adobe Premiere Pro Gdańsk, ul. Powstańców Warszawskich 45 czw., 2017-11-09 09:00 3960PLN / 2480PLN
Microsoft Office Excel - analiza finansowa Warszawa, ul. Złota 3/11 czw., 2017-11-09 09:00 2079PLN / 1093PLN
Nagios Core Gdańsk, ul. Powstańców Warszawskich 45 pon., 2017-11-13 09:00 13919PLN / 4968PLN
Visual Basic for Applications (VBA) w Excel - poziom zaawansowany Gdańsk, ul. Powstańców Warszawskich 45 pon., 2017-11-13 09:00 3069PLN / 1773PLN
Tworzenie stron WWW w języku PHP Szczecin, ul. Sienna 9 pon., 2017-11-13 09:00 2970PLN / 1344PLN
Techniki DTP (InDesign, Photoshop, Illustrator, Acrobat) Bielsko-Biała, Al. Armii Krajowej 220 pon., 2017-11-13 09:00 5940PLN / 3730PLN
Oracle 11g - Analiza danych - warsztaty Gdynia, ul. Ejsmonda 2 pon., 2017-11-13 09:00 9900PLN / 4664PLN
Linux Fundamentals Kraków, ul. Rzemieślnicza 1 wt., 2017-11-14 09:00 10128PLN / 3869PLN
Adobe Illustrator Lublin, ul. Spadochroniarzy 9 wt., 2017-11-14 09:00 2871PLN / 1648PLN
ADO.NET 4.0 Development Warszawa, ul. Złota 3/11 wt., 2017-11-14 09:00 20176PLN / 6914PLN
Visual Basic for Applications (VBA) w Excel - poziom zaawansowany Warszawa, ul. Złota 3/11 śr., 2017-11-15 09:00 3069PLN / 1623PLN
Access - podstawy Szczecin, ul. Sienna 9 pon., 2017-11-20 09:00 3465PLN / 1550PLN
Certyfikacja BPM przygotowanie do egzaminu OCEB2 OMG Certified Expert in BPM Fundamental Warszawa, ul. Złota 3/11 pon., 2017-11-20 09:00 11880PLN / 4760PLN
Introduction to Selenium Poznań, Garbary 100/63 śr., 2017-12-20 09:00 1871PLN / 824PLN
Adobe Photoshop Warszawa, ul. Złota 3/11 śr., 2017-12-20 09:00 1881PLN / 1152PLN

Newsletter z promocjami

Zapisz się na nasz newsletter i otrzymuj informacje o aktualnych zniżkach na kursy otwarte.
Szanujemy Twoją prywatność, dlatego Twój e-mail będzie wykorzystywany jedynie w celu wysyłki naszego newslettera, nie będzie udostępniony ani sprzedany osobom trzecim.
W dowolnej chwili możesz zmienić swoje preferencje co do otrzymywanego newslettera bądź całkowicie się z niego wypisać.

Zaufali nam