Szkolenia Big Data

Szkolenia Big Data

BigData to termin używany w odniesieniu do rozwiązań przeznaczonych do przechowywania i przetwarzania dużych zbiorów danych. Rozwiązania typu BigData zostały zainicjowane przez firmę Google, jakkolwiek obecnie dostępnych jest wiele rozwiązań typu open-source takich jak Apache Hadoop, Cassandra czy Cloudera Impala. Zgodnie z raportami publikowanymi przez firmę Gartner BigData jest kolejnym wielkim krokiem w branży IT, zaraz po rozwiązaniach opartych na chmurze obliczeniowej, i będzie wiodącym trendem przez kilka najbliższych lat.

Opinie uczestników

Plany Szkoleń Big Data

Kod Nazwa Czas trwania Charakterystyka kursu
datavault Data Vault: Building a Scalable Data Warehouse 28 godz. Data vault modeling is a database modeling technique that provides long-term historical storage of data that originates from multiple sources. A data vault stores a single version of the facts, or "all the data, all of the time". Its flexible, scalable, consistent and adaptable design encompasses the best aspects of 3rd normal form (3NF) and star schema. In this instructor-led, live training, participants will learn how to build a Data Vault. By the end of this training, participants will be able to: Understand the architecture and design concepts behind Data Vault 2.0, and its interaction with Big Data, NoSQL and AI. Use data vaulting techniques to enable auditing, tracing, and inspection of historical data in a data warehouse Develop a consistent and repeatable ETL (Extract, Transform, Load) process Build and deploy highly scalable and repeatable warehouses Audience Data modelers Data warehousing specialist Business Intelligence specialists Data engineers Database administrators Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
hadoopforprojectmgrs Hadoop for Project Managers 14 godz. As more and more software and IT projects migrate from local processing and data management to distributed processing and big data storage, Project Managers are finding the need to upgrade their knowledge and skills to grasp the concepts and practices relevant to Big Data projects and opportunities. This course introduces Project Managers to the most popular Big Data processing framework: Hadoop.   In this instructor-led training, participants will learn the core components of the Hadoop ecosystem and how these technologies can be used to solve large-scale problems. In learning these foundations, participants will also improve their ability to communicate with the developers and implementers of these systems as well as the data scientists and analysts that many IT projects involve. Audience Project Managers wishing to implement Hadoop into their existing development or IT infrastructure Project Managers needing to communicate with cross-functional teams that include big data engineers, data scientists and business analysts Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
neo4j Beyond the relational database: neo4j 21 godz. Relational, table-based databases such as Oracle and MySQL have long been the standard for organizing and storing data. However, the growing size and fluidity of data have made it difficult for these traditional systems to efficiently execute highly complex queries on the data. Imagine replacing rows-and-columns-based data storage with object-based data storage, whereby entities (e.g., a person) could be stored as data nodes, then easily queried on the basis of their vast, multi-linear relationship with other nodes. And imagine querying these connections and their associated objects and properties using a compact syntax, up to 20 times lighter than SQL. This is what graph databases, such as neo4j offer. In this hands-on course, we will set up a live project and put into practice the skills to model, manage and access your data. We contrast and compare graph databases with SQL-based databases as well as other NoSQL databases and clarify when and where it makes sense to implement each within your infrastructure. Audience Database administrators (DBAs) Data analysts Developers System Administrators DevOps engineers Business Analysts CTOs CIOs Format of the course Heavy emphasis on hands-on practice. Most of the concepts are learned through samples, exercises and hands-on development.
iottech IoT (Internet of Things) - Technology Overview 7 godz. Internet of Things (IoT) jest koncepcją połączonej sieci obiektów (fizycznych urządzeń) - pojazdów, budynków, telefonów komórkowych itp. które poprzez wbudowaną elektronikę, oprogramowanie, czujniki oraz karty sieciowe, mogą komunikować się z sobą, wymieniać i gromadzić dane.  IoT umożliwia wykrywanie i zdalne sterowanie urządzeniami za pomocą istniejącej infrastruktury sieciowej. Tworzy możliwość bardziej bezpośredniej integracji świata fizycznego z systemami komputerowymi, której wynikiem może być np. zwiększenie bezpieczeństwa, optymalizacja rozłożenia natęrzenia ruchu drogowego, inteligentne domy oraz mnóstwo innych wymiernych korzyści biznesowych. Szkolenie ma na celu przekazanie uczestnikom wiedzy o IoT, trendów rozwoju oraz pokazać jakie oraz jak gromadzić dane i do czego można je później wykorzystać.
hadoopadm Big Data Hadoop Administration Training 21 godz. Szkolenie pozwoli w pełni zapoznać się i zrozumieć wszystkie niezbędne kroki do obsługi i utrzymywania klastra Hadoop. Dostarcza wiedzę począwszy od zagadnień związanych ze specyfikacją sprzętu, instalacją i konfiguracją systemu, aż do zagadnien związanych z równoważeniem obciążenia, strojeniem, diagnozowaniem i rozwiązywaniu problemów  przy wdrożeniu. Kurs dedykowany administratorom, którzy będą tworzyć lub/i utrzymywać klaster Hadoop. Materiały szkoleniowe Materiały szkoleniowe Student Guide Materiały szkoleniowe Lab Guide
datameer Datameer for Data Analysts 14 godz. Datameer is a business intelligence and analytics platform built on Hadoop. It allows end-users to access, explore and correlate large-scale, structured, semi-structured and unstructured data in an easy-to-use fashion. In this instructor-led, live training, participants will learn how to use Datameer to overcome Hadoop's steep learning curve as they step through the setup and analysis of a series of big data sources. By the end of this training, participants will be able to: Create, curate, and interactively explore an enterprise data lake Access business intelligence data warehouses, transactional databases and other analytic stores Use a spreadsheet user-interface to design end-to-end data processing pipelines Access pre-built functions to explore complex data relationships Use drag-and-drop wizards to visualize data and create dashboards Use tables, charts, graphs, and maps to analyze query results Audience Data analysts Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
cassdev1 Cassandra for Developers - Bespoke 21 godz. This course will introduce Cassandra –  a popular NoSQL database.  It will cover Cassandra principles, architecture and data model.   Students will learn data modeling  in CQL (Cassandra Query Language) in hands-on, interactive labs.  This session also discusses Cassandra internals and some admin topics. Duration : 3 days Audience : Developers
IntroToAvro Apache Avro: Data serialization for distributed applications 14 godz. This course is intended for Developers Format of the course Lectures, hands-on practice, small tests along the way to gauge understanding
bigdarch Big Data Architect 35 godz. Day 1 - provides a high-level overview of essential Big Data topic areas. The module is divided into a series of sections, each of which is accompanied by a hands-on exercise. Day 2 - explores a range of topics that relate analysis practices and tools for Big Data environments. It does not get into implementation or programming details, but instead keeps coverage at a conceptual level, focusing on topics that enable participants to develop a comprehensive understanding of the common analysis functions and features offered by Big Data solutions. Day 3 - provides an overview of the fundamental and essential topic areas relating to Big Data solution platform architecture. It covers Big Data mechanisms required for the development of a Big Data solution platform and architectural options for assembling a data processing platform. Common scenarios are also presented to provide a basic understanding of how a Big Data solution platform is generally used.  Day 4 - builds upon Day 3 by exploring advanced topics relatng to Big Data solution platform architecture. In particular, different architectural layers that make up the Big Data solution platform are introduced and discussed, including data sources, data ingress, data storage, data processing and security.  Day 5 - covers a number of exercises and problems designed to test the delegates ability to apply knowledge of topics covered Day 3 and 4. 
apacheh Administrator Training for Apache Hadoop 35 godz. Głównym celem szkolenia jest zdobycie wiedzy z administracji systemem Apache Hadoop w środowiskach MapReduce oraz YARN na poziomie zaawansowanym. Tematyka szkolenia dotyczy w głównej mierze architektury systemu Hadoop, a w szczególności systemu plików HDFS oraz modeli programistycznych MapReduce i YARN oraz zagadnień związanych z planowaniem, instalacją, konfiguracją, administracją, zarządzaniem i monitorowaniem klastra systemu Hadoop. Pozostałe zagadnienia związane z tematyką BigData takie jak HBase, Cassandra, Impala, Pig, Hiver oraz Sqoop są również omówione, choć pobieżnie. Kurs przeznaczony jest w głównej mierze do specjalistów z branży IT, którzy chcą przygotować się i zdać egzamin CCAH (Cloudera Certified administrator for Apache Hadoop).
vespa Vespa: Serving large-scale data in real-time 14 godz. Vespa an open-source big data processing and serving engine created by Yahoo.  It is used to respond to user queries, make recommendations, and provide personalized content and advertisements in real-time. This instructor-led, live training introduces the challenges of serving large-scale data and walks participants through the creation of an application that can compute responses to user requests, over large datasets in real-time. By the end of this training, participants will be able to: Use Vespa to quickly compute data (store, search, rank, organize) at serving time while a user waits Implement Vespa into existing applications involving feature search, recommendations, and personalization Integrate and deploy Vespa with existing big data systems such as Hadoop and Storm. Audience Developers Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
teraintro Teradata Fundamentals 21 godz. Teradata is one of the popular Relational Database Management System. It is mainly suitable for building large scale data warehousing applications. Teradata achieves this by the concept of parallelism.  This course introduces the delegates to Teradata
dsbda Data Science for Big Data Analytics 35 godz. Big data is data sets that are so voluminous and complex that traditional data processing application software are inadequate to deal with them. Big data challenges include capturing data, data storage, data analysis, search, sharing, transfer, visualization, querying, updating and information privacy.
osovv OpenStack Overview 7 godz. The course is dedicated to IT engineers and architects who are looking for a solution to host private or public IaaS (Infrastructure as a Service) cloud. This is also great opportunity for IT managers to gain knowledge overview about possibilities which could be enabled by OpenStack. Before You spend a lot of money on OpenStack implementation, You could consider all pros and cons by attending on our course. This topic is also avaliable as individual consultancy. Course goal: gaining basic knowledge regarding OpenStack
rneuralnet Sieci Neuronowe w R 14 godz. Szkolenie jest wprowadzeniem do wdrożenia sieci neuronowych w życiu codziennym wykorzystując oprogramowanie R-project.
Przygotowanie do egzaminu CCAH (Certified Administrator for Apache Hadoop) 35 godz. Kurs przeznaczony jest dla specjalistów z branży IT pracujących nad rozwiązaniami wymagającymi przechowywania i przetwarzania dużych zbiorów danych w systemach rozproszonych Cel szkolenia: zdobycie wiedzy na temat administracji systemem Apache Hadoop przygotowanie do egzaminu CCAH (Cloudera Certified Administrator for Apache Hadoop)
apex Apache Apex: Processing big data-in-motion 21 godz. Apache Apex is a YARN-native platform that unifies stream and batch processing. It processes big data-in-motion in a way that is scalable, performant, fault-tolerant, stateful, secure, distributed, and easily operable. This instructor-led, live training introduces Apache Apex's unified stream processing architecture and walks participants through the creation of a distributed application using Apex on Hadoop. By the end of this training, participants will be able to: Understand data processing pipeline concepts such as connectors for sources and sinks, common data transformations, etc. Build, scale and optimize an Apex application Process real-time data streams reliably and with minimum latency Use Apex Core and the Apex Malhar library to enable rapid application development Use the Apex API to write and re-use existing Java code Integrate Apex into other applications as a processing engine Tune, test and scale Apex applications Audience Developers Enterprise architects Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
ApHadm1 Apache Hadoop: Manipulation and Transformation of Data Performance 21 godz. This course is intended for developers, architects, data scientists or any profile that requires access to data either intensively or on a regular basis. The major focus of the course is data manipulation and transformation. Among the tools in the Hadoop ecosystem this course includes the use of Pig and Hive both of which are heavily used for data transformation and manipulation. This training also addresses performance metrics and performance optimisation. The course is entirely hands on and is punctuated by presentations of the theoretical aspects.
cassadmin Cassandra Administration 14 godz. This course will introduce Cassandra –  a popular NoSQL database.  It will cover Cassandra principles, architecture and data model.   Students will learn data modeling  in CQL (Cassandra Query Language) in hands-on, interactive labs.  This session also discusses Cassandra internals and some admin topics.
matlab2 MATLAB Fundamentals 21 godz. This three-day course provides a comprehensive introduction to the MATLAB technical computing environment. The course is intended for beginning users and those looking for a review. No prior programming experience or knowledge of MATLAB is assumed. Themes of data analysis, visualization, modeling, and programming are explored throughout the course. Topics include:     Working with the MATLAB user interface     Entering commands and creating variables     Analyzing vectors and matrices     Visualizing vector and matrix data     Working with data files     Working with data types     Automating commands with scripts     Writing programs with logic and flow control     Writing functions
iotemi IoT (Internet of Things) for Entrepreneurs, Managers and Investors 21 godz. Estimates for Internet of Things or IoT market value are massive, since by definition the IoT is an integrated and diffused layer of devices, sensors, and computing power that overlays entire consumer, business-to-business, and government industries. The IoT will account for an increasingly huge number of connections: 1.9 billion devices today, and 9 billion by 2018. That year, it will be roughly equal to the number of smartphones, smart TVs, tablets, wearable computers, and PCs combined.  In the consumer space, many products and services have already crossed over into the IoT, including kitchen and home appliances, parking, RFID, lighting and heating products, and a number of applications in Industrial Internet.  However the underlying technologies of IoT are nothing new as M2M communication existed since the birth of Internet. However what changed in last couple of years is the emergence of number of inexpensive wireless technologies added by overwhelming adaptation of smart phones and Tablet in every home. Explosive growth of mobile devices led to present demand of IoT.  Due to unbounded opportunities in IoT business, a large number of small and medium sized entrepreneurs jumped into bandwagon of IoT gold rush. Also due to emergence of open source electronics and IoT platform, cost of development of IoT system and further managing its sizeable production is increasingly affordable. Existing electronic product owners are experiencing pressure to integrate their device with Internet or Mobile app.  This training is intended for a technology and business review of an emerging industry so that IoT enthusiasts/entrepreneurs can grasp the basics of IoT technology and business. Course objectives  Main objective of the course is to introduce emerging technological options, platforms and case studies of IoT implementation in home & city automation (smart homes and cities), Industrial Internet, healthcare, Govt., Mobile Cellular and other areas.  Basic introduction of all the elements of IoT-Mechanical, Electronics/sensor platform, Wireless and wireline protocols, Mobile to Electronics integration, Mobile to enterprise integration, Data-analytics and Total control plane.  M2M Wireless protocols for IoT- WiFi, Zigbee/Zwave, Bluetooth, ANT+ : When and where to use which one?  Mobile/Desktop/Web app- for registration, data acquisition and control –Available M2M data acquisition platform for IoT-–Xively, Omega and NovoTech, etc. Security issues and security solutions for IoT Open source/commercial electronics platform for IoT-Rasberry Pi, Adruino , ArmMbedLPC etc  Open source /commercial enterprise cloud platform for IoT-Ayla, iO Bridge, Libellium, Axeda, Cisco frog cloud Studies of business and technology of some of the common IoT devices like Home automation, Smoke alarm, vehicles, military, home health etc. Target Audience  Investors and IoT entrepreneurs  Managers and Engineers whose company is venturing into IoT space  Business Analysts & Investors
aifortelecom AI Awareness for Telecom 14 godz. AI is a collection of technologies for building intelligent systems capable of understanding data and the activities surrounding the data to make "intelligent decisions". For Telecom providers, building applications and services that make use of AI could open the door for improved operations and servicing in areas such as maintenance and network optimization. In this course we examine the various technologies that make up AI and the skill sets required to put them to use. Throughout the course, we examine AI's specific applications within the Telecom industry. Audience Network engineers Network operations personnel Telecom technical managers Format of the course     Part lecture, part discussion, hands-on exercises
alluxio Alluxio: Unifying disparate storage systems 7 godz. Alexio is an open-source virtual distributed storage system that unifies disparate storage systems and enables applications to interact with data at memory speed. It is used by companies such as Intel, Baidu and Alibaba. In this instructor-led, live training, participants will learn how to use Alexio to bridge different computation frameworks with storage systems and efficiently manage multi-petabyte scale data as they step through the creation of an application with Alluxio. By the end of this training, participants will be able to: Develop an application with Alluxio Connect big data systems and applications while preserving one namespace Efficiently extract value from big data in any storage format Improve workload performance Deploy and manage Alluxio standalone or clustered Audience Data scientist Developer System administrator Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
BigData_ A practical introduction to Data Analysis and Big Data 35 godz. Participants who complete this training will gain a practical, real-world understanding of Big Data and its related technologies, methodologies and tools. Participants will have the opportunity to put this knowledge into practice through hands-on exercises. Group interaction and instructor feedback make up an important component of the class. The course starts with an introduction to elemental concepts of Big Data, then progresses into the programming languages and methodologies used to perform Data Analysis. Finally, we discuss the tools and infrastructure that enable Big Data storage, Distributed Processing, and Scalability. Audience Developers / programmers IT consultants Format of the course Part lecture, part discussion, hands-on practice and implementation, occasional quizing to measure progress.
SolrAdm Apache Solr Administration Essentials 7 godz. Szkolenie jest skierowane głównie do administratorów systemów oraz specjalistów IT, którzy są zainteresowani tym, w jaki sposób funkcjonuje klaster Solr / Solr Cloud, w jaki sposób go utrzymywać i nim zarządzać. Jednodniowy trening jest skoncentrowany na operacjach administracyjnych i pod tym kątem zostały również przygotowane ćwiczenia. Nie mniej jednak warsztat zawiera wszystkie niezbędne informacje na temat Apache Solr, pozwalające zrozumieć zasady działania i przeznaczenia tej technologii. Po szkoleniu, uczestnik będzie posiadał ogólną wiedzę na temat Apache Solr, ale przede wszystkim będzie znał: podstawy indeksowania, przetwarzania i wyszukiwania dokumentów zagadnienia dot. skalowania, wydajności zasady utrzymywania instalacji Apache Solr zaawansowane zagadnienia konfiguracyjne silnika wyszukiwania
datashrinkgov Data Shrinkage for Government 14 godz.
bdbiga Big Data Business Intelligence for Govt. Agencies 35 godz. Advances in technologies and the increasing amount of information are transforming how business is conducted in many industries, including government. Government data generation and digital archiving rates are on the rise due to the rapid growth of mobile devices and applications, smart sensors and devices, cloud computing solutions, and citizen-facing portals. As digital information expands and becomes more complex, information management, processing, storage, security, and disposition become more complex as well. New capture, search, discovery, and analysis tools are helping organizations gain insights from their unstructured data. The government market is at a tipping point, realizing that information is a strategic asset, and government needs to protect, leverage, and analyze both structured and unstructured information to better serve and meet mission requirements. As government leaders strive to evolve data-driven organizations to successfully accomplish mission, they are laying the groundwork to correlate dependencies across events, people, processes, and information. High-value government solutions will be created from a mashup of the most disruptive technologies: Mobile devices and applications Cloud services Social business technologies and networking Big Data and analytics IDC predicts that by 2020, the IT industry will reach $5 trillion, approximately $1.7 trillion larger than today, and that 80% of the industry's growth will be driven by these 3rd Platform technologies. In the long term, these technologies will be key tools for dealing with the complexity of increased digital information. Big Data is one of the intelligent industry solutions and allows government to make better decisions by taking action based on patterns revealed by analyzing large volumes of data — related and unrelated, structured and unstructured. But accomplishing these feats takes far more than simply accumulating massive quantities of data.“Making sense of these volumes of Big Data requires cutting-edge tools and technologies that can analyze and extract useful knowledge from vast and diverse streams of information,” Tom Kalil and Fen Zhao of the White House Office of Science and Technology Policy wrote in a post on the OSTP Blog. The White House took a step toward helping agencies find these technologies when it established the National Big Data Research and Development Initiative in 2012. The initiative included more than $200 million to make the most of the explosion of Big Data and the tools needed to analyze it. The challenges that Big Data poses are nearly as daunting as its promise is encouraging. Storing data efficiently is one of these challenges. As always, budgets are tight, so agencies must minimize the per-megabyte price of storage and keep the data within easy access so that users can get it when they want it and how they need it. Backing up massive quantities of data heightens the challenge. Analyzing the data effectively is another major challenge. Many agencies employ commercial tools that enable them to sift through the mountains of data, spotting trends that can help them operate more efficiently. (A recent study by MeriTalk found that federal IT executives think Big Data could help agencies save more than $500 billio while also fulfilling mission objectives.). Custom-developed Big Data tools also are allowing agencies to address the need to analyze their data. For example, the Oak Ridge National Laboratory’s Computational Data Analytics Group has made its Piranha data analytics system available to other agencies. The system has helped medical researchers find a link that can alert doctors to aortic aneurysms before they strike. It’s also used for more mundane tasks, such as sifting through résumés to connect job candidates with hiring managers.
apachedrill Apache Drill for On-the-Fly Analysis of Multiple Big Data Formats 21 godz. Apache Drill is a schema-free, distributed, in-memory columnar SQL query engine for Hadoop, NoSQL and and other Cloud and file storage systems. Apache Drill's power lies in its ability to join data from multiple data stores using a single query. Apache Drill supports numerous NoSQL databases and file systems, including HBase, MongoDB, MapR-DB, HDFS, MapR-FS, Amazon S3, Azure Blob Storage, Google Cloud Storage, Swift, NAS and local files. In this instructor-led, live training, participants will learn the fundamentals of Apache Drill, then leverage the power and convenience of SQL to interactively query big data without writing code. Participants will also learn how to optimize their Drill queries for distributed SQL execution. By the end of this training, participants will be able to: Perform "self-service" exploration on structured and semi-structured data on Hadoop Query known as well as unknown data using SQL queries Understand how Apache Drills receives and executes queries Write SQL queries to analyze different types of data, including structured data in Hive, semi-structured data in HBase or MapR-DB tables, and data saved in files such as Parquet and JSON. Use Apache Drill to perform on-the-fly schema discovery, bypassing the need for complex ETL and schema operations Integrate Apache Drill with BI (Business Intelligence) tools such as Tableau, Qlikview, MicroStrategy and Excel Audience Data analysts Data scientists SQL programmers Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
flink Flink for scalable stream and batch data processing 28 godz. Apache Flink is an open-source framework for scalable stream and batch data processing. This instructor-led, live training introduces the principles and approaches behind distributed stream and batch data processing, and walks participants through the creation of a real-time, data streaming application. By the end of this training, participants will be able to: Set up an environment for developing data analysis applications Package, execute, and monitor Flink-based, fault-tolerant, data streaming applications Manage diverse workloads Perform advanced analytics using Flink ML Set up a multi-node Flink cluster Measure and optimize performance Integrate Flink with different Big Data systems Compare Flink capabilities with those of other big data processing frameworks Audience Developers Architects Data engineers Analytics professionals Technical managers Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
voldemort Voldemort: Setting up a key-value distributed data store 14 godz. Voldemort is an open-source distributed data store that is designed as a key-value store.  It is used at LinkedIn by numerous critical services powering a large portion of the site. This course will introduce the architecture and capabilities of Voldomort and walk participants through the setup and application of a key-value distributed data store. Audience     Software developers     System administrators     DevOps engineers Format of the course     Part lecture, part discussion, heavy hands-on practice, occasional tests to gauge understanding
bdhat Big Data Hadoop Analyst Training 28 godz. Big Data Analyst Training to praktyczny kurs, który polecany jest każdemu, kto chce w przyszłości zostać ekspertem Data Scientist. Kurs skupia sie na aspektach potrzebnych do pracy nowoczesnego analityka w technologii Big Data. W trakcie kursu prezentowane są narzędzia pozwalające na uzyskanie dostępu, zmianę, transformację i analizę skomplikowanych struktur danych umieszczonych w klastrze Hadoop. W trakcie kursu będą poruszane tematy w ramach technologii Hadoop Ecosystem (Pig, Hive, Impala, ELK i inne). Funkcjonaloność narzędzi Pig, Hive, Impala, ELK, pozwalające na zbieranie danych, zapisywanie wyników i analitykę. Jak Pig, Hive i Impala mogą podnieść wydajność typowych i codziennych zadań analitycznych. Wykonywanie w czasie rzeczywistym interaktywnych analiz ogromnych zbiorów danych aby uzyskać cenne i wartościowe elementy dla biznesu oraz jak interpretować wnioski. Wykonywanie złożonych zapytań na bardzo dużych wolumenach danych.
wdneo4j Wprowadzenie do Neo4j - grafowej bazy danych 7 godz.
bdbitcsp Big Data Business Intelligence for Telecom & Communication Service Providers 35 godz. Overview Communications service providers (CSP) are facing pressure to reduce costs and maximize average revenue per user (ARPU), while ensuring an excellent customer experience, but data volumes keep growing. Global mobile data traffic will grow at a compound annual growth rate (CAGR) of 78 percent to 2016, reaching 10.8 exabytes per month. Meanwhile, CSPs are generating large volumes of data, including call detail records (CDR), network data and customer data. Companies that fully exploit this data gain a competitive edge. According to a recent survey by The Economist Intelligence Unit, companies that use data-directed decision-making enjoy a 5-6% boost in productivity. Yet 53% of companies leverage only half of their valuable data, and one-fourth of respondents noted that vast quantities of useful data go untapped. The data volumes are so high that manual analysis is impossible, and most legacy software systems can’t keep up, resulting in valuable data being discarded or ignored. With Big Data & Analytics’ high-speed, scalable big data software, CSPs can mine all their data for better decision making in less time. Different Big Data products and techniques provide an end-to-end software platform for collecting, preparing, analyzing and presenting insights from big data. Application areas include network performance monitoring, fraud detection, customer churn detection and credit risk analysis. Big Data & Analytics products scale to handle terabytes of data but implementation of such tools need new kind of cloud based database system like Hadoop or massive scale parallel computing processor ( KPU etc.) This course work on Big Data BI for Telco covers all the emerging new areas in which CSPs are investing for productivity gain and opening up new business revenue stream. The course will provide a complete 360 degree over view of Big Data BI in Telco so that decision makers and managers can have a very wide and comprehensive overview of possibilities of Big Data BI in Telco for productivity and revenue gain. Course objectives Main objective of the course is to introduce new Big Data business intelligence techniques in 4 sectors of Telecom Business (Marketing/Sales, Network Operation, Financial operation and Customer Relation Management). Students will be introduced to following: Introduction to Big Data-what is 4Vs (volume, velocity, variety and veracity) in Big Data- Generation, extraction and management from Telco perspective How Big Data analytic differs from legacy data analytic In-house justification of Big Data -Telco perspective Introduction to Hadoop Ecosystem- familiarity with all Hadoop tools like Hive, Pig, SPARC –when and how they are used to solve Big Data problem How Big Data is extracted to analyze for analytics tool-how Business Analysis’s can reduce their pain points of collection and analysis of data through integrated Hadoop dashboard approach Basic introduction of Insight analytics, visualization analytics and predictive analytics for Telco Customer Churn analytic and Big Data-how Big Data analytic can reduce customer churn and customer dissatisfaction in Telco-case studies Network failure and service failure analytics from Network meta-data and IPDR Financial analysis-fraud, wastage and ROI estimation from sales and operational data Customer acquisition problem-Target marketing, customer segmentation and cross-sale from sales data Introduction and summary of all Big Data analytic products and where they fit into Telco analytic space Conclusion-how to take step-by-step approach to introduce Big Data Business Intelligence in your organization Target Audience Network operation, Financial Managers, CRM managers and top IT managers in Telco CIO office. Business Analysts in Telco CFO office managers/analysts Operational managers QA managers
BDATR Big Data Analytics for Telecom Regulators 16 godz. To meet compliance of the regulators, CSPs ( Communication service providers) can tap into Big Data Analytics which not only help them to meet compliance but within the scope of same project they can increase customer satisfaction and thus reduce the churn. In fact since compliance is related to Quality of service tied to a contract, any initiative towards meeting the compliance, will improve the “competitive edge” of the CSPs. Therefore, it is important that Regulators should be able to advise/guide a set of Big Data analytic practice for CSPs that will be of mutual benefit between the regulators and CSPs. 2 days of course : 8 modules, 2 hours each = 16 hours
samza Samza for stream processing 14 godz. Apache Samza is an open-source near-realtime, asynchronous computational framework for stream processing.  It uses Apache Kafka for messaging, and Apache Hadoop YARN for fault tolerance, processor isolation, security, and resource management. This instructor-led, live training introduces the principles behind messaging systems and distributed stream processing, while walking participants through the creation of a sample Samza-based project and job execution. By the end of this training, participants will be able to: Use Samza to simplify the code needed to produce and consume messages Decouple the handling of messages from an application Use Samza to implement near-realtime asynchronous computation Use stream processing to provide a higher level of abstraction over messaging systems Audience Developers Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
druid Druid: Build a fast, real-time data analysis system 21 godz. Druid is an open-source, column-oriented, distributed data store written in Java. It was designed to quickly ingest massive quantities of event data and execute low-latency OLAP queries on that data. Druid is commonly used in business intelligence applications to analyze high volumes of real-time and historical data. It is also well suited for powering fast, interactive, analytic dashboards for end-users. Druid is used by companies such as Alibaba, Airbnb, Cisco, eBay, Netflix, Paypal, and Yahoo. In this course we explore some of the limitations of data warehouse solutions and discuss how Druid can compliment those technologies to form a flexible and scalable streaming analytics stack. We walk through many examples, offering participants the chance to implement and test Druid-based solutions in a lab environment. Audience     Application developers     Software engineers     Technical consultants     DevOps professionals     Architecture engineers Format of the course     Part lecture, part discussion, heavy hands-on practice, occasional tests to gauge understanding
hbasedev HBase for Developers 21 godz. This course introduces HBase – a NoSQL store on top of Hadoop.  The course is intended for developers who will be using HBase to develop applications,  and administrators who will manage HBase clusters. We will walk a developer through HBase architecture and data modelling and application development on HBase. It will also discuss using MapReduce with HBase, and some administration topics, related to performance optimization. The course  is very  hands-on with lots of lab exercises. Duration : 3 days Audience : Developers  & Administrators
smtwebint Semantic Web Overview 7 godz. The Semantic Web is a collaborative movement led by the World Wide Web Consortium (W3C) that promotes common formats for data on the World Wide Web. The Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries.
mdlmrah Model MapReduce w implementacji oprogramowania Apache Hadoop 14 godz. Szkolenie skierowane jest do organizacji chcących wdrożyć rozwiązania pozwalające na przetwarzanie dużych zbiorów danych za pomocą klastrów.
ApacheIgnite Apache Ignite: Improve speed, scale and availability with in-memory computing 14 godz. Apache Ignite is an in-memory computing platform that sits between the application and data layer to improve speed, scale and availability. In this instructor-led, live training, participants will learn the principles behind persistent and pure in-memory storage as they step through the creation of a sample in-memory computing project. By the end of this training, participants will be able to: Use Ignite for in-memory, on-disk persistence as well as a purely distributed in-memory database Achieve persistence without syncing data back to a relational database Use Ignite to carry out SQL and distributed joins Improve performance by moving data closer to the CPU, using RAM as a storage Spread data sets across a cluster to achieve horizontal scalability Integrate Ignite with RDBMS, NoSQL, Hadoop and machine learning processors Audience Developers Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
zeppelin Zeppelin for interactive data analytics 14 godz. Apache Zeppelin is a web-based notebook for capturing, exploring, visualizing and sharing Hadoop and Spark based data. This instructor-led, live training introduces the concepts behind interactive data analytics and walks participants through the deployment and usage of Zeppelin in a single-user or multi-user environment. By the end of this training, participants will be able to: Install and configure Zeppelin Develop, organize, execute and share data in a browser-based interface Visualize results without referring to the command line or cluster details Execute and collaborate on long workflows Work with any of a number of plug-in language/data-processing-backends, such as Scala ( with Apache Spark ), Python ( with Apache Spark ), Spark SQL, JDBC, Markdown and Shell. Integrate Zeppelin with Spark, Flink and Map Reduce Secure multi-user instances of Zeppelin with Apache Shiro Audience Data engineers Data analysts Data scientists Software developers Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
accumulo Apache Accumulo: Building highly scalable big data applications 21 godz. Apache Accumulo is a sorted, distributed key/value store that provides robust, scalable data storage and retrieval. It is based on the design of Google's BigTable and is powered by Apache Hadoop, Apache Zookeeper, and Apache Thrift.   This courses covers the working principles behind Accumulo and walks participants through the development of a sample application on Apache Accumulo. Audience Application developers Software engineers Technical consultants Format of the course Part lecture, part discussion, hands-on development and implementation, occasional tests to gauge understanding
sparkdev Spark for Developers 21 godz. OBJECTIVE: This course will introduce Apache Spark. The students will learn how  Spark fits  into the Big Data ecosystem, and how to use Spark for data analysis.  The course covers Spark shell for interactive data analysis, Spark internals, Spark APIs, Spark SQL, Spark streaming, and machine learning and graphX. AUDIENCE : Developers / Data Analysts
d2dbdpa From Data to Decision with Big Data and Predictive Analytics 21 godz. Audience If you try to make sense out of the data you have access to or want to analyse unstructured data available on the net (like Twitter, Linked in, etc...) this course is for you. It is mostly aimed at decision makers and people who need to choose what data is worth collecting and what is worth analyzing. It is not aimed at people configuring the solution, those people will benefit from the big picture though. Delivery Mode During the course delegates will be presented with working examples of mostly open source technologies. Short lectures will be followed by presentation and simple exercises by the participants Content and Software used All software used is updated each time the course is run so we check the newest versions possible. It covers the process from obtaining, formatting, processing and analysing the data, to explain how to automate decision making process with machine learning.
TalendDI Talend Open Studio for Data Integration 28 godz. Talend Open Studio for Data Integration is an open-source data integration product used to combine, convert and update data in various locations across a business. In this instructor-led, live training, participants will learn how to use the Talend ETL tool to carry out data transformation, data extraction, and connectivity with Hadoop, Hive, and Pig.   By the end of this training, participants will be able to Explain the concepts behind ETL (Extract, Transform, Load) and propagation Define ETL methods and ETL tools to connect with Hadoop Efficiently amass, retrieve, digest, consume, transform and shape big data in accordance to business requirements Upload to and extract large records from Hadoop, Hive, and NoSQL databases Audience Business intelligence professionals Project managers Database professionals SQL Developers ETL Developers Solution architects Data architects Data warehousing professionals System administrators and integrators Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
sparkpython Python and Spark for Big Data (PySpark) 21 godz. Python is a high-level programming language famous for its clear syntax and code readibility. Spark is a data processing engine used in querying, analyzing, and transforming big data. PySpark allows users to interface Spark with Python. In this instructor-led, live training, participants will learn how to use Python and Spark together to analyze big data as they work on hands-on exercises. By the end of this training, participants will be able to: Learn how to use Spark with Python to analyze Big Data Work on exercises that mimic real world circumstances Use different tools and techniques for big data analysis using PySpark Audience Developers IT Professionals Data Scientists Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
magellan Magellan: Geospatial Analytics with on Spark 14 godz. Magellan is an open-source distributed execution engine for geospatial analytics on big data. Implemented on top of Apache Spark, it extends Spark SQL and provides a relational abstraction for geospatial analytics. This instructor-led, live training introduces the concepts and approaches for implementing geospacial analytics and walks participants through the creation of a predictive analysis application using Magellan on Spark. By the end of this training, participants will be able to: Efficiently query, parse and join geospatial datasets at scale Implement geospatial data in business intelligence and predictive analytics applications Use spatial context to extend the capabilities of mobile devices, sensors, logs, and wearables Audience Application developers Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
altdomexp Analytics Domain Expertise 7 godz. This course is part of the Data Scientist skill set (Domain: Analytics Domain Expertise).
solrdev Solr for Developers 21 godz. This course introduces students to the Solr platform. Through a combination of lecture, discussion and labs students will gain hands on experience configuring effective search and indexing. The class begins with basic Solr installation and configuration then teaches the attendees the search features of Solr. Students will gain experience with faceting, indexing and search relevance among other features central to the Solr platform. The course wraps up with a number of advanced topics including spell checking, suggestions, Multicore and SolrCloud. Duration: 3 days Audience: Developers, business users, administrators
pjfs Programowanie w języku F# 7 godz. Szkolenie skierowane jest do programistów, analityków oraz osób chcących poznać podstawy i możliwości języka F# w oparciu o platformę .NET.
nifi Apache NiFi for Administrators 21 godz. Apache NiFi (Hortonworks DataFlow) is a real-time integrated data logistics and simple event processing platform that enables the moving, tracking and automation of data between systems. It is written using flow-based programming and provides a web-based user interface to manage dataflows in real time. In this instructor-led, live training, participants will learn how to deploy and manage Apache NiFi in a live lab environment. By the end of this training, participants will be able to: Install and configure Apachi NiFi Source, transform and manage data from disparate, distributed data sources, including databases and big data lakes Automate dataflows Enable streaming analytics Apply various approaches for data ingestion Transform Big Data and into business insights Audience System administrators Data engineers Developers DevOps Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
bigdatabicriminal Big Data Business Intelligence for Criminal Intelligence Analysis 35 godz. Advances in technologies and the increasing amount of information are transforming how law enforcement is conducted. The challenges that Big Data pose are nearly as daunting as Big Data's promise. Storing data efficiently is one of these challenges; effectively analyzing it is another. In this instructor-led, live training, participants will learn the mindset with which to approach Big Data technologies, assess their impact on existing processes and policies, and implement these technologies for the purpose of identifying criminal activity and preventing crime. Case studies from law enforcement organizations around the world will be examined to gain insights on their adoption approaches, challenges and results. By the end of this training, participants will be able to: Combine Big Data technology with traditional data gathering processes to piece together a story during an investigation Implement industrial big data storage and processing solutions for data analysis Prepare a proposal for the adoption of the most adequate tools and processes for enabling a data-driven approach to criminal investigation Audience Law Enforcement specialists with a technical background Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
hdp Hortonworks Data Platform (HDP) for administrators 21 godz. Hortonworks Data Platform is an open-source Apache Hadoop support platform that provides a stable foundation for developing big data solutions on the Apache Hadoop ecosystem. This instructor-led live training introduces Hortonworks and walks participants through the deployment of Spark + Hadoop solution. By the end of this training, participants will be able to: Use Hortonworks to reliably run Hadoop at a large scale Unify Hadoop's security, governance, and operations capabilities with Spark's agile analytic workflows. Use Hortonworks to investigate, validate, certify and support each of the components in a Spark project Process different types of data, including structured, unstructured, in-motion, and at-rest. Audience Hadoop administrators Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
dmmlr Data Mining & Machine Learning with R 14 godz. R is an open-source free programming language for statistical computing, data analysis, and graphics. R is used by a growing number of managers and data analysts inside corporations and academia. R has a wide variety of packages for data mining.
hadoopadm1 Hadoop For Administrators 21 godz. Apache Hadoop is the most popular framework for processing Big Data on clusters of servers. In this three (optionally, four) days course, attendees will learn about the business benefits and use cases for Hadoop and its ecosystem, how to plan cluster deployment and growth, how to install, maintain, monitor, troubleshoot and optimize Hadoop. They will also practice cluster bulk data load, get familiar with various Hadoop distributions, and practice installing and managing Hadoop ecosystem tools. The course finishes off with discussion of securing cluster with Kerberos. “…The materials were very well prepared and covered thoroughly. The Lab was very helpful and well organized” — Andrew Nguyen, Principal Integration DW Engineer, Microsoft Online Advertising Audience Hadoop administrators Format Lectures and hands-on labs, approximate balance 60% lectures, 40% labs.
datamin Data Mining 21 godz. Course can be provided with any tools, including free open-source data mining software and applications
nifidev Apache NiFi for Developers 7 godz. Apache NiFi (Hortonworks DataFlow) is a real-time integrated data logistics and simple event processing platform that enables the moving, tracking and automation of data between systems. It is written using flow-based programming and provides a web-based user interface to manage dataflows in real time. In this instructor-led, live training, participants will learn the fundamentals of flow-based programming as they develop a number of demo extensions, components and processors using Apache NiFi. By the end of this training, participants will be able to: Understand NiFi's architecture and dataflow concepts Develop extensions using NiFi and third-party APIs Custom develop their own Apache Nifi processor Ingest and process real-time data from disparate and uncommon file formats and data sources Audience Developers Data engineers Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
graphcomputing Introduction to Graph Computing 28 godz. A large number of real world problems can be described in terms of graphs. For example, the Web graph, the social network graph, the train network graph and the language graph. These graphs tend to be extremely large; processing them requires a specialized set of tools and mindset referred to as graph computing. In this instructor-led, live training, participants will learn about the various technology offerings and implementations for processing graph data. The aim is to identify real-world objects, their characteristics and relationships, then model these relationships and process them as data using graph computing approaches. We start with a broad overview and narrow in on specific tools as we step through a series of case studies, hands-on exercises and live deployments. By the end of this training, participants will be able to: Understand how graph data is persisted and traversed Select the best framework for a given task (from graph databases to batch processing frameworks) Implement Hadoop, Spark, GraphX and Pregel to carry out graph computing across many machines in parallel View real-world big data problems in terms of graphs, processes and traversals Audience Developers Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
PentahoDI Pentaho Data Integration Fundamentals 21 godz. Pentaho Data Integration is an open-source data integration tool for defining jobs and data transformations. In this instructor-led, live training, participants will learn how to use Pentaho Data Integration's powerful ETL capabilities and rich GUI to manage an entire big data lifecycle, maximizing the value of data to the organization. By the end of this training, participants will be able to: Create, preview, and run basic data transformations containing steps and hops Configure and secure the Pentaho Enterprise Repository Harness disparate sources of data and generate a single, unified version of the truth in an analytics-ready format. Provide results to third-part applications for further processing Audience Data Analyst ETL developers Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
rprogda R Programming for Data Analysis 14 godz. This course is part of the Data Scientist skill set (Domain: Data and Technology)
dataar Data Analytics With R 21 godz. R is a very popular, open source environment for statistical computing, data analytics and graphics. This course introduces R programming language to students.  It covers language fundamentals, libraries and advanced concepts.  Advanced data analytics and graphing with real world data. Audience Developers / data analytics Duration 3 days Format Lectures and Hands-on
hadoopmapr Hadoop Administration on MapR 28 godz. Audience: This course is intended to demystify big data/hadoop technology and to show it is not difficult to understand.
matlabpredanalytics Matlab for Predictive Analytics 21 godz. Predictive analytics is the process of using data analytics to make predictions about the future. This process uses data along with data mining, statistics, and machine learning techniques to create a predictive model for forecasting future events. In this instructor-led, live training, participants will learn how to use Matlab to build predictive models and apply them to large sample data sets to predict future events based on the data. By the end of this training, participants will be able to: Create predictive models to analyze patterns in historical and transactional data Use predictive modeling to identify risks and opportunities Build mathematical models that capture important trends Use data to from devices and business systems to reduce waste, save time, or cut costs Audience Developers Engineers Domain experts Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
pythonmultipurpose Advanced Python 28 godz. In this instructor-led training, participants will learn advanced Python programming techniques, including how to apply this versatile language to solve problems in areas such as distributed applications, finance, data analysis and visualization, UI programming and maintenance scripting. Audience Developers Format of the course Part lecture, part discussion, exercises and heavy hands-on practice Notes If you wish to add, remove or customize any section or topic within this course, please contact us to arrange.
DM7 Getting started with DM7 21 godz. Audience Beginner or intermediate database developers Beginner or intermediate database administrators Programmers Format of the course Heavy emphasis on hands-on practice. Most of the concepts are learned through samples, exercises and hands-on development
hadoopba Hadoop for Business Analysts 21 godz. Apache Hadoop is the most popular framework for processing Big Data. Hadoop provides rich and deep analytics capability, and it is making in-roads in to tradional BI analytics world. This course will introduce an analyst to the core components of Hadoop eco system and its analytics Audience Business Analysts Duration three days Format Lectures and hands on labs.
68780 Apache Spark 14 godz.
flockdb Flockdb: A Simple Graph Database for Social Media 7 godz. FlockDB is an open source distributed, fault-tolerant graph database for managing wide but shallow network graphs. It was initially used by Twitter to store relationships among users. In this instructor-led, live training, participants will learn how to setup and use a FlockDB database to help answer social media questions such as who follows whom, who blocks whom, etc. By the end of this training, participants will be able to: Install and configure FlockDB Understand the unique features of FlockDB, relative to other graph databases such Neo4j Use FlockDB to maintain a large graph dataset Use FlockDB together with MySQL to provide provide distributed storage capabilities Query, create and update extremely fast graph edges Scale FlockDB horizontally for use in on-line, low-latency, high throughput web environments Audience Developers Database engineers Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
matlabfundamentalsfinance MATLAB Fundamentals + MATLAB for Finance 35 godz. This course provides a comprehensive introduction to the MATLAB technical computing environment + an introduction to using MATLAB for financial applications. The course is intended for beginning users and those looking for a review. No prior programming experience or knowledge of MATLAB is assumed. Themes of data analysis, visualization, modeling, and programming are explored throughout the course. Topics include: Working with the MATLAB user interface Entering commands and creating variables Analyzing vectors and matrices Visualizing vector and matrix data Working with data files Working with data types Automating commands with scripts Writing programs with logic and flow control Writing functions Using the Financial Toolbox for quantitative analysis
bigddbsysfun Big Data & Database Systems Fundamentals 14 godz. The course is part of the Data Scientist skill set (Domain: Data and Technology).
cassdev Cassandra for Developers 21 godz. This course will introduce Cassandra –  a popular NoSQL database.  It will cover Cassandra principles, architecture and data model.   Students will learn data modeling  in CQL (Cassandra Query Language) in hands-on, interactive labs.  This session also discusses Cassandra internals and some admin topics. Audience : Developers
68736 Hadoop for Developers 14 godz.
hypertable Hypertable: Deploy a BigTable like database 14 godz. Hypertable is an open-source software database management system based on the design of Google's Bigtable. In this instructor-led, live training, participants will learn how to set up and manage a Hypertable database system. By the end of this training, participants will be able to: Install, configure and upgrade a Hypertable instance Set up and administer a Hypertable cluster Monitor and optimize the performance of the database Design a Hypertable schema Work with Hypertable's API Troubleshoot operational issues Audience Developers Operations engineers Format of the course Part lecture, part discussion, exercises and heavy hands-on practice Part lecture, part discussion, exercises and heavy hands-on practice
kylin Apache Kylin: From classic OLAP to real-time data warehouse 14 godz. Apache Kylin is an extreme, distributed analytics engine for big data. In this instructor-led live training, participants will learn how to use Apache Kylin to set up a real-time data warehouse. By the end of this training, participants will be able to: Consume real-time streaming data using Kylin Utilize Apache Kylin's powerful features, including snowflake schema support, a rich SQL interface, spark cubing and subsecond query latency Note We use the latest version of Kylin (as of this writing, Apache Kylin v2.0) Audience Big data engineers Big Data analysts Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
rintrob Introductory R for Biologists 28 godz. R is an open-source free programming language for statistical computing, data analysis, and graphics. R is used by a growing number of managers and data analysts inside corporations and academia. R has also found followers among statisticians, engineers and scientists without computer programming skills who find it easy to use. Its popularity is due to the increasing use of data mining for various goals such as set ad prices, find new drugs more quickly or fine-tune financial models. R has a wide variety of packages for data mining.
hadoopdev Hadoop for Developers (4 days) 28 godz. Apache Hadoop is the most popular framework for processing Big Data on clusters of servers. This course will introduce a developer to various components (HDFS, MapReduce, Pig, Hive and HBase) Hadoop ecosystem.  
apachemdev Apache Mahout for Developers 14 godz. Audience Developers involved in projects that use machine learning with Apache Mahout. Format Hands on introduction to machine learning. The course is delivered in a lab format based on real world practical use cases.
kdbplusandq kdb+ and q: Analyze time series data 21 godz. kdb+ is an in-memory, column-oriented database and q is its built-in, interpreted vector-based language. In kdb+, tables are columns of vectors and q is used to perform operations on the table data as if it was a list. kdb+ and q are commonly used in high frequency trading and are popular with the major financial institutions, including Goldman Sachs, Morgan Stanley, Merrill Lynch, JP Morgan, etc. In this instructor-led, live training, participants will learn how to create a time series data application using kdb+ and q. By the end of this training, participants will be able to: Understand the difference between a row-oriented database and a column-oriented database Select data, write scripts and create functions to carry out advanced analytics Analyze time series data such as stock and commodity exchange data Use kdb+'s in-memory capabilities to store, analyze, process and retrieve large data sets at high speed Think of functions and data at a higher level than the standard function(arguments) approach common in non-vector languages Explore other time-sensitive applications for kdb+, including energy trading, telecommunications, sensor data, log data, and machine and network usage monitoring Audience Developers Database engineers Data scientists Data analysts Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
glusterfs GlusterFS for System Administrators 21 godz. GlusterFS is an open-source distributed file storage system that can scale up to petabytes of capacity. GlusterFS is designed to provide additional space depending on the user's storage requirements. A common application for GlusterFS is cloud computing storage systems. In this instructor-led training, participants will learn how to use normal, off-the-shelf hardware to create and deploy a storage system that is scalable and always available.  By the end of the course, participants will be able to: Install, configure, and maintain a full-scale GlusterFS system. Implement large-scale storage systems in different types of environments. Audience System administrators Storage administrators Format of the Course Part lecture, part discussion, exercises and heavy hands-on practice.
scylladb Scylla database 21 godz. Scylla is an open-source distributed NoSQL data store. It is compatible with Apache Cassandra but performs at significantly higher throughputs and lower latencies. In this course, participants will learn about Scylla's features and architecture while obtaining practical experience with setting up, administering, monitoring, and troubleshooting Scylla.   Audience     Database administrators     Developers     System Engineers Format of the course     The course is interactive and includes discussions of the principles and approaches for deploying and managing Scylla distributed databases and clusters. The course includes a heavy component of hands-on exercises and practice.
hadoopdeva Advanced Hadoop for Developers 21 godz. Apache Hadoop is one of the most popular frameworks for processing Big Data on clusters of servers. This course delves into data management in HDFS, advanced Pig, Hive, and HBase.  These advanced programming techniques will be beneficial to experienced Hadoop developers. Audience: developers Duration: three days Format: lectures (50%) and hands-on labs (50%).  
bigdatar Programming with Big Data in R 21 godz.
deckgl deck.gl: Visualizing Large-scale Geospatial Data 14 godz. deck.gl is an open-source, WebGL-powered library for exploring and visualizing data assets at scale. Created by Uber, it is especially useful for gaining insights from geospatial data sources, such as data on maps. This instructor-led, live training introduces the concepts and functionality behind deck.gl and walks participants through the set up of a demonstration project. By the end of this training, participants will be able to: Take data from very large collections and turn it into compelling visual representations Visualize data collected from transportation and journey-related use cases, such as pick-up and drop-off experiences, network traffic, etc. Apply layering techniques to geospatial data to depict changes in data over time Integrate deck.gl with React (for Reactive programming) and Mapbox GL (for visualizations on Mapbox based maps). Understand and explore other use cases for deck.gl, including visualizing points collected from a 3D indoor scan, visualizing machine learning models in order to optimize their algorithms, etc. Audience Developers Data scientists Format of the course Part lecture, part discussion, exercises and heavy hands-on practice
storm Apache Storm 28 godz. Apache Storm is a distributed, real-time computation engine used for enabling real-time business intelligence. It does so by enabling applications to reliably process unbounded streams of data (a.k.a. stream processing). "Storm is for real-time processing what Hadoop is for batch processing!" In this instructor-led live training, participants will learn how to install and configure Apache Storm, then develop and deploy an Apache Storm application for processing big data in real-time. Some of the topics included in this training include: Apache Storm in the context of Hadoop Working with unbounded data Continuous computation Real-time analytics Distributed RPC and ETL processing Request this course now! Audience Software and ETL developers Mainframe professionals Data scientists Big data analysts Hadoop professionals Format of the course     Part lecture, part discussion, exercises and heavy hands-on practice
kdd Knowledge Discover in Databases (KDD) 21 godz. Knowledge discovery in databases (KDD) is the process of discovering useful knowledge from a collection of data. Real-life applications for this data mining technique include marketing, fraud detection, telecommunication and manufacturing. In this course, we introduce the processes involved in KDD and carry out a series of exercises to practice the implementation of those processes. Audience     Data analysts or anyone interested in learning how to interpret data to solve problems Format of the course     After a theoretical discussion of KDD, the instructor will present real-life cases which call for the application of KDD to solve a problem. Participants will prepare, select and cleanse sample data sets and use their prior knowledge about the data to propose solutions based on the results of their observations.
bigdatastore Big Data Storage Solution - NoSQL 14 godz. When traditional storage technologies don't handle the amount of data you need to store there are hundereds of alternatives. This course try to guide the participants what are alternatives for storing and analyzing Big Data and what are theirs pros and cons. This course is mostly focused on discussion and presentation of solutions, though hands-on exercises are available on demand.
psr Podstawy systemów rekomendacyjnych 7 godz. Szkolenie skierowane jest dla pracowników działów marketingu oraz leaderów działów IT.

Najbliższe szkolenia

Other regions

Szkolenie Big Data, Big Data boot camp, Szkolenia Zdalne Big Data, szkolenie wieczorowe Big Data, szkolenie weekendowe Big Data , edukacja zdalna Big Data, nauka przez internet Big Data, kurs zdalny Big Data, lekcje UML, kurs online Big Data,Kursy Big Data, Trener Big Data, nauczanie wirtualne Big Data, wykładowca Big Data , instruktor Big Data,Kurs Big Data

Kursy w promocyjnej cenie

Szkolenie Miejscowość Data Kursu Cena szkolenia [Zdalne / Stacjonarne]
Natural Language Processing with Python Warszawa, ul. Złota 3/11 wt., 2018-02-27 09:00 3950PLN / 5200PLN
WordPress Gdańsk, ul. Grodzka 19 śr., 2018-02-28 09:00 2960PLN / 3960PLN
Symfony 3 Lublin, ul. Spadochroniarzy 9 śr., 2018-02-28 09:00 2000PLN / 3000PLN
Introduction to AUTOSAR RTE for Automotive Software Professionals Kraków, ul. Rzemieślnicza 1 śr., 2018-02-28 09:00 1970PLN / 2720PLN
Enterprise Architecture Overview Warszawa, ul. Złota 3/11 czw., 2018-03-01 09:00 1970PLN / 2720PLN
Techniki DTP (InDesign, Photoshop, Illustrator, Acrobat) Katowice ul. Opolska 22 pon., 2018-03-05 09:00 2150PLN / 3650PLN
Adobe Photoshop Katowice ul. Opolska 22 pon., 2018-03-05 09:00 850PLN / 1600PLN
Test Automation with Selenium and Jenkins Wrocław, ul.Ludwika Rydygiera 2a/22 pon., 2018-03-05 09:00 4940PLN / 5940PLN
Visual Basic for Applications (VBA) w Excel - wstęp do programowania Katowice ul. Opolska 22 pon., 2018-03-05 09:00 3560PLN / 4810PLN
Microsoft SQL Server 2008/2012 (MSSQL) Gdańsk, ul. Grodzka 19 wt., 2018-03-06 09:00 1970PLN / 2720PLN
Microsoft Office Excel - poziom podstawowy Opole, Władysława Reymonta 29 wt., 2018-03-06 09:00 850PLN / 1600PLN
Adobe Illustrator Katowice ul. Opolska 22 śr., 2018-03-07 09:00 850PLN / 1600PLN
Techniki DTP (InDesign, Photoshop, Illustrator, Acrobat) Gdańsk, ul. Grodzka 19 czw., 2018-03-08 09:00 2130PLN / 3630PLN
Spring Cloud: Building Microservices with Spring Cloud Gdańsk, ul. Grodzka 19 czw., 2018-03-15 09:00 1970PLN / 2720PLN
Certified Agile Tester Katowice ul. Opolska 22 pon., 2018-04-02 09:00 8910PLN / 10410PLN
Comprehensive Git Gdańsk, ul. Grodzka 19 śr., 2018-04-04 09:00 2170PLN / 3170PLN
Perfect tester Szczecin, ul. Sienna 9 śr., 2018-04-04 09:00 1790PLN / 2540PLN
Kontrola jakości i ciągła integracja Katowice ul. Opolska 22 czw., 2018-04-12 09:00 2670PLN / 3420PLN
Protokół SIP w VoIP Rzeszów, Plac Wolności 13 wt., 2018-04-17 09:00 2960PLN / 3960PLN
UML for the IT Business Analyst Gdańsk, ul. Grodzka 19 śr., 2018-04-25 09:00 4940PLN / 5940PLN
Microsoft Office Excel - poziom zaawansowany Katowice ul. Opolska 22 pon., 2018-04-30 09:00 1280PLN / 2280PLN
Oracle 12c – wprowadzenie do języka SQL Łódź, ul. Tatrzańska 11 wt., 2018-06-12 09:00 3960PLN / 4710PLN

Newsletter z promocjami

Zapisz się na nasz newsletter i otrzymuj informacje o aktualnych zniżkach na kursy otwarte.
Szanujemy Twoją prywatność, dlatego Twój e-mail będzie wykorzystywany jedynie w celu wysyłki naszego newslettera, nie będzie udostępniony ani sprzedany osobom trzecim.
W dowolnej chwili możesz zmienić swoje preferencje co do otrzymywanego newslettera bądź całkowicie się z niego wypisać.

Zaufali nam