Big Data Training Courses in the Philippines

Big Data Training Courses

Online or onsite, instructor-led live Big Data training courses start with an introduction to elemental concepts of Big Data, then progress into the programming languages and methodologies used to perform Data Analysis. Tools and infrastructure for enabling Big Data storage, Distributed Processing, and Scalability are discussed, compared and implemented in demo practice sessions.

Big Data training is available as "online live training" or "onsite live training". Online live training (aka "remote live training") is carried out by way of an interactive, remote desktop. Onsite live Big Data trainings in the Philippines can be carried out locally on customer premises or in NobleProg corporate training centers.

NobleProg -- Your Local Training Provider

Testimonials

★★★★★
★★★★★

Big Data Course Outlines in the Philippines

Course Name
Duration
Overview
Course Name
Duration
Overview
14 hours
Overview
Goal:

Learning to work with SPSS at the level of independence

The addressees:

Analysts, researchers, scientists, students and all those who want to acquire the ability to use SPSS package and learn popular data mining techniques.
28 hours
Overview
MonetDB is an open-source database that pioneered the column-store technology approach.

In this instructor-led, live training, participants will learn how to use MonetDB and how to get the most value out of it.

By the end of this training, participants will be able to:

- Understand MonetDB and its features
- Install and get started with MonetDB
- Explore and perform different functions and tasks in MonetDB
- Accelerate the delivery of their project by maximizing MonetDB capabilities

Audience

- Developers
- Technical experts

Format of the course

- Part lecture, part discussion, exercises and heavy hands-on practice
7 hours
Overview
Spark SQL is Apache Spark's module for working with structured and unstructured data. Spark SQL provides information about the structure of the data as well as the computation being performed. This information can be used to perform optimizations. Two common uses for Spark SQL are:
- to execute SQL queries.
- to read data from an existing Hive installation.

In this instructor-led, live training (onsite or remote), participants will learn how to analyze various types of data sets using Spark SQL.

By the end of this training, participants will be able to:

- Install and configure Spark SQL.
- Perform data analysis using Spark SQL.
- Query data sets in different formats.
- Visualize data and query results.

Format of the Course

- Interactive lecture and discussion.
- Lots of exercises and practice.
- Hands-on implementation in a live-lab environment.

Course Customization Options

- To request a customized training for this course, please contact us to arrange.
7 hours
Overview
The objective of the course is to enable participants to gain a mastery of the fundamentals of R and how to work with data.
14 hours
Overview
Objective : This training course aims at helping attendees understand why Big Data is changing our lives and how it is altering the way businesses see us as consumers. Indeed, users of big data in businesses find that big data unleashes a wealth of information and insights which translate to higher profits, reduced costs, and less risk. However, the downside was frustration sometimes when putting too much emphasis on individual technologies and not enough focus on the pillars of big data management.

Attendees will learn during this course how to manage the big data using its three pillars of data integration, data governance and data security in order to turn big data into real business value. Different exercices conducted on a case study of customer management will help attendees to better understand the underlying processes.
14 hours
Overview
Apache Hama is a framework based on the Bulk Synchronous Parallel (BSP) computing model and is primarily used for Big Data analytics.

In this instructor-led, live training, participants will learn the fundamentals of Apache Hama as they step through the creation of a BSP-based application and a vertex-centric program using the Apache Hama frameworks.

By the end of this training, participants will be able to:

- Install and configure Apache Hama
- Understand the fundamentals of Apache Hama and the Bulk Synchronous Parallel (BSP) programming model
- Build a BSP-based program using Apache Hama BSP framework
- Build a vertex-centric program using Apache Hama Graph Framework
- Build, test, and debug their own Apache Hama applications

Audience

- Developers

Format of the course

- Part lecture, part discussion, exercises and heavy hands-on practice

Note

- To request a customized training for this course, please contact us to arrange.
14 hours
Overview
This classroom based training session will explore Big Data. Delegates will have computer based examples and case study exercises to undertake with relevant big data tools
7 hours
Overview
Kafka Streams is a client-side library for building applications and microservices whose data is passed to and from a Kafka messaging system. Traditionally, Apache Kafka has relied on Apache Spark or Apache Storm to process data between message producers and consumers. By calling the Kafka Streams API from within an application, data can be processed directly within Kafka, bypassing the need for sending the data to a separate cluster for processing.

In this instructor-led, live training, participants will learn how to integrate Kafka Streams into a set of sample Java applications that pass data to and from Apache Kafka for stream processing.

By the end of this training, participants will be able to:

- Understand Kafka Streams features and advantages over other stream processing frameworks
- Process stream data directly within a Kafka cluster
- Write a Java or Scala application or microservice that integrates with Kafka and Kafka Streams
- Write concise code that transforms input Kafka topics into output Kafka topics
- Build, package and deploy the application

Audience

- Developers

Format of the course

- Part lecture, part discussion, exercises and heavy hands-on practice

Notes

- To request a customized training for this course, please contact us to arrange
21 hours
Overview
Dremio is an open-source "self-service data platform" that accelerates the querying of different types of data sources. Dremio integrates with relational databases, Apache Hadoop, MongoDB, Amazon S3, ElasticSearch, and other data sources. It supports SQL and provides a web UI for building queries.

In this instructor-led, live training, participants will learn how to install, configure and use Dremio as a unifying layer for data analysis tools and the underlying data repositories.

By the end of this training, participants will be able to:

- Install and configure Dremio
- Execute queries against multiple data sources, regardless of location, size, or structure
- Integrate Dremio with BI and data sources such as Tableau and Elasticsearch

Audience

- Data scientists
- Business analysts
- Data engineers

Format of the course

- Part lecture, part discussion, exercises and heavy hands-on practice

Notes

- To request a customized training for this course, please contact us to arrange.
7 hours
Overview
Apache Drill is a schema-free, distributed, in-memory columnar SQL query engine for Hadoop, NoSQL and other Cloud and file storage systems. The power of Apache Drill lies in its ability to join data from multiple data stores using a single query. Apache Drill supports numerous NoSQL databases and file systems, including HBase, MongoDB, MapR-DB, HDFS, MapR-FS, Amazon S3, Azure Blob Storage, Google Cloud Storage, Swift, NAS and local files. Apache Drill is the open source version of Google's Dremel system which is available as an infrastructure service called Google BigQuery.

In this instructor-led, live training, participants will learn how to optimize and debug Apache Drill to improve the performance of queries on very large data sets. The course begins with an architectural overview and feature comparison between Apache Drill and other interactive data analysis tools. Participants then step through a series of interactive, hands-on practice sessions that include installation, configuration, performance evaluation, query optimization, data partitioning, and debugging of an Apache Drill instance in a live lab environment.

By the end of this training, participants will be able to:

- Install and configure Apache Drill
- Understand Apache Drill's architecture and features
- Understand how Apache Drills receives and executes queries
- Optimize Drill queries for distributed SQL execution
- Debug Apache Drill

Audience

- Developers
- Systems administrators
- Data analysts

Format of the course

- Part lecture, part discussion, exercises and heavy hands-on practice

Notes

- To request a customized training for this course, please contact us to arrange.
7 hours
Overview
In this instructor-led, live training, participants will learn the core concepts behind MapR Stream Architecture as they develop a real-time streaming application.

By the end of this training, participants will be able to build producer and consumer applications for real-time stream data procesing.

Audience

- Developers
- Administrators

Format of the course

- Part lecture, part discussion, exercises and heavy hands-on practice

Note

- To request a customized training for this course, please contact us to arrange.
21 hours
Overview
Apache Drill is a schema-free, distributed, in-memory columnar SQL query engine for Hadoop, NoSQL and other Cloud and file storage systems. The power of Apache Drill lies in its ability to join data from multiple data stores using a single query. Apache Drill supports numerous NoSQL databases and file systems, including HBase, MongoDB, MapR-DB, HDFS, MapR-FS, Amazon S3, Azure Blob Storage, Google Cloud Storage, Swift, NAS and local files. Apache Drill is the open source version of Google's Dremel system which is available as an infrastructure service called Google BigQuery.

In this instructor-led, live training, participants will learn the fundamentals of Apache Drill, then leverage the power and convenience of SQL to interactively query big data across multiple data sources, without writing code. Participants will also learn how to optimize their Drill queries for distributed SQL execution.

By the end of this training, participants will be able to:

- Perform "self-service" exploration on structured and semi-structured data on Hadoop
- Query known as well as unknown data using SQL queries
- Understand how Apache Drills receives and executes queries
- Write SQL queries to analyze different types of data, including structured data in Hive, semi-structured data in HBase or MapR-DB tables, and data saved in files such as Parquet and JSON.
- Use Apache Drill to perform on-the-fly schema discovery, bypassing the need for complex ETL and schema operations
- Integrate Apache Drill with BI (Business Intelligence) tools such as Tableau, Qlikview, MicroStrategy and Excel

Audience

- Data analysts
- Data scientists
- SQL programmers

Format of the course

- Part lecture, part discussion, exercises and heavy hands-on practice
28 hours
Overview
MemSQL is an in-memory, distributed, SQL database management system for cloud and on-premises. It's a real-time data warehouse that immediately delivers insights from live and historical data.

In this instructor-led, live training, participants will learn the essentials of MemSQL for development and administration.

By the end of this training, participants will be able to:

- Understand the key concepts and characteristics of MemSQL
- Install, design, maintain, and operate MemSQL
- Optimize schemas in MemSQL
- Improve queries in MemSQL
- Benchmark performance in MemSQL
- Build real-time data applications using MemSQL

Audience

- Developers
- Administrators
- Operation Engineers

Format of the course

- Part lecture, part discussion, exercises and heavy hands-on practice
21 hours
Overview
Amazon Redshift is a petabyte-scale cloud-based data warehouse service in AWS.

In this instructor-led, live training, participants will learn the fundamentals of Amazon Redshift.

By the end of this training, participants will be able to:

- Install and configure Amazon Redshift
- Load, configure, deploy, query, and visualize data with Amazon Redshift

Audience

- Developers
- IT Professionals

Format of the course

- Part lecture, part discussion, exercises and heavy hands-on practice

Note

- To request a customized training for this course, please contact us to arrange.
28 hours
Overview
Hadoop is a popular Big Data processing framework. Python is a high-level programming language famous for its clear syntax and code readibility.

In this instructor-led, live training, participants will learn how to work with Hadoop, MapReduce, Pig, and Spark using Python as they step through multiple examples and use cases.

By the end of this training, participants will be able to:

- Understand the basic concepts behind Hadoop, MapReduce, Pig, and Spark
- Use Python with Hadoop Distributed File System (HDFS), MapReduce, Pig, and Spark
- Use Snakebite to programmatically access HDFS within Python
- Use mrjob to write MapReduce jobs in Python
- Write Spark programs with Python
- Extend the functionality of pig using Python UDFs
- Manage MapReduce jobs and Pig scripts using Luigi

Audience

- Developers
- IT Professionals

Format of the course

- Part lecture, part discussion, exercises and heavy hands-on practice
21 hours
Overview
In this instructor-led, live training in the Philippines, participants will learn how to use Python and Spark together to analyze big data as they work on hands-on exercises.

By the end of this training, participants will be able to:

- Learn how to use Spark with Python to analyze Big Data.
- Work on exercises that mimic real world cases.
- Use different tools and techniques for big data analysis using PySpark.
35 hours
Overview
Advances in technologies and the increasing amount of information are transforming how law enforcement is conducted. The challenges that Big Data pose are nearly as daunting as Big Data's promise. Storing data efficiently is one of these challenges; effectively analyzing it is another.

In this instructor-led, live training, participants will learn the mindset with which to approach Big Data technologies, assess their impact on existing processes and policies, and implement these technologies for the purpose of identifying criminal activity and preventing crime. Case studies from law enforcement organizations around the world will be examined to gain insights on their adoption approaches, challenges and results.

By the end of this training, participants will be able to:

- Combine Big Data technology with traditional data gathering processes to piece together a story during an investigation
- Implement industrial big data storage and processing solutions for data analysis
- Prepare a proposal for the adoption of the most adequate tools and processes for enabling a data-driven approach to criminal investigation

Audience

- Law Enforcement specialists with a technical background

Format of the course

- Part lecture, part discussion, exercises and heavy hands-on practice
14 hours
Overview
To meet compliance of the regulators, CSPs (Communication service providers) can tap into Big Data Analytics which not only help them to meet compliance but within the scope of same project they can increase customer satisfaction and thus reduce the churn. In fact since compliance is related to Quality of service tied to a contract, any initiative towards meeting the compliance, will improve the “competitive edge” of the CSPs. Therefore, it is important that Regulators should be able to advise/guide a set of Big Data analytic practice for CSPs that will be of mutual benefit between the regulators and CSPs.

The course consists of 8 modules (4 on day 1, and 4 on day 2)
28 hours
Overview
In this instructor-led, live training in the Philippines, participants will learn about the technology offerings and implementation approaches for processing graph data. The aim is to identify real-world objects, their characteristics and relationships, then model these relationships and process them as data using a Graph Computing (also known as Graph Analytics) approach. We start with a broad overview and narrow in on specific tools as we step through a series of case studies, hands-on exercises and live deployments.

By the end of this training, participants will be able to:

- Understand how graph data is persisted and traversed.
- Select the best framework for a given task (from graph databases to batch processing frameworks.)
- Implement Hadoop, Spark, GraphX and Pregel to carry out graph computing across many machines in parallel.
- View real-world big data problems in terms of graphs, processes and traversals.
7 hours
Overview
In this instructor-led, live training in the Philippines, participants will learn the fundamentals of flow-based programming as they develop a number of demo extensions, components and processors using Apache NiFi.

By the end of this training, participants will be able to:

- Understand NiFi's architecture and dataflow concepts.
- Develop extensions using NiFi and third-party APIs.
- Custom develop their own Apache Nifi processor.
- Ingest and process real-time data from disparate and uncommon file formats and data sources.
21 hours
Overview
In this instructor-led, live training in the Philippines (onsite or remote), participants will learn how to deploy and manage Apache NiFi in a live lab environment.

By the end of this training, participants will be able to:

- Install and configure Apachi NiFi.
- Source, transform and manage data from disparate, distributed data sources, including databases and big data lakes.
- Automate dataflows.
- Enable streaming analytics.
- Apply various approaches for data ingestion.
- Transform Big Data and into business insights.
14 hours
Overview
Apache SolrCloud is a distributed data processing engine that facilitates the searching and indexing of files on a distributed network.

In this instructor-led, live training, participants will learn how to set up a SolrCloud instance on Amazon AWS.

By the end of this training, participants will be able to:

- Understand SolCloud's features and how they compare to those of conventional master-slave clusters
- Configure a SolCloud centralized cluster
- Automate processes such as communicating with shards, adding documents to the shards, etc.
- Use Zookeeper in conjunction with SolrCloud to further automate processes
- Use the interface to manage error reporting
- Load balance a SolrCloud installation
- Configure SolrCloud for continuous processing and fail-over

Audience

- Solr Developers
- Project Managers
- System Administrators
- Search Analysts

Format of the course

- Part lecture, part discussion, exercises and heavy hands-on practice
28 hours
Overview
In this instructor-led, live training in the Philippines, participants will learn how to build a Data Vault.

By the end of this training, participants will be able to:

- Understand the architecture and design concepts behind Data Vault 2.0, and its interaction with Big Data, NoSQL and AI.
- Use data vaulting techniques to enable auditing, tracing, and inspection of historical data in a data warehouse.
- Develop a consistent and repeatable ETL (Extract, Transform, Load) process.
- Build and deploy highly scalable and repeatable warehouses.
14 hours
Overview
Datameer is a business intelligence and analytics platform built on Hadoop. It allows end-users to access, explore and correlate large-scale, structured, semi-structured and unstructured data in an easy-to-use fashion.

In this instructor-led, live training, participants will learn how to use Datameer to overcome Hadoop's steep learning curve as they step through the setup and analysis of a series of big data sources.

By the end of this training, participants will be able to:

- Create, curate, and interactively explore an enterprise data lake
- Access business intelligence data warehouses, transactional databases and other analytic stores
- Use a spreadsheet user-interface to design end-to-end data processing pipelines
- Access pre-built functions to explore complex data relationships
- Use drag-and-drop wizards to visualize data and create dashboards
- Use tables, charts, graphs, and maps to analyze query results

Audience

- Data analysts

Format of the course

- Part lecture, part discussion, exercises and heavy hands-on practice
21 hours
Overview
Stream Processing refers to the real-time processing of "data in motion", that is, performing computations on data as it is being received. Such data is read as continuous streams from data sources such as sensor events, website user activity, financial trades, credit card swipes, click streams, etc. Stream Processing frameworks are able to read large volumes of incoming data and provide valuable insights almost instantaneously.

In this instructor-led, live training (onsite or remote), participants will learn how to set up and integrate different Stream Processing frameworks with existing big data storage systems and related software applications and microservices.

By the end of this training, participants will be able to:

- Install and configure different Stream Processing frameworks, such as Spark Streaming and Kafka Streaming.
- Understand and select the most appropriate framework for the job.
- Process of data continuously, concurrently, and in a record-by-record fashion.
- Integrate Stream Processing solutions with existing databases, data warehouses, data lakes, etc.
- Integrate the most appropriate stream processing library with enterprise applications and microservices.

Audience

- Developers
- Software architects

Format of the Course

- Part lecture, part discussion, exercises and heavy hands-on practice

Notes

- To request a customized training for this course, please contact us to arrange.
28 hours
Overview
Pentaho Open Source BI Suite Community Edition (CE) is a business intelligence package that provides data integration, reporting, dashboards, and load capabilities.

In this instructor-led, live training, participants will learn how to maximize the features of Pentaho Open Source BI Suite Community Edition (CE).

By the end of this training, participants will be able to:

- Install and configure Pentaho Open Source BI Suite Community Edition (CE)
- Understand the fundamentals of Pentaho CE tools and their features
- Build reports using Pentaho CE
- Integrate third party data into Pentaho CE
- Work with big data and analytics in Pentaho CE

Audience

- Programmers
- BI Developers

Format of the course

- Part lecture, part discussion, exercises and heavy hands-on practice

Note

- To request a customized training for this course, please contact us to arrange.
14 hours
Overview
In this instructor-led, live training in the Philippines, participants will learn the principles behind persistent and pure in-memory storage as they step through the creation of a sample in-memory computing project.

By the end of this training, participants will be able to:

- Use Ignite for in-memory, on-disk persistence as well as a purely distributed in-memory database.
- Achieve persistence without syncing data back to a relational database.
- Use Ignite to carry out SQL and distributed joins.
- Improve performance by moving data closer to the CPU, using RAM as a storage.
- Spread data sets across a cluster to achieve horizontal scalability.
- Integrate Ignite with RDBMS, NoSQL, Hadoop and machine learning processors.
21 hours
Overview
This instructor-led, live training in the Philippines (online or onsite) is aimed at software engineers who wish to stream big data with Spark Streaming and Scala.

By the end of this training, participants will be able to:

- Create Spark applications with the Scala programming language.
- Use Spark Streaming to process continuous streams of data.
- Process streams of real-time data with Spark Streaming.
7 hours
Overview
This instructor-led, live training in the Philippines (online or onsite) is aimed at data engineers, data scientists, and programmers who wish to use Apache Kafka features in data streaming with Python.

By the end of this training, participants will be able to use Apache Kafka to monitor and manage conditions in continuous data streams using Python programming.
28 hours
Overview
This instructor-led, live training in the Philippines (online or onsite) is aimed at technical persons who wish to deploy Talend Open Studio for Big Data to simplifying the process of reading and crunching through Big Data.

By the end of this training, participants will be able to:

- Install and configure Talend Open Studio for Big Data.
- Connect with Big Data systems such as Cloudera, HortonWorks, MapR, Amazon EMR and Apache.
- Understand and set up Open Studio's big data components and connectors.
- Configure parameters to automatically generate MapReduce code.
- Use Open Studio's drag-and-drop interface to run Hadoop jobs.
- Prototype big data pipelines.
- Automate big data integration projects.

Upcoming Big Data Courses in the Philippines

Online Big Data courses, Weekend Big Data courses, Evening Big Data training, Big Data boot camp, Big Data instructor-led, Weekend Big Data training, Evening Big Data courses, Big Data coaching, Big Data instructor, Big Data trainer, Big Data training courses, Big Data classes, Big Data on-site, Big Data private courses, Big Data one on one training

Course Discounts

Course Discounts Newsletter

We respect the privacy of your email address. We will not pass on or sell your address to others.
You can always change your preferences or unsubscribe completely.

Some of our clients

is growing fast!

We are looking to expand our presence in the Philippines!

As a Business Development Manager you will:

  • expand business in the Philippines
  • recruit local talent (sales, agents, trainers, consultants)
  • recruit local trainers and consultants

We offer:

  • Artificial Intelligence and Big Data systems to support your local operation
  • high-tech automation
  • continuously upgraded course catalogue and content
  • good fun in international team

If you are interested in running a high-tech, high-quality training and consulting business.

Apply now!

This site in other countries/regions