Certified Data Engineering Professional

Master the art of data engineering with our expertly designed program that covers everything from SQL proficiency to advanced big data concepts. Gain unparalleled insights into modernizing data infrastructures while learning from industry leaders. Enroll now to transform your career trajectory with cutting-edge skills tailored for today's digital landscape.

Face-to-Face Jul 14-17, 2025 9:00 AM - 5:00 PM Tarun Sukhani
updated
beginner
Certified Data Engineering Professional
We price match

Public Pricing

MYR 7000

Corporate Pricing

Pax:

Training Fees: MYR 6500/day
Total Fees: MYR 26000 ++

Training Provider Pricing

Pax:

Training Fees: MYR 9600
Material Fees: MYR 400
Total Fees: MYR 10000

Certification

Certified Data Engineering Professional
Certified Data Engineering Professional
CCSD
Validity: 2 years
Price: $149.00

Features

4 days
28 modules
11 intakes
English

Subsidies

HRDC Claimable logo

What you'll learn

  • Understand and implement both relational and NoSQL data models.
  • Learn to automate robust data pipelines using Apache Airflow.
  • Work with diverse NoSQL databases such as Cassandra, Riak, Redis, Neo4j, and Elasticsearch.
  • Develop proficiency in SQL using PostgreSQL for effective database management.
  • Gain expertise in business intelligence tools like Pentaho for enhanced decision-making.
  • Optimize performance in Spark-based environments for efficient data processing.
  • Explore big data fundamentals including HDFS and MapReduce.

Why should you attend?

This course offers a comprehensive exploration of data engineering concepts, focusing on the practical application of SQL and PostgreSQL to build fluency in database management. Participants will learn to create relational data models and understand the principles of normalization, providing a solid foundation for efficient data handling. The curriculum delves into data modeling, contrasting SQL with NoSQL data models, and guides learners through implementing denormalized schemas such as STAR and Snowflake. Additionally, students will gain hands-on experience creating NoSQL databases using Apache Cassandra. Business intelligence and data warehousing are covered extensively, with modules on implementing data warehouses on AWS and building multi-dimensional cubes using Pentaho. The course also introduces SparkSQL, DataFrames, and Datasets, emphasizing their use over traditional RDDs and exploring Spark MLLib for machine learning applications. Learners will explore the power of Spark in managing data lakes, including techniques for debugging and optimization. The course highlights the importance of modernizing data lakes and warehouses to enhance business operations through successful data pipelines. Automation is another key focus area, where participants will create data pipelines with Apache Airflow while ensuring data quality and tracking lineage. The fundamentals of big data are addressed through topics like HDFS, MapReduce in Hadoop, and working with various Hadoop ecosystem components such as Hive and HBase. Finally, the course covers working with Cassandra and other common NoSQL databases like Riak, Redis, Neo4j, and Elasticsearch. It concludes with an introduction to MapReduce architecture, detailing its phases and benefits.

Course Syllabus

Introduction to SQL and PostgreSQL
Build fluency in SQL using PostgreSQL
Short Break
15 mins
Short Break
15 mins
Recap and Q&A
15 mins
Lunch
1 hour
Short Break
15 mins
Short Break
15 mins
Short Break
15 mins
Recap and Q&A
15 mins
End of Day 1
Short Break
15 mins
Short Break
15 mins
Recap and Q&A
15 mins
Lunch
1 hour
Short Break
15 mins
Short Break
15 mins
Short Break
15 mins
Recap and Q&A
15 mins
End of Day 2
Short Break
15 mins
Short Break
15 mins
Recap and Q&A
15 mins
Lunch
1 hour
Short Break
15 mins
Short Break
15 mins
Short Break
15 mins
Recap and Q&A
15 mins
End of Day 3
Short Break
15 mins
Short Break
15 mins
Recap and Q&A
15 mins
Lunch
1 hour
Short Break
15 mins
Short Break
15 mins
Short Break
15 mins
Recap and Q&A
15 mins
End of Day 4

Minimum Qualification

graduate

Target Audience

entry level
engineers

Methodologies

lecture
slides
case studies
labs
group discussion
q&A

Why should you attend?

This course offers a comprehensive exploration of data engineering concepts, focusing on the practical application of SQL and PostgreSQL to build fluency in database management. Participants will learn to create relational data models and understand the principles of normalization, providing a solid foundation for efficient data handling. The curriculum delves into data modeling, contrasting SQL with NoSQL data models, and guides learners through implementing denormalized schemas such as STAR and Snowflake. Additionally, students will gain hands-on experience creating NoSQL databases using Apache Cassandra. Business intelligence and data warehousing are covered extensively, with modules on implementing data warehouses on AWS and building multi-dimensional cubes using Pentaho. The course also introduces SparkSQL, DataFrames, and Datasets, emphasizing their use over traditional RDDs and exploring Spark MLLib for machine learning applications. Learners will explore the power of Spark in managing data lakes, including techniques for debugging and optimization. The course highlights the importance of modernizing data lakes and warehouses to enhance business operations through successful data pipelines. Automation is another key focus area, where participants will create data pipelines with Apache Airflow while ensuring data quality and tracking lineage. The fundamentals of big data are addressed through topics like HDFS, MapReduce in Hadoop, and working with various Hadoop ecosystem components such as Hive and HBase. Finally, the course covers working with Cassandra and other common NoSQL databases like Riak, Redis, Neo4j, and Elasticsearch. It concludes with an introduction to MapReduce architecture, detailing its phases and benefits.

What you'll learn

  • Understand and implement both relational and NoSQL data models.
  • Learn to automate robust data pipelines using Apache Airflow.
  • Work with diverse NoSQL databases such as Cassandra, Riak, Redis, Neo4j, and Elasticsearch.
  • Develop proficiency in SQL using PostgreSQL for effective database management.
  • Gain expertise in business intelligence tools like Pentaho for enhanced decision-making.
  • Optimize performance in Spark-based environments for efficient data processing.
  • Explore big data fundamentals including HDFS and MapReduce.

Course Syllabus

Introduction to SQL and PostgreSQL
Build fluency in SQL using PostgreSQL
Short Break
15 mins
Short Break
15 mins
Recap and Q&A
15 mins
Lunch
1 hour
Short Break
15 mins
Short Break
15 mins
Short Break
15 mins
Recap and Q&A
15 mins
End of Day 1
Short Break
15 mins
Short Break
15 mins
Recap and Q&A
15 mins
Lunch
1 hour
Short Break
15 mins
Short Break
15 mins
Short Break
15 mins
Recap and Q&A
15 mins
End of Day 2
Short Break
15 mins
Short Break
15 mins
Recap and Q&A
15 mins
Lunch
1 hour
Short Break
15 mins
Short Break
15 mins
Short Break
15 mins
Recap and Q&A
15 mins
End of Day 3
Short Break
15 mins
Short Break
15 mins
Recap and Q&A
15 mins
Lunch
1 hour
Short Break
15 mins
Short Break
15 mins
Short Break
15 mins
Recap and Q&A
15 mins
End of Day 4
We price match

Public Pricing

MYR 7000

Corporate Pricing

Pax:

Training Fees: MYR 6500/day
Total Fees: MYR 26000 ++

Training Provider Pricing

Pax:

Training Fees: MYR 9600
Material Fees: MYR 400
Total Fees: MYR 10000

Certification

Certified Data Engineering Professional
Certified Data Engineering Professional
CCSD
Validity: 2 years
Price: $149.00

Features

4 days
28 modules
11 intakes
English

Subsidies

HRDC Claimable logo

Minimum Qualification

graduate

Target Audience

entry level
engineers

Methodologies

lecture
slides
case studies
labs
group discussion
q&A
Close menu