Hadoop Administration

Master the art of Hadoop administration with our expertly crafted training program. Gain unparalleled insights into managing complex data ecosystems under the guidance of industry leaders. Enroll now to elevate your skills in deploying, securing, and optimizing Hadoop clusters across diverse environments.

Face-to-Face May 19, 2025 - May 22, 2025
new
intermediate
Hadoop Administration
MYR 7000

Training Provider Pricing

Material Fees: MYR 600

Pax:

MYR 11200
Total (training + material fees): MYR 11800

Features

4 days (9:00 AM - 5:00 PM)
28 modules
1 intake
Full life-time access
English

Subsidies

HRDC Claimable logo

What you'll learn

  • Learn data ingestion techniques using Sqoop and Flume.
  • Gain proficiency in HDFS operations including file read/write processes.
  • Plan and deploy efficient Hadoop clusters tailored for large datasets.
  • Develop expertise in cloud-based Hadoop deployments on AWS, Azure, and Google Cloud.
  • Understand the fundamentals of Big Data challenges and Hadoop architecture.
  • Optimize resource management through YARN architecture understanding.
  • Implement security measures such as Kerberos authentication within Hadoop clusters.
  • Integrate ecosystem tools like Hive and Pig for enhanced data processing capabilities.

Why should you attend?

This course provides a comprehensive exploration of Hadoop administration, designed to equip participants with the skills needed to manage and optimize Hadoop clusters effectively. Beginning with an introduction to Big Data challenges and the Hadoop ecosystem, learners will gain foundational knowledge in setting up a single-node cluster. The course delves into the architecture and operations of the Hadoop Distributed File System (HDFS), covering critical concepts such as block replication and rack awareness. Participants will explore various data ingestion techniques using tools like Sqoop and Flume, enabling seamless integration of diverse data sources into HDFS. Security is a key focus, with modules on Kerberos authentication and HDFS permissions ensuring that learners can secure their clusters against unauthorized access. The curriculum also covers essential aspects of cluster planning and deployment, including hardware selection and network design. Advanced topics include YARN architecture for resource management, configuration file optimization, and resource scheduling strategies. Learners will engage in hands-on exercises to reinforce their understanding, such as simulating NameNode failover for high availability and configuring schedulers for service level agreements (SLAs). The course concludes with cloud-based Hadoop administration best practices, offering insights into deploying Hadoop on platforms like AWS EMR, Azure HDInsight, and Google Cloud Dataproc. Throughout the course, participants will benefit from practical labs that simulate real-world scenarios, preparing them to tackle complex challenges in both on-premises and cloud environments. By the end of this training program, learners will be well-equipped to administer robust Hadoop ecosystems efficiently.

Course Syllabus

Big Data challenges, Hadoop architecture, ecosystem overview
Exercise: Install a single-node Hadoop cluster via tarball
Short Break
15 mins
Short Break
15 mins
Recap and Q&A
15 mins
Lunch
1 hour
Short Break
15 mins
Short Break
15 mins
Short Break
15 mins
Recap and Q&A
15 mins
End of Day 1
Short Break
15 mins
Short Break
15 mins
Recap and Q&A
15 mins
Lunch
1 hour
Short Break
15 mins
Short Break
15 mins
Short Break
15 mins
Recap and Q&A
15 mins
End of Day 2
Short Break
15 mins
Short Break
15 mins
Recap and Q&A
15 mins
Lunch
1 hour
Short Break
15 mins
Short Break
15 mins
Short Break
15 mins
Recap and Q&A
15 mins
End of Day 3
Short Break
15 mins
Short Break
15 mins
Recap and Q&A
15 mins
Lunch
1 hour
Short Break
15 mins
Short Break
15 mins
Short Break
15 mins
Recap and Q&A
15 mins
End of Day 4

Instructor

Mohammad Mehdi Lotfinejad
Mohammad Mehdi Lotfinejad Certified Data Science Trainer and Data Engineer Teaching

Mohammad Mehdi Lotfinejad is an accomplished Chief Data Officer with a profound expertise in data science and engineering, amassing over ten years of experience in developing data processing pipelines for enterprise insights. He is a proven leader with exceptional communication, presentation, and leadership skills, certified as an HRDF trainer with more than 15 years of experience as a computer science instructor both in academia and as a professional data science/engineering trainer in the industry. His academic credentials include a PhD in Computer Science from Universiti Malaya, Malaysia, and he holds certifications from Harvard Business School in Business Analytics. Mohammad specializes in cutting-edge technologies such as Apache Spark, MySQL, PostgreSQL, MongoDB, Snowflake, Redshift, Apache Airflow, API and Microservices, and Amazon Web Services. Currently serving as the Chief Data and Knowledge Officer at Magna.ai since February 2024, Mohammad leads the development of graph databases and data warehouses to support AI-driven law case analysis services. He architects robust API and microservice solutions to enhance system interoperability and scalability while ensuring data security and compliance with legal standards. Prior to this role, he has been contributing his expertise as a Senior Data Engineer at AXIATA Digital Advertising (ADA) since March 2020. Here, he collaborates on designing automated data pipelines using AWS RedShift and Snowflake for storing telco data and implements BI dashboards leveraging Google BigQuery. From June 2018 to February 2020, Mohammad was the Lead Senior Data Scientist Professional Trainer at The Center of Applied Data Science in Kuala Lumpur. He led teams of data scientists and engineers to design professional training programs for prominent clients like CIMB, PETRONAS, SHELL, and TNB. His earlier roles include leading big data engineering teams at RAHA in Iran where he developed large-scale analytics pipelines using Hadoop Ecosystem tools like Hive and Spark. In academia, Mohammad served as a faculty member at Payame Noor University from September 2014 to June 2018 where he supervised graduate research projects and contributed significantly to curriculum development. His tenure also includes leadership positions at Islamic Azad University where he managed departments to achieve high academic standards. With technical proficiencies spanning RDBMS like MySQL and PostgreSQL to programming languages such as Python and C++, Mohammad is adept at web design using HTML/CSS/Bootstrap alongside project management skills including Scrum Master certification. His published works include books on Object-Oriented Programming and Project Management Fundamentals along with numerous journal articles on topics ranging from solar radiation prediction models to machine learning algorithms for intrusion detection systems.

2 Students
57 Courses
18 Years

Minimum Qualification

undergraduate

Target Audience

engineers

Methodologies

lecture
slides
case studies
labs
q&A

Instructor Reviews

Mohammad Mehdi Lotfinejad Certified Data Science Trainer and Data Engineer
review avatar
Michael Ogheneme
1 year ago
1 year ago

Mehdi and I worked on several projects with company such as Petronas , Shell and CIMB Regional ETC. I must say Mehdi's training was highly appreciated by our clients as he was able to exhibit in full display his vast knowledge as a Data professional. I would highly recommend him to anyone looking for a top tier training expert.

review avatar
Amin Jula
1 year ago
1 year ago

Not only knowledgeable but also having hands dirty on what he knows Friendly and building networks quickly.

review avatar
Kennedy Okonkwo
1 year ago
1 year ago

I had the pleasure of working with Mehdi together on some high-level initiatives such as the Petronas data scientist program and Shell's project to become a data-driven organization. During these projects, Mehdi received numerous accolades for his ability to share his knowledge and mentor up-and-coming data scientists. Based on our shared experiences, I have no hesitation in recommending Mehdi for any project or position he may be considered for.

Why should you attend?

This course provides a comprehensive exploration of Hadoop administration, designed to equip participants with the skills needed to manage and optimize Hadoop clusters effectively. Beginning with an introduction to Big Data challenges and the Hadoop ecosystem, learners will gain foundational knowledge in setting up a single-node cluster. The course delves into the architecture and operations of the Hadoop Distributed File System (HDFS), covering critical concepts such as block replication and rack awareness. Participants will explore various data ingestion techniques using tools like Sqoop and Flume, enabling seamless integration of diverse data sources into HDFS. Security is a key focus, with modules on Kerberos authentication and HDFS permissions ensuring that learners can secure their clusters against unauthorized access. The curriculum also covers essential aspects of cluster planning and deployment, including hardware selection and network design. Advanced topics include YARN architecture for resource management, configuration file optimization, and resource scheduling strategies. Learners will engage in hands-on exercises to reinforce their understanding, such as simulating NameNode failover for high availability and configuring schedulers for service level agreements (SLAs). The course concludes with cloud-based Hadoop administration best practices, offering insights into deploying Hadoop on platforms like AWS EMR, Azure HDInsight, and Google Cloud Dataproc. Throughout the course, participants will benefit from practical labs that simulate real-world scenarios, preparing them to tackle complex challenges in both on-premises and cloud environments. By the end of this training program, learners will be well-equipped to administer robust Hadoop ecosystems efficiently.

What you'll learn

  • Learn data ingestion techniques using Sqoop and Flume.
  • Gain proficiency in HDFS operations including file read/write processes.
  • Plan and deploy efficient Hadoop clusters tailored for large datasets.
  • Develop expertise in cloud-based Hadoop deployments on AWS, Azure, and Google Cloud.
  • Understand the fundamentals of Big Data challenges and Hadoop architecture.
  • Optimize resource management through YARN architecture understanding.
  • Implement security measures such as Kerberos authentication within Hadoop clusters.
  • Integrate ecosystem tools like Hive and Pig for enhanced data processing capabilities.

Course Syllabus

Big Data challenges, Hadoop architecture, ecosystem overview
Exercise: Install a single-node Hadoop cluster via tarball
Short Break
15 mins
Short Break
15 mins
Recap and Q&A
15 mins
Lunch
1 hour
Short Break
15 mins
Short Break
15 mins
Short Break
15 mins
Recap and Q&A
15 mins
End of Day 1
Short Break
15 mins
Short Break
15 mins
Recap and Q&A
15 mins
Lunch
1 hour
Short Break
15 mins
Short Break
15 mins
Short Break
15 mins
Recap and Q&A
15 mins
End of Day 2
Short Break
15 mins
Short Break
15 mins
Recap and Q&A
15 mins
Lunch
1 hour
Short Break
15 mins
Short Break
15 mins
Short Break
15 mins
Recap and Q&A
15 mins
End of Day 3
Short Break
15 mins
Short Break
15 mins
Recap and Q&A
15 mins
Lunch
1 hour
Short Break
15 mins
Short Break
15 mins
Short Break
15 mins
Recap and Q&A
15 mins
End of Day 4

Instructor Reviews

Mohammad Mehdi Lotfinejad Certified Data Science Trainer and Data Engineer
review avatar
Michael Ogheneme
1 year ago
1 year ago

Mehdi and I worked on several projects with company such as Petronas , Shell and CIMB Regional ETC. I must say Mehdi's training was highly appreciated by our clients as he was able to exhibit in full display his vast knowledge as a Data professional. I would highly recommend him to anyone looking for a top tier training expert.

review avatar
Amin Jula
1 year ago
1 year ago

Not only knowledgeable but also having hands dirty on what he knows Friendly and building networks quickly.

review avatar
Kennedy Okonkwo
1 year ago
1 year ago

I had the pleasure of working with Mehdi together on some high-level initiatives such as the Petronas data scientist program and Shell's project to become a data-driven organization. During these projects, Mehdi received numerous accolades for his ability to share his knowledge and mentor up-and-coming data scientists. Based on our shared experiences, I have no hesitation in recommending Mehdi for any project or position he may be considered for.

MYR 7000

Training Provider Pricing

Material Fees: MYR 600

Pax:

MYR 11200
Total (training + material fees): MYR 11800

Features

4 days (9:00 AM - 5:00 PM)
28 modules
1 intake
Full life-time access
English

Subsidies

HRDC Claimable logo

Instructor

Mohammad Mehdi Lotfinejad
Mohammad Mehdi Lotfinejad Certified Data Science Trainer and Data Engineer Teaching

Mohammad Mehdi Lotfinejad is an accomplished Chief Data Officer with a profound expertise in data science and engineering, amassing over ten years of experience in developing data processing pipelines for enterprise insights. He is a proven leader with exceptional communication, presentation, and leadership skills, certified as an HRDF trainer with more than 15 years of experience as a computer science instructor both in academia and as a professional data science/engineering trainer in the industry. His academic credentials include a PhD in Computer Science from Universiti Malaya, Malaysia, and he holds certifications from Harvard Business School in Business Analytics. Mohammad specializes in cutting-edge technologies such as Apache Spark, MySQL, PostgreSQL, MongoDB, Snowflake, Redshift, Apache Airflow, API and Microservices, and Amazon Web Services. Currently serving as the Chief Data and Knowledge Officer at Magna.ai since February 2024, Mohammad leads the development of graph databases and data warehouses to support AI-driven law case analysis services. He architects robust API and microservice solutions to enhance system interoperability and scalability while ensuring data security and compliance with legal standards. Prior to this role, he has been contributing his expertise as a Senior Data Engineer at AXIATA Digital Advertising (ADA) since March 2020. Here, he collaborates on designing automated data pipelines using AWS RedShift and Snowflake for storing telco data and implements BI dashboards leveraging Google BigQuery. From June 2018 to February 2020, Mohammad was the Lead Senior Data Scientist Professional Trainer at The Center of Applied Data Science in Kuala Lumpur. He led teams of data scientists and engineers to design professional training programs for prominent clients like CIMB, PETRONAS, SHELL, and TNB. His earlier roles include leading big data engineering teams at RAHA in Iran where he developed large-scale analytics pipelines using Hadoop Ecosystem tools like Hive and Spark. In academia, Mohammad served as a faculty member at Payame Noor University from September 2014 to June 2018 where he supervised graduate research projects and contributed significantly to curriculum development. His tenure also includes leadership positions at Islamic Azad University where he managed departments to achieve high academic standards. With technical proficiencies spanning RDBMS like MySQL and PostgreSQL to programming languages such as Python and C++, Mohammad is adept at web design using HTML/CSS/Bootstrap alongside project management skills including Scrum Master certification. His published works include books on Object-Oriented Programming and Project Management Fundamentals along with numerous journal articles on topics ranging from solar radiation prediction models to machine learning algorithms for intrusion detection systems.

2 Students
57 Courses
18 Years

Minimum Qualification

undergraduate

Target Audience

engineers

Methodologies

lecture
slides
case studies
labs
q&A
Close menu