• Education Bureau Registration Number:575690, 597600
Course Centre Day Date Time Hours Trainer Status Enrollment
Big Data Scientist (module 4,5,6) Wanchai N/A To be confirmed 7:00 – 10:00pm 18 Houston Ho Planning

Big Data Scientist

Great Learning is the first Arcitura – Licensed Training Partner in Hong Kong

Course Duration:

18 hours, 6 sessions of 3 hours each, exam schedule will be discussed and determined in class.

Course Fee:


Language of Delivery: Cantonese with English terms


The Big Data Science Certified Professional (BDSCP) program from the Arcitura™ Big Data Science School is dedicated to excellence in the fields of Big Data science, analysis, analytics, business intelligence, technology architecture, design and development, as well as governance.

A collection of courses establishes a set of vendor-neutral industry certifications with different areas of specialization. Founded by best-selling author, Thomas Erl, this curriculum enables IT professionals to develop real-world Big Data science proficiency. Because of the vendor-neutral focus of the course materials, the skills acquired by attaining certifications are applicable to any vendor or open-source platform.


A Certified Big Data Scientist has demonstrated proficiency in the application of principles, processes, and techniques required for exploring and analyzing large volumes of complex data with the goals of discovering novel insights, developing data products, and communicating analytic results that can drive decision making.

Along with a firm understanding of fundamental and advanced Big Data concepts and terminologies, a Certified Big Data Scientist is also required to have a thorough understanding of Big Data analysis lifecycle and foundational mechanisms essential for acquiring, processing and storing Big Data datasets. Exploratory data analysis (EDA) and confirmatory data analysis (CDA) techniques, statistical concepts, visualization tools and machine learning algorithms are taught and assessed in this certification program. A Certified Big Data Scientist understands the art of model development and evaluation and possesses an in-depth knowledge of both fundamental and advanced analysis techniques required for building descriptive and predictive models.

Note that the Big Data Scientist certification program is based on vendor-neutral coverage of technologies and a broad treatment of various statistical techniques and machine learning algorithms. The attainment of this certification, does not requires any knowledge of specific products or the underlying mathematical formulas involved in performing analysis and developing models. This certification imparts the necessary skills and understanding required for successful exploration and interpretation of Big Data datasets. This knowledge establishes a sound foundation that can be further built upon with additional training, accreditation and experience.

Module 4: Fundamental Big Data Analysis & Science (duration: 6 hours)

It provides an in-depth overview of essential topic areas pertaining to data science and analysis techniques relevant and unique to Big Data with an emphasis on how analysis and analytics need to be carried out individually and collectively in support of the distinct characteristics, requirements and challenges associated with Big Data datasets.

Module 5: Advanced Big Data Analysis & Science (duration: 6 hours)

This module delves into a range of advanced data analysis practices and analysis techniques that are explored within the context of Big Data. The course content focuses on topics that enable participants to develop a thorough understanding of statistical, modeling, and analysis techniques for data patterns, clusters, and text analytics, as well as the identification of outliers and errors that affect the significance and accuracy of predictions made on Big Data datasets.

Module 6: Big Data Analysis & Science Lab (6 hours)

This course module covers a series of exercises and problems designed to test the participant’s ability to apply knowledge of topics covered previously in course modules 4 and 5. Completing this lab will help highlight areas that require further attention, and will further prove hands-on proficiency in Big Data analysis and science practices as they are applied and combined to solve real-world problems.

  • Module 4
    • Data Science, Data Mining & Data Modeling
    • Big Data Dataset Categories
    • Exploratory Data Analysis (EDA) (including numerical summaries, rules & data reduction)
    • EDA analysis types (including univariate, bivariate & multivariate)
    • Essential Statistics (including variable categories & relevant mathematics)
    • Statistics Analysis (including descriptive, inferential, correlation, covariance & hypothesis testing)
    • Data Munging & Machine Learning
    • Variables & Basic Mathematical Notations
    • Statistical Measures & Statistical Inference
    • Distributions & Data Processing Techniques
    • Data Discretization, Binning, Clustering
    • Visualization Techniques & Numerical Summaries
    • Correlation for Big Data
    • Time Series Analysis for Big Data
  • Module 5
    • Statistical Models, Model Evaluation Measures (including cross-validation, bias-variance, confusion matrix & f-score)
    • Machine Learning Algorithms, Pattern Identification (including association rules & apriori algorithm)
    • Advanced Statistical Techniques (including parametric vs. non-parametric, clustering vs. non-clustering distance-based, supervised vs. semi-supervised)
    • Linear Regression & Logistic Regression for Big Data
    • Decision Trees for Big Data
    • Classification Rules for Big Data
    • K Nearest Neighbor (kNN) for Big Data
    • Naïve Bayes for Big Data
    • Association Rules for Big Data
    • K-means for Big Data
    • Text Analytics for Big Data
    • Outlier Detection for Big Data
  • Module 6
    • As a hands-on lab, this course incorporates a set of detailed exercises that require participants to solve various inter-related problems, with the goal of fostering a comprehensive understanding of how different data analysis techniques can be applied to solve problems in Big Data environments and used to make significant, relevant predictions that offer increased business value.
    • For instructor-led delivery of this lab course, the Certified Trainer works closely with participants to ensure that all exercises are carried out completely and accurately. Attendees can voluntarily have exercises reviewed and graded as part of the class completion. For individual completion of this course as part of the Module 6 Self-Study Kit, a number of supplements are provided to help participants carry out exercises with guidance and numerous resource references.

Candidates who passed Exam B90.01 and Exam B90.02 AND like to learn more in-depth knowledge of data science

Certified Big Data Scientist examination

Comprised of BDSCP Modules 4, 5 and 6

Prerequisite: Passing B90.01, B90.02 exams

Exam fee: HKD4,000

Exam Duration: 2 hours

Exam Format:  Paper-based, 57 questions (including standard multiple-choice questions & lab-style questions)

Passing Score: 68%

Exam Location: Great Learning Education Centre Limited


Official course workbook

  • Great Learning is the first education centre delivered Arcitura Bid Data training in Hong Kong.
  • Unlimited re-sit within 2 years. All lessons can be re-sit, refreshing your knowledge all the way.
  • We are the IT soft skill specialist, highly experienced in delivering complicated and conceptual knowledge in an effective way.
  • We have successfully delivered more than 20 classes of Big Data courses.