08 Jan 2022 by dzlab Show
I recently passed Google Professional Machine Learning Engineer Certification, during the preparation I went throught lot resources about the exam. The exam is relatively eaiser than the Data engineer certification exam as the questions are more direct (almost no ambigous question) but it has 60 questions instead of the typical 50. It focuses on the following areas:
I could not find a comprehensive resource that covers all aspect of the exam when I started preparing. I had to go over a lot of Google Cloud products page and general Machine Learning resources and at no point I felt ready as both topics are huge. Here I will try to provide a summary of the resources I did found helpful for passing the exam. Machine LearningBig part of the exam are general ML questions that touches concept not specific to Google. This is a huge topic by itself but it should be enough for the exam to go over most of the materials in Google ML Crash Course
Also you should get familliar with Privacy in Machine Learning - link MetricsYou need to know what are the metrics you can use and for what kind of ML problem they can be applied to. For instance, for a Classification problem you can use:
Regularization techniques
Neural NetworksSome common issues with Neural Networks training and how to address them:
To summaries:
AI ExplanationsExplainable AI is another topic to know about and the different techniques available to explain a model.
Also, an important tool to know about is WhatIf Tool — when do you use it? How do you use it? How do you discover different outcomes? How do you conduct experiments? TensorflowYou need to know basic model architectures, layers (e.g. dense, dropout, convolutional, pooling) and which one define training parameters. Also, knowing the Keras API is important.
AcceleratorsYou need to know the differences between CPUs, TPUs and GPUs and when to use each one. The general answer is that GPU training is faster than CPU training, and GPU usually doesn’t require any additional setup. TPUs are faster than GPUs but they don’t support custom operations.
You may also want learn about basic troubleshooting - link Distributed trainingYou need to know the differences between the different Distributed training strategies in Tensorflow link.
Make also sure to know the components of a distributed training architecture: master, worker, parameter server, evaluator, and how many of each you can get. MLOps
TFXYou have to know TFX (TensorFlow Extended) and its limitations (can be used to build pipelines for Tensoflow models only), what are its standard components (e.g. ingestion, validation, transform) and how to build a pipeline out of them.
KubeflowYou need to know Kubeflow and that you should use if your modeling framework is not TensorFlow (i.e. when you need PyTorch, XGBoost) or if you want to dockerize every step of the flow - link
CI/CD
Here is a flow chart to help with deciding what Google ML product to use depending on the situation: BigQuery MLBigQuery is a managed data warehouse service, it also has ML capabilities. So if you see a question where the data is in BigQuery and the output will also be there then a natural answer is to use BigQuery ML for modeling.
AI PlatformYou need to know AI Platform, built-in algorithms, hyperparameter tuning, and distributed training and what container images to use based on your modeling framework (e.g. tensorflow, pytorch, xgboost, sklearn). The following resources covers most of what you need to know for the exam:
Natural Language
AutoML APITrain your own high-quality machine learning custom models to classify, extract, and detect sentiment with minimum effort and machine learning expertise using Vertex AI for natural language, powered by AutoML. You can use the AutoML UI to upload your training data and test your custom model without a single line of code. - link
Natural Language APIThe powerful pre-trained models of the Natural Language API empowers developers to easily apply natural language understanding (NLU) to their applications with features including sentiment analysis, entity analysis, entity sentiment analysis, content classification, and syntax analysis. - link Healthcare Natural Language AIGain real-time analysis of insights stored in unstructured medical text. Healthcare Natural Language API allows you to distill machine-readable medical insights from medical documents, while AutoML Entity Extraction for Healthcare makes it simple to build custom knowledge extraction models for healthcare and life sciences apps—no coding skills required. - link TranslationCloud Translation API helps: Translating text, Discovering supported languages, Detecting language of Text, Creating and using glossaries when translating.
Vision AICreate a dataset of images, train a custom AutoML for Cloud or Edge, then deploy it. If Edge is target you can then export the model in TF Lite, TF.js, CoreML, or Coral Edge TPU.
Video AI
Other products
Certification SWAGAfter passing the exam, you can choose one of the official certification swags: Which Cloud certification is best for machine learning engineer?Google, AWS and Azure offer machine learning certifications for the cloud that can further your career. Learn what to expect from each exam, skills you need to know and study tips.
Is Google Cloud learning certification worth it?Since the Professional Machine Learning Engineer exam tests you in equal parts on developing, training and improving ML models as well as the Google Cloud products and services that enhance and enable those models, it's a great certification if your network depends on Google products.
How do I become a Google ML engineer?Exam overview. Step 1: Get real world experience. Before attempting the Machine Learning Engineer exam, it's recommended that you have 3+ years of hands-on experience with Google Cloud products and solutions. ... . Step 2: Understand what's on the exam. ... . Step 3: Review the sample questions. ... . Step 5: Schedule an exam.. How do I prepare for GCP ML Engineering certification?You should use Flashcards, Mindmaps, or other techniques to remember a lot of details about solutions. The main focus of the test is MLOps and ML pipelines, however, do not discard data engineering knowledge and Machine Learning model-specific questions.
|