AWS Certification for Data Science | Is it Mandatory?

image
AWS Certification for Data Science | Is it Mandatory? 2

Data science engineering is a field that combines the skills of data science with the principles and practices of software engineering. It involves designing, building, and maintaining systems that are able to collect, process, and analyze large amounts of data in order to extract valuable insights and knowledge.

Data science engineers typically have a strong background in computer science and programming, as well as expertise in statistical analysis and machine learning. They are responsible for building and maintaining the infrastructure and systems that enable data scientists to work effectively, including data pipelines, data lakes, and data warehouses. They may also be involved in the design and implementation of machine learning models, as well as the deployment of those models into production environments.

Overall, the role of a data science engineer is to bridge the gap between data science and software engineering, bringing together the skills and knowledge needed to build and maintain large-scale data systems and to apply data-driven techniques to solve real-world problems.

AWS Certification for Data Science

There are several Amazon Web Services (AWS) certification options related to data science. These certifications demonstrate proficiency in various aspects of data science and cloud computing, and can be valuable for professionals working in the field. Here are some of the options

  • AWS Certified Machine Learning – Specialty: This certification is designed for professionals who have experience designing, implementing, and maintaining machine learning solutions on the AWS platform. It covers topics such as machine learning concepts, AWS services for machine learning, and machine learning workloads on the AWS platform.
  • AWS Certified Big Data – Specialty: This certification is for professionals who have experience working with big data and want to demonstrate their skills in using AWS technologies to process, store, and analyze large datasets. It covers topics such as big data concepts, AWS services for big data, and design and implementation of big data solutions on the AWS platform.
  • AWS Certified Data Analytics – Specialty: This certification is for professionals who have experience working with data analytics and want to demonstrate their skills in using AWS technologies to process, store, and analyze data. It covers topics such as data analytics concepts, AWS services for data analytics, and design and implementation of data analytics solutions on the AWS platform.
  • AWS Certified Solutions Architect – Associate: While not specifically focused on data science, this certification is relevant for professionals working in the field as it covers the design and implementation of solutions on the AWS platform, including data processing and analytics. It is suitable for professionals who have experience designing and deploying cloud infrastructure.

It is worth noting that these certifications require a certain level of experience and knowledge in order to be successful. AWS recommends that candidates have at least one year of hands-on experience with the relevant technologies before attempting the certification exams.

Is AWS Certificate for Data Science Recommended?

Whether or not an AWS certification in data science is recommended for you depends on your goals and circumstances. Here are a few factors to consider:

  • Are you looking to build or improve your skills in data science and cloud computing? AWS certifications can be a good way to learn new technologies and best practices, and to demonstrate your expertise to potential employers.
  • Do you want to advance your career in data science? An AWS certification may be viewed positively by employers and could potentially lead to new job opportunities or salary increases.
  • Do you have the necessary experience and knowledge to successfully complete the certification exams? AWS recommends that candidates have at least one year of hands-on experience with the relevant technologies before attempting the certification exams. If you do not have this level of experience, you may want to gain more experience before pursuing a certification.

Ultimately, the decision whether or not to pursue an AWS certification in data science is up to you and should be based on your own goals and needs. It is a good idea to carefully research the certification requirements and assess your current skills and experience before making a decision.

Recently between March and June, I had gotten Machine Learning and Data Science certified across the major cloud players – AWS, Azure and GCP. And I decided to blog about that experience for people considering stepping up their ML game to hyper-scale.

Why I wanted to get these certifications – Back in my undergrad days, I had taken special interest in getting myself trained in cloud technologies, just when it was starting to be a buzz word. And I had spoken out about this interest so much that some of my classmates started calling me “Sanjana Cloud”. Fast-forward to today and it’s not just a hype, but the present and future of how digital businesses are run. My career trajectory up to this point, has been along the lines of an applied ML Research Engineer and a Data Scientist. So getting these certifications was, for me – a great opportunity to combine these two valuable skill streams to be ready for the industry’s future.

To folks who feel the same about these certifications, and are considering taking 1 or 2 or all of the following:

Microsoft Certified: Azure Data Scientist Associate
AWS Certified Machine Learning – Specialty
Google Cloud Certified: Professional Machine Learning Engineer
I use this article to compare these certifications under 8 categories:

Ease of preparation
Affordability
Exam experience
Challenge the Data Scientist in you
Scope of improvement post exam
Post certification benefits
Which one should you take?
Where would I use each?
Now that I’ve revealed the whole content, I would also like to mention that these cloud providers and exam providers keep changing their exam content and format at intervals they find fit. So, some of the points I am providing now in July 2021 might not apply if you’re reading this too far out. Still, I hope some timeless points in this comparison help you make informed decisions.

Ease of preparation:
The first step to taking these certification is learning about what these providers have to offer in end-to-end ML solutions. There’s also the ML foundations and concepts to be strong on. But such materials can be found in various courses and online sources and doesn’t really change from one provider to another.

So it comes down to how easy / accessible is it, to gain provider specific knowledge (theory and hands-on) on data pre-processing, exploratory analysis, modeling, deployment and operations. Azure was the most accessible as they provide a free learning path which is very detailed, has everything you need to know in one place and also lets you spin up assets to try out their offerings. They also offer free official practice tests.

AWS would be the next, where they provide this exam readiness course, which serves as an introduction and overview, but I definitely wouldn’t say is exhaustive and a single stop for the exam. To be ready for the exam, requires in addition to strong ML basics, thorough knowledge on the ocean of cloud ML offerings AWS has. And being the oldest cloud layer, the ocean is wide and deep. Especially, the data engineering offerings took me a while to wrap my head around the number of alternate instances and services being built for the same overlapping use cases, with only minute differences.

For this, one might use external paid courses like the one offered by SunDog Education, but also make sure to go through the developer documentation across all offerings from AWS website. A trick I used to make my life easier with the developer docs was to use the Edit -> Speech -> Start Speaking (or similar functionality) provided by most browsers. This worked out for me as I can remember stuff I hear better than the ones I read.

GCP was the hardest to prepare for since the certification itself is very new and was also in Beta mode a while ago. If you look on their webpage for this exam, they seem to be providing a learning path, but it only points you to several external links and courses, a lot of which are paid. And hence, definitely not a single stop resource. There is this Coursera course which might serve as a single stop resource, but I did not try it as I was not willing to spend for it (didn’t want to rush through the 7 day free trial either), but people willing to spend can try it out and let me and others know if it was useful.

So how did I prepare for the GCP ML exam? I relied on a lot of Medium blogs from past test takers. I should say that I did get side tracked from some of the blogs from Beta exam takers, since the beta exam ground and the current exam group happen to be different. I came across Sathish VJ’s blog and used that as a guide to land at initial developer docs and navigate to related developer docs from there. And used the same Edit -> Speech -> Start Speaking technique again.

I can say this blog was a very good outline, but some of the links it points to may not be valid since GCP seems to keep rebranding stuff like AI Platform to Vertex AI etc. So whatever you read, you’ll have to read with the old branding and new branding in mind as it’s not clear which one is going to be asked about in the exam. GCP has an opportunity to improve themselves in this regard and provide a better exam readiness course, which is possibly also updated with each of their rebranding. I did also have access to the partner learning provided by GCP for a short period of 2 weeks through my employer, but that can use some restructuring too as it has a lot of repetitive information especially around AI Platform Pipelines and Kubeflow and is not organized very well under reflective headings and sections, in my opinion.

So for this category my rating is: Azure > AWS > GCP

Affordability:
Phew, now that we’ve gotten past the biggest task of preparing for the exam, let’s sign up to take the exam and obtain the certification.

The affordability is as follows: Azure > GCP > AWS

While Azure costs about $100, GCP cost $120 and AWS cost $300 (this is all before tax). And one should consider validity along with cost too. While both Azure and GCP ML certifications are valid for 2 years, AWS is valid for 3 years. Even with that one extra year, AWS costs 2x of Azure and a little above 1.5x of GCP on a per year cost.

This is when we are considering only the ML exams against each other. But, one more thing about AWS is that, if you can take another cheaper AWS exam first, you get a 50% off for the ML Specialty exam and also get a free practice test. And that can bring the cost and investment on this single exam down a bit.

Exam Experience:
This includes experience during the duration of the exam and up to the point you receive results.

Both Azure and AWS are 3 hour long exams with Azure having a varying number of questions (up to 80) and AWS having a fixed 65. The exam provider I used for both is PearsonVue and while I noticed a calculator being present for Azure, did not find it for AWS and did not end up requiring it either. GCP has 60 questions and a duration of 2 hours and is provided through the test provider Kryterion. Between PearsonVue and Kryterion, I liked Kryterion’s UX better.

One might be wondering how someone can manage GCP when it has nearly the same questions as AWS but provides an hour less. I would say GCP is ideally timed since even 2 hours is a lot of time to be looking non-stop into a system screen (with even looking away from screen being looked at as a sign of cheating in remote-proctored exams during the pandemic). Besides, for both AWS and Azure I ended up having half an hour and an hour extra respectively. GCP was the one I ended right on time.

When it comes to variety of questions Azure had a greater variety like fill in the blanks, multiple choice, multiple response etc. and that variety removed the monotony associated with these testing experiences, at least for me. AWS had the longest questions – took me longer to figure out the question than the answer. Maybe they were trying to teach us the life skill of surviving meetings that should’ve been an email. And since I took GCP right after AWS on the same weekend, seeing direct questions on GCP was a relief. And partly that’s what enabled finishing almost the same number of questions on time, even with one less hour.

After finishing the exam and submitting it, you will see a Pass / Fail status on the test software for all 3 exams. For Azure, your scores are computed and displayed right away. For AWS, you get the score only later, when you receive your certificate. For GCP, there is no score provided at any time. Since these testing softwares kill and disable all other apps in your system, they don’t really allow you to take a screenshot of that momentary flash of your result. With that being the case, speed and mode of delivering proof of result plays an important role.

For Azure, I got the final result delivered in my email inbox within an hour of taking the test. For AWS, you do not get preliminary results displayed or provided anywhere and it can take up to 5 business days to get the result finalized and get your certificate, scorecard and badge. For GCP, it can take up to 10 days but you will see the preliminary result displayed on your WebAssessor candidate profile (the same one you used to register for the test).

I am guessing both AWS and GCP take time to rewatch your remote proctoring video to ensure there has been no cheating. Kind of invalidates the presence of the proctor, but I much prefer GCP’s way of providing proof of preliminary results first and then taking their own sweet time to validate your result, over AWS’s method of causing unnecessary anxiety to test takers who have no proof of their result, even if it’s for a few days. Imagine the amount of effort and money put into this certification, only to cost you more anxiety and stress after successfully completing it.

So for this category my rating is: Azure > GCP > AWS

Challenge the Data Scientist in you:
This I think is inversely proportional to the ease of preparation. Because, if everything is handed to you, what’s the challenge in it. It is also in line with the type of questions and the racing against the exam clock. While I did say that AWS questions were really long, I feel like that’s more of a test on someone’s reading comprehension skills (something my colleague also mentioned) rather than pure Data Science and analytical skills. GCP I found was testing more on core Data Science and applied Data Science skills in a more direct manner and also testing how well you do under pressure by being the shortest of the exams. If you’re someone who likes to challenge yourself,

The rating is: GCP > AWS > Azure

Scope of improvement post exam:
One can improve only when there’s feedback. And that is something you definitely get with the AWS certification. You get scores plus section wise rating on whether you performed well or whether you need improvement in that section. Azure also displays score but no detailed rate-card. It still helps assess how far you have to go to reach the goal you’ve set for yourself. GCP however does not provide any scores or feedback, so it’s hard to assess where you stand.

For this category my rating is: AWS > Azure > GCP

Hurry Up!
Gag4n
Logo