Microsoft DAT208x: Introduction to Python for Data Science, a review

In my quest to complete the Microsoft Professional Program for Data Science, I took their course Introduction to Python for Data Science earlier this month to disappointing results.

It could be that I had very different expectations, or that I already have too much background in Python for another introductory course, but I wasn’t impressed and I’m loath to pay for the verified certificate.

This felt more like an overview than a proper introduction. If this was a university, this would have been the first day when the instructor gives out the syllabus and walks through the course expectations.

Would I discourage you from taking the course? Yes actually.

(To follow my progress on the program, check out the Microsoft Professional Program tag)


The Structure

DAT208x claims to “cover Python basics and prepare you to undertake data analysis using Python”. Similar to the Microsoft courses that come before it, it is a self-paced course comprised of video lectures and lab exercises.

The modules are as follows:

  1. Python Basics
  2. Lists
  3. Functions and Packages
  4. Numpy
  5. Plotting with Matplotlib
  6. Control Flow and Pandas

This course is brought to you by a partnership between Microsoft and Data Camp, the latter an online Data Science school similar to DataQuest. In an old post I mentioned my apprehension with Data Camp as I’ve heard they favor R over Python, but I decided to give them the benefit of the doubt and give their Python course a try.

Its due to this partnership that most of the lab activities are outside of edX. i.e., we’re redirected to DataCamp’s interface for the lab exercises.

These exercises are the meat of the course. If you’ve tried DataQuest before then the DataCamp interface should be familiar:

Instructions are to the left, interactive Python shell to the right. After submitting your answer DataCamp verifies if your code is correct.

Unlike other Microsoft courses I’ve tried, this one has a final exam. In this exam you are given 4 hours to answer 50 questions: a mixture of knowledge checks, pseudo coding, and actual coding.

Considering the quizzes, exercises, and final exam, you need to score at least 70% to pass the course. Pretty easy considering 40% is just course surveys.


The Positives

Given the similarity of DataCamp to DataQuest it should be no surprise that the two share the same advantages: gamification and hands-on exercises.

Unlike DataQuest though, DataCamp course offers lectures which address my complaint with DataQuest’s lack of explanations.

These lectures are given by Filip: an engaging instructor who’s obviously excited about what he’s teaching. His explanation on logical operators (e.g. and, or, not) is one of the best.

I’m surprised they introduced data analysis libraries like Numpy and Pandas as these are considered intermediate Python level by DataQuest. I’m not complaining though. I was happy to get the chance to try them out.

I fangirled a bit during the Matplotlib module as they used Hans Rosling’s famous data visualization from his Ted talk, The best stats you’ve ever seen, for demonstration.

I never knew you could use Python to recreate it! Good to know.


The Negatives

If your objective is to learn Python, this isn’t the course for you.

The coverage is too light. Even the lab exercises don’t present a challenge to beginner coders. I guess the point is to be encouraging in that anybody can code, but the output won’t be useful in real life.

Speaking of output, I can’t help but compare it to Udacity. With Udacity you at least have a working search engine by the end of the course. DAT208x has nothing, except maybe for that one Rosling graph.

I find it odd that loops weren’t included as part of the course. Aren’t they considered basic? They usually follow shortly after if/else statements in other programming courses.

And while I do commend Filip for being an excellent instructor, he’s no match for Udacity’s Dave Evans. Dave spoiled other online instructors for me!

I hated the final exam. I got some items wrong because while I knew how to write the actual code, I didn’t understand the pseudo code required by some questions. I was also not expecting memorization-type items to come up in what I thought was a coding exam.

But most especially, I hated that I couldn’t go back and redo my previous answers. Isn’t a big part of coding about cleaning and debugging code?!



I’m confused what this course is for.

On one hand, it covers some Python basics but it isn’t sufficient for a Python course. I’d say other MOOCs like the one over at Udacity do a better job.

On the other hand, if it was meant to introduce common data analysis libraries, then its too much to call itself an MOOC. Its nothing more but a data science libraries demonstration.

What happened is they tried to be too many things at the same time, so there wasn’t any depth to the content.

EdX claims the course will take six weeks, while Microsoft claims 16-32 hours.

In reality it took me 5.5 hours over three days.

Time spent on Microsoft Intro to Python for Data Science, as tracked by toggl.


That’s how easy it was.

While I may have an advantage for having taken the Udacity course beforehand, it doesn’t change the fact that I went through all the videos, quizzes, and lab exercises.

There was nothing challenging there.

I remember that part of the reason why I took so long with the Udacity course was because whenever I would struggle with a concept (say, recursive functions and modulo %), I would answer the heck out of the problem sets until the concept sank in. There isn’t an option like that with Microsoft/DataCamp.

In addition, I want to question the rating of lab exercises. By default it’s set to full stars, and you manually have to pull the rating down otherwise.

Psychologically people will be too lazy to put in the manual effort to change the default rating, so DataCamp will be receiving full marks for student experience just because they know we’re lazy. Sneaky!

All in all, no I would not recommend this course.


One thought on “Microsoft DAT208x: Introduction to Python for Data Science, a review”

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s