Microsoft DAT208x: Introduction to Python for Data Science, a review

In my quest to complete the Microsoft Professional Program for Data Science, I took their course Introduction to Python for Data Science earlier this month to disappointing results.

It could be that I had very different expectations, or that I already have too much background in Python for another introductory course, but I wasn’t impressed and I’m loath to pay for the verified certificate.

This felt more like an overview than a proper introduction. If this was a university, this would have been the first day when the instructor gives out the syllabus and walks through the course expectations.

Would I discourage you from taking the course? Yes actually.

(To follow my progress on the program, check out the Microsoft Professional Program tag)

 

The Structure

DAT208x claims to “cover Python basics and prepare you to undertake data analysis using Python”. Similar to the Microsoft courses that come before it, it is a self-paced course comprised of video lectures and lab exercises.

The modules are as follows:

  1. Python Basics
  2. Lists
  3. Functions and Packages
  4. Numpy
  5. Plotting with Matplotlib
  6. Control Flow and Pandas

This course is brought to you by a partnership between Microsoft and Data Camp, the latter an online Data Science school similar to DataQuest. In an old post I mentioned my apprehension with Data Camp as I’ve heard they favor R over Python, but I decided to give them the benefit of the doubt and give their Python course a try.

Its due to this partnership that most of the lab activities are outside of edX. i.e., we’re redirected to DataCamp’s interface for the lab exercises.

These exercises are the meat of the course. If you’ve tried DataQuest before then the DataCamp interface should be familiar:

Instructions are to the left, interactive Python shell to the right. After submitting your answer DataCamp verifies if your code is correct.

Unlike other Microsoft courses I’ve tried, this one has a final exam. In this exam you are given 4 hours to answer 50 questions: a mixture of knowledge checks, pseudo coding, and actual coding.

Considering the quizzes, exercises, and final exam, you need to score at least 70% to pass the course. Pretty easy considering 40% is just course surveys.

 

Continue reading “Microsoft DAT208x: Introduction to Python for Data Science, a review”

Udacity CS101: Intro to Computer Science, a review

I’ve been trying to learn how to code in Python for a while now. Of all the beginner resources I’ve tried, Udacity’s Intro to Computer Science (UD CS101) has been my favorite.

To clarify: I’m not learning Python with the intention of becoming a software developer. Rather, I like analyzing data, and I hear Python can help with that. R too, but Python is 1: recommended for beginners, and 2: has more applications outside of big data.

I do have some programming experience, though never anything formal, never to this depth, and never in Python.

 

THE STRUCTURE

UD CS101’s premise is for you to create “The Next Google” by teaching you how to build your own search engine.

The self-paced course is broken down into 7 modules*. Each module introduces a new concept to help improve on your search engine.

Each module contains:

  • Videos. Here the instructor explains the theory behind the concepts and demonstrates how to use them on the search engine.
  • Q&As. These help nail down the concepts. These aren’t too difficult and are usually similar to the demonstrations.
  • Problem sets. These are machine problems that build on the concepts you’ve learned so far and are more challenging than the Q&As.

At the end of the course you would have built a search engine with a similar algorithm to AltaVista–what was once the #1 search engine in the 90s before Google took over.

For your class project you then build a mini social network based on the concepts you learned from the course.

*As of writing Udacity has revamped their classrooms so this modular approach may no longer apply.

 

Continue reading “Udacity CS101: Intro to Computer Science, a review”

What’s the code under the hood? Find out with Gomix

There’s a new kid tool in the block: Gomix.

The premise is simple:

  1. Find a piece of code you’d like to tweak (or maybe just curious about).
  2. Tweak it.
  3. Save it so others can tweak it too.

That’s it. Easy.

It’s perfect for all those times you’ve come across a program and wondered, “How did they do that?”

Gomix lets you not only view the code underneath, but play around with it to create something new.

Its like a less structured version of Github, which has its own pros and cons… I think its more fun?

Gomix is by the creators of Trello, so we’re guaranteed similar levels of collaboration and intuitiveness.

More information available here.

 

P.S. I’ve had this post in my drafts for a while now and apparently forgot about it. Oops. Gomix is no longer as “new” as it was on the first draft.

Danna on Data

It’s been a while since I’ve talked about my data analysis self-study.

I’ve been trying this and that, but haven’t felt anything was worth writing about. I mean, who would want to know that I tried something and failed, right?

Oh wait. Me. I would want to know.

When I’m about to try something new, like skincare or a restaurant, I look up blogs for reviews. I try to see if I can relate to the blogger and put myself in their shoes–Would I have failed as well?

It saves me a lot of effort because someone else has already gone through the experience for me.

That’s why I’m writing about all my data science-related updates so far, incomplete and disorganized as they are. Maybe it’ll help.

Continue reading “Danna on Data”

DataQuest: Day 3-ish.

Quick update to say I’ve given DataQuest a try.

It’s radically different from the Microsoft or MOOC approach. Zero videos, all lab work. They’re big fans of the learn by doing approach.

I’m still on the (free) Python introduction, but already I can say it’s a step above Interactive Python. There are fewer walls of text and more chances to play around with code.

It costs ~29USD a month though. I haven’t been on the program long enough to judge if its worth it.

Also, I’ve decided to go for the Data Analyst path. I feel it’s less intimidating than the Data Scientist path. And I like that the progress bar goes up faster due to the smaller scope (I’m a bit of a completionist gamer, sorry). I can switch tracks later on anyway.

A more in-depth review in the works.