Microsoft DAT208x: Introduction to Python for Data Science, a review

In my quest to complete the Microsoft Professional Program for Data Science, I took their course Introduction to Python for Data Science earlier this month to disappointing results.

It could be that I had very different expectations, or that I already have too much background in Python for another introductory course, but I certainly wasn’t impressed and I’m loath to pay for the verified certificate.

In a nutshell: This felt more like an overview than a proper introduction. If this was in a university setting, this would have been the first day when the instructor gives out the syllabus and walks through the course expectations. If (s)he’s a smart alec they’d force an awkward icebreaker.

Would I discourage you from taking the course? Yes actually.

(To follow my progress on the program, check out the Microsoft Professional Program tag)

 

The Structure

DAT208x claims to “cover Python basics and prepare you to undertake data analysis using Python”. Similar to the Microsoft courses that come before it, it is a self-paced course composed of modules that comprise of video lectures and lab exercises.

The modules are as follows:

  1. Python Basics
  2. Lists
  3. Functions and Packages
  4. Numpy
  5. Plotting with Matplotlib
  6. Control Flow and Pandas

This course is brought to you by a partnership between Microsoft and Data Camp, an online Data Science school similar to DataQuest. In an old post I mentioned my apprehension with Data Camp as I’ve heard they favor R over Python, but I decided to give them the benefit of the doubt and give their Python course a try.

Its due to this partnership that most of the lab activities are outside of edX. i.e., we’re redirected to DataCamp’s interface for the lab exercises.

These exercises are the meat of the course. If you’ve tried DataQuest before then the DataCamp interface should be familiar:

Instructions to the left, interactive Python shell to the right.

Unlike other Microsoft courses I’ve tried, this one has a final exam. You are given 4 hours to answer 50 questions: a mixture of knowledge checks, pseudo coding, and actual coding.

Considering the knowledge checks, exercises, and final exam, you need to score at least 70% to pass the course. An easy feat considering 40% is just course surveys.

 

Continue reading “Microsoft DAT208x: Introduction to Python for Data Science, a review”

Danna on Data

It’s been a while since I’ve talked about my data analysis self-study.

I’ve been trying this and that, but haven’t felt anything was worth writing about. I mean, who would want to know that I tried something and failed, right?

Oh wait. Me. I would want to know.

When I’m about to try something new, like skincare or a restaurant, I look up blogs for reviews. I try to see if I can relate to the blogger and put myself in their shoes–Would I have failed as well?

It saves me a lot of effort because someone else has already gone through the experience for me.

That’s why I’m writing about all my data science-related updates so far, incomplete and disorganized as they are. Maybe it’ll help.

Continue reading “Danna on Data”

I may not love data science

Rather, I may not love data science as a whole. Just a part of it.

I’ve been having these thoughts since I started the statistics course for the Microsoft Data Science Program.

It’s boring.

I’m sorry, but it’s true. I’ve made no secret of how much I hate lectures. This course… it’s sickeningly brimming full of it. And there’s been no lab activities so far. Only reading comprehension quizzes which, frankly, can be answered by a simple Ctrl+F*.

My lack of interest has reflected in my progress. I used to average around a month per course, but now I’ve been stuck with the introductory modules for a while. It’s not realistic for me to complete by the end of the year. It’s put my target study schedule at risk.

It’s been so bad it’s made me question if I’m cut out for this whole big data business after all.

BUT BUT BUT I still have some semblance of faith.

Maybe, just maybe, I don’t have to be a data scientist. Maybe I just have to be good enough to be part of a data science team. As what the keynote speaker from the recent big data conference said,

He calls the data scientist a unicorn: difficult to find and even harder to keep. For most businesses starting out with analytics, investing in a data scientist will be too much of an overhead. Instead, he recommends to build out a data science team with distributed data science skills. e.g., Team members would include a statistics expert, a communicator, a programmer, a visualizer, etc.

–my notes from Isaac Reyes’ keynote speech during #BigDataPH2016

I’m pretty confident in my communication and visualization skills. I have some programming background. It’s statistics that’s my crux. I know I’ll have to study it anyway, just so I can speak the same language.

But I have to accept I may not have the affinity towards statistics as other data science skills.

It’s helping that I’ve been working on infographics and Excel charts lately. It’s reminded me of how much I love visualizing information. And discovering FlowingData? Peg, right there.

Actions

Of course, I can’t allow this feeling of disinterest to fester. I have to move on. So here are some of the productive procrastination I’ve been doing:

  1. Make myself excited about data science again.

21480734

I bought the book Dataclysm on a whim. It’s not exactly a data science book. But it is full of insights the author picked up while analyzing his own data from managing the popular dating site OkCupid.

It’s a fascinating look into what kind of story numbers can tell you. I’m just on the first chapter on dating, and already I find it interesting how women are much more transparent with their love interests than men. And a bit disheartened to find how men are obsessed with youth.

It’s this ability to tell stories using numbers that got me curious about data science in the first place.

2. Switch gears.

I planned to start on coding when the new year starts, but I wanted to code so much more than to sit through another statistics lecture.

I’d never coded before so, at a reader’s recommendation, I started on Interactive Python‘s “How to Think Like a Computer Scientist.”Except I’ve surprised myself by saying,

Hey, I know this.

It might not be much, but apparently I do have some background in programming. I’d forgotten how much coding I did back in school, and even my first official job (creating and modifying UNIX accounts).

3. Don’t just switch gears, switch the whole damn car.

One problem I’m finding with MOOC-based learning is how it’s heavy on the videos, but limited follow-through. I thought the problem was already pretty bad with the Microsoft courses, but this statistics MOOC by Columbia just takes the cake. It’s a big problem for hands-on learning types like myself.

Unsurprisingly people have complained asked this before and one answer that frequently pops up is DataQuest:

At Dataquest, our unique teaching approach means that you’ll be able to learn all the relevant data science concepts, then build your own projects. These projects will help build your skills, and also form a portfolio that you can show to potential employers.

–DataQuest, “Why Learn Data Science?”

I’m finding this prospect of project-based learning very appealing. I’ll give it some more thought, but if any of you have tried it before I’d appreciate the feedback.

Locally, Data Seer offers data science training. Based on the schedules though it looks to be those workshop-type trainings I attend just for compliance. The ones I don’t really learn from but look damn nice on the resume. I ‘d be happy to be corrected though.

So… the hunt for a learning style that works is still on!

*I know, I know, it’s not the proper way to learn blah blah blah. Cut me a break ok? I’m an engineer. It’s ingrained in me to try to find the most efficient way.

Microsoft DAT206x: Analyzing and Visualizing Data with Excel Review

You never actually analyze and visualize data, but this course is worth taking as it’s a good introduction to using Power Pivot and Power Query–both of which are useful for managing large amounts of data in Excel. Just make sure you manage your expectations.

Update: To follow my progress in this program, check the Microsoft Professional Program tag.

 

Context

For those who are following this blog for my data science updates, it might be of interest to you that I am still working on Microsoft’s Professional Program for Data Science  (on beta). I have recently completed my second course, Analyzing and Visualizing Data with Excel.

This was my gateway course to the program. Excel enthusiasts at work had recommended it as a good introduction to PowerPivot, and it was only later that I found out the course was part of a larger data science program.

My primary purpose for taking the course was increasing my proficiency in Excel. I currently manage a large-scale project with an equally large-scale tracking spreadsheet. The spreadsheet easily gets out of hand due to the sheer number of assets involved and because it pulls data regularly from multiple data sources. I was hoping the course would help me clean up the data and make it sustainable to maintain in the long run.

Because of this, I’m reviewing the course from a more practical Can I use this at work? perspective rather than its relation (or lack of) to data science.

It took me about a month to complete, starting September 2016. You can follow my progress in the MS Data Science Program by using my tag Microsoft Professional Program.

Continue reading “Microsoft DAT206x: Analyzing and Visualizing Data with Excel Review”

Microsoft DAT101x Data Science Orientation Review

At $25 (beta price), this orientation course is overpriced for what it offers.

Update: To follow my progress in this program, check the Microsoft Professional Program tag.

 

I’ve mentioned in my Getting Started with Data Science tips that I’m currently taking the Microsoft Professional Program for Data Science.

The program is still in beta, so:

1. Microsoft needs the feedback, and

2. Potential students would want to know if the program will be worth their time, money, and effort.

The program is pretty extensive, so I thought it best to break my reviews by course as I take them. This review is on the orientation course, DAT101x Data Science Orientation.

For context, I took the course around late September 2016, and got my certificate early October.

 

Continue reading “Microsoft DAT101x Data Science Orientation Review”