Friday, July 10, 2015

Google Tag Manager certificate

One more certificate of mastering one more very useful tool from Google. This time it is Google Tag Manager. I've already used it in my work. But now my knowledge about it is more consistent.
Looking forward to work with GTM closer while implementing and customizing analytics on websites and mobile.

Monday, July 6, 2015

Kaggle Walmart competition

Just submitted my code on GitHub, which gave me the first Kaggle badge (top 25%).
I used Gradient Boosting with R package caret. But before that, there was a lot of preprocessing.
This was my second Kaggle competition. I finished 86th out of 485 participants, 1% of score difference with the winner.
Probably not bad for the start :)

Sunday, July 5, 2015

Google Analytics certified

I've worked with Google Analytics almost since its birth in 2005. And didn't pay too much attention to certification. But now it's done! I just passed GAIQ exam and got 95% mark. So now I'm Google Analytics certified:;idtf=112782820638777969559;

Thanks to Google for that brilliant tool!

Tuesday, June 9, 2015

Wine quality assessment app

Developing Data Products course, ninth and last in Coursera's Data Science Specialization, finished. As a course project, we had to build web-application with elements of machine learning. And then make a pitch presentation for that app. We were expected to use Shiny package and platform to publish our projects on the web.

I found an interesting dataset of Portugal wines' chemical characteristics and assessed quality in UCI repository. It gave me an idea that one can predict wine quality after chemical analysis.
Random Forest was chosen as a modelling tool. I hosted the model right on my github account:

And there you are! :)

Pitch presentation was restricted by five pages including the cover. I added a bit of humor in it :) Use right and left arrows on your keyboard to turn pages.

Tuesday, May 5, 2015

Machines are learned!

It finally happened! Machine Learning course on Coursera finished.
Mixed feelings as after an excellent movie. It's great that the good guys have won, but it's very interesting what will happen next! :)

All lectures have been very engaging. I've watched many of them two or three times. Now I have a notebook full of lecture notes. Hope Andrew will eventually write a book which I will definitely buy.

All programming assignments were far from "annoying homework", but rather interesting little challenges and at the same time the continuation of lectures with many practical tips and tricks inside.

I'm proud to get maximum grade, 100%. As a course bonus, I've connected on LinkedIn with lots of my classmates from all around the world! My network has grown several times.

Want to say Thank You Very Much to our instructor, Andrew Ng, professor of Stanford University. That was an amazing experience!

Wednesday, April 8, 2015

Workout in a right way

Do you do physical exercises? Do you do it right? :)
Our course project in Practical Machine Learning course was about this topic. Several people put on an accelerometers and performed some exercises the right and the wrong way (consciously). So our task was to build the model to predict whether a person does barbell lifts correctly or incorrectly.
I used Random Forest and Gradient Boosting algorithms. RF has won in a Cross-Validation. Here is my report:

Traditionally, many thanks to our instructor Jeff Leek, professor of Johns Hopkins University!

Saturday, March 7, 2015

Statistical Inference and Regression Models

That was different! Two new courses in Data Science Specialization on Coursera. Two courses of almost pure mathemathics. Pure pleasure! :) Many thanks to my instructor Brian Caffo, professor of Johns Hopkins University!
Even though these concepts are mostly familiar to me thanks to my alma mater, Moscow State University, it's very useful to see them again in combination with R language.

Monday, February 2, 2015

A project for Reproducible Research

Just finished a big report for the course project. This was a part of Reproducible Research course on Coursera.
The subject of the analysis was the most severe weather events and their consequences with respect to population health and economics.
A couple of interesting details.
I used generalized Levenshtein distance to handle raw data. Event coding in formal instruction and in real database weren't identical.
One of the tasks was to draw at least one plot. So I've made two US maps of weather events impact on population health and economics.

Tuesday, January 27, 2015

Machine Learning

Enrolled in Machine Learning course on Coursera.
The instructor is Stanford Professor and Coursera founder Andrew Ng! Looking forward for a very fun couple of months :)