Sunday, April 10, 2016

Scroll depth tracking

Understanding the way users scroll pages on your site is important to measure the engagement. Being ensembled with metrics like “time on page”, it will get you valuable information about people’s interest in your content.

I know, I know.. If you google something like “scroll depth tracking”, you’ll get ready-for-use Google Analytics plugin, which is apparently quite good. But what if you don’t want to involve jQuery or you wish to tune every subtle detail, or you have another reason to reject ready-made solution? Also I don’t really like the idea of sending some scroll stats (e.g. “50% achieved”) to GA before user finished working with the page.

Here is my approach.

Friday, March 11, 2016

Tracking categorical variables with Google Analytics and Google Tag Manager

Recently I was working with a magazine’s website. And one of the questions was to construct the report about an authors’ popularity. Surprisingly, this one turned out a bit challenging.

I needed to sum up all visits to all of the articles’ pages for every particular author. Using these numbers, I would build a rating of authors. The problem is in fact that any author can have many articles, and any article can be written by several authors (coauthors). Speaking SQL, it’s a “Many-To-Many Relationship”.

First of all, ‘author’ is a categorical variable since it takes discrete values of a string type, like an author’s name concatenated with a surname. Therefore, we can’t use Custom Metrics of Google Analytics since it can only be a number, time or currency. No dictionary option here.

On the other hand, Google Analytics offers Custom dimensions for such cases. If you have implemented Google Tag Manager, you can just make the new custom dimension ‘author’, and pass the author’s name from each article’s page through the Data Layer. But not in our “Many-to-Many” case!

If an article was written by two or more authors, you can’t pass one long string with the authors’ names, like this: “A.Johns, B.Johnson, C.Jacobs”. This is not an option, because you need to make three (in this example) distinct database entries: one for each author.

So we have the categorical variable with multiple values for each pageview.

Here is my approach on how to handle the situation..

Friday, July 10, 2015

Google Tag Manager certificate

One more certificate of mastering one more very useful tool from Google. This time it is Google Tag Manager. I've already used it in my work. But now my knowledge about it is more consistent.
Looking forward to work with GTM closer while implementing and customizing analytics on websites and mobile.

Monday, July 6, 2015

Kaggle Walmart competition

Just submitted my code on GitHub, which gave me the first Kaggle badge (top 25%).
https://github.com/Oleg-Davydov/kaggle_walmart_competition
I used Gradient Boosting with R package caret. But before that, there was a lot of preprocessing.
This was my second Kaggle competition. I finished 86th out of 485 participants, 1% of score difference with the winner.
Probably not bad for the start :)

Sunday, July 5, 2015

Google Analytics certified

I've worked with Google Analytics almost since its birth in 2005. And didn't pay too much attention to certification. But now it's done! I just passed GAIQ exam and got 95% mark. So now I'm Google Analytics certified:
https://www.google.com/partners/#i_profile;idtf=112782820638777969559;

Thanks to Google for that brilliant tool!

Tuesday, June 9, 2015

Wine quality assessment app

Developing Data Products course, ninth and last in Coursera's Data Science Specialization, finished. As a course project, we had to build web-application with elements of machine learning. And then make a pitch presentation for that app. We were expected to use Shiny package and Shinyapps.io platform to publish our projects on the web.

I found an interesting dataset of Portugal wines' chemical characteristics and assessed quality in UCI repository. It gave me an idea that one can predict wine quality after chemical analysis.
Random Forest was chosen as a modelling tool. I hosted the model right on my github account:
https://github.com/Oleg-Davydov/winequality

And there you are! :)
https://laborant.shinyapps.io/winequality

Pitch presentation was restricted by five pages including the cover. I added a bit of humor in it :) Use right and left arrows on your keyboard to turn pages.
http://oleg-davydov.github.io/winequality/Rpresenter.html

Tuesday, May 5, 2015

Machines are learned!

It finally happened! Machine Learning course on Coursera finished. https://www.coursera.org/learn/machine-learning
Mixed feelings as after an excellent movie. It's great that the good guys have won, but it's very interesting what will happen next! :)

All lectures have been very engaging. I've watched many of them two or three times. Now I have a notebook full of lecture notes. Hope Andrew will eventually write a book which I will definitely buy.

All programming assignments were far from "annoying homework", but rather interesting little challenges and at the same time the continuation of lectures with many practical tips and tricks inside.

I'm proud to get maximum grade, 100%. As a course bonus, I've connected on LinkedIn with lots of my classmates from all around the world! My network has grown several times.

Want to say Thank You Very Much to our instructor, Andrew Ng, professor of Stanford University. That was an amazing experience!