cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

User Flow Report

kin_wai_koo
Inactive

Here's a little servlet I wrote which pulls the Purelytics useraction data from elasticsearch, parses it into a tree structure, and spits out some JSON. The JSON is then converted into a Sankey diagram, similar to Google Analytics' user flow report.

The servlet is attached if you want to try this for yourself.

visitflow.zip

To run this, unpack the zip file into the webapps directory of a web container like tomcat or jetty. You'll also need to copy the .jar files from the elasticsearch/lib directory into visitflow/WEB-INF/lib.

If your elasticsearch node is on another host, you will need to edit visitflow/WEB-INF/web.xml, and set the elasticserver parameter to point to your elasitcsearch node.

Edit: Added the ability to focus on a particular user action.

Edit 2: Added the ability to generate goal flows. Goal flows are accessed at goalflow.html.

5 REPLIES 5

nalin_agrawal
Dynatrace Organizer
Dynatrace Organizer

this is awesome.

dominik_punz
Dynatrace Pro
Dynatrace Pro

Wow Kin,

thats really awesome! Is there a reason why you created a servlet that queries the data from elastic search instead of querying it directly from Javascript? This would make it easier to set up.

Best regards,

Dominik

Hi Dominik,

The servlet queries Elasticsearch for user actions, and then builds a tree in memory. I wasn't sure if the browser Javascript engine could handle the volume of data.

Regards,

Kin Wai

dominik_stadler
Dynatrace Pro
Dynatrace Pro

Quite a nice way to visualize this information!

However it seems the current query needs to iterate over all results one-by-one (in batches), so it will likely become quite slow when more data is looked at together with lots of network traffic for all the data, e.g. for me it was not able to handle a few ten-thousand entries in reasonable time.

You would probably need to store the data in a slightly different way to make use of aggregations in Elasticsearch to retrieve the results for larger amounts of data. Probably storing "next"/"previous" user action information could make it work to some degree already.

Thanks for your comment Dominik.

I totally agree. It becomes too slow when the query set becomes larger. I guess a solution to this would be to build the next/previous data set using entity-centric indexing:

https://www.elastic.co/elasticon/2015/sf/building-...

Regards,

Kin Wai