A peek into the analytics engine running under the hood, and moving to v2
productArticle08 Sep, 2022

A peek into the analytics engine running under the hood, and moving to v2

For network operators, a comprehensive set of analytics are available - cumulative analytics & historical (per week) analytics of the network and each channel

The MainCross #analytics engine runs in parallel behind the scene and collects usage stats for various actions that are performed across all the networks. This includes member analytics, channel analytics and network analytics.

  1. Member analytics - Popularity, Engagement & Activity
  2. Channel analytics - ChannelPulse, subscribers, views, posts, etc
  3. Network analytics - NetworkPulse, members, vistors, posts, pages, etc

The v1 analytics engine - cumulative stats

The v1 analytics engine displays snapshot analytics (called Vital Stats) for each member and channel and for the network as a whole.

As the name suggests these are cumulative snapshots in time - they are calculated periodically and are a great way to know where the member, channel or the network is currently.

Network vital stats - live example from one of the Network Sites
Member vital stats - live example for a MainCross member

But I need more!

But what if the network operator wants to know the previous history of these parameters? Looking at a graph or a time line of any statistic is a great way to get a bird's eye view of what's been happening with the channel or the network. One can quickly sense whether the stats are growing (yaah!) or whether they've flattened or even started dipping - in which case its time for course correction!

So with that in mind, we set out to build v2 of the analytics engine - the historical analytics engine for the channel and network.

Note that historical channel and network stats are available on paid plans only.

The v2 analytics engine - historical stats

Building a history engine is no mean task. There are possibly 3 parts to such a product:

  1. Collecting raw statistics
  2. Whipping them into shape
  3. Displaying them

Collecting stats

We've been collecting raw stats from the start, so every network has detailed stats available from when it was launched, and similar for channels and the same for members since signup.

Massaging the data

This is the complex part - all the data that's collected is of no use without massaging the data into usable info over time. And one of the primary decisions is how granular should the time step be - ie should stats be displayable at an hourly internal, or daily, or monthly? In fact, we choose to go with weekly, ie channel and network stats are internally massaged and snapshots are created per week from the raw data.

A weekly interval serves well for the parameters we display, and maintains the balance between sufficiently detailed views vs the amount of data we need to store.

Massaging the data and creating historical snapshots is a compute intensive task and cannot be performed instantly on demand (ie when the network operator wants to view). Hence this has to be continuously performed in the background such that the historical charts are instantly available when the network operator wants to view.

Displaying time line charts

The final step is to display all the parameters on interactive charts. Each chart is a time series which can be zoomed in and examined. The data can be exported as CSV, and the chart graphic can be exported as an image.

Channel historical analytics - live example from one of the network sites
Channel historical analytics - live example from one of the network sites
Network historical analytics
Network historical analytics


Why did we build our own history machine?

There are a lot of analytics tools out there, the best known of course would be Google Analytics. And that begs the question as to why we would build our own analytics engines if there's so many out there. Here's why:

  1. Only we know what are the relevant statistics to display, and how to differentiate between the raw stats at a granular level.
  2. As a simple example - views counts for pages vs posts vs user profiles vs channels and so on. For a third party analytics tools, these are all just clubbed together and consolidated stats are displayed.
  3. Similarly showing the difference between members and visitors.
  4. Using an external analytics tool for our customers is not simple - setup to usage has a learning curve, and to find benefit from all the data is sometimes a challenge. We wanted to distill the most relevant parameters and showcase only those, within the dashboard. This removes the friction of setting up, and opening a third party site and mucking around in it, if all one wants to see is the most important stats at a glance.
  5. The vast majority of our network operators certainly don't care about anything beyond GA, and as it turns out most don't care even about GA*. Having an inbuilt tool which displays relevant stats is Good Enoughβ„’ πŸ˜‰
  6. The integration into one of these tools is non trivial. Out of the box, the MC system supports integration into GA. If a network operator requires an integration into some other preferred analytics tool, that's going to be on a chargeable basis. Further, other than GA, most others tools are paid, so in additional to an integration cost then is a running cost (monthly or annually).

* Data based on how many networks have actually added a GA tracking ID.

Want to be informed when this author publishes the next article?

Save, embed, share, report
0comments

Explore more channels?Show all

Updates

More from this channel

Select between trending, latest and important content.