Few samples of interesting data manipulation and forecasting methodologies for effective data:
- Forecasting >Trend analysis and Gut Method
- Benchmarking > client vs. competitors activity and > analytics vs. ad impressions
Timely identification of newly emerging trends is very important to businesses. Predictive modelling is usually done automatically by computer software that learns patterns from data. Here, we’ll focus our attention on 2 methodologies that can be used by any analysts: regression modeling and decision tree. The former is extremely useful in time series analysis.
Had been asked to predict how many visits our sites was expected to counted at the end of Q2 using 3 years site behavioral data. I used a trend model adjusted by seasonality. It went pretty well and, eliminating the noises, we reached our end-of-period target.
Here how to build this model using Excel (please, refer to the below table – indeed all data are unrelated to the real case).
1) Organise your data considering the correct time frame: week, year, and time period). If you start your analysis on the week beginning with July the 8th, this represents Time Period 1.
2) Quadratic time period is calculated using the following formula: (Time Period 1)^2, (Time Period 2)^2, ….
3) The Quadratic Trend formula is: TREND($ActualVisit$11:$ActualVisits$166; $TimePeriod$11:$TimePeriod$166, TimePeriod11:Time^2 11)
4) Actual as % of the Trend is simply the ratio Actual Visits on Quadratic trend
5) The Seasonal Index formula is: sum.if ($Quarter$11:$Quarter$166; Seasonal Quarter 3, $Actual as % of Trend$11: $Actual as % of Trend$166)/count.if (($Quarter$11:$Quarter$166; Seasonal Quarter 3)
6) The Seasonal Forecast formula is: Quadratic Trend 11*Lookup(Quarter 11, $Seasonal Quarter$3: $Seasonal Index$ 6, 2).
Please note I assumed the Time Series data spam from cell 11 to cell 166, whilst seasonal quarter/index inputs are on cell 3 to 6.
The Gut Modelling
The Gut method? I discovered that this is the preferred predictive methods used by many agencies: start forecasting the traffic from marketing sources using as predictor solely your expected ad spends for the period and use this result for guessing how many sales you will get from your mkt effort.
Now look at your overall historical traffic and derive your organic traffic and sales outcome from the marketing traffic. You can play with the stats using “adjustable weekly seasonal indexes”, but be aware that you’ll need to run this model almost on a weekly basis! Why? Because the assumptions are statistically pretty weak, the distribution is likely to be not normal; organic sales are brand related; …
Benchmarking: client vs. competitors activity
How am I doing against my competitors? Is my marketing effort efficient? Are getting the most of my online presence?
The following is a typical example of competitors benchmarking in an EU digital landscape. In the top left area in the graph I synthetized monthly ad spend and ad impressions by competitors and countries. On the bottom left, all marketing mix monthly variations were collected by market including factors as technological platforms, email, newsletters, sponsorships and affiliates programmes, presence in social networking sites, brand reputation indexes, as well as both country own characteristics (e.g. broadband penetration, eCommerce propensity index, …) and internal market limitations (e.g. supply chains constraints).
An excel file with all data, coupled with market penetration representation and a descriptive word file.
Benchmarking: analytics vs. ad impressions – the Decision Tree Modeling
How many visitors are currently exposed to an ad? Do they have any preferences? In messaging? format? location?
The Decision Tree Model is graphically the most intuitive attribution model. Based on historical behavioural data, such as how many visitors were exposed to a specific advert in a well specific location, this model help you to calculate the probability a visitor will click on a advert according to selective attributes (e.g. format, placement, messaging). In other words, it divides your population into smaller segments repeatedly and each node will include information regarding probability of the event (CRT) to occur, averages and total visitors at each stage, most frequent attributes …