Exploring trends in the daily deals market.
How does the revenue changes over the time? Find out by following the width of the steamgraph from as time changes from the top to the bottom.
How much revenue is generated by the new merchants and how much by the returning merchants? Find the answer to that question by observing the proportion of the colors on the chart. At the beginning the revenue was generated only by the new merchants (which is not suprising for a new market). Later, it can be seen that more than a half of the total revenue is generated by the returning merchants - ones that offer their deals for the 4th or 4th+ time. Should more effort be made to transform new merchants into returning one?
This visualization was originally prepared for the Visualizing Daily Deals contest hosted at Crowdanalytix.com
For the sake of the integrity of the analysis the following rows have been removed from the dataset:
- Canadian deals - they account for only 0.48% of data hence there is no point including them in the analysis as the remaining 99.52% come from US.
- Start dates only in the 2009-12 range were preserved as they account for 99.82% of data. End dates in the range from 2009-13.
- New derived column was added - duration of the deal (end date - start day). Rows for which obtained value was negative were removed.
- New derived column was added which indicates for each deal if it was 1st, 2nd, 3rd or 4th+ deal offered by a certain merchant. To compute this column deals were ordered according to the start date and for each deal number of deals offered by this merchant up to that time was counted.
- Missing values in price and value columns account only for 0.0128% of remaining dataset hence they were also removed.
Finally, the cleaned dataset constitutes 250644/252528 = 99.25% of the original dataset and was used to conduct analysis and prepare presented visualizations.
Circular heat map
library and Circular heat chart
were used to prepare the circular graph. Data used to create that display was aggregated and then normalized for each category such that the maximum value in the category is 1. Thus, absolute values (depicted as color intensities) should not be compared between categories but can be used to find general trends within each category and between categories. For instance, when "Number of deals" measure is selected one can observe that the highest number of deals in each category was usually offered at the end of 2011 and at the beginning of 2012. Nightlife deals started to be offered really late compared with the other categories and quickly reached the peak of their popularity in Q4 2011 and then they have been steadily decreasing. What happened at the end of 2012 that number of deals decreased for each category?
was used to prepare the tag clouds. Maximum 50 words are displayed from the textual descriptions of the deals for each category. English stopwords were removed. The underlying rule of that visualization is that the more freqent the word is the bigger it appears in the screen. The downside of using Wordle is that the color has no actual meaning so it should not be interpreted. Tag clouds should change when you roll over each category on the circular chart. Should you not see the change immediately please wait a moment as the image needs to be downloaded by your browser.
The navy blue area represents merchants sorted by the selected measure (e.g. revenue) in the descending order. Their cumulative number can be observed on the horizontal axis. The line on the other hand corresponds to the cumulative revenue that can be found on the vertical axis. One can examine the proportion between the cumulative measure (revenue, number of deals or number of sold deals) and percentage of merchants that generated this measure. The selected point on the line corresponds to the 80% of the selected measure. The revealed structure has tremendous consequences - depending on the business strategy more attention should be paid to the several important merchants or more emphasis should be put on the non so significant majority?
This visualization was inspired by Canadian Deals Association analysis.
But instead of using a stacked barchart a steamgraph is used. Steamgrap is basically the same but bars are not aligned to the bottom baseline and therefore it is easier to follow the changes in the structure while preserving one additional dimension - the width which in this case represents the total revenue for given month. Code adopted from D3 steamgraph example.