Categories

Exploring trends in the daily deals market.

How to find trends and patterns in the tons of raw data? In this page two main tools are proposed for that purpose. Circular heat chart allows you to find patters in the temporal dimension of the data. Colors represent the intensity of the selected measure. For instance, we can observe that average price of the deals is increasing over the time for almost all the categories. Does the industry evolve? Is the business model changing?When 'Number of deals' measure is selected you can observe when different categories emerged.

Select measure: Average price Number of deals

What are the deals about, which items or services appear most frequently? Tag clouds are popular way of visualizing textual data. Here, they allow you to get insight in what in fact is represented by each category as they were created using the textual description of the deals. You can observe that some categories such as Art and Entertainment are quite heterogeneous while others are dominated just by few services e.g. Active Life.

Merchants

Discovering the structure of the market.

Have you heard about Pareto rule? Known also as 80/20 rule says that often 20% of the causes is responsible for 80% of the effects. It is amazing how true is this rule for Daily deals market. 20.47% of merchants generate 80% of the total revenue. 19.71% is behind 80% of total number of sold deals and finally just 9.60% of them is responsible for offering 80% of all the deals. The above chart also known as Pareto chart depicts this relation.

Revenue Number of deals Number of sold deals

How does the revenue changes over the time? Find out by following the width of the steamgraph from as time changes from the top to the bottom.

How much revenue is generated by the new merchants and how much by the returning merchants? Find the answer to that question by observing the proportion of the colors on the chart. At the beginning the revenue was generated only by the new merchants (which is not suprising for a new market). Later, it can be seen that more than a half of the total revenue is generated by the returning merchants - ones that offer their deals for the 4th or 4th+ time. Should more effort be made to transform new merchants into returning one?

Description

This visualization was originally prepared for the Visualizing Daily Deals contest hosted at Crowdanalytix.com

Data treatment

For the sake of the integrity of the analysis the following rows have been removed from the dataset: Finally, the cleaned dataset constitutes 250644/252528 = 99.25% of the original dataset and was used to conduct analysis and prepare presented visualizations.

Circular heat map

D3 library and Circular heat chart were used to prepare the circular graph. Data used to create that display was aggregated and then normalized for each category such that the maximum value in the category is 1. Thus, absolute values (depicted as color intensities) should not be compared between categories but can be used to find general trends within each category and between categories. For instance, when "Number of deals" measure is selected one can observe that the highest number of deals in each category was usually offered at the end of 2011 and at the beginning of 2012. Nightlife deals started to be offered really late compared with the other categories and quickly reached the peak of their popularity in Q4 2011 and then they have been steadily decreasing. What happened at the end of 2012 that number of deals decreased for each category?

Tag clouds

Wordle was used to prepare the tag clouds. Maximum 50 words are displayed from the textual descriptions of the deals for each category. English stopwords were removed. The underlying rule of that visualization is that the more freqent the word is the bigger it appears in the screen. The downside of using Wordle is that the color has no actual meaning so it should not be interpreted. Tag clouds should change when you roll over each category on the circular chart. Should you not see the change immediately please wait a moment as the image needs to be downloaded by your browser.

Pareto chart

The navy blue area represents merchants sorted by the selected measure (e.g. revenue) in the descending order. Their cumulative number can be observed on the horizontal axis. The line on the other hand corresponds to the cumulative revenue that can be found on the vertical axis. One can examine the proportion between the cumulative measure (revenue, number of deals or number of sold deals) and percentage of merchants that generated this measure. The selected point on the line corresponds to the 80% of the selected measure. The revealed structure has tremendous consequences - depending on the business strategy more attention should be paid to the several important merchants or more emphasis should be put on the non so significant majority?

Revenue steamgraph

This visualization was inspired by Canadian Deals Association analysis.But instead of using a stacked barchart a steamgraph is used. Steamgrap is basically the same but bars are not aligned to the bottom baseline and therefore it is easier to follow the changes in the structure while preserving one additional dimension - the width which in this case represents the total revenue for given month. Code adopted from D3 steamgraph example.

Author: Dominik Cygalski