The Beauty Of Data
- Muhammed Hamzah
- Mar 28, 2022
- 4 min read
Imagine you own a start-up that sells electronic appliances and devices. You have to answer the question of when your advertising should be displayed to maximize the chance of potential consumers buying your products. Your current system includes a PoS (Point of Sale) system that sends each order to some sort of storage, such as cloud storage. This process of collecting data is known as data mining. Now, instead of using the valuable data that is being gathered, you leave it alone on the cloud provider’s racks, where it sits there, slowly losing value and becoming obsolete. Such misuse of data would result in inadequate and poor decision making and inefficient business operations, causing the business to lose against competition and eventually, get driven out of business, especially in today’s world, where we are surrounded by technology.
Now, let’s look at this from a different perspective; let’s say that the data is handled and analyzed to answer key business questions and therefore put forward an effective solution to solving those problems. First, the data goes through a process known as data cleaning, where any incomplete or error-containing data is removed from the to-be analyzed dataset, such as something as simple as deleting rows with empty cells or removing certain characters from cells in a column, to something as complicated as removing data where the difference between the time the order was placed and when the order data was persisted, is too long, as it could lead to data corruption due to communication medium failure/error or malicious intent.
After the raw data is cleaned, it undergoes what is known as data exploration, which is the critical process of performing initial investigations on data so as to discover patterns, spot anomalies, test hypotheses, and check assumptions with the help of summary statistics and graphical representations. One key aspect of data exploration is that it is heavily based on intuition, essentially meaning that it relies on curiosity and the questioning nature of humans. Which year were sales the highest? Or, in this case, what time of the day are sales the highest? It is because of this closely tied relationship between the data and the business that, at this point, data is most valuable to a business. In simple words, data exploration involves setting out business problems and forming assumptions based on the visual analysis of the data, usually done with programming languages instead of no-code and advanced data visualization tools. An example of data visualizations obtained from exploratory data analysis would be:

As shown, the exploratory analysis and basic visualization show us that sales are highest at 12:00 and 7:00. From this, we can form assumptions or hypotheses about the business problem of when its advertising should be displayed; around 12 o’clock and 7 o’clock.
From this analysis, we can then pick on certain points and features from the analysis to turn those assumptions into more solid and accurate facts. This is known as feature engineering and is key to polishing business solutions tailored using data. For example, we can group all sales at the high times and figure out which ones bring more revenue. Or, even better, figure out which one is bought in most quantity instead of revenue; such a stringent measure would allow the right products to be advertised at the right time, as a high revenue product may not be sold in high quantities, while it is very likely that a low revenue product is sold in high quantities, as shown below:

The data solution created up till now is sufficient; advertising can be played at noon at 7:00, where products such as cables, batteries, and headphones are shown in advertising. However, the solution is only temporary, as the data used would’ve, most likely, already lost 50% of its value and accuracy significantly decreases, meaning that the solution would quickly become inefficient and outdated. To cope with this, we can create machine learning models that use the data collected from the data mining and data cleaning steps to be trained and make predictions and decisions based on key features and decide at what time the advertising should be displayed. The model can then be used constantly in the future to consistently update the advertising solution. The model can be made even more accurate by accounting for sudden external factors that may even cause the model to become outdated, such as the economic situation and average consumer expenditure.
The last step involved in the process of making a data-driven business decision is to actually inform the stakeholders about the findings and implement the solution. Informing the information can be done in various ways, such as just displaying the raw findings or the low-level visualizations done during the data exploration. However, to really drive the point home for the decision-makers of the business, data visualization tools such as Power BI or Tableau can be used. Such tools and visualizations not only visually display a ton of data in an attractive and interactive method, but they are more appealing to people - who are likely to be visual learners - with little knowledge of what goes behind the scenes into producing these findings.
And there you have it, making a data-driven business decision to optimize and increase the efficiency of your business operations and keep up with the competition or even get a foot ahead of other businesses!
Comentários