Discover how applied AI can automate tasks and enhance productivity in data analysis. Explore a real life example of Applied AI in report generation.
Based on our current experience and assessment of applied AI capabilities, it is unlikely that analysts will be replaced completely. On the other hand, we see that LLMs can multiply analyst productivity and release significant time for higher-value activities.
Consider the following example.
We recently did a proof of concept for one of our client, and the results are promising. Our client needed to produce a report that tracks the market share of mobile apps in a specific category across the globe. There are several apps competing in each country and more than 30 countries are analyzed. The analysis involves two market share metrics: usage penetration and download share.
Every quarter, analysts refresh the market share data in a spreadsheet and then convert the tables to a slide deck. Each slide focuses on one market and displays the market share trend for several apps.
Once the presentation is created, analysts interpret the trends and write a few bullet points on each slide with a summary if their analysis.
Given a set of competitor mobile apps for each of the 30 geographical markets, the following tasks need to be performed:
Given the number of markets, different competitor sets, and multiple metrics, these tasks take a lot of effort and are error-prone.
The good news is that all of the above tasks can be automated to some degree with a data transformation involving an applied AI step, resulting in a significant increase in efficiency.
Below are the analysis steps after the report generation workflow is set up in an environment utilizing Modern Data Stack and Applied AI:
The markets and competitors of interest change over time. As a part of the automated report generation process, this is the step where the analyst “configures” the markets and competitors that will be included in the report. This information can be stored in a Google Sheet that is automatically synced to the Snowflake data warehouse. This way, a non-technical analyst can pick the markets and competitors of interest.
One of our analytics engineer coded this data transformation step, which runs as a series of SQL statements in the Snowflake data warehouse. Once set up, the data transformation pipeline runs automatically and produces up-to-date data without human intervention (The transformation is set up as a dbt project.) This pipeline produces market share metrics for each market for the period of interest.
We used a template provided by the client as a starting point and automated the creation of each slide using a Python script.
Although this step is normally done manually, we use Applied AI to “interpret” the numbers in each market and populate each slide with the findings.
In the steps laid out above, all are standard in the modern data stack approach, while the last step, the data interpretation, has become possible recently with AI.
At the start of step 4 above, we have the overall structure of the slide deck with charts showing KPI trends over quarters. Now, it is time to derive insights from the charts. This is a time-consuming process since 30 countries need to be reviewed. This work can be automated using OpenAI’s GPT-4 API.
Let’s take an example of a chart with hypothetical values for Monthly Active Users for every quarter between 2019-Q4 to 2023-Q2 for five competitors in one market. That visual is built from tabular data. Thus, we can pass this tabular information to a Large Language Model (LLM) and ask it to generate some key findings.
Take, as an example, a few lines of the underlying data used for building the chart above:
We need to interact with LLMs using texts. Therefore, passing that tabular data as a text to the LLM is necessary to allow it to reason about it, produce insight, and respond us. That is what we will do when we interact with GPT-4 from OpenAI.
It is recommended to provide some context to the model and to be very clear about the task you want to achieve, as in the prompt below:
You will act as a mobile app strategic analyst and provide me with insightful takeaways based on the information I give you.
Consider the following table in markdown format with data about the monthly active users (MAU) market share for different apps in the United States. These apps are competitors in the social media market. Give me the main findings and takeaways based on the data in this table:
[The table data is shared in this part of the prompt, in the text form]
Here is the output provided by ChatGPT:
Based on the provided data, we can discern several key findings and insights about the monthly active users (MAU) market share for different apps in the US social media market: Here's the analysis:
The prompt above has not been optimized. Additional instructions can be provided to focus the analysis on a specific aspect such as recent changes, to be less verbose, or to provide the analysis results in a structured format that is easy to parse by our automated script.
Despite this not being an optimal prompt, it’s clear that adding the OpenAI API to our technology stack can greatly improve the efficiency of analysts.
LLMs are impressive; however, sometimes they hallucinate. They can deviate from the task or even makeup information that was not there to start with.
Therefore, having a human in the loop for such tasks is still crucial. In our example ,the current plans are to have an expert analyst receive the draft crafted slide deck with automated charts and AI-generated key takeaways. Our expert analyst will refine and finalize the content. As a result, instead of constructing everything from scratch, our analyst will be able to concentrate on enhancing the commentary and ensuring its alignment with his analysis, thus making the process more efficient and targeted.
We have experimented with Chat GPT 3.5 vs. GPT 4.0 in this project. We observed significant improvement in performance when the task was completed by GPT 4.0. In our tests, ChatGPT 3.5 produced 121 statements with an accuracy of 88%, while GPT 4.0 achieved 99% accuracy across 140 sentences. The 3.5 model constantly got confused with trends of increase and decrease over time. You can see some examples below.
“X” also experienced growth, with market share increasing from 36% in Q4 2022 to 30% in Q1 2023
“X” was the market leader in Q4 2022 with a market share of 35.85%. However, its market share decreased to 30.36% in Q1 2023.
“Y” market share in Italy remained relatively stable, with a slight increase from 3.89% in Q4 2022 to 2.61% in Q1 2023.
“Y” market share decreased slightly from 3.89% in Q4 2022 to 2.61% in Q1 2023.
“W” remained relatively stable, with a market share of 23% in Q4 2022 and 16% in Q1 2023.
“W” experienced a significant decrease in its market share, dropping from 23.45% in Q4 2022 to 15.65% in Q1 2023.
The above is an example of an everyday use case for using applied AI for analysis purposes. Applied AI can provide significant efficiency and allow more time for high-value activities.
Harnessing the power of AI presents an important opportunity for our clients’ data solutions and products.
At 205 Data Lab, we build data solutions and products, enhancing our clients’ custom workflows. We lay out the data infrastructure, and make quality, timely data available, and integrate AI for our client’s specific use cases.
Stay in the loop with everything you need to know.