top of page
Writer's pictureDuc Pham

(1) Exploring the Role of AI in Data Analysis: Why We've Stayed Silent Until Now

The world is experiencing a transformative shift with the emergence of Generative AI (Gen-AI). By harnessing machine resource, individuals & organizations can perform previously impossible tasks. This includes design, content writing, coding with respectable outcome. With this new capabilities, from the beginning of OpenAI ChatGPT 3, people had discussed whether Gen-AI can replace human in data analysis. Datainsider.co have been exploring this possibility for the past years, and we want to give you some clues with the latest updates from key AI players.

But before we dive in, let's explain the basics of Gen-AI and Data Analysis.

What is Generative AI?

Generative AI (or Gen-AI) refers to a subset of artificial intelligence that focuses on generating new content or predictions based on existing data. It encompasses a variety of models, including but not limited to:

  • Generative Adversarial Networks (GANs)

  • Variational Autoencoders (VAEs)

  • Transformer models (like GPT-4)

These models are designed to understand patterns in data and generate new, similar data, making them incredibly useful for new content creation.


Here are several (but not all) issues with using Gen-AI for Data Analytics:

  • Data Quality and Bias: AI can understand data patterns from the trained data. However, all data is not equal.

  • Interpretability: Understanding how generative AI models arrive at their outputs can be difficult. For example: How much weight does the answer give to pre-trained data versus newly input data?

  • AI-hallucination is real: working with business data & number. It's a big No-go.

  • Customizability: It can be frustrating if you simply need to switch from a bar chart to a line chart and the AI loses its context or data, requiring you to start over from scratch.

  • Cost: Implementing pre-trained models is already costly, and let not to delve into the expenses for training custom models.

  • Security: In many cases, this means that you need to provide AI models with your business data. This is always a sensitive topic.


May 2024 - The AI Arms Race Heats Up:

OpenAI ChatGPT 4o

OpenAI kicked things off on May 13th with the unveiling of ChatGPT 4o with a nice, digital-only broadcast. The version naming "o", definitely "optimized," is significantly more efficient with speed and resource enhancements, enabling larger and superior AI responses at lower cost. Which included:

  • Text: Expect more nuanced and informative responses, along with a deeper understanding of context and intent. Context and intent of the "problem" is super important in business intelligence. More context mean it can directly remember the whole conversation and the input data.

  • Vision: ChatGPT 4o can now analyze and understand visual information, potentially opening doors to applications like image description or real-time scene comprehension.

We tested it and it's incredible. The charts we generated from ChatGPT 4o itself with a CSV:

chart-generate-from-chatgpt-4o
We told ChatGPT 4o to draw some analyses from a csv

In addition to reading the text and number accurately from the image, it can also identify the peak and trend of the charts. I'll let you judge the outcome here:

1.Overall Growth in Offline Revenue: The trend component shows a clear upward trend in offline revenue from 2016 to 2022.
+ Peak Values:
The highest peak in the observed data is around early 2019, reaching approximately $14.56 billion.
The trend component shows a gradual increase, peaking in early 2022 at about $5.5 billion.
+ Dip in Revenue:
A noticeable dip occurs around 2020-2021, which could be due to external factors like economic downturns or market changes.

2.Seasonal Patterns: The seasonal component highlights periodic peaks and troughs within each year.
+ Recurring Peaks:
Peaks are observed annually around the same periods, suggesting predictable increases in revenue during these times.
For instance, there are significant peaks in the seasonal component every year, typically reaching around $2.5 billion.
+ Recurring Troughs:
Troughs are also consistent, indicating lower revenue periods. The seasonal lows drop to nearly zero, suggesting minimal revenue during these periods.

3.Monthly Variations: To provide detailed monthly variations, let’s break down the average monthly revenue based on the seasonal component:
+ High Season Months: Peak revenue months typically occur around the end of each year (November-December) and mid-year (June-July), aligning with holiday seasons and mid-year sales.
+ Low Season Months: Lower revenue months appear consistently at the start of each year (January-February) and around mid-year (April-May).

4.Residual Analysis: The residuals show how much of the observed revenue is unexplained by the trend and seasonal components.
+ Magnitude of Residuals:
The residuals are relatively small, generally ranging between $-1 billion and $1 billion.
This indicates that the model's trend and seasonal components effectively capture most of the variations in the data.

5.Yearly Growth Rate:
Calculate the yearly growth rate to quantify the increase in revenue:
2016 to 2017: Approximately 20% increase in offline revenue.
2017 to 2018: Around 25% increase.
2018 to 2019: Significant growth, over 50%.
2019 to 2020: Noticeable decline, about 30% decrease.
2020 to 2021: Recovery phase, about 15% increase.
2021 to 2022: Strong recovery, approximately 40% increase.
  • Audio: We won't discuss the voice functionality since it doesn't contribute significantly to data analysis, but it does enhance the user experience.


The Gemini Everywhere

Just a day after OpenAI's event, Sundar Pichai, CEO of Alphabet took center stage at Google I/O 2024. While Google didn't unveil a single, unified update like ChatGPT 4o, they showcased a range of AI-powered advancements across their product suite. Excepts the corporate style, I love everything else in this Gemini update. It's like Google is cheating because they included various similar features like all the great AI chrome add-ons, such as:

  • Gemini in Gmail: how it related to data analysis? The demonstration Google showed, allows you to instruct Gemini to scan your inbox, organize all receipts by vendor name, date, and cost, and input them into a Google Sheet.

This automated data processing got our interest and we look forward to testing it upon release.

gemini-data-mining-ai-assistant
Gemini AI data mining assistant for non-tech users

It may overwhelm our Google Sheet connector with the abundance of ideas and possibilities.

  • Gemini Live: with the demo, it would be similar to the Vision capacity from GPT 4o

  • Gems: Create an AI assistant like ChatGPT Assistance but with the twist of making it accessible to your organization's Workspace Chat conversation. So you can create a Data Analyst Assistance, and by having access to all the chat history, context would not be a problem with this AI chatbot.


Microsoft All-in on Copilot

Finally, a week later at Microsoft Build 2024, Satya Nadella discusses Azure AI and Copilot Studio, emphasizing their transformation into not only personal AI assistants but enterprise-grade AI tools. Personally, I find their updates more intriguing compared to those from OpenAI and Google. Despite receiving less attention, because they used ChatGPT as the foundational model. Nevertheless, they have significantly expanded its capabilities and deserve recognition. Relevant features to Data Analysis includes:

  • Azure AI: They had been used by over 50,000 companies already, which makes you wonder how these large enterprises had integrated AI into their processes. However, I believe the convenience of deploying OpenAI on your private cloud (on Azure) is one of the major use case.

azure-ai-for-corporations
Azure AI built for Enterprises
  • SLMs with advanced capacity: When discussing LLMs, Microsoft is actually leading in various areas, such as multi-modals, open-source models, and small language models. What is even more intriguing is their demonstration of using a small model to explain a chart and provide an insightful overview of the data.

slm-for-data-analysis
Small model Big capacity for Data analyzing
  • AI Intelligence Platform: I was like, "Oh no!!!". This deserves a full blog post to discuss.

Microsoft-Intelligent-Data-Platform
Is it our biggest competitor?
  • Copilot connectors: This is where you build your data ingestion. The only catch is it would only work with Microsoft data-stack, like Dataverse, Fabric or Azure Data Cloud.

  • Copilot Agents: Taking a significantly different approach compared to OpenAI or Google, Microsoft allows you to build a functional workflow by integrating directly with 3rd party services.

copilot-agents-workflow
Copilot Agents are built different
  • Knowledge select (path of Copilot Studio): Allowing you to choose the training data sources not only makes Copilot's responses more concise but also speeds up the process.

  • Actions & Publishing Channels: Automate AI actions directly integrated with Slack, Telegram, Twilio, Zendesk, and of course, Dynamics 365.


Our first opinion:

  • For Google Enthusiasts, be patience: Gemini Studio has already been rolled out, but it is designed for more advanced users. I believe Gemini will become the go-to platform for business professionals without technical expertise.

  • ChatGPT 4o: For Savvy Data professionals, I'm talking about Digital marketers, data-drive Salesperson, Tech entrepreneurs. With OpenAI’s Data Analyst Assistant, you can streamline complex data analyses, saving invaluable time daily.

  • Microsoft would be the focus of Enterprises: While integrating Microsoft’s BI stack may present challenges and costs, the robust support for leading data warehouses—including Snowflake, Redis, MongoDB, and Oracle—coupled with Copilot’s deep integration across Microsoft’s suite, solidifies its status as the trusted enterprise solution.

  • For Builders: I would write a more in-depth article on each foundation model and their AI offering from the view point of a startup founder.


What about Rocket.BI?

We had many discussion and pilots on how to integrate AI into our Data intelligence platform. As the single source of truth trusted by our clients in Fintech, Retail & eCommerce, security & accuracy of insights were our top priority. Get ready for some exciting new features coming your way in the next month:

  • Data analyst assistant.

  • Chat with chart.

  • Data analyst query copilot.

  • Chat with dataset.

rocketbi-ai-
Upcoming AI update for RocketBI

Register Now to start your data-driven journey with RocketBI. And stay tuned for some thrilling surprises!


24 views0 comments

Comments


bottom of page