The golden age of (data) exploration

An Axios article titled Behind the Curtain: A White-Collar Bloodbath by Jim. VandeHei and Mike Allen has gotten a lot of attention recently.

·  CNBC Segment on YouTube

·  CNN Segment on YouTube

·  Fox & Friends Segment on YouTube

·  LinkedIn News

The article starts with a stark warning:

“Dario Amodei, CEO of Anthropic, one of the world's most powerful creators of artificial intelligence, has a blunt, scary warning for the U.S. government and all of us:

AI could wipe out half of all entry-level white-collar jobs and spike unemployment to 10 to 20 percent in the next one to five years,” Amodei told us in an interview from his San Francisco office.”

The article later goes on to say, 

“the possible mass elimination of jobs across technology, finance, law, consulting, and other white-collar professions, especially entry-level gigs.”

While attention-grabbing, is it true, and what does this mean for entry-level roles?

The AI-jobs conversation keeps coming up and grabbing headlines especially among younger generations. We were told to go to college for success, but AI may change that advice. AI is capable of empowering one person to do the jobs that were once done by a few members or a whole team. Technology fosters change, and tech itself may not be immune to its own doing.

Some claim the impending job decline will be like the ones in the ’80s and ’90s with blue-collar jobs. The graph below shows the shift from blue collar to white collar jobs. Note, the decrease in manufacturing;  increase in Professional/Business Services and Education/Health Services.

(Ted C. Jones, U.S. supersector employment changes from 1950 to 2020)

However, this time around, AI might be coming for all the jobs—blue or white collar. There may be no stopping it, so it is important to stay on top of the topic of AI and how it may affect your industry.

What Makes AI So Disruptive?

A quick definition of AI:

“The capability of computational systems (AKA computers) to perform tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making.”

AI systems can outperform humans in specific, repetitive tasks especially those involving pattern recognition across large datasets. Computers can do this using data to make scarily accurate predictions if done correctly.

Have you ever wondered how YouTube knows what video you want to watch?
How does Spotify recommend that song that just hits different?

That’s AI learning from your behavioral patterns and the patterns of those like you. One popular predictive approach is decision trees, which can be used for regression or classification tasks depending on the problem.

 

Understanding the Tech: Data and Decision Trees

Decision Tree Example

You may be familiar with my drawing above even if you aren’t in data science. This is one method for predicting future outcomes. One question we may ask is: “Which customers are more likely to churn in the future?” Past data is used to answer this question. Simply feed in a customer and be presented the likelihood of future churn.

By taking a large sample size and using the law of large numbers, we start to see patterns in churn. These patterns are used to predict future outcomes. This process uses machine learning and can show you events before they happen. So how is this done?

To create a good prediction machine, you need to put good data in. This is generally tough due to messy data. A good predictor requires a good data model. When I build a data model, I want something that is fast, reliable, and uses well-organized and formatted data. This is an essential step.

But how do we know if the data is good?

The first step in most data projects is exploring or getting a feel for the data. Somebody or something needs to go through the data, understand the data, and format the data to fit business requirements. This process can be done in Excel by creating pivot tables, SQL by writing queries, or in Python with coding libraries. I want to talk more about Python and why it is so powerful.

My Python Project: Data Exploration in Action

Python uses tools that were built for the community by a team of engineers. Think of Python functions as prebuilt tools like having a robot arm that assembles parts for you, so you don’t have to build everything from scratch. Now, imagine you can log onto the internet and start using that robot arm to build your own projects. That is basically what a function is: calculations that you can use as tools to build your own projects.

One more thing: libraries are sets of functions.

In Python data exploration, there are a few popular libraries:

  • Pandas – Data exploration

  • Matplotlib – Visualizing data like Power BI or Tableau

  • Seaborn – Works with Matplotlib to create more detailed visuals

  • Scikit-learn – Machine learning toolkit that includes many model types, including regression, classification, clustering, and more

  • NumPy – Great for mathematics

Case Study: My Sales & Revenue Project

I worked on a project that looked at sales and revenue, breaking it down by multiple dimensions like Product, Customer, and Location to tell a story using Python. To do this, I used the three libraries highlighted above (Pandas, Matplotlib, and Seaborn). These functions were essential tools for data exploration and can-do things like clean the data and help you learn about the data. This is generally a good first step in many data projects like dashboards and prediction tools.

Everybody talks about AI and predicting the future. The essential part in all of that is good data models, which are created through strong data exploration and preparation.

I created a Python data exploration script that can now be used for future projects with just a few minor tweaks. Throughout this process, I uncovered data errors, trends, and distributions to get a better sense of the outline of my data. High-level knowledge of your data is essential before deep diving into a business problem.

Sales & Revenue Dashboard

The future of work is changing fast. Stay curious and keep learning!

If you are interested in learning more about the sales and revenue project, you can visit the project on my website. Or if you just want to see the Python code and Power BI file, go over to my GitHub. Also, you can find links to connect on my website or email me at Jakelender@gmail.com to speak more about this.

 Works Cited

Previous
Previous

AI Agents Are the Future.

Next
Next

U.S.–China Tariffs and Why Data Analysis Matters More Than Ever