• Adam Dathi

What You Need to Know BEFORE Hiring Your First Data Scientist

What You Need to Know BEFORE Hiring Your First Data Scientist

Data science is firmly established as one of the ‘sexiest’ terms in data, and - under the right conditions - it can be extremely beneficial to a business. However, too often it is viewed as a panacea for problems with data and poor decision-making. If you do not understand how and when to start investing in a data science function, you will be setting them up for failure. Luckily, there are some pitfalls that you can easily avoid when considering your first hire.

What is Data Science?

First, let’s establish what we mean by data science. In the context of this article, data science seeks to extract meaning from often complex data sets (e.g., voice recordings or images), ‘scientifically’ assess why an event occurred, forecast and predict a future event, or even suggest how to behave to achieve a given goal. It is an intersection of computer science and statistics that either creates a one-off insight (e.g., ‘what were the main causes of client churn?’) or periodically re-calculates a result (e.g., ‘forecasting future sales’).

To understand the position further, let’s distinguish data scientists from three other data roles:

  • Data Engineer - Ensures that data is collected, transferred, transformed, and stored appropriately. They create the data infrastructure that the company will use.
  • Analytics Engineer - Responsible for modeling the data so that it can be used to answer business questions and setting up tools to enable company-wide reporting.
  • Data Analyst (Business Intelligence Analyst) - Creates company-wide reporting, interfaces with non-technical data consumers (like marketing and sales) to satisfy their data needs, and ensures they understand what’s happening in their teams and the wider company.
  • Data Scientist - Leverages the work of data engineers and (often) data analysts to answer more complex business questions. For example, while a standard report may show how many customers have churned and the lost value, a data science project might predict when a customer will churn and why.

Why Do Investments in Data Science Fail Before They Start?

Given that almost every business wants to answer complex ‘why’ questions, the appeal of data science is clear. However, too many companies hire data scientists without evaluating whether they can enable their success. This usually comes down to four reasons.

1. Your Data Infrastructure is Too Immature

The most common issue in setting up a data science function is doing so when your company’s data infrastructure is too immature.

Consider the data value pyramid:

  • Data (foundation)
  • Reporting (what happened?)
  • Insights (why did it happen?)
  • Predictions (what will happen?)
  • Prescriptions (what should we do?)

Each layer increases in complexity. If you can’t report on the basics, generating complex insights or predictions is unlikely. Before hiring a data scientist, ensure you have a strong foundation of good quality data that’s actively used and improved by the organization, and that it’s accessible in a non-production environment.

2. You Have Not Properly Explored Your Data Using Analysts

Data Analysts should be a company’s first port-of-call for making use of their data. Analysts are more cost-effective than data scientists and specialize in fast data exploration and reporting. They also interface with the rest of the company, which is essential for determining data quality.

For example, the finance team might identify discrepancies between sales figures and bank deposits, or the buying team might flag issues with product categorization. These issues only become apparent when teams start interacting with the data. Until these are resolved, asking a data scientist to forecast product sales becomes inefficient, if not impossible.

3. You Do Not Have a Prioritized Pipeline of Data Science Projects

This is a symptom of a larger issue. While it’s easy to brainstorm data science projects, this task could (and should) be handled by a data engineer or analyst before hiring a data scientist. If you haven’t developed a project backlog, you haven’t properly evaluated whether you should make the hire. A well-defined data science backlog helps answer three critical questions:

  • Is there enough value in these projects to justify the hire?
  • Do we have the right data to facilitate these projects?
  • Could these projects be accomplished without hiring a data scientist?

4. You Have Data Science Project Ideas, But No Plan for Implementation

This is a continuation of the previous point and something I’ve seen repeatedly. Data scientists are often hired and given ‘high value’ projects, only for their work to be forgotten because the business lacks the capacity to implement their findings. Before starting a data science project, there must be a clear path to implementation.

While some projects are one-off discoveries that can be re-run as needed, others—especially those to be productized and sold—require engineering support. This support must be secured before making your first data science hire.

When Should You Hire a Data Scientist?

To determine whether a data science function is beneficial, cost-effective, and timely, ask yourself these two questions:

  1. [Client-Facing] Does a data science function form the basis of a product or service on your roadmap to offer to clients/customers?
  2. [Internal-Facing] Are there data science projects with a clear route to implementation that your data engineers and analysts cannot handle due to the problem’s complexity?

If the answer to either is yes, then hiring a data scientist makes sense because:

  • You’ve identified a value-adding initiative (beneficial).
  • You’ve exhausted your current data team’s capabilities (cost-effective).
  • You have a clear path to implementation and potentially a deadline (timely).

What If I Shouldn’t Hire a Data Scientist Yet?

If you answered ‘no’ to both questions above, consider these alternatives:

1. Develop Your Analyst Team

A data scientist shouldn’t spend time fixing your data pipeline or generating company reports—this would be an inefficient use of their skills. Analysts are better suited (and more cost-effective) for these tasks. They can also handle simple data science projects using out-of-the-box machine learning algorithms for tasks like churn prediction or revenue forecasting. This approach allows your business to start using, reviewing, and improving its data without additional investment.

2. Develop Your Data Product Roadmap

If you don’t have a data product roadmap, hiring a data product manager should be your priority. A well-developed product backlog helps evaluate whether there’s real value in data science and identifies the resources needed for implementation. It also allows you to tailor your recruitment to specific needs, such as time series forecasting or multivariate testing.

3. Explore Out-of-the-Box Solutions

Many products offer data science capabilities that may be less flexible but are worth considering. Before building from scratch, investigate existing solutions for sales forecasting, customer churn analysis, or marketing attribution. These can serve as benchmarks for cost, accuracy, and speed-to-value.

Final Thoughts

This article highlights common pitfalls in data science initiatives based on my experience. While I have a deep respect for data science and its potential value, I’ve seen too many companies hire data scientists without the right foundation or expectations. For organizations that will thrive in the coming decades, establishing a data science function may be inevitable. By following the advice above, you can approach this decision cautiously, set realistic expectations, and make your first data science hire at the right time and under the right circumstances.