One of the most publicized and arguably important use cases of Big Data technologies is their facilitation of Data Science techniques in solving complex business problems. Companies that embrace their data as a corporate asset can improve business processes and inform corporate strategies with insights gained through predictive and optimization analytic models. Although the business value of these projects is tremendous, their success is completely dependent on the ability to overcome an age-old enterprise challenge: bridging the communication and prioritization gaps between a company’s business units and IT department. While the IT department houses those most knowledgeable about the data and project-enabling technologies, project results are only relevant if they align with the needs of the business stakeholders.
Due to this mutual dependency, a strong focus on business analysis is necessary to derive business-critical insights from a Data Science project. The project team must understand potential hypotheses that answer the business question being solved as well as the data that may be relevant for the analysis. The team must be able to integrate the business needs with the analytic model, choosing data sources that the business believes to be relevant while maintaining flexibility if hypotheses change or data sources prove unusable. Whether the responsibility for understanding the business needs falls upon a Business Analyst, Data Analyst or Data Scientist, this information is as essential to discovering business insights as the analytic model itself.
Business Analysis in Project Preparation
The Business Analyst’s first step in a Data Science project is to brainstorm the problem with key stakeholders. Generally speaking, by the time a Business Analyst and Data Scientist are brought into a project the question that the business wants solved has already been defined. The Analyst starts by brainstorming potential reasons for the problem with relevant stakeholders and determines which data sets and sources could provide insight. The same business analysis elicitation techniques used to facilitate software requirements gathering are used in the identification of hypotheses (proposed explanations to be tested) and data sources for the Data Scientist to explore. Understanding the stakeholders’ point of view is crucial to uncovering specific insights into their business processes and the data supporting their processes.
Continued stakeholder involvement is needed to help prioritize the hypotheses, decide which ones to pursue and which ones to put aside if the data is not available or not in a usable format. The Business Analyst needs to keep the discovery process on track, ensuring that busy stakeholders are engaged, answering questions, and providing needed insights. Using reminders is a common BA tactic that often proves useful in moving the process along.
The culmination of the elicitation process is a documented list of hypotheses and the data sources to test the hypotheses. The Business Analyst collects all of the inputs and authors the document which is provided for sign-off by the stakeholders. Once the hypotheses are agreed upon, the data preparation and modeling process can begin.