How data scientists help achieve value from ERP

Table of content
Many companies have complex ERP’s that capture and store most of their business data in one place. But they normally do not utilise full potential of this data to retrieve meaningful insights that can improve their work. And this is where data scientists can help. AI and ML are the next steps for a lot of companies that want to get more from their data. A team with right combination of business knowledge, technology experts and data scientists can make the move easier.
Many companies have complex ERP’s that capture and store most of their business data in one place. But they normally do not utilise full potential of this data to retrieve meaningful insights that can improve their work. And this is where data scientists can help. AI and ML are the next steps for a lot of companies that want to get more from their data. A team with right combination of business knowledge, technology experts and data scientists can make the move easier.
Data Science
A Data Scientist is someone who can use a set of data to build a use case, extract knowledge, do experiments, analyze the results and come up with a solution.
Data is a collection of facts (numbers, measurements, words, etc)
Data can be machine-readable (structured) or human-readable (unstructured). Unstructured data refers to information that requires intelligence to interpret and study, such as an image or the meaning of a block of text.
Structured data refers to information that computer programs can process. A program is a set of instructions for manipulating data. In order for a program to perform instructions on data, that data must have some kind of uniform structure. Structured data is collected from multiple sources:
- Web Data - that is any type of data you might pull from the internet. That might be data on what your competitors are selling, published government data, football scores, etc. Web data can be used to monitor competitors, track potential customers, and keep track of channel partners.
- Transactional Data – that is anything that requires an action to collect, like making a purchase or clicking and navigating on a website (almost every website you visit collects transactional data of some kind, either through Google Analytics, or another 3rd party system). By examining large amounts of data, it is possible to uncover hidden patterns and correlations. These patterns can create competitive advantages, and result in business benefits like more effective marketing and increased revenue.
- Sensor Data – that is produced by objects and is often referred to as the Internet of Things (IoT). For example, your smartwatch measuring your heart rate or a building with external sensors that measure the weather. By measuring what is happening around them, machines can make smart changes to increase productivity and alert people when they are in need of maintenance.
All companies have a lot of data that is not being used which could help in decision making. Customer data reveals which products the company should focus on. IoT data can inform process optimization to replace parts before they have issues. Forecasting is improved with more information and context.
Data needs to be processed, and interpreted before it can be used for insights, leveraging data management and data science techniques.
Data Ingestion
There are 4 main steps in extracting knowledge from data. Data Ingestion is the first one and consists of the process of obtaining and importing data for immediate use or storage in a database. Data can be streamed in real-time or ingested in batches. This work is done by the Data Engineer.
Data Processing
After the data is stored, the Data Scientist has to go through the data and clean it. Data Processing (Cleansing) is the process of ensuring that your data is correct, consistent and useable. Common inaccuracies in data include missing values and typographical errors. Improving data quality is critical for this step, and so the data should be validated with all the rules that make sense for the specific data. Other important things to understand are if there is missing data one can get from another source, and if there is uniformity in the data that can be converted to the same units/measures. The Data Scientist must also check the accuracy and consistency of the data with other data sets and real values. Normally a process is created (using python scripts, R or other tools) to go through the data searching for unexpected or incorrect data, cleaning it by fixing it or removing it, checking the data again, then reporting the changes and the quality of the current data to work with.
Data Mining
Data mining is the process of finding patterns, anomalies and correlations in data to solve problems through data analysis. Additionally, data mining techniques are used to build machine learning (ML) models that power applications such as search engine algorithms and recommendation systems.
Mathematical/statistic models are used to find patterns in the data using data tools. There are a lot of libraries in python or R that make a lot of these tools available, such as Tensorflow. Some examples of techniques include:
- Sequence or Path Analysis looks for patterns where one event leads to another later event.
- Clustering is able to find and group data sets in ways that were previously unknown. Clustering groups are aggregated based on how similar they are to each other.
- Classification looks for new patterns and might result in a change in the way the data is organized. These algorithms predict classifications based on multiple features.
In the end, the results are evaluated and compared to business objectives. Businesses can learn more about their customers and develop more effective strategies related to various business functions and in turn leverage resources in a more optimal and insightful manner. This helps businesses be closer to their objectives and make better decisions.
Visualization - Reporting
Findings are communicated through Reporting and Monitoring so key resources of the business can understand the results in a clear and concise manner. Monitoring usually provides an alert or warning for a specific point in time, while Reporting typically displays information in an organized manner. A report usually takes the shape of a table, graph, or chart. In the field of information technology, reporting is divided into two types: executive and operational. Operational reporting presents information that tends to be more technical and detailed. Executive reporting tends to be of a broader or higher-level perspective and is generally used to educate managers about financial decisions.
ERP and Data Science
Companies with ERP’s and data-driven processes have a lot of data about the day-to-day operations they can use in their digital transformation. Data Science can help companies to get more insights for better decision-making. Here are a few examples:
Demand Forecasting
ERP platforms normally contain data about sales over time. Data scientists can discover trends that help the company to understand and determine how soon to order more products and which items will likely sell each season. Understanding seasonality factors gives key users guidance so they can, for example, keep stock of sunscreen and cough drops at appropriate times. These advantages can be used in manufacturing to predict when prices may change effecting how many supplies should be purchased, or even in customer service to determine how many people the company will need in the call center.
Merging ERP with Outside Data
ERP systems have a lot of historical data but doesn’t always give a lot of context. Data scientists can merge data sets from external sources and match them to ERP data to enhance insights for decision making or prediction models. They can merge business data with other factors that can influence customer demand, like weather, news, etc. So it is possible to understand if current events at a given time have any effect on the business processes.
Predictive Maintenance
Production machine sensors create data used to monitor production quality and efficiency. Knowing when machines are beginning to produce low-quality items is essential for ERP planning. Preventing downtime later by performing preventative machine work now improves business processes. Data scientists create models of machine performance designed to minimize downtime and fit irregular preventative work into ERP production schedules.
We already have an article that explains this topic in depth here.
ML to predict likely errors
It is possible to develop machine learning (ML) algorithms to find patterns in very different datasets. One example is detecting invoices that are more likely to contain errors, based on the supplier, the number of items, the amount and other features that can be found in the related data. This can minimize any rework during month-end closing by reducing the previously undetected errors.
How can Tenthpin help you?
Having Data Scientists working closely with the business is an excellent way to produce short- and long-term results. Gaining new insights from the existing data can improve decision-making. The process does not have to be complicated. It can use simple programming languages (e.g. Python) and utilize any visualization tools (e.g. PowerBI, Tableau). It could also be much more complex and use elements of existing enterprise architecture (Azure Cloud Services, AWS).
Related insights
AI is no longer a future ambition in Life Sciences, it is here. Executives are under pressure to move beyond pilots and deliver...
Life sciences companies are evolving into data-driven, patient-centric tech companies. Data is becoming the lifeblood of these...
Artificial Intelligence (AI) is transforming the Life Sciences industries, but not all AI is created equal. From task-specific...
We are a globally leading business and technology boutique consultancy for the Life Sciences industry. Our clients are leading companies from pharma, biotech, med tech, healthcare & animal health.
© 2025 Tenthpin AG | Illustrations by: www.till-lauer.ch