Many companies have complex ERP’s that capture and store most of their business data in one place. But they normally do not utilise full potential of this data to retrieve meaningful insights that can improve their work. And this is where data scientists can help. AI and ML are the next steps for a lot of companies that want to get more from their data. A team with right combination of business knowledge, technology experts and data scientists can make the move easier.
Many companies have complex ERP’s that capture and store most of their business data in one place. But they normally do not utilise full potential of this data to retrieve meaningful insights that can improve their work. And this is where data scientists can help. AI and ML are the next steps for a lot of companies that want to get more from their data. A team with right combination of business knowledge, technology experts and data scientists can make the move easier.
A Data Scientist is someone who can use a set of data to build a use case, extract knowledge, do experiments, analyze the results and come up with a solution.
Data can be machine-readable (structured) or human-readable (unstructured). Unstructured data refers to information that requires intelligence to interpret and study, such as an image or the meaning of a block of text.
Structured data refers to information that computer programs can process. A program is a set of instructions for manipulating data. In order for a program to perform instructions on data, that data must have some kind of uniform structure. Structured data is collected from multiple sources:
All companies have a lot of data that is not being used which could help in decision making. Customer data reveals which products the company should focus on. IoT data can inform process optimization to replace parts before they have issues. Forecasting is improved with more information and context.
Data needs to be processed, and interpreted before it can be used for insights, leveraging data management and data science techniques.
There are 4 main steps in extracting knowledge from data. Data Ingestion is the first one and consists of the process of obtaining and importing data for immediate use or storage in a database. Data can be streamed in real-time or ingested in batches. This work is done by the Data Engineer.
After the data is stored, the Data Scientist has to go through the data and clean it. Data Processing (Cleansing) is the process of ensuring that your data is correct, consistent and useable. Common inaccuracies in data include missing values and typographical errors. Improving data quality is critical for this step, and so the data should be validated with all the rules that make sense for the specific data. Other important things to understand are if there is missing data one can get from another source, and if there is uniformity in the data that can be converted to the same units/measures. The Data Scientist must also check the accuracy and consistency of the data with other data sets and real values. Normally a process is created (using python scripts, R or other tools) to go through the data searching for unexpected or incorrect data, cleaning it by fixing it or removing it, checking the data again, then reporting the changes and the quality of the current data to work with.
Data mining is the process of finding patterns, anomalies and correlations in data to solve problems through data analysis. Additionally, data mining techniques are used to build machine learning (ML) models that power applications such as search engine algorithms and recommendation systems.
Mathematical/statistic models are used to find patterns in the data using data tools. There are a lot of libraries in python or R that make a lot of these tools available, such as Tensorflow. Some examples of techniques include:
In the end, the results are evaluated and compared to business objectives. Businesses can learn more about their customers and develop more effective strategies related to various business functions and in turn leverage resources in a more optimal and insightful manner. This helps businesses be closer to their objectives and make better decisions.
Findings are communicated through Reporting and Monitoring so key resources of the business can understand the results in a clear and concise manner. Monitoring usually provides an alert or warning for a specific point in time, while Reporting typically displays information in an organized manner. A report usually takes the shape of a table, graph, or chart. In the field of information technology, reporting is divided into two types: executive and operational. Operational reporting presents information that tends to be more technical and detailed. Executive reporting tends to be of a broader or higher-level perspective and is generally used to educate managers about financial decisions.
Companies with ERP’s and data-driven processes have a lot of data about the day-to-day operations they can use in their digital transformation. Data Science can help companies to get more insights for better decision-making. Here are a few examples:
ERP platforms normally contain data about sales over time. Data scientists can discover trends that help the company to understand and determine how soon to order more products and which items will likely sell each season. Understanding seasonality factors gives key users guidance so they can, for example, keep stock of sunscreen and cough drops at appropriate times. These advantages can be used in manufacturing to predict when prices may change effecting how many supplies should be purchased, or even in customer service to determine how many people the company will need in the call center.
ERP systems have a lot of historical data but doesn’t always give a lot of context. Data scientists can merge data sets from external sources and match them to ERP data to enhance insights for decision making or prediction models. They can merge business data with other factors that can influence customer demand, like weather, news, etc. So it is possible to understand if current events at a given time have any effect on the business processes.
Production machine sensors create data used to monitor production quality and efficiency. Knowing when machines are beginning to produce low-quality items is essential for ERP planning. Preventing downtime later by performing preventative machine work now improves business processes. Data scientists create models of machine performance designed to minimize downtime and fit irregular preventative work into ERP production schedules.
We already have an article that explains this topic in depth here.
It is possible to develop machine learning (ML) algorithms to find patterns in very different datasets. One example is detecting invoices that are more likely to contain errors, based on the supplier, the number of items, the amount and other features that can be found in the related data. This can minimize any rework during month-end closing by reducing the previously undetected errors.
Having Data Scientists working closely with the business is an excellent way to produce short- and long-term results. Gaining new insights from the existing data can improve decision-making. The process does not have to be complicated. It can use simple programming languages (e.g. Python) and utilize any visualization tools (e.g. PowerBI, Tableau). It could also be much more complex and use elements of existing enterprise architecture (Azure Cloud Services, AWS).