The AI technological revolution we are currently experiencing brings new use cases every day. We observe players gaining more and more competitive advantages thanks to the value of their their data through AI. Thus, more and more companies are embarking on projects of "data mining".data science". This usually starts with the recruitment or internal promotion of a data scientist who must "enhance"the company's data. However, we observe that 50of data science projects fail before they have even brought value to the company.
The success of an AI project is not simply a matter of connecting data with a data scientist. It requires the implementation of a methodology and an organization.
A real problem profession
Data science is gaining in maturity. More and more training courses are being created around AI. They teach typical AI use cases as well as the techniques to apply in these situations. This standardization creates the risk of proposing projects that are not adapted to the specificities of the company. For example, a typical project would be to create a model to detect churn (of a customer who leaves for the competition). A priori, such a model would be interesting for the company. What respectable company doesn't want to keep its current customers? However, the churn may not be one of the company's major issues at this time. time .. In that case, there is a good chance that this model will not be used, or even deployed.
As the field of data science has matured, it hashas thus acquired its own own standard methodss. Thus, a certain situation will generate a certain type of model. However, these standard techniquess are not always aligned with the needss business needs. For example, based on a customer relationship history, a data scientist can propose a model for reducing churn (of customers who leave for the competition). However, the churn may not be a business limiting point at all. So there is a gap between the business need and standard data science projects.
Before embarking on a data science project, it is essential to ask yourself the right questions. What are the limitations or bottlenecks in my business? What do I really need to solve my limitations? For this first point, we finally come back to the the obvious of project management. Before creating a product, you have to understand the customer's needs as well as possible.
A metric aligned with needs
By definition, an artificial intelligence model seeks, thanks to the data, to minimize the difference between its prediction and the true value of the variable it seeks to predict.By definition, an artificial intelligence model seeks, thanks to the data, to minimize the gap between its prediction and the true value of the variable it seeks to predict. There are several ways to estimate this gapand and this leads to different models. There are also metrics that do not enter into the creation of the model, but that estimate the overall performance of the model. Moreover, there are standards for the use of metrics depending on the type of problem to be solved. These are the metrics used by data scientist use to justify the performance of the model.
Unfortunately, these standard metrics are not always are not always understandable and relevant to the for the business. Let's take the example ofa fraudulent company detection system, developed in collaboration with clients. In this case, thehe model seeks to separate companies that have actually defrauded from those that have not. Internally, we use a metric (the OCR) that measures the level of separation of fraudulent firms and safe companies.รปcompanies. This metric has nor to the customer. ร Instead, we have proposed a metric that estimates, based on the accepted risk level, the number of prospects that will be marked as risky and the rate of as risky and the rate of truly fraudulent companies among this set. Thanks to this, the client can estimate the level of risk he is willing to accept, according to his internal constraints (file processing capacities, loss of turnover loss...).
By identifying the customer's needs, it is essential to define a business metric, which will allow the end user to estimate the quality of the model and its impact on his activities.
A performance validated by the experts
Once we have identified a real problem to be solved, we have to make sure that the way we solve it is consistent with the knowledge of the experts. We must then create a trustworthy link between the solution and the user. Thus, the user and/or the domain experts must participate in in the development and validation of the system, and we will make sure that it reproduces the standard heuristics of the domain. This also allows also to test the new heuristics that the model has foundes. This last point is particularly detailed in another of our articles concerning the explainability of AI systems.
When Ellisphere's default score was created and put into production in 2018, there were numerous studies to confirm the relevance of the score. Predictions were submitted to financial analysts to compare their analysis of companies with the model and compare, at the same time, the arguments. This allowed us to verify traditional heuristics such as: companies that do not make profits are risky, or that companies are particularly risky between the second and fifth year of life and then less risky thereafter.
An adapted technical solution
Most of the data scientist or blogging courses focus on algorithmsbut do not mention unfortunately very rarely mention lthe deployment phase. Cow can a statistical model be moved from the data scientist's personal computer to an environment where end users can interact? The sources that talk about deployment mainly propose to deploy a web API: a web interface, which is queried like a web site, that returns the result of the model if the input data is passed to it.
The web API solution makes an assumption about the company: that the company has an appropriate IT infrastructure to transmit the data and process the response. However, a small SME does not necessarily have theรฉt necessarily have such an infrastructure. Many processes can be done on Excel and the notions of database and information system can be very far from the daily reality of the company. For an AI project to be successful, it is necessary to think from the beginning about how the end user will interact with the solution, so that he can really benefit from its advantages.
What to conclude?
The valorisation of a company's data is a major issue today. However, many companiess However, many companies have not yet fully integrated the use of data in their processes, through Business Intelligence for example. Setting up AI projects is then a complicated subject and many projects fail to bring value. However, a few simple rules can limit the risk of failure. They mainly refocus the project on the need and the context, and push the data science part into the implementation details. Finally, even though even if AI allows today applications that were unhoped for yesterday, we must keep in mind that it is a tool at the service of a concrete problem.