Hello Christophe, can you explain the role of the "data scientist" today?
Today, I would talk more about data-science in the plural sense of the term. It is no longer a question of one profession, but of several professions. The field is currently being structured.
In terms of profile, there are two types of data scientists today. On the one hand, we have a versatile profile that can do everything. And on the other hand, in structured companies that already have a data-science department, more specialized profiles. At Ellisphere, we are in a hybrid model. Typically, standard processes are managed by the IT department, while the data science department explores new data.
In my previous company, I played the role of the free electron. I was in charge of retrieving data from the databases, maintaining them, cleaning them, developing interpretation models from them and finally launching and maintaining production. I was in charge of the entire pipeline.
What role does the data scientist play within an already structured company?ย ย
In an already structured team, the data scientist will try to improve the performance of the models. In concrete terms, if we take the example of a company like Google, which is currently working on deep learning technologies to optimize and predict responses to emails for example, the role of the data scientist will be to improve the performance of the algorithm. In this case, data scientists are located at the end of the production chain and will have few problems related to data.
There is also a problem with putting these models into production. In a structured team, there are data-engineer profiles that are specialized in the articulation of databases, models and web-services to articulate everything efficiently.
In my daily work at Ellisphere, I am in the middle of the process. We have our own qualified database of financial information. However, our current challenge is to implement other databases from outside. This work requires going through the entire value chain (analysis, sorting, correlation) to ensure the consistency of our database and to develop it further. This then allows us to think about new and more efficient models.
ย
How has the role of the data scientist evolved?
CIO departments are very structured these days. In the 1980s, there was an "IT guy" who was in charge of doing most of the IT-related tasks. Today, I think that this system still exists in the smallest structures (SMEs, VSEs). The same phenomenon can be observed for data-science related subjects. I think that over time, we will see a separation of the business by expertise.
There are already specializations by type of algorithm or by type of technique in the manner of what can be observed in the programming world. We can imagine that tomorrow, we will see a diversification of expertise with experts specializing in the performance of the model or in its explicability.
What does your daily work look like?
At Ellisphere, my job is to implement, via technology, solutions to make business-to-business relationships more secure and transparent. In concrete terms, Ellisphere's current business failure probability score is part of this logic. The idea is to say, I want to do business with such and such a company and I need to know if it is going to remain sustainable in order to undertake a serene partnership.
Our score is constantly evolving. We are working every day to improve it by adding new data that will increase its relevance. Today, the score is a good indicator of the health of companies. We manage to get good results with almost exclusively financial or macro-economic data.
What are your upcoming projects?
In the future, we would like to correlate data from other areas. Typically, we are working on vehicle registration data.
For example, in the case of a company that does not publish its accounts, this allows us to have visibility on the renewal of its vehicle fleet, the range of vehicles purchased ... This can give us insight into the management and financial health of the company.
ย
What is the most interesting thing about your work?ย
I would say that it is the field of possibilities that is available to us that is the most interesting. The treatments are infinite and we have to be creative in our work to approach problems from new angles. This requires having the right data and getting your hands into the machine.