Data are everywhere. They pervade our world. Most of our daily routines do nowadays leave digital traces: sending emails, visiting a website, calling a friend, posting content on a social network, or using a loyalty card at the supermarket. In addition, we have witnessed over the last few years an explosion of connected devices, obviously including smartphones but also more specific devices in the area of health or domotics for instance. The number of such connected devices generating, collecting and sharing data could reach 75 billions in 2025. Other figures related to this exponential data generation are staggering: in the last two years alone, the astonishing 90% of the world’s data has been created. The volume of generated data more than doubles every year and in 2023, the size of the digital data universe will then approximately reach 100 trillion gigabytes and the value of the big data market will approximately reach $200 billions.
Simultaneously, our capacity to collect and store those data is increasing extremely fast as well, meaning that we are now able to keep track of this huge amount of information. This fantastic growth in digital information is what we call Big Data: Data that we generate and acquire far more rapidly than the rate at which we process, analyse and exploit it.
Indeed, despite this flood of digital traces, few initiatives have succeeded at efficiently leveraging large-scale digital traces to resolve the many challenges we are facing in the information business. This observation is partly due to the emergence of ever growing unstructured data such as images, videos or texts (which represent more than 80% of the generated data) and the inadequacy of traditional approaches to manage and analyse such kinds of data.
Consequently, new concepts and approaches aiming at resolving these issues have been introduced over the last few years. Numerous marketing jargons, if not too many, have been used to describe this new analytics paradigm: Artificial Intelligence, Advanced Analytics, Machine Learning, Big Data Analytics, Deep learning, Natural Language Processing or Cognitive Computing.
Given the rapid emergence and attention of such concepts in the industry, but also the apparent complexity and confusion they may bring, we believe it is crucial to provide a better understanding of Artificial Intelligence in the context of large scale data. My core objectives focus on this very challenge, by performing deep-dives into some of the cutting-edge algorithms and analytical engines within A.I, as well as describing how large scale data can be stored and managed.
Press articles about his research: