Soccer: When Data Drives Performance
Our society is becoming increasingly digitized thanks to the use of a wide range of digital technologies that generate massive data flows, collectively referred to as big data. This is particularly true of sports-related data collected through smartwatches, smartphone apps, satellite tracking systems, and smart clothing.
Stéphane Perrey, University of Montpellier; Gérard Dray, IMT Mines Alès – Institut Mines-Télécom and Jacky Montmain, IMT Mines Alès – Institut Mines-Télécom

In particular, this data would make it possible to identify key performance indicators as well as athletes’ movements on the field: the number and length of sprints and changes of direction, distance covered at certain speeds, changes in heart rate, and the player’s trajectory.
This information can thus guide training strategies. From there, it is only a short step to becoming an essential tool for improving performance or preventing injuries. This type of information may therefore be of interest to a wide range of stakeholders: athletes, coaches, strength and conditioning specialists, adapted physical education instructors, physicians, or sports agents. These data, along with the methodologies that accompany them, open up new avenues for research in sports science and digital science.
Recent advancements in wearable and connected sensors, cloud-based data storage, and artificial intelligence tools have been the cornerstones of a major shift in how sports-related data is analyzed. Over the past decade, research in sports science has benefited from the reduction in sensor size and the resulting increase in the ability to collect and analyze simultaneous measurements, driven by immense advances in wireless transmission.
Soccer is one of the most popular sports in the world, and over the years, more and more data has become available, which has sparked greater interest in the sport among data analysts. Related professions, such as data scientists, have emerged, requiring specific skills in statistics and computer science to collect, process, and analyze big data—with the goal of uncovering insights that inform decision-making.
In the sports world, the term "Sports Data Analyst" is commonly used, and several think tanks, academic programs, and thematic research groups (such asEuroMov Digital Health in Motion) that bring together multidisciplinary expertise have been established. How can sports science and digital science—from data collection to predictive modeling of performance or injury, including data management—impact the field of sports?
The proliferation of motion data
In recent years, there has been a surge in the use of position-tracking systems to provide spatio-temporal tracking data on players on the field. Although semi-automatic camera systems have been used to track player positions during professional soccer matches, automatic tracking systems using Global Navigation Satellite Systems (GNSS) or local positioning systems are now commonly adopted by professional organizations and teams in team sports (soccer, rugby, basketball, handball, ice hockey). Some positioning systems have even equipped balls with built-in sensors.

One of the most noticeable advancements in many sports is the introduction of autonomous inertial measurement units (IMUs), which measure linear acceleration (accelerometers), rotational velocity (gyroscopes), and the Earth’s magnetic field for orientation (magnetometers) in three dimensions.
The integration of these portable sensors (IMU-GNSS) is an ongoing trend in the development of systems for tracking and detecting human movement, with applications in sports such as soccer to monitor players’ movements. Generally speaking, movement (speed, distance, and derived metrics) is primarily quantified using GNSS data, while the detection and characterization of collisions and impacts are handled by IMU data. Despite these advances, is it possible to free the player from these sensors when quantifying movement? It would appear so.
[Nearly 80,000 readers rely on The Conversation’s newsletter to better understand the world’s major issues. Subscribe today]
Markerless motion capture systems and algorithms have been continuously improved over the past five years to measure kinematics (i.e., the description of motion in terms of position, velocity, or acceleration) in sports. Modern computer vision algorithms using neural networks have been adapted to evaluate various forms of motor actions, providing practical means for faster data analysis with validity under real-world conditions—that is, outside the restrictive laboratory environment.
The classification of human locomotor activities from the perspective of athletic performance can be improved when acquired signals are used as inputs for machine learning algorithms. The ability of current algorithms to analyze and extract insights from such datasets can, for example, identify changes in direction, which are often decisive during a game. The use of deep learning algorithms (a subfield of machine learning focused on algorithms inspired by the structure and functioning of the brain, known as artificial neural networks) on video footage would enable the highly accurate recognition of different types of soccer shots.
Using multiple fused IMU-GNSS sensors and various machine learning algorithms, Reilly et al. developed an automated classification model to accurately identify players’ movements involving changes of direction during competitive matches. Improvements to classification models in sports such as soccer are consistently hampered by the variability of movement patterns, which directly impacts the quality and quantity of available datasets.
From data to predicting performance or injury status
So how could big data improve performance in elite sports? The data collected by the aforementioned tools and equipment makes it possible to characterize movements in detail and then determine the athlete’s training load. Widely used to monitor soccer players, physical demand can be determined using objective mechanical parameters, calculated from GNSS-IMU signals combined with heart rate data, for example. The data collected from these wearable devices provides useful information for understanding a player’s activity, their performance in competition, or preventing the risk of injuries during training.
To achieve this, one approach involves conducting descriptive analyses to characterize target exercise intensities in relation to physical performance over time and to identify interpretable analytical relationships. With sufficient data collected over several months or even years, predictive analyses can be used to estimate performance on “D-Day” or to provide useful information to coaches, the team, or players in order to guide training protocols and optimize training prescriptions (volume, intensity, and type of exercises), thereby enhancing performance.
Regarding individual physical performance in professional soccer players, researchers have presented an approach for predicting individual acceleration-velocity profiles based on GNSS data measurements collected during real-game situations. These profiles can provide relevant information regarding the theoretical maximum strength of the hip extensors and the ability to generate significant horizontal force at high running speeds; these factors are key determinants of the onset of muscle injuries, as well as sprint performance.
In the context of sports injuries, the ability to predict risk factors and assess an athlete’s readiness following surgery or any other medical procedure is essential. The application of machine learning techniques appears capable of providing insights into the risk of non-contact injuries by taking into account changes in training loads over a week (short term) and a month (medium term), based on data collected via GNSS and IMU coupled with questionnaires. The results of a study conducted over a sports season among professional soccer players in Ligue 2 show that, depending on the complexity of the predictive model, the classification performance for predicting injury risks can approach 100%, particularly over a one-month time horizon.
Furthermore, it appears that subjective variables (such as sleep quality, physical condition, mood, satisfaction, and enjoyment) are significant factors in predicting the risk of injury, just as distance covered can be. This initial information can help guide the development of personalized training programs designed to reduce the risk of injury.
Today, portable mobile devices provide the data needed to analyze player performance during training and competition. In addition, new machine learning algorithms—including deep learning and data mining—enable the assessment of player progress throughout all phases of their training.
However, it remains to be seen whether current technological and scientific advances are already sufficiently advanced to enable automated decision-making systems to be implemented in real-time during actual competition and influence the outcome of the match.
Stéphane Perrey, Professor of Exercise Physiology / Integrative Neuroscience, Director of the Research Digital Health in Motion Research Unit, University of Montpellier; Gérard Dray, Professor, IMT Mines Alès – Institut Mines-Télécom and Jacky Montmain, Professor – EuroMov Digital Health in Motion, IMT Mines Alès – Institut Mines-Télécom
This article is republished from The Conversation under a Creative Commons license. Readthe original article.