Contact : Dominique Vaufreydaz
Catégorie : Formations disciplinaires
Thématique : Formation à la recherche
Langue de l'intervention : français et anglais
Nombre d'heures : 6
Crédits/Points : 6
Min participants : 1
Max participants : 30
Nbre en attente d'inscription : 1
Nombre de places disponibles : 30
Public prioritaire : Aucun
Public concerné : Doctorant(e)s
Proposé par : MSTII - Mathématiques, Sciences et technologies de l'information, Informatique
| Lieu : REMPLIR (UFR/LABO) Observations : Les dates sont provisoirement à titre indicatif Début de la formation : 19 janvier 2026 Fin de la formation : 16 février 2026 Date ouverture des inscriptions : Date fermeture des inscriptions : Objectifs : Optimized management and processing for learning Artificial Intelligence has experienced unprecedented growth in recent years. This growth comes at the cost of more complex models, with more parameters, and the use of increasingly large datasets. This implies that data preparation and training of these AI models, to be efficient, must take into account the various constraints of the training platforms: the number of available CPU cores, RAM size, the number of GPUs/specific cores, available VRAM size, available disk space, and specific constraints of certain file systems, particularly in the case of cluster usage (e.g., limited number of inodes). This implies a thorough understanding of the nature of the data being processed, whether images/videos, sound, text, or lidar data from an autonomous vehicle, and their peculiarities both in terms of processing and storage, as well as access efficiency to prepare batches for the training process. The parallelization capabilities of the system performing the training will also be crucial in making decisions to optimize the training process. In this course, we will focus on different types of data, discuss how to efficiently prepare this data, and how to store it in a way that allows for efficient reading and processing. We will explore how this is compatible with the parallelization aspects of the host system (CPU/RAM/Disk) as well as the available GPUs for training machine learning models, including Deep Learning ones. This sensitization lecture focuses on understanding where efforts can be made to improve machine learning training/infering time. Programme : Optimized management and processing for learning Artificial Intelligence has experienced unprecedented growth in recent years. This growth comes at the cost of more complex models, with more parameters, and the use of increasingly large datasets. This implies that data preparation and training of these AI models, to be efficient, must take into account the various constraints of the training platforms: the number of available CPU cores, RAM size, the number of GPUs/specific cores, available VRAM size, available disk space, and specific constraints of certain file systems, particularly in the case of cluster usage (e.g., limited number of inodes). This implies a thorough understanding of the nature of the data being processed, whether images/videos, sound, text, or lidar data from an autonomous vehicle, and their peculiarities both in terms of processing and storage, as well as access efficiency to prepare batches for the training process. The parallelization capabilities of the system performing the training will also be crucial in making decisions to optimize the training process. In this course, we will focus on different types of data, discuss how to efficiently prepare this data, and how to store it in a way that allows for efficient reading and processing. We will explore how this is compatible with the parallelization aspects of the host system (CPU/RAM/Disk) as well as the available GPUs for training machine learning models, including Deep Learning ones. This sensitization lecture focuses on understanding where efforts can be made to improve machine learning training/infering time.
Equipe pédagogique : Dominique Vauvreydaz
Méthode pédagogique : 3 séances de 2h
Les Compétences et capacités visées à l'issue de la formation (fiches RNCP)
Arrêté du 22 février 2019 définissant les compétences des diplômés du doctorat et inscrivant le doctorat au répertoire national de la certification professionnelle. https://www.legifrance.gouv.fr/loda/id/JORFTEXT000038200990/ Bloc 1 : Conception et élaboration d’une démarche de recherche et développement, d’études et prospective - Disposer d'une expertise scientifique tant générale que spécifique d'un domaine de recherche et de travail déterminé - Faire le point sur l’état et les limites des savoirs au sein d’un secteur d’activité déterminé, aux échelles locale, nationale ou internationale Bloc 4 : Veille scientifique et technologique à l’échelle internationale - Acquérir, synthétiser et analyser les données et informations scientifiques et technologiques d’avant-garde à l’échelle internationale - Disposer d’une compréhension, d’un recul et d’un regard critique sur l’ensemble des informations de pointe disponibles - Dépasser les frontières des données et du savoir disponibles par croisement avec différents champs de la connaissance ou autres secteurs professionnels - Disposer de la curiosité, de l’adaptabilité et de l’ouverture nécessaire pour se former et entretenir une culture générale de haut niveau La formation participe à l'objectif suivant :conforter la culture scientifique des doctorants dans leur champ disciplinaire ou en interdisciplinaire
|