Main content

Top content


Enhancing AI through Hybrid Data Sets




Artificial intelligence (AI) promises profound change in society and industry. The effectiveness of AI systems depends heavily on the quality and diversity of the underlying data. However, acquiring, processing, and labeling real data is resource-intensive and time-consuming. This problem, known as the "data problem", has led to a shift to synthetic data. While synthetic data offers cost advantages, it lacks realism, which is referred to as the "reality gap". Hybrid data addresses this problem by combining synthetic and real data. Within this seminar, we will analyze the terminology related to real, synthetic, augmented, and hybrid data, propose a unified taxonomy, and practically evaluate the benefits of hybrid data sets for AI.
To explore hybrid data sets for AI, we will combine real data with synthetic data; thus, knowledge of Python or another scripting language is a prerequisite. During the course, student groups will explore how and when real and synthetic data can be used, i.e., using synthetic data to pre-train an AI and use real data for transfer learning, using a mixed data set for training, and using augmented data sets generated out of real a synthetic data. Next to the research hybrid data set, students will be introduced to creating scripts for the high-performance computing (HPC) cluster, enabling systematic evaluation methods.

Weitere Angaben

Ort: 35/E25
Zeiten: Di. 10:00 - 12:00 (wöchentlich)
Erster Termin: Dienstag, 02.04.2024 10:00 - 12:00, Ort: 35/E25
Veranstaltungsart: Seminar (Offizielle Lehrveranstaltungen)
ECTS-Punkte: 4
Art der Durchführung: Präsenz-Sitzungen ohne Video-Aufzeichnung [präsenz]


  • Veranstaltungen > Cognitive Science > Master-Programm
  • Courses in English > Human Sciences (e.g. Cognitive Science, Psychology)