Master/Bachelor Theses

We offer various Bachelor/Master thesis topics. A non-exhaustive list of open topics is listed below. If you are interested in a thesis, please send your CV and transcript of records to Prof. Martin Atzmüller via email and we will arrange a meeting to talk about the potential topics.

  • Symbolic Time Series Embedding (Information:  Leonid Schwenke): In Deep Learning, embeddings are used to create an informative data format. Especially in NLP word embeddings like e.g. word2vec (https://arxiv.org/abs/1301.3781) are common appearance. In the area of Time Series data a similar solution is desired. For this reason, multiple new embeddings got proposed in recent years. However, as described in  journals.flvc.org/FLAIRS/article/view/133107 those embeddings often lack interpretability. The goal of the proposed thesis would be now to take a time series embeddings like e.g.  link.springer.com/article/10.1007/s00521-020-04916-5 and adapt it into a more interpretable symbolic-based approach. Here multiple approaches would be valid and could be discussed as a thesis goal. For example, a Master Thesis could tackle the following: as stated in the conclusion of the paper, a more symbolic abstract approach (e.g. SAX and SFA) could be used as core mechanic to approximate a word2vec like approach. Especially, combining multiple symbolic features is desirable. Hereby, the concept of  journals.flvc.org/FLAIRS/article/view/133107 should be considered to maintain the interpretability of the embedding. Alternatively, for a Bachelor Thesis: symbolic approximations bring time series tasks closer to natural language processing and thus tend to highlight distinctive patterns which can be used for time series classification, e.g. BOSS  https://link.springer.com/article/10.1007/s10618-014-0377-7) and WEASEL  arxiv.org/abs/1701.07681 . The task would be to use SFA to find word-like patterns and train a word2vec approach on those words.
  • Comparing Attention-based Interpretability Methods with SHAP (Information:  Leonid Schwenke): In Deep Learning, Interpretability is a desirable feature for each neural network. Saliency maps or attribution scores help to find the most "important" features for a given input/task. However, it is quite hard to evaluate those values. On the other hand, approaches like SHAP  github.com/slundberg/shap are mathematical funded, but very time-consuming to calculate. The goal of the thesis would be to compare multiple saliency map based methods to the output of SHAP on simple and clear tabular classification tasks. Questions like "On which data/patterns do they agree/disagree?", "Does a combination of SHAP and saliency maps makes sense?" and "Do correlations exists?" should be answered. Hereby, the main focus should lie on local and global attention-based approaches (Transformer) like e.g. LASA  https://journals.flvc.org/FLAIRS/article/download/128399/130111  and GCR  https://ieeexplore.ieee.org/document/9564126 . For a master thesis, a more in depth comparison and further attention-based XAI methods should be included.
  • Evaluation of Learning Procedure of CNN architecture using Information (Bottleneck) Theory (Information:  Arnab Ghosh Chowdhury): Evaluate Learning Procedure of CNN architecture using Information (Bottleneck) Theory (X.Shi et al. Evaluating the Learning Procedure of CNNs through a Sequence of Prognostic Tests Utilising Information Theoretical Measures; PhD Thesis of Ravid Shwartz-Ziv: Information Flow in Deep Neural Networks,  arxiv.org/abs/2202.06749)
  • Measure Uncertainty for Semantic Segmentation (Image) in Active Learning (Information:  Arnab Ghosh Chowdhury): Investigate approaches for uncertainty measurement for Semantic Segmentation (Image) in Active Learning, c.f. Cygert, Sebastian, et al. "Closer look at the uncertainty estimation in semantic segmentation under distributional shift." 2021 International Joint Conference on Neural Networks (IJCNN). IEEE, 2021.
  • Probabilistic Programming and Deep Learning (Information:  Steffen Meinert): Evaluate an improve the applied inference technique Hamiltonian Monte Carlo with advanced approaches.
  • Combining Graph Neural Networks and Bayesian Neural Networks (Information:  Steffen Meinert): Combine the approach of Bayesian Neural Networks (BNN) and combine them with the approach of Graph Neural Networks (GNN),  ieeexplore.ieee.org/abstract/document/9555949
  • How to train your anomaly detector: examining the impact of different types of synthetic anomaly on the training of a state-of-the-art neural network anomaly detector (Information:  Dan Hudson): Anomaly detection for time series is a topic with considerable practical importance, e.g., for monitoring sensor readings in critical infrastructure. One of the most successful methods in this domain uses a neural network called ‘NCAD’, described in “Neural Contextual Anomaly Detection for Time Series” (Carmona et al., 2021,  https://arxiv.org/abs/2107.07702). This approach uses synthetic anomalies which are ‘injected’ during training, however, so far, there has only been a limited investigation of how the results are influenced by the way these synthetic anomalies are constructed. Therefore, this study will consider different ways of creating synthetic anomalies and investigate how they impact the predictions of the trained NCAD model. Inspiration on how to construct synthetic anomalies can be found in “TimeEval: a benchmarking toolkit for time series anomaly detection algorithms” (Wenig, Schmidl and Papenbrock, 2022,  https://hpi.de/fileadmin/user_upload/fachgebiete/naumann/publications/PDFs/2022_wenig_timeeval.pdf).
  • How much data is enough data for anomaly detection? Investigating the relationship between data availability and model performance in neural networks for anomaly detection (Information:  Dan Hudson): Deep learning methods have made considerable improvements over previous ML techniques when identifying anomalies in benchmark datasets, however, such methods are ‘data-hungry’. In many contexts, data availability is limited, raising the question of how much data is enough in order to successfully train deep learning models for anomaly detection. This research project will investigate the impact of reducing the quantity of training data on the performance of a selection of state-of-the-art deep learning models for anomaly detection. Examples of neural networks that might be especially data-hungry are: “TranAD: Deep Transformer Networks for Anomaly Detection in Multivariate Time Series Data” (Tuli, Casale and Jennings, 2022,  https://arxiv.org/abs/2201.07284), and “Neural Contextual Anomaly Detection for Time Series” (Carmona et al., 2021,  https://arxiv.org/abs/2107.07702). A recent review of general ML anomaly detection techniques can be found in “Anomaly Detection in Time Series: A Comprehensive Evaluation” (Schmidl, Wenig and Papenbrock, 2022,  http://vldb.org/pvldb/vol15/p1779-wenig.pdf).