Mutations in DNA have a profound impact on a range of biological processes, including evolution, ageing, and disease. DNA mutation rates result from replication errors, various sources of damage, and the activity of counteracting repair pathways. Experimental and medical data indicate essential links between transcription and mutation: transcription causes DNA damage and is coupled to DNA repair. However, the molecular aspects of these links and the extent to which they shape mutation rates remain poorly understood. To address this question, our consortium is performing large-scale mutation-accumulation experiments in dedicated microfluidic devices (developed in F. MALLOGGI team, CEA/Iramis) on genetically engineered mutant strains of the yeast Saccharomyces cerevisiae. This organism is indeed a very relevant model, given the conservation of transcription and DNA repair with multicellular eukaryotes, its short generation time, and its
genetic tractability.
The PhD project will cover all the significant steps of analysing the mutation accumulation data, with particular attention to statistical and mathematical modelling. This will start with the bioinformatic processing of the raw sequencing data to establish the lists of mutations in mutation-accumulation lines needed to estimate mutation rates. These mutation rates will then be compared between genomic contexts and experimental conditions (genetic background and/or exposure to mutagen) using appropriate statistical methods and graphical representations. The results will be interpreted in light of the current knowledge of the biological processes involved in DNA damage and DNA repair. Hypotheses will be proposed to explain the observed patterns. Mathematical and statistical modelling is essential to strengthen the analyses, from the numerical exploration of mechanistic models to assess the explanatory power of the hypotheses with proper uncertainty quantification, to the possible development of innovative approaches to improve the treatment of the data (e.g. unsupervised classification of mutation profiles).