Mathématiques et Informatique Appliquées
du Génome à l'Environnement


Transcriptome Analysis from High-Throughput Sequencing Count Data

Intervening organization
Biologie Computationnelle et Quantitative, UMR 7238 CNRS-UPMC; MaIAGE INRA
Name of intervener
Bogdan Mirauta

The most common RNA-Seq strategy consists of random shearing, amplification, and high-throughput sequencing, of the RNA fraction. Methods to analyze transcription level variations along the genome from the read count profiles generated by the is global RNA-Seq protocol are needed. We developed statistical approaches to estimate the local transcription levels and to identify transcript borders. The transcriptional landscape reconstruction relies on a state-space model to describe transcription level variations in terms of abrupt shifts and more progressive drifts. A new emission model is introduced to capture not only the read count variance inside a transcript but also its short-range autocorrelation and the fraction of positions with zero-counts. The estimation relies on a Sequential Monte Carlo algorithm, the Particle Gibbs.

Salle de réunion 142, bâtiment 210
Date of the day