PhD project: Fast and reliable methods for genome meta-assembly based on quality criteria.
More and more teams of biologists are sequencing their study species and are using assembly software to recon-struct the genome of the studied species. However, the problem of the assembly being NP-hard and APX-difficult, these methods are heuristics and pro-duce genomes of different quality as a result. In order to produce a reliable and better quality assembly, we can do ‘meta-assembly’ (combining several assem-blies of the same species). Several methods already exist but do not combine all the criteria expected by biologists: explicit quality index, speed of execution… As part of this thesis, the goal is to evaluate the quality of assemblies (quality criteria,
method for comparing assemblies), then to define a method to merge two assemblies to provide a better quality assembly, according to the criteria defined above. And finally to evaluate the possibilities of extending this fusion method to several assemblies. The central point of this work is to develop a fast and reliable method because the volume of data to be input is considerable (ex: human genome 300GB per assembly).