Warning: pg_query(): Query failed: ERROR: missing chunk number 0 for toast value 29512337 in pg_toast_2619 in /dati/webiit-old/includes/database.pgsql.inc on line 138 Warning: ERROR: missing chunk number 0 for toast value 29512337 in pg_toast_2619 query: SELECT data, created, headers, expire, serialized FROM cache_page WHERE cid = 'https://www-old.iit.cnr.it/node/58286' in /dati/webiit-old/includes/database.pgsql.inc on line 159 Warning: pg_query(): Query failed: ERROR: missing chunk number 0 for toast value 29512337 in pg_toast_2619 in /dati/webiit-old/includes/database.pgsql.inc on line 138 Warning: ERROR: missing chunk number 0 for toast value 29512337 in pg_toast_2619 query: SELECT data, created, headers, expire, serialized FROM cache_page WHERE cid = 'https://www-old.iit.cnr.it/node/58286' in /dati/webiit-old/includes/database.pgsql.inc on line 159 Technology and Species Independent Simulation of Sequencing Data and Genomic Variants | IIT - CNR - Istituto di Informatica e Telematica
IIT Home Page CNR Home Page

Technology and Species Independent Simulation of Sequencing Data and Genomic Variants

Highly accurate genotyping is essential for genomic projects aimed at understanding the etiology of diseases as well as for routinary screening of patients. For this reason, genotyping software packages are subject to a strict validation process that requires a large amount of sequencing data endowed with accurate genotype information. In-vitro assessment of genotyping is a long, complex and expensive activity that also depends on the specific variation and locus, and thus it cannot really be used for validation of in-silico genotyping algorithms. In this scenario, sequencing simulation has emerged as a practical alternative. Simulators must be able to keep up with the continuous improvement of different sequencing technologies producing datasets as much indistinguishable from real ones as possible. Moreover, they must be able to mimic as many types of genomic variant as possible. In this paper we describe OmniSim: a simulator whose ultimate goal is that of being suitable in all the possible applicative scenarios. In order to fulfill this goal, OmniSim uses an abstract model where variations are read from a .vcf file and mapped into edit operations (insertion, deletion, substitution) on the reference genome. Technological parameters (e.g. error distributions, read length and per-base quality) are learned from real data. As a result of the combination of our abstract model and parameter learning module, OmniSim is able to output data in all aspects similar to that produced in a real sequencing experiment. The source code of OmniSim is freely available at the URL: https://gitlab.com/geraci/omnisim

International Conference on Bioinformatics and Bioengineering (BIBE), Atene, 2019

Autori esterni: Riccardo Massidda (Universita' di Pisa), Nadia Pisanti (Universita' di Pisa)
Autori IIT:

Tipo: Contributo in atti di convegno
Area di disciplina: Computer Science & Engineering

File: BIBE-19.pdf

Attività: Biologia computazionale