Электронный архив

Effect of method of deduplication on estimation of differential gene expression using RNA-seq

Показать сокращенную информацию

dc.contributor.author Klepikova A.
dc.contributor.author Kasianov A.
dc.contributor.author Chesnokov M.
dc.contributor.author Lazarevich N.
dc.contributor.author Penin A.
dc.contributor.author Logacheva M.
dc.date.accessioned 2018-09-19T22:30:28Z
dc.date.available 2018-09-19T22:30:28Z
dc.date.issued 2017
dc.identifier.uri https://dspace.kpfu.ru/xmlui/handle/net/145291
dc.description.abstract © 2017 Klepikova et al.Background. RNA-seq is a useful tool for analysis of gene expression. However, its robustness is greatly affected by a number of artifacts. One of them is the presence of duplicated reads. Results. To infer the influence of different methods of removal of duplicated reads on estimation of gene expression in cancer genomics, we analyzed paired samples of hepatocellular carcinoma (HCC) and non-tumor liver tissue. Four protocols of data analysis were applied to each sample: processing without deduplication, deduplication using a method implemented in samtools, and deduplication based on one or two molecular indices (MI).Wealso analyzed the influence of sequencing layout (single read or paired end) and read length. We found that deduplication without MI greatly affects estimated expression values; this effect is the most pronounced for highly expressed genes. Conclusion. The use of unique molecular identifiers greatly improves accuracy of RNA-seq analysis, especially for highly expressed genes. We developed a set of scripts that enable handling of MI and their incorporation into RNA-seq analysis pipelines. Deduplication without MI affects results of differential gene expression analysis, producing a high proportion of false negative results. The absence of duplicate read removal is biased towards false positives. In those cases where using MI is not possible, we recommend using paired-end sequencing layout.
dc.subject Cancer genomics
dc.subject Deduplication
dc.subject Differential expression
dc.subject Hepatocarcinoma
dc.subject RNA-seq
dc.title Effect of method of deduplication on estimation of differential gene expression using RNA-seq
dc.type Article
dc.relation.ispartofseries-issue 3
dc.relation.ispartofseries-volume 2017
dc.collection Публикации сотрудников КФУ
dc.source.id SCOPUS-2017-2017-3-SID85015218572


Файлы в этом документе

Данный элемент включен в следующие коллекции

  • Публикации сотрудников КФУ Scopus [24551]
    Коллекция содержит публикации сотрудников Казанского федерального (до 2010 года Казанского государственного) университета, проиндексированные в БД Scopus, начиная с 1970г.

Показать сокращенную информацию

Поиск в электронном архиве


Расширенный поиск

Просмотр

Моя учетная запись

Статистика