Optimized de novo variant calling in family trios using GPUs
Stephanie Sarkar
Dr. Tychele Turner
Detection of de novo variants (DNVs) is important in order to study genetic variation whose impact is relevant to human evolution, genetics, and disease. In this study, we constructed a GPU-based workflow to make DNV calling easier. We applied the 602 parent-child sequenced trios from the publicly available 1000 Genomes Project as a set of controls. The DNA that was used for whole-genome sequencing from these individuals was derived from lymphoblastoid cell lines. We detected 445,711 DNVs with a bimodal distribution, having a first peak at approximately 200 DNVs and a second peak at 2,000 DNVs. Since de novo calling can be challenging, in order to validate our de novo calls, we utilized visualization of raw read data where we manually checked the mutational profiles of each proband (~4000 de novo variant sites). This process included looking at the first column of each mutational profile from the mother, father, and proband for each of the 4000 samples and seeing if there was a change from the original letter listed as the first letter of the sequence. If there was a change that was not seen in either the mother or father, but seen in the proband, we knew this was a de novo mutation and our de novo was validated. From this process, we found that ~92% of all variants looked de novo which meant that our de novo calls were correct. We then used MuPiT to visualize de novo mutations that occurred in DNA repair genes since mutations in these genes can be more detrimental and can lead to cancer. Our overall approach allows for efficient and accurate detection of DNVs, which is important when analyzing large disease datasets, especially with autism. This approach also shows that cell line artifacts present in lymphoblastoid cell lines are not always random but can be associated with cancer mutation profiles with significant implications for use of the variant data as a control.
Enter the password to open this PDF file.
-
-
-
-
-
-
-
-
-
-
-
-
-
-