Merge single end and paired end illumina raw sequences for SNP calling
1
0
Entering edit mode
3.9 years ago

Hello there, I am new to bioinformatic analyses and want to do SNP calling from Illumina sequences. I have the raw reads from the sequencing of 192 haploid honey bee brothers from a single mother. This is old data in the lab where they sequenced paired-end as well as single-end. More specifically:

  • Genomic DNA from 192 samples was pooled with 96 samples per pool
    • Each pool sequenced on Illumina HiSeq 2000 in one 100-bp single-end run and two lanes in 100-bp paired-end runs
    • paired-end sequence data: 2 runs x 2 sequences x 192 samples = 768 sequences
    • single end sequence data: 1 run x 1 sequence each x 192 samples = 192 (1 sequence missing so we have 191)

I did FastQC and trimmomatic steps for quality filtering and trimming. I am following some pipelines that I found and I believe the next step would be alignment to the reference genome. But how can I merge all the files into a single sequence so I can do alignment step? Or do I need to align single-end and paired-end separately? Also, I would be greatful if anyone can provide me a simple tutorial/pipeline for SNP calling.

Thanks in advance, Prashant

illumina snp snp calling concatenate • 995 views
ADD COMMENT
1
Entering edit mode
3.9 years ago

Or do I need to align single-end and paired-end separately?

align separately, merge the sorted BAM later with samtools.

o, I would be greatful if anyone can provide me a simple tutorial/pipeline for SNP calling.

http://www.htslib.org/workflow/

ADD COMMENT

Login before adding your answer.

Traffic: 2371 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6