Retrieving identical positions from mpileup file
1
0
Entering edit mode
7.4 years ago
samocarp • 0

Hello all,

I performed a read mapping of one species against a reference genome. I converted the file_sorted.bam to a file.mpileup. Now, I'd like to create a file containing shared (identical) positions between the reference genome and the mapped species. Also, these shared positions must have not only a good mapping quality but also a read depth > 10. Does anyone know the best way to do this?

Thank you!

snp alignment • 1.1k views
ADD COMMENT
0
Entering edit mode
7.4 years ago
Samuel Brady ▴ 330

I would suggest (1) using VarScan to identify differences and similarities between reference and your sample or (2) writing a Python script if you know Python. If you use VarScan you could do something like this:

java -Xmx8g -jar VarScan.v2.3.9.jar mpileup2snp file.mpileup --min-coverage 10

If you write a Python script you could just take a look at the pileup file format spec here: http://samtools.sourceforge.net/pileup.shtml. It's pretty easy to identify SNPs (differences between sample and reference) and read depth in a pileup file. In your case since you want sites that are the same between reference and sample you will want straight periods or commas (a match to the reference genome on the forward or reverse strands, respectively) in your fifth column.

ADD COMMENT

Login before adding your answer.

Traffic: 1758 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6