Extarct specific bases from BAM/SAM files
1
0
Entering edit mode
3.1 years ago
Ankit ▴ 500

Hi everyone,

I want to extract a specific base (based on position lists) from my reads from bam/sam file.

If any specific tools / package exist please let me know.

I followed the long a useful thread here. But I did not get the specific base.

I would appreciate any help.

Thank you

BAM SAM Bases Extract • 1.0k views
ADD COMMENT
1
Entering edit mode
ADD COMMENT
0
Entering edit mode

Thank you for the tools.

My sequence reads are amplicon.

It gives error:

with full genome hg38.fa

Sequence dictionaries are not the same size (455, 25)

with only chr14.fa (I thought amplicon could be the cause of error)

Sequence dictionaries are not the same size (1, 25)

What could be the issue?

I created dictionary file with picard:

java -jar picard-2.26.2/picard.jar CreateSequenceDictionary R=hg38.fa O=hg38.dict

or

java -jar picard-2.26.2/picard.jar CreateSequenceDictionary R=chr14.fa O=chr14.dict

My command:

java -jar dist/sam2tsv.jar -R chr14.fa ./../S49.sort.bam

I would appreciate any suggestions.

Thanks

ADD REPLY
0
Entering edit mode

the dictionary in the BAM file is not the same as the one in hg38.fa. The reference is not the same as the one that was used to map the reads. See the @SQ lines in samtools view -H S49.sort.bam

ADD REPLY
0
Entering edit mode

THANK YOU for pointing it out.

I have two limitations:

  1. My data is a bisulfite reads from an amplicon and I used individual chromsomes like chr1.fa , chr2.fa .......... so on to create bisulfire reference genome using bismark and then aligned the data. How to created dictionary file for this. I tried

    java -jar picard-2.26.2/picard.jar CreateSequenceDictionary R=*.fa O=hg38.dict

but it gives error and i think that is also not the correct syntax to do it. But I do not find the solution.

  1. . sam2tsv -R

    java -jar dist/sam2tsv.jar -R hg38.fa ./../S49.sort.bam

script needs hg38.fa but my bisulfite genome reference folder has two fasta a). genome_mfa.CT_conversion.fa b). genome_mfa.GA_conversion.fa

I do not know how to deal with bisulfite reference, which one to supply in -R option.

thanks

ADD REPLY

Login before adding your answer.

Traffic: 2665 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6