How to convert bam into fasta with chromosome name in fasta file having consensus sequences in fasta file
1
0
Entering edit mode
6.7 years ago
Kritika ▴ 270

hi All

E00477:196:HCYTLCCXY:3:1222:9425:46542  99  A01 4   0   150M    =   157 303 AAACACGCGGATCCTTCGGGTCGGGTCGGGTCGACGCGCGGATCCCCCTTTGCTAAAACGACGCCGTTTTGTGTTTAATATAAATATAAAAAAAAGGCTAAAAACAAAACTGCTTCATCATTTTGTTGAAAAAACAGAGAGAAAACTCTC  AAFFFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJFJFJ  XA:Z:A10,-4883245,150M,0;D06,+59383175,150M,1;A06,+3484528,87M1I62M,3;D05,-25737025,55M1I94M,4; MC:Z:150M   MD:Z:150    RG:Z:4  NM:i:0  AS:i:150    XS:i:150   
E00477:196:HCYTLCCXY:3:2107:21105:50814 99  A01 5   0   150M    =   69  214 AACACGCGGATCCTTCGGGTCGGGTCGGGTCGACGCGCGGATCCCCCTTTGCTAAAACGACGCCGTTTTGTGTTTAATATAAATATAAAAAAAAGGCTAAAAACAAAACTGCTTCATCATTTTGTTGAAAAAACAGAGAGAAAACTCTCT  AAFFFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJFFJJJJJJJJJJJJJJJJJJJJJJJJJJFJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJFJJJJFJJJ  XA:Z:A10,-4883244,150M,0;D06,+59383176,150M,1;A06,+3484529,86M1I63M,3;D05,-25737024,56M1I93M,4; MC:Z:150M   MD:Z:150    RG:Z:4  NM:i:0  AS:i:150    XS:i:150
E00477:196:HCYTLCCXY:3:1122:22648:4262  163 A01 6   0   150M    =   194 338 ACACGCGGATCCTTCGGGTCGGGTCGGGTCGACGCGCGGATCCCCCTTTGCTAAAACGGCGCCGTTTTGTGTTTAATATAAATATAAAAAAAAGGCTAAAAACAAAACTGCTTCATCATTTTGTTGAAAAAACAGAGAGAAAACTCTCTC  AAFFFJJJJJJJJJJJJJFJJJJAJJJJJJJJJJJJJJJJJJJJJJJJJJJJAFJJJJJFJFJJ<A-AJFF7<AAAFJJJJJJJJJJAJJJJJJJJJJJJFJJFFAJJJJJF<FAJJJJJJJJJJJJJJJFFJFJFJJJFJJJJJJFJJJ  XA:Z:D06,+59383177,150M,0;A10,-4883243,150M,1;A06,+3484530,85M1I64M,2;D05,-25737023,57M1I92M,3;A12,+73788594,126M1I23M,6;   MC:Z:150M   MD:Z:58A91  RG:Z:4NM:i:1    AS:i:145    XS:i:150
E00477:196:HCYTLCCXY:3:1202:19076:68816 163 A01 9   0   150M    =   124 265 CGCGGATCCTTCGGGTCGGGTCGGGTCGACGCGCGGATCCCCCTTTGCTAAAACGACGCCGTTTTGTGTTTAATATAAATATAAAAAAAAGGCTAAAAACAAAACTGATTCATCATTTTGTTGAAAAAACAGAGAGAAAACTCTCTCTTT  A-AAFFJF-<F-7AJAAFJJJJA7-A7FJ7-7-AJ<AFFFJFJJJ7JA<FA-<-<FFFJJJAJFJJFJJJJJJJF7FJAJJ-7<FJFFJFFJAJ77FJFJJJJJ-77-7AA-A7FJF-<JJJJJFFFJ<<AFAFF<<-AJJJFJFFJJFJ  XA:Z:A10,-4883240,150M,1;D06,+59383180,150M,2;A06,+3484533,82M1I67M,4;D05,-25737020,60M1I89M,4; MC:Z:150M   MD:Z:107C42 RG:Z:4  NM:i:1  AS:i:145    XS:i:145

What i want is fasta file with this format

A01 AAACACGCGGATCCTTCGGGTCGGGTCGGGTCGACGCGCGGATCCCCCTTTGCTAAAACGACGCCGTTTTGTGTTTAATATAAATATAAAAAAAAGGCTAAAAACAAAACTGCTTCATCATTTTGTTGAAAAAACAGAGAGAAAACTCTCTCTTT

fasta bam • 2.1k views
ADD COMMENT
0
Entering edit mode

I'd suggest that you change the title of the thread because you are not asking for a simple conversion, rather for creating a consensus.

ADD REPLY
0
Entering edit mode
6.7 years ago

One solution with awk:

samtools view your.bam | awk -v OFS='\t' '{print ">"$3"\n"$10}'

The identifier line in fasta must start with ">". I you don't need it in your case just remove it from the print command.

fin swimmer

ADD COMMENT
0
Entering edit mode

but will it generate consensus? or it will give me each sequence with chromosome name?

ADD REPLY
0
Entering edit mode

No, building a consensus fasta out of bam is total diffenrent task (which I couldn't read in your initial post) than just convert bam to fasta.

ADD REPLY
0
Entering edit mode

The fasta sequence which i listed above is

A01 AAACACGCGGATCCTTCGGGTCGGGTCGGGTCGACGCGCGGATCCCCCTTTGCTAAAACGACGCCGTTTTGTGTTTAATATAAATATAAAA

has chromosome name with all mapped cotings what i need is to convert my bam file into fasta and use it as my reference

ADD REPLY
2
Entering edit mode

I have the impression you are asking a XY-Question. So could you please try to explain what you realy try to solve?

fin swimmer

ADD REPLY
0
Entering edit mode

If you need to generate consensus sequence from your aligned BAM file follow the instructions here.

ADD REPLY

Login before adding your answer.

Traffic: 2045 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6