Error In Converting Sam To Bam By Samtools
2
0
Entering edit mode
11.6 years ago
Raghav ▴ 100

hello every one,

after mapping,i encountered an error during converting my sam out file into bam out put file.

[cdac@nbri samtools-0.1.19]$ samtools view -bS /storage/home/cdac/raghav/bowtie2-2.1.0/at_wt.sam > at_wt2.bam [samopen] SAM header is present: 7 sequences. Parse error at line 38967971: sequence and quality are inconsistent Aborted

it generated approximately 1gb file then aborted how to over come this error??

I have another queries related with sam and bam file:

is it necessary to convert sam file to bam file??? if yes then why it is so??

what are the methods through which we can cross checked our out put whether it mapped correctly or not?? which tools are suitable to visualize our mapping results [except circos].

your valuable comments and suggestions are always welcome.

sam bam samtools • 8.2k views
ADD COMMENT
0
Entering edit mode

Dear Sir Philipp,

sam bam conversion is basically depend on downstream tool which we are going to use, basic advantage of this conversion is low space storage [let me sure about it].

when I checked corresponding read:

[cdac@nbri bowtie2-2.1.0]$ sed -n '38967971p' at_wt.sam
SRR681003.19483981      145     1       19313630        42      50M     =       19461614        148034  TGATATGTTTCCATGGACGTTTGATTTCACCATGGAATCGAGAATCGAAC   hiiiiiiih

as I got from this line I think it mapped at chromosome1 at position 19313630 [ is any thing wrong with this line??] I am getting trouble to install picard tool when I am looking for help ValidateSamFile --help it giving me all the details but when I am running:

[cdac@nbri picard-tools-1.90]$ ValidateSamFile -I /storage/home/cdac/raghav/bowtie2-2.1.0/at_wt.sam
-bash: ValidateSamFile: command not found

I am basically interested in chimeric reads [I don't bowtie2 will give it to me or not]

ADD REPLY
0
Entering edit mode

I can see your problem - TGATATGTTTCCATGGACGTTTGATTTCACCATGGAATCGAGAATCGAAC is the nucleotide sequence of your read, "hiiiiiiih" is the quality score for these nucleotides and should be of the same length, which it obviously isn't.

Can you check your original fastq for the read "SRR681003.19483981" and see if that one has the full quality string? If it doesn't have the full quality, you can just remove the four lines in your original fastq and re-run bowtie.

ADD REPLY
0
Entering edit mode

Dear Sir Philipp, I got what you want to say here, it might be possible that there may so many reads which may not be qualify full quality string. So before going to map fastq file it must be pass through quality test and for that purpose Fastx toollit, fastQC and FTQC tools are available, Unfortunately I did not use anyone of them. Impotent thing is how to customize any one of available tool for paired end reads and obtain good quality of fastq file..
thank you sir.

ADD REPLY
0
Entering edit mode

Dear Sir Alex,

You always help me since beginning when I started NGS data analysis a month before.

Sir I am unable to interpret this line "... adds an index, so that you can retrieve portions of it without filtering the rest." what index you are talking about (is it related with reference index or some thing else) and retrial of position is related to which type of information??? please explain a bit sir.

Sir there is nothing very special with CIRCOS, I just want to visualize my alignment file, as you suggested I try to look another SAM or BAM viewer.

ADD REPLY
0
Entering edit mode

A BAI index file associated with a BAM file is a like an index of a book, which helps you look up something you're interested in. In a cookbook, the index might tell you where to find the recipe for shrimp fried rice. With a BAM file, the index will help you pull out some genomic region you're interested in.

ADD REPLY
1
Entering edit mode
11.6 years ago

A BAM-file is the compressed version of a SAM-file, some downstream programs work only with BAM-files, some can work with SAM-files, depends on what you want to do with your files.

Have you checked which read is aligned at line 38967971? Could be that the fastq-file you put in is corrupted at that position, leading to a broken SAM-file.

Edit: I just remembered that if you ran the alignment on a cluster and the system killed your job prematurely, converting your SAM-file will result in similar error-messages. Check if your bowtie-run actually finished.

You can always check your SAM-files using Picard's ValidateSamFile and see what it has to say about your SAM-file.

which tools are suitable to visualize our mapping results

I'm not sure what you mean here - what do you want to see? Do you have an assembly you made yourself and want to see how good your reads align? For just a look in general, I'd use Tablet.

ADD COMMENT
1
Entering edit mode
11.6 years ago

Perhaps you could use sed to retrieve that line and see what the issue might be. I think BAM is a binary version of SAM that allows generating an associated index, so that you (or some visualization tool) can retrieve portions of it without processing the rest. I don't know why you would use circos to visualize reads, but you could search other threads on biostars for SAM or BAM viewers.

ADD COMMENT

Login before adding your answer.

Traffic: 1495 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6