Output From Picard And Samtools Flagstat Not Very Useful From My Sambam Files:
1
2
Entering edit mode
12.8 years ago
Angel ▴ 220

Sorry for another post ... as I could not fit this message as a "comment" in my last post of alignment statistics:

I got the following from Picard CollectAligment and samtools flagstat ... I found flagstat result more informative than picard but no software allows me to fill the info in table at the end that I want. Sorry for my dumb questions. Any comments? Is there stand-alone perl or any other script that I can use? Thanks very much.

CATEGORY FIRSTOFPAIR SECONDOFPAIR PAIR

TOTAL_READS 269056355 269056355 538112710

PF_READS 269056355 269056355 538112710

PCTPFREADS 1 1 1

PFNOISEREADS 256 288 544

PFREADSALIGNED 0 0 0

PCTPFREADS_ALIGNED 0 0 0

PFALIGNEDBASES 0 0 0

PFHQALIGNED_READS 0 0 0

PFHQALIGNED_BASES 0 0 0

PFHQALIGNEDQ20BASES 0 0 0

PFHQMEDIAN_MISMATCHES 0 0 0

PFMISMATCHRATE 0 0 0

PFHQERROR_RATE 0 0 0

PFINDELRATE 0 0 0

MEANREADLENGTH 100 100 100

READSALIGNEDIN_PAIRS 0 0 0

PCTREADSALIGNEDINPAIRS 0 0 0

BAD_CYCLES 0 0 0

STRAND_BALANCE 0 0 0

PCT_CHIMERAS 0 0 0

PCT_ADAPTER 0.001418 0.00058 0.000999


I tried: samtools flagstat as well and I got:

538112710 + 0 in total (QC-passed reads + QC-failed reads)

0 + 0 duplicates

512844854 + 0 mapped (95.30%:-nan%)

538112710 + 0 paired in sequencing

269056355 + 0 read1

269056355 + 0 read2

501199328 + 0 properly paired (93.14%:-nan%)

509847920 + 0 with itself and mate mapped

2996934 + 0 singletons (0.56%:-nan%)

728423 + 0 with mate mapped to a different chr

568502 + 0 with mate mapped to a different chr (mapQ>=5)


Table that I want to fill is:

Total read = 538112710

Uniquely paired read#

Non-uniquely paired read#

Uniquely mapped read1#

Non-uniquely mapped read1#

Uniquely mapped read2#

Non-uniquely mapped read2#

Unmapped read1#

Unmapped read2#

Uniquely paired read%

Non-uniquely paired read%

Uniquely mapped read1%

Non-uniquely mapped read1%

Uniquely mapped read2%

Non-uniquely mapped read2%

Unmapped read1%

Unmapped read2%

samtools exome picard • 5.5k views
ADD COMMENT
0
Entering edit mode

What is your mapping program ?

ADD REPLY
0
Entering edit mode

have you tried Bioconductor or Biopython? If you are desperate consider placing the project up for bid on Guru or Elance.

ADD REPLY
0
Entering edit mode

Hi Jeremy, I looked into http://www.bioconductor.org/developers/new_packages/ and do not see a package mentioned that can read bam file and give me the metrics I am looking for. Am I sorry if I missed it. Any comments? My bam file are of the size of 60-70GB, would bioconductor be able to handle these files?

ADD REPLY
0
Entering edit mode

Hi Tony, I am using bwa for alignment and samtools for the rest of the pipeline.

ADD REPLY
2
Entering edit mode
12.8 years ago
User 59 13k

You might want to try CollectMultipleMetrics from Picard instead. For whole exome work I use that and CalculateHsMetrics and get as much information as I could possibly need..

ADD COMMENT
0
Entering edit mode

Hi Daniel, I could not understand/find info on how to get input "Bait_file" and "Interval_file". COuld you point me to a command or link on how to get these input files?

ADD REPLY
0
Entering edit mode

Depends what your capture method is. We use mainly Agilent SureSelect, so we get this information from E-Arrayhttps://earray.chem.agilent.com/earray/ I'm sure whoever makes your capture kits will have a similar resource

ADD REPLY
0
Entering edit mode

Hi Daniel Thanks!

I am being told the data was created using Agilent SureSelect whole exome 50MB kit for the hybrid capture (in a solution, not on a chip).

I registered at the site you gave me but sorry I still don't know how to get the "bait" and "interval" file. Would you be able to provide little bit more info on how can I use the site to get these files? I highly appreciate your help.

ADD REPLY
0
Entering edit mode

Angel, I have these files for the 50MB kit for hg19. Drop me an email: dan[dot]swan[at]gmail[dot]com

ADD REPLY

Login before adding your answer.

Traffic: 1903 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6