Missing header sam file
1
0
Entering edit mode
7.3 years ago
G.Car ▴ 20

Hi everyone

I am trying to analyse some old data on STAT3 binding locations in macrophages upon IL-10 treatment. I found a dataset which perfectly matches what I want, however it was only available in bowtie format. My initial aim is to view it in the UCSC genome browser and quickly check for peaks at my genes of interest then move on to a more detailed analysis.

I managed (with some difficulty, I'm very new to all this) to convert it into a SAM file, however when I try to upload it to UCSC I get an error. After spending a while trying to figure out what was up, I discovered that it's missing the @SEQ header.

Now, I know that it has been mapped to mm9 as the reference genome, so I was hoping someone could help me generate a basic header.

I have read around and attempted converting it to a bam, using a fasta file of mm9 chr1, however I'm kicked back given an error:

samtools view -bT Documents/chr2.fa Documents/ChIP/STAT3.txt > STAT3.bam
[samfaipath] build FASTA index...
[W::sam_parse1] urecognized reference name; treated as unmapped
[W::sam_read1] Parse error at line 1
[main_samview] truncated file.

I would appreciate any help people can provide, or alternate methods of generating the @SEQ header, and please explain in detail, I'm new to all this and it takes me a while to understand what exactly I have to do.

Thank you!

SAM ChIP-Seq BAM header • 5.4k views
ADD COMMENT
0
Entering edit mode

wha is the output of

head Documents/ChIP/STAT3.txt
ADD REPLY
0
Entering edit mode
head Documents/ChIP/STAT3.txt

A204RKABXX:3:26:4008:112894#TGACCAAT 0 chr15 14328859 25 47M * TACCTTGCTTTGGGGATTACAGTTAAGTGACTGAATGAACCTCAGGA GGGGGGGGGGGGGGGGGGGGGGGGGGGFGEGGGGGGGGGGGGGDGGD NM:i:0 X0:i:1 MD:Z:47 A204RKABXX:3:26:6009:112894#TGACCAAT 0 chr18 43952303 25 47M * CAGCCCAGTGTTCTTTATGTGGCGCCAAAATGCCCCTCCCCTTTAGT GFGGGGGGGGGGGGGGGGFGGGGGGGGGDFGGGGGGGGGGGGGGEGE NM:i:0 X0:i:1 MD:Z:47 A204RKABXX:3:26:5766:112899#TGACCAAT 0 chr2 17589600 25 47M * AGGAAGACACTGGACTTTTTATGGCTGGTACTAGGCATATCTCCCTG GGGFFGGGFGGGGGGGGGGGEGEGGGG#################### NM:i:1 X1:i:1 MD:Z:27G16C2
A204RKABXX:3:26:8370:112900#TGACCAAT 16 chr8 64320362 25 47M * TCGCCTATTTTGTTAGTTTGAAACAACTATGCAGCCCTGAATGACTT GFGGGGFFFGGGGGGGDGGGGGGGGGGGGGGGGGGGGGGGGFGGGGG NM:i:0 X0:i:1 MD:Z:47 A204RKABXX:3:26:8496:112900#TGACCAAT 0 chr2 132770552 25 47M * TGGTTTTCCCACATTCCTTTCCTATCTCTCTGCGCCTTCAGTTTGGC EEEEGGGGGGGGGGGGGGGGGGEGGFGGGGGGGEEGFGGGGFGFDEB NM:i:0 X0:i:1 MD:Z:47 A204RKABXX:3:26:8144:112893#TGACCAAT 0 chr3 27838344 25 47M * TTTGTTTGAAACAGTCTTCTGTAGCTCAGGCTGCACTCAAAGGCTAT GGGGGGGFGGGFGGGFGGDDDEFGGDFEFDEE?EEBEBDEEAFEEDA NM:i:0 X0:i:1 MD:Z:47 A204RKABXX:3:26:12358:112894#TGACCAAT 0 chr4 134222144 25 47M * TGTTAGCCTTGGTTTCTGTTCCCGGCCATTCACACACAGCCCACCTC GGGFGGGGGGDFFGFGGFGFGDGGGFGGGEGEG:CCCCEEGG?EE?E NM:i:0 X0:i:1 MD:Z:47 A204RKABXX:3:26:16734:112899#TGACCAAT 16 chr2 29721733 25 47M * GGTCGAGATCCAGAAGATCTGCTGTCTGGTGAGGACCTGTTCCTCAC ECBCGDGEGDFCFFCEBAEAGGGGEEGGGGFGGGEEGGGGEFGGGGG NM:i:0 X0:i:1 MD:Z:47 A204RKABXX:3:26:16019:112892#TGACCAAT 16 chr16 85400695 25 47M * AGCAAACACCAGGAAAATAGCAGGATACTGTTGCTAAGGAAATGGGA GDFGFFFBFFFGFGGFGAGEGGGGFFFFEEGGGGEGGFGGEGFGGGG NM:i:0 X0:i:1 MD:Z:47 A204RKABXX:3:26:14189:112898#TGACCAAT 0 chr9 54525873 25 47M * CATTGGTATTTTGACTGCATGTCTGTCTGTGTTAGATCCCCTGGAAC EBFFFEFDEDEEAEDEE?DDCDCFFFBFEEAEECEAEEFFEEEEFEE NM:i:0 X0:i:1 MD:Z:47

ADD REPLY
1
Entering edit mode
7.3 years ago

(not tested) you can create a dict file using picard https://broadinstitute.github.io/picard/command-line-overview.html#CreateSequenceDictionary and then create the bam file:

cat new.dict Documents/ChIP/STAT3.txt | samtools view -Sb -o out.bam -
ADD COMMENT
0
Entering edit mode

Thanks for the suggestion, but setting up picard goes way over my head when it comes to UNIX. I'm barely scraping by as it is, but having to set up an environmental variable to access picard requires a level of understanding that I just don't have.

ADD REPLY
0
Entering edit mode

you can always use a dict on the web: http://dldcc-web.brc.bcm.edu/lilab/deqiangs/ref/bowtie/Mm9/mm9.dict , but you need to be sure that it is the very same reference genome (UR and M5 tag are not required/checked)

ADD REPLY
0
Entering edit mode

Thanks, how would I go about checking it? It's definitely against mm9 but that's all I know.

ADD REPLY
0
Entering edit mode

just try....

ADD REPLY
0
Entering edit mode

Hmm, no, I get another error: [W::sam_read1] Parse error at line 24 [main_samview] truncated file.

ADD REPLY

Login before adding your answer.

Traffic: 2527 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6