Chipseq Fastq Files
2
1
Entering edit mode
12.9 years ago
Hamilton ▴ 290

Hi,

I have a raw sequencing file from illumina like below. This is chipseq data.

HWUSI-EAS491_2:1:1:1696:1814:ATTGCCGGACGCCAAGCTAGCCCGTGGGTGGGGGGG:hhS@EB?IEDBGA?>GE?B?CBBQ=CEG?FE@B?EN
HWUSI-EAS491_2:1:1:1369:1245:ATTGACAGTTGGCTCGCCAAATAGGAGGGCCGGCAG:hhhLIVOhAGIcD?ER@\F?DB??IDDMPFDHPF>G
HWUSI-EAS491_2:1:1:1572:1204:ATTTTGCGATGGCGGAAATTCTTGAGTTTAGTTGAT:hIC@GLBSHCOZ@FB?HCBADA@CA=B@@?FB?=C?
HWUSI-EAS491_2:1:1:1514:1273:AGAATAGCGGGGAGTTAAACGGTGCGGGCCAGGATC:hI[E??NN?]GPDE?>I?A?@>DC@DFDEB?DGA?C
HWUSI-EAS491_2:1:1:343:824:GATTAGAATCAGCCGTCGCTCGTTGTCCAGCTGTAA:hXFA?BE==BDFFA?>?FAA@?A?DEHA@=BB@B>C
HWUSI-EAS491_2:1:1:1612:1422:GAAAATTGCAAATCACGGAAAAAGAGCAATACACAG:hhXPAQJN=GLDIJIIF=DB@C@ACIA?C??N?G?@
HWUSI-EAS491_2:1:1:1410:1308:GAAATATGGCGACTAGAACCCAACAAGGGAGAAGCG:hg^IDMGLCDLIA?J@HKCAAGA@F?>X>@HJG=DG
HWUSI-EAS491_2:1:1:1723:1168:ACATGGGCGGCCGCCGCCGGCAGTGGCCGGTCGCGC:hQh@=?LC>AHD?BCBB?>?D?N?GDG?NB=AJFAA
HWUSI-EAS491_2:1:1:1586:1332:ATTATAGTCCTGAAGCTTTCCTCGAAGGTATATGCC:hhhJFMKMLNHL@A?EBD>MIB@EBI?BE?@DCCCN
HWUSI-EAS491_2:1:1:1686:1141:GATGGACGAGGGCTGGTACTGAGGGGAGGGGCTGCG:hORGFBAM>ECIID?@D=FFCDILE?GHBB>D>>A?
HWUSI-EAS491_2:1:1:1443:1504:GTTTTGTTTTGTTATTTGTCGATATATGTTTCCTTT:hJJUVGCAFINXOCEG@A?B>CBA??C>E?H@>@?I
HWUSI-EAS491_2:1:1:1559:1277:GACATTTGTCCGCCTCTTTCCATGCACTACGACATT:hO[E=BAJF?AA?=>F@?>BD?AK=?I==DB@GHAH>AM=??B>G

Using this file, I am trying to do mapping with bowtie.

Command line: bowtie -r -S bowtieindex/mm9 input file output.sam

But only 2 percent was mapped and 98% was not mapped.

I have no idea why it is.

Any comments or suggestions?

Thanks.

fastq fastq filter chip-seq bowtie • 6.2k views
ADD COMMENT
3
Entering edit mode
12.9 years ago

According to what you write above, you are using the -r switch to Bowtie.

This is what the Bowtie manual says about the -r option:

-r

The query input files (...) are Raw files: one sequence per line, without quality values or names. All quality values are assumed to be 40 on the Phred quality scale.

So each line is supposed to be just a sequence, no quality, no name. Your example sequence is a bit weirdly formatted, but for sure it includes the name of the read at the beginning, so you can't use -r.

Another issue that could give this kind of result is if you use the wrong quality score scale. For the "phred+64" quality scale that Illumina data sometimes uses, you have to add the --solexa-1.3-quals option.

ADD COMMENT
2
Entering edit mode
12.9 years ago
brentp 24k

In addition to what @Mikael has said...

It looks like your input is FASTQ format--except messed up. Can you paste 8 lines directly from the file into your question? From your input above, somehow it looks like the 4 lines are merged into a single line.

Did bowtie give an error when you tried to run it? It should have been able to at least attempt to align 25% of the rows in your FASTQ file.

Maybe you'll also need to trim the tag from the left end of the read. You can do this with the --trim5 flag and specify the number of bases to skip in each read--likely 6 or so.

EDIT:

your data is not in FASTQ format. Save this into a file named tofq.py

import sys
for line in open(sys.argv[1]):
    header, seq, qual = line.rstrip().rsplit(":", 2)
    print "@%s\n%s\n+\n%s" % (header, seq, qual)

then run it as:

python tofq.py yourfile.not.fastq > output.fastq

where yourfile.not.fastq is the name of the file where you got the reads above. Then run output.fastq through bowtie.

ADD COMMENT
1
Entering edit mode

If i try with bowtie --solexa1.3-quals --trim5 -S: Error: reads file does not look like a FASTQ file

ADD REPLY
0
Entering edit mode

if i run this file with bowtie -p 8 -S, then i got an error Error: reads file does not look like a FASTQ file terminate called after throwing an instance of 'int' Abort trap

If i try with -r option, there was no error, but i got poor mapping result like this:# reads processed: 5821424

reads with at least one reported alignment: 1449 (0.02%)

reads that failed to align: 5819975 (99.98%)

Reported 1449 alignments to 1 output stream(s)

ADD REPLY
0
Entering edit mode

And i just edited my question with the more information of the file.

ADD REPLY
0
Entering edit mode

if i run with bowtie --phred64-quals -5 6 -3 6 -S: Error: reads file does not look like a FASTQ file

ADD REPLY
0
Entering edit mode

any suggestions??

ADD REPLY
0
Entering edit mode

see my EDIT above.

ADD REPLY

Login before adding your answer.

Traffic: 1943 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6