Question

Regarding Bowtie-1.0.0

0

Entering edit mode

12.0 years ago

Raghav ▴ 100

hello every one, i have 454 sequencing data and i want to map it on reference genome, for that I have build data base in bowtie folder (to map my reads) and my data file (.fna file) contain reads with unique ids like

>read1
ACGATCGCAGATC----hindIII site-----CGATCGCATCGCATCAGCAT.. 
>read2
ACGATCGCAGATC----AAGCTT with addition of 1 or 2 nts in between cut position-----CGATCGCATCGCATCAGCAT..

and so on... I want to map such type of sequence to my reference genome but i don't how to to choose parameters in bowtie.. If any one has experience of such type of mapping please help me out. you may also suggest better mapping tools.

waiting for your valuable comments.

bowtie 454 • 2.4k views

ADD COMMENT • link updated 10.9 years ago by Biostar 20 • written 12.0 years ago by Raghav ▴ 100

score 0 · Answer 1 · 2013-05-01

Its not completely clear to me how your reads look like, but i think you have some sort of fixed 5' and 3' adaptor - or are the ---- your insert sequences and you have a hindIII site somewhere in read1 but not in read2?

With bowtie 1 you would have to trim the reads. If its an exact number of nucleotides and the data is in fasta/q format this is easy with eg. fastx trimmer where you can enter the first base and the last base to keep. Or, because its 454 so not a lot of data and you seem unexperienced with bioinformatics upload it to galaxy and use the tools there (fastx trimmer is there IIRC).

With bowtie 2 you could align in local mode without needing to trim. But thats suboptimal.

score 0 · Answer 2 · 2013-05-01

dear Tamir, thank you for your kind reply. my reads are look like:

G6H7M2K02GSJ6M rank=0000448 x=2668.5 y=2204.0 length=101 GTGTTGGGTGTGTTTGGTGTGTTGTTTTCTAACAAGGATACACTACTTAGGCTTTTAAGATCGGGTTGCGGTTTAAGTTCTTATACTCAATCATACACAAT G6H7M2K02IOEII rank=0001256 x=3441.0 y=1256.0 length=47 GTGTTGGTGTGTTGTTGGTGTTGTATCAGTCAGCACACAGGGAGTAG G6H7M2K02I46FY rank=0001789 x=3632.0 y=1612.0 length=61 TATAGTAGATGAGGTCTAGTCCTAAACTCGTCTCGTCTAACACCTATATAAATAGGTTTAC

in each read,hindIII cut site present somewhere in between. and theoretically each read should be mapped at two different locations (either on the same chromosome or at different chromosomes). I am a new one in this field [NGS data analysis] and now i am looking the same task in bowtie2 but unable to fix parameters in bowtie2 I used this command for bowtie1 and did not get desire out put ./bowtie2 < reference data base> -f <query> --all <outputfile>

I am expecting result something like that:

G6H7M2K01C5MVF, 75..19 of 345 and Chr4, 5564653..5564709 of 18585056 (52/57 ident)

75 TTCCCCCATCAAGAAATAGAACTGACTAATCCTAAGTCAAAGGGTCGAAAAACCCAA 19

5564653 TTCTCCCATCAAGAAATAGAACTGACTAATCCTAAGTCAAAGAGTCAAGAAACTCAA 5564709

explanation::: 75 to 19 position of read id G6H7M2K01C5MVF of length 345 mapped with chromosome 4 at 5564653 and 5564709 and rest of positions of read are matched on same chromosome at 5564536 and 5564622

G6H7M2K01C5MVF, 212..125 of 345 and Chr4, 5564536..5564622 of 18585056 (82/89 ident)

212 TATCCATTCTTATTCGATCACAGCGAGGGAGCAAGTCAAAATAGAAAAACTCACATTCATTGGGTTTAGGGATAATCAGGCTCGA-ACT 125

5564536 TATCTATTCTTATTCGATCACAGCGAGGAAGCAAGTTAAAATAGAAAAACTCACATTTATTGGGTTTAGGGATAATCAGGC--GACACT 5564622

waiting for your kind reply,

score 0 · Answer 3 · 2013-05-01

Why do you want to use bowtie 1? That's close to the worst choice you could make for 454 data.

454 data is characterized by short indels stemming both from homopolymer repeat length errors and the general error structure of pyrosequencing.

Try searching this forum (and other sites) for better tools to use with 454 data:

What'S The Best Reads Aligner For 454 Data Now???

What Aligner To Use For Paired-End 454 Reads?