Question

getfasta command requires a bedfile template of some sort

0

Entering edit mode

8.2 years ago

chrisclarkson100 ▴ 160

Apologies but I am completely new to bedtools and fasta

I want to get a fasta file from a full genome annotation as follows:

bedtools getfasta -fi /home/cc16956/mm9.fa  -fo nucs_gained_099_v5_occup.fasta

However having looked at the documentation, it seems that I require the following template instead, where I use a bed file to specify the size of each chromosome with the 'fa' file....

From a previous post, I gathered that I could make one such file with a command like this:

mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -e "select chrom, size from mm9.chromInfo"  > mm9.genome

chrom   size
chr1    197195432
chr2    181748087
chr3    159599783
chr4    155630120
chr5    152537259
chr6    149517037
chr7    152524553
chr8    131738871
chr9    124076172
chr10   129993255
chr11   121843856
chr12   121257530
chr13   120284312
chr14   125194864
chr15   103494974
chr16   98319150
chr17   95272651
chr18   90772031
chr19   61342430
chrX    166650296
chrY    15902555
chrM    16299
chr13_random    400311
chr16_random    3994
chr17_random    628739
chr1_random     1231697
chr3_random     41899
chr4_random     160594
chr5_random     357350
chr7_random     362490
chr8_random     849593
chr9_random     449403
chrUn_random    5900358
chrX_random     1785075
chrY_random     58682461

Having re-tried as follows:

bedtools getfasta -fi /home/cc16956/mm9.fa -bed /storage/projects/teif/mm9_generic_data/mm9.genome  -fo nucs_gained_099_v5_occup.fasta

I got this error:

It looks as though you have less than 3 columns at line: 1.  Are you sure your files are tab-delimited?

Can anyone tell me what the required format of the bed file is as I am completely new to this....

getfasta bedtools • 5.8k views

ADD COMMENT • link updated 8.2 years ago by harold.smith.tarheel ★ 5.0k • written 8.2 years ago by chrisclarkson100 ▴ 160

GenoMax · Answer 1 · 2016-11-03

1

Entering edit mode

8.2 years ago

Manvendra Singh ★ 2.2k

you would have to make your file in "bed" format with atleast 3 coloumns without header viz.

# Chr start end

add second coloumn as number "1" then bed file would be mm9.bed then you use this comand

bedtools getfasta  -fi /home/cc16956/mm9.fa -bed /storage/projects/teif/mm9_generic_data/mm9.bed -fo nucs_gained_099_v5_occup.fa

hth

ADD COMMENT • link updated 8.2 years ago by GenoMax 148k • written 8.2 years ago by Manvendra Singh ★ 2.2k

score 0 · Answer 2 · 2016-11-03

0

Entering edit mode

8.2 years ago

harold.smith.tarheel ★ 5.0k

See here for file format. Note that BED positions are zero-based.

ADD COMMENT • link 8.2 years ago by harold.smith.tarheel ★ 5.0k