Apologies but I am completely new to bedtools and fasta
I want to get a fasta file from a full genome annotation as follows:
bedtools getfasta -fi /home/cc16956/mm9.fa -fo nucs_gained_099_v5_occup.fasta
However having looked at the documentation, it seems that I require the following template instead, where I use a bed file to specify the size of each chromosome with the 'fa' file....
From a previous post, I gathered that I could make one such file with a command like this:
mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -e "select chrom, size from mm9.chromInfo" > mm9.genome
chrom size
chr1 197195432
chr2 181748087
chr3 159599783
chr4 155630120
chr5 152537259
chr6 149517037
chr7 152524553
chr8 131738871
chr9 124076172
chr10 129993255
chr11 121843856
chr12 121257530
chr13 120284312
chr14 125194864
chr15 103494974
chr16 98319150
chr17 95272651
chr18 90772031
chr19 61342430
chrX 166650296
chrY 15902555
chrM 16299
chr13_random 400311
chr16_random 3994
chr17_random 628739
chr1_random 1231697
chr3_random 41899
chr4_random 160594
chr5_random 357350
chr7_random 362490
chr8_random 849593
chr9_random 449403
chrUn_random 5900358
chrX_random 1785075
chrY_random 58682461
Having re-tried as follows:
bedtools getfasta -fi /home/cc16956/mm9.fa -bed /storage/projects/teif/mm9_generic_data/mm9.genome -fo nucs_gained_099_v5_occup.fasta
I got this error:
It looks as though you have less than 3 columns at line: 1. Are you sure your files are tab-delimited?
Can anyone tell me what the required format of the bed file is as I am completely new to this....