AFAIK bedtools only allows for '>chrom:start-stop' (default) or '>name' (-name)
Is it possible to have both, i.e. for a .bed feature
chr1 100 110 name . +
i want to have an output like
>name:chr:100-110(+)
NNNNNNNNNN
AFAIK bedtools only allows for '>chrom:start-stop' (default) or '>name' (-name)
Is it possible to have both, i.e. for a .bed feature
chr1 100 110 name . +
i want to have an output like
>name:chr:100-110(+)
NNNNNNNNNN
Does name come from bed or from fasta header or fasta file name? From OP, it seems it comes from bed. If so you can create a fifth column with desired string.
$ awk -v OFS="\t" '{$5=$4":"$1":"$2"-"$3}1' test.bed | bedtools getfasta -fi test.fa -bed - -name
>Name:chr1:5-10
AAACC
Hello,
you could first use the -name
option. And than change the id with a little python script using Biopython like this:
from Bio import SeqIO
bed_file = "./regions.bed"
original_file = "./original.fasta"
renamed_file = "./renamed.fasta"
regions = dict()
records = []
with open(bed_file, "r") as bed:
for line in bed:
data = line.strip().split("\t")
regions[data[3]] = data
for record in SeqIO.parse(original_file, 'fasta'):
record.id = "{3}:{0}:{1}-{2}({5})".format(*regions[record.id])
record.description = record.id
records.append(record)
SeqIO.write(records, renamed_file, 'fasta')
fin swimmer
EDIT: Code correction. The original has overwritten the renamed.fasta in every iteration.
You could use samtools faidx
with bed2faidxsta.pl
:
$ bed2faidxsta.pl < in.bed > out.fa
If you want the header to have a custom format, just edit line 67:
my $header = ">".join(":",($chr, $start, $stop, $id, $score, $strand));
You could move $id
to the front of the list, for instance.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.