Filter Genome for Specific Sites
1
0
Entering edit mode
6 months ago
Anita • 0

Hello,

I have a list of ~1,300 single bp sites and a fully annotated genome. I'd like to create a fasta file with only the 1,300 sites (with ±300 bp on each side). My sites are in an Excel file right now with chromosome, position, strand, correlation, p-value as 5 separate columns. Does anyone know of a program/function that will subset a genome into the 1,300 sites/regions?

Thank you in advance!

bedtools • 494 views
ADD COMMENT
0
Entering edit mode

My sites are in an Excel file right now

enter image description here

ADD REPLY
0
Entering edit mode

This is unhelpful and doesn't answer my question.

ADD REPLY
2
Entering edit mode
6 months ago
GenoMax 147k

I have a list of ~1,300 single bp sites

Convert your file to plain text BED format (three fields required).

Use bedtools slop to extend the range by 300 bp --> Extend The Coordinates Entries Within A Bed File
Then use bedtools getfasta or samtools faidx to extract the sequence.

ADD COMMENT
0
Entering edit mode

Great, thank you!

ADD REPLY

Login before adding your answer.

Traffic: 1542 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6