How to replace the sequence in the genome.fasta with reverse complement in specific chromosome and location.
1
0
Entering edit mode
10 months ago
Ap1438 ▴ 50
    #!/bin/bash

# Input BED file with regions from multiple chromosomes
input_bed="wall_region.bed"
# genome file
genome="Wall_10_chr.fasta"

# Iterate through unique chromosomes in the BED file
for chr in $(cut -f1 "$input_bed" | sort -u); do
    # Create a temporary BED file for the current chromosome
    temp_bed="${chr}_temp.bed"

    # Extract regions for the current chromosome and save them to the temporary BED file
    awk -v chromosome="$chr" '$1 == chromosome' "$input_bed" > "$temp_bed"

    # Run bedtools getfasta for the current chromosome and save to region.fasta
    bedtools getfasta -fi "$genome" -bed "$temp_bed" > "${chr}_region.fasta"

    # Reverse complement the sequences using seqtk and save to inverted_region.fasta
    /prj/pflaphy-pacbio/Softwares/seqtk/seqtk seq -r "${chr}_region.fasta" > "${chr}_inverted_region.fasta"

    # Remove the temporary BED file
    rm "$temp_bed"
done

I want to replace original sequence in the genome with reverse complement in chromosome3 ,4 and 5, at specific location . And i have 2 genomes to do so. I made a bed file with the chromosome number and location like chrL.3 0 1000 and extracted the regions using the bedtools and reverse complement using seqtk. But now i am not able to replace them in the genome.fasta. Can anyone please help me do it. Thanks in advance.

genome awk edditing bash python • 766 views
ADD COMMENT
0
Entering edit mode

I made a bed file with the chromosome number and location like chrL.3 0 1000 and extracted the regions using the bedtools and reverse complement using seqtk.

Now grab the sequences from those regions that complement your original BED file (use https://bedtools.readthedocs.io/en/latest/content/tools/complement.html ) and then cat those with the reverse complemented pieces you made to create the sequence you want.

ADD REPLY
0
Entering edit mode

I did not understand completely. But as far as i understand complement doesn't replace. It only gives region which are not in the input_bed file. My question is how to replace it?

ADD REPLY
1
Entering edit mode
10 months ago
GenoMax 147k

If I understand correctly you want to replace a specific area on a chromosome with reverse complement.

Let us say that area was (entire chromosome was 100 bp long, be sure to use correct intervals in your real data)

chr3        20   30

You have already extracted this sequence and reverse complemented it based on what you said above. Now you want to place it back in the original sequence.

So now you need to extract the flanking sequences out so you can now replace the sequence in it. To do that you need the coordinates of the pieces so you can use bedtools to extract the sequence. You get those using bedtools complement.

Complement of the above bed would be

chr3     0  20
chr3    30 100

Now get the sequences for these two bed intervals. Now you can cat the sequence pieces to get the replacement you need in following order.

chr3_0_20_your_RC'ed sequence_chr3_30_100
ADD COMMENT
0
Entering edit mode

Thanks for the explanation.

ADD REPLY

Login before adding your answer.

Traffic: 1877 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6