Entering edit mode
5.5 years ago
Kumar
▴
120
I have two fasta files as shown below,
File:1
>Contig_1:90600-91187
AAGGCCATCAAGGACGTGGATGAGGTCGTCAAGGGCAAGGAACAGGAATTGATGACGGTC
Contig_98:35323-35886
GACGAAGCGCTCGCCAAGGCCGAAGAAGAAGGCCTGGATCTGGTCGAAATCCAGCCGCAG
>Contig_24:26615-28387
GCTGCGGCGCTGATCCTGGCGGCCCGCGCCGAGGAGATCGCCCGTTTGGAGCGCGGCGAA
File:2
>Contig_1:90600-91187
GACCGTCATCAATTCCTGTTCCTTGCCCTTGACGACCTCATCCACGTCCTTGATGGCCTT
>Contig_24:26615-28387
TTCGCCGCGCTCCAAACGGGCGATCTCCTCGGCGCGGGCCGCCAGGATCAGCGCCG
Both files are having the same fasta headers but vary in terms of sequences. Hence, I need to replace File:2
sequences in File:1
as shown below,
Expected outcome:
>Contig_1:90600-91187
GACCGTCATCAATTCCTGTTCCTTGCCCTTGACGACCTCATCCACGTCCTTGATGGCCTT
>Contig_98:35323-35886
GACGAAGCGCTCGCCAAGGCCGAAGAAGAAGGCCTGGATCTGGTCGAAATCCAGCCG
>Contig_24:26615-28387
TTCGCCGCGCTCCAAACGGGCGATCTCCTCGGCGCGGGCCGCCAGGATCAGCGCCG
I tried with cat
command, but it is concatenating all the sequences instead of replacing the sequences as I mentioned above.
Thank you Corentin. However, I have large fasta file, in that case it would be tedious to implement
samtools faidx
.There is an option to give a list of regions to faidx:
You could create two files, one with a list of regions from file1, the other from file2 and run two faidx commands
Thank you Corentin, Is it possible to extract the sequences based on the reverse order coordinates by
samtools faidx
. For example, Instead of proper orderIf it like this,
Will it work?
I am not exactly sure what you want to do, if you want the reverse complement, faidx also have an option for that:
You can see the documentation here : http://www.htslib.org/doc/samtools.html
Dear Corentin, If the coordinates are in reverse order, will samtools work? Let me put in this way, I have the series of coordinates to be extracted from the given data,
The above mentioned coordinates are well formatted except the last one
Contig_91:690-450
. Because, the last coordinate is in reverse order. In that case will samtool works?Apparently faidx is giving this error if the coordinates are reversed:
Thank you Corentin for your time and help. I have implemented your suggestions and all works well.