Entering edit mode
4.6 years ago
ovariohisterectomia
▴
40
Hi! im generating a .fa file with the sequences of a list of genes that i got from a RNA-seq analysis. I did it with this
library(fastaR)
genelist <- scan("genelist.txt", what="character", sep=NULL)
fa_some_records(gene_list=genelist, fasta_file="genoma.fna", outfile="sc_myGenelist.fa")
The bad news are that after that, the file sc_myGenelist.fa is empty (if i open with gedit) and the weight its 0kb)
i think the problem is because in genelist.txt the format is for example EGR_09983 and in the genoma.fna file, the format is like this NW_020171337.1
Is that a problem?
Thank you all!
there are the first lines of the files:
genelist.txt
EGR_01804
EGR_05374
EGR_07251
EGR_10508
EGR_07266
EGR_05599
EGR_01271
EGR_11080
EGR_10767
EGR_03982
genoma.fna
>NW_020171337.1 Echinococcus granulosus chromosome Unknown EG_S00967, whole genome shotgun sequence
AAATCAAATGACTGGCAAGACTACATCACTGCTGAAGTTGCTGCATCTAGTGTGagctaaaaggcaaaaattggaaagTA
CGCATTTCTGACAGGGTCATCTATTTGCTATCCTTTAGTAGGGAACATGCACACAGGTTTTAATTGCTTTAGGGAAAATG
GTTAGATTTAGTACAGTTTCACAAAATCCACATCTCCAATAATTCATCGGTTTCCTAATGAAATTGTACCTGATGCGTAG
ATTGTAATGGTCAAAGTTTATGGGATCCTGAGTGCTACAAGTGCACCAGAAcaaaattaaaggcaaaagcaaaatgcGCT
TTATTGACTGTCAACACATGGCAATACTGTGCTGGTGTGCACCCGAATTAGGGCTTCTAGTAATACGATTGTGGGGTTCT
CTGTTACCtccaaattcaacaaaatcatTACTGGCTGCCGTTCAGTATATTAAAAGTGGACACCATTCTACattggattt
tacaaaatataaatcTATTTTATCATAATTAACGataaaaggttttatttttattccattgtTTGGTATTTGCGTCATAG
CTTGCGTTAAAGTGCAGTAGTTCAAACCAAAGTAAACATTAATACCAAATGTAGAGGGCGATATCTGACGTTCTTCcttg
attattttctttttttaaaagaaaaaccatATTGTTAGGGAGTATAAGTCTCCCATGGCAGCAAAAAAAGTGTGTCGAAC
TTGAAACTGTGCAAATGAAGagagtttgcaaaaatacacCAAGCTAAGTGTGTTTCAAGGGGGGATTAAAGATGTGGAAA
GTGTGCTTAAGTTGTGATAAATCACGTTTGCGATATTAGCGCCACAAACTCCTTTATCAAAGTGCCTCTCATCGACAATC
AAAAGTGTTCGATTTAGTTCTAAACCTTTGAAGGTCCATCGCCCGCCCATCCACCAATTTACCCACACATTCACCTCGAT
CATCTCGCTTTCTTTACTAATATTGATAAAGAGATTCCTCTAA
I assume query (genelist.txt) should match exactly to the headers in multi-fasta (genoma.fna) file
Switch to a better documented/used - more popular - tool than a random github repository. Try seqkit, maybe.