Entering edit mode
7.5 years ago
It is a blastx file , I want to select all the chromosomes number 4 for example and I tested this command but it does not work
grep "^>" .chromosome:TAIR10:4 file >reslt_grep
> AT4G23310.2 pep chromosome:TAIR10:4:12185369:12188961:1 gene:AT4G23310
transcript:AT4G23310.2 gene_biotype:protein_coding
transcript_biotype:protein_coding gene_symbol:CRK23 description:Putative
cysteine-rich receptor-like protein kinase 23
[Source:UniProtKB/Swiss-Prot;Acc:O65482]
Length=790
Score = 27.3 bits (59), Expect = 1.7, Method: Composition-based stats.
Identities = 12/38 (32%), Positives = 21/38 (55%), Gaps = 0/38 (0%)
Frame = +3
Query 42 SLIAAASKMIPISILHCCHSYLLHPRTNTKHYSLISVL 155
S + S + P + H C S+ PR++T +LI++L
Sbjct 129 SNLVVTSALDPTYVYHVCPSWATFPRSSTYMTNLITLL 166
> AT1G31420.2 pep chromosome:TAIR10:1:11249600:11253915:1 gene:AT1G31420
transcript:AT1G31420.2 gene_biotype:protein_coding
transcript_biotype:protein_coding gene_symbol:FEI1 description:Leucine-rich
repeat receptor kinase [Source:UniProtKB/TrEMBL;Acc:F4I9D5]
Length=591
Score = 124 bits (312), Expect = 3e-34, Method: Compositional matrix adjust.
Identities = 63/88 (72%), Positives = 68/88 (77%), Gaps = 3/88 (3%)
Frame = -1
Query 265 GFGTVYKLIMDDNSAFAVKKILNNGVRSDRLFERELEILGSIKHRNLVNLRGYCNSPSAK 86
GFGTVYKL MDD FA+K+IL DR FERELEILGSIKHR LVNLRGYCNSP++K
Sbjct 316 GFGTVYKLAMDDGKVFALKRILKLNEGFDRFFERELEILGSIKHRYLVNLRGYCNSPTSK 375
Query 85 LLIYDYLPLGSLDELLHEHRETDSTLDW 2
LL+YDYLP GSLDE LHE E LDW
Sbjct 376 LLLYDYLPGGSLDEALHERGE---QLDW 400
How about:
If you don't need to see the alignment, using flag
-outfmt 6
(text tab-delimited) will be much easier to parse. Then you can get all the lines in the file that match your query sequence:grep -e 'chromosome:TAIR10:4' blastout.txt