Hi,
I am working on some programming projects for biology lab.I need help with extracting snp names from the file which contains snp names along with some quality information. One file x.txt has snp_name, but other file y.txt has Ilm_name. However in y.txt ilm_name follows a pattern of [SnpName]-[0-9]_[A-Z]_[A-Z]_[0-9]. So how do i saperate out only snp name from this pattern?
x.txt:
snp_name
1:10002775-GA
1:100152282-CT
1:100154376-GA
1:100154844-CA
1:100155035-AC
1:100155084-CT
1:100316615-CAG-C
1:10032154-AC
1:100336041-TAGAC-T
1:100340360-CT
y_txt:
ilm_name
1:10002775-GA-0_T_F_2299176856 1 10002775 G + 0 0 0 0
1:100152282-CT-0_T_R_2299204377 1 100152282 G - 0 0 0 0
1:100154376-GA-0_B_R_2299204383 1 100154376 G + 0 0 0 0
1:100154844-CA-0_B_R_2299204393 1 100154844 C + 0 0 0 0
1:100155035-AC-0_T_F_2299204394 1 100155035 A + 0 0 0 0
1:100155084-CT-0_B_F_2299204396 1 100155084 G - 0 0 0 0
1:100182985-CA-0_T_F_2299204412 1 100182985 C + 0 0 0 0
1:100183042-AG-0_B_R_2299204415 1 100183042 A + 0 0 0 0
1:100316615-CAG-C-0_P_F_2304230872 1 100316615 AG + 0 0 0 0
1:10032154-AC-0_B_R_2299176859 1 10032154 A + 0 0 0 0
It would help if you format the content carefully with the button "101010"
I formatted the text block for readability. It's always advisable (to make everything clear) to also show an example of what you aim to obtain.