Hi, I have small RNA-seq data that contain UMI and index sequences; however, I don't have their sequences and I don't know their location, what should I do to d-duplication reads and remove UMI and index sequences? I'll appreciate any help!
@NS500318:1217:HY5LKBGXT:1:11101:14039:1050 1:N:0:CTCTGATGGC+NTTATGAGGC GACTCNTAGCGGTGGATCACTCGGCAACTGTAGGCACCNTCAATTTTATTAAGGCTAGATCGGAAGAGCACA + AAAAA#EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE#EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
@NS500318:1217:HY5LKBGXT:1:11101:8110:1051 1:N:0:CTCTGATGGC+NTTATGAGGC GACTCNTAGCGGTGGATCACTCGGCAACTGTAGGCACCATCAATAAAACACTAGGCAGATCGGAAGAGCACA + AAAAA#EEEEEEEEEEEEEEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
@NS500318:1217:HY5LKBGXT:1:11101:24435:1052 1:N:0:CTCTGATGGC+NTTATGAGGC GACTCNTAGCGGTGGATCACTCGGCAACTGTAGGCACCATCAATCTCAGACCACCAAGATCGGAAGAGCACA + AAAAA#EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE6EEEEEEE
@NS500318:1217:HY5LKBGXT:1:11101:20765:1054 1:N:0:CTCTGATGGC+NTTATGAGGC AGAGGNCGTGAAACCGTTAAGAGGTAACTGTAGGCACCATCAATCCATGTAACCCGAGATCGGAAGAGCAAC + AAAAA#EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
@NS500318:1217:HY5LKBGXT:1:11101:20220:1058 1:N:0:CTCTGATGGC+NTTATGAGGC GTTTCNGTAGTGTAGTGGTTATCACGTTCGCCTAACTGTAGGCACCATCAATACGGGAGCCACCAGATCGGA + AAAAA#EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
You will probably have better luck troubleshooting this issue by getting in contact with the person or company who prepared your libraries.