Entering edit mode
5.0 years ago
Ati
▴
50
I have a paired-end bulk RNAseq data. An example of a read header in my data is:
@A00379:151:HW2HVDSXX:4:1101:21748:1078_CGCTA 3:N:0:CAGTTCTG+TGACTGAC
Based on @Instrument:RunID:FlowCellID:Lane:Tile:X:Y:UMI ReadNum:FilterFlag:0:IndexSequence or SampleNumber
, CGCTA is the UMI. But what is the numbers before it 1078_
? Is this the read ID?
It should be the Y co-ordinate of the particular cluster. In this case a
_
has been used to append the UMI.Read ID should be this entire string:
@A00379:151:HW2HVDSXX:4:1101:21748:1078