HI
Using featurecounts, gene counts information extracted. The output seems little little awkward to handle. How do I make into a proper format. "Chr" name repeating in other columns, suggestions.
Geneid Chr Start End Strand Length 13.bam
LOC117736 NC_046966.1 NC_046966.1 18662 18662 46417 46417 - - 27756 910
naf1 NC_046966.1 NC_046966.1 51440 51440 57561 57359 - - 6122 558
vegfc NC_046966.1 82235 184669 + 102435 127
LOC1177 NC_046966.1 186959 189305 - 2347 4
tenm3 NC_046966.1 1017035 1114474 + 97440 471
dctd NC_046966.1 NC_046966.1 NC_046966.1 1117869 1121679 1121679 1133921 1134718 1133908 - - - 16850 478
cep44 NC_046966.1 NC_046966.1 NC_046966.1 1136953 1136953 1136953 1154202 1154202 1154202 - - - 17250 94
fbxo8 NC_046966.1 NC_046966.1 NC_046966.1 NC_046966.1 NC_046966.1 NC_046966.1 1153392 1153444 1154267 1154282 1154407 1154957 1165631 1165631
hand2 NC_046966.1 NC_046966.1 1250592 1250592 1256478 1256478 + + 5887 0
This is the right format. It is a tab separate file. GeneID in column 1, with a few columns (generally 5, you seem to have 3 since you used SAF format annotation?) of annotation. Followed by samples in columns with counts.
Tip: You should supply all BAM files in the same featureCounts command to get the complete matrix you need.
I totally agreed, I use the same cmd for all the bam files. Then why chr "NC_046966.1" is repeated in other columns more than once ? table seems to be unstructured.
If you look at the file using
less -S
you will see that there is a definite tab separated structure to the file.the output for the lines with repeating sequence names seems incorrect.
as I recall, the expected output of feature counts is simple and straightforward, the chromosome name is listed only once and it should not require the type of cleanup you seem to need
is it possible that this file was created in a different way?
I used the same standard cmd why their is chr name repeating in other columns, some suggestions.
Can you post the code you used and which SAF or GTF you used? Did you make the SAF yourself?