I have downloaded 10000 SNVs from the COSMIC database. These are all cancer-causing variants. Each of these mutations has strand information attached to them. The problem is every mutation has been flagged as belonging to the plus strand. Why is that? Is it a convention? To give you an example, consider this mutation COSM1099572. It is listed as a plus-strand mutation (G>A) if you download the database and view it in R. But when I click on the ensemble contig information, the base on the plus strand is C and on the minus strand is G. My question is, is there a convention to consider everything on the plus strand?
Yes, convention is to refer to the top strand.