Entering edit mode
9.1 years ago
frcamacho
▴
210
I start out with this pandas dataframe:
sampleID scaffoldID Type Program Breadth \
3 G38791 scaffold_7 4 A 73.558964
0 G38791 scaffold_388 3 B 0.000000
1 G38791 scaffold_777 2 B 0.000000
2 G38791 scaffold_787 0 B 0.000000
3 G38791 scaffold_7 4 B 73.558964
How can I conditionally merge columns? So if df['Type' ==4]
, I want to change Type value for that row to "Partial" then merge column value at Program and Breadth value to give a new value for the column, Type to partial_A_73.558964
?
New dataframe should be:
sampleID scaffoldID Type Program Breadth \
3 G38791 scaffold_7 partial_A_73.558964 A 73.558964
0 G38791 scaffold_388 3 B 0.000000
1 G38791 scaffold_777 2 B 0.000000
2 G38791 scaffold_787 0 B 0.000000
3 G38791 scaffold_7 partial_B_73.558964 B 73.558964
Quick question - I think mixing data types in a column is a bad idea. Is pursuing an alternative that maintains data integrity a viable option?
I will be dropping both Program and Breadth column which is why I needed to concat the rows. I agree with you that in general you should not mix data types but, because of the analysis I will be conducting the Type column tell us a lot about the row, especially those that are partials.
Hmmm. It's just a matter of personal choice, I guess. Personally, in a storage vs maintainability contest, I'd pick maintainability (especially when atomicity is at stake), but that's me being excessively obsessed with data integrity.