Returning column names based on condition

Zoozoo Source

I have a DS with 100+ columns, and need to return only column names that has rows that contains this string 'palm oil'. I have seen some variations to this question, and have tried all combinations possible, but don't quite help me return my column names that contain rows with the string of interest. Can someone please help? This is my code below -

str_cols = []
for col in df.select_dtypes([np.object]).columns[7:45]:
    if df[col].str.lower().str.contains("palm", na=False):
    str_cols.append(col)
print (str_cols)
pythonpython-3.xpandas

Answers

answered 5 days ago jezrael #1

If want return columns names filter columns by str.contains with case=False for not case sensitive:

df = pd.DataFrame({'A':list('abcdef'),
                   'B':['palm oil',5,4,5,'palm oil 5',4],
                   'C':[7,8,9,4,2,3],
                   'D':[1,3,5,'Palm oil',1,0],
                   'E':[5,3,6,9,2,'palm OIL'],
                   'F':list('aaabbb')}).astype(str)

print (df)
   A           B  C         D         E  F
0  a    palm oil  7         1         5  a
1  b           5  8         3         3  a
2  c           4  9         5         6  a
3  d           5  4  Palm oil         9  b
4  e  palm oil 5  2         1         2  b
5  f           4  3         0  palm OIL  b

m = df.astype(str).apply(lambda x: x.str.contains("palm oil", case=False, na=False)).any()

c = df.columns[m]
print (c)
Index(['B', 'D', 'E'], dtype='object')

comments powered by Disqus