How to make assignment to hierarchical column of pandas dataframe using boolean mask?
Date : March 29 2020, 07:55 AM
To fix this issue I have a dataframe like this: , you can just use a list to select you column idx = df[['val1']] > 20
idx
Out[39]:
val1
site a b
time
1 False True
2 False True
df[idx] = 50
df
Out[41]:
val1 val2
site a b a b
time
1 11 50 101 201
2 12 50 102 202
|
pandas: Calculate new values in the new column from multiple criteria apply from multiple columns without looping
Date : March 29 2020, 07:55 AM
wish helps you I have my data in the following dataframe , You can use the following: Deal with the Status Closed first: df.loc[df.Status == 'Closed','ActiveYears'] = df.loc[df.Status == 'Closed','Years']
df.loc[df.Status == 'Active', 'ActiveYears'] = df[df.Status == 'Active'].groupby('AccID')['Years'].transform(max)
print(df)
AccID AccTypes Status Years ActiveYears
0 001 A Closed 5 5.0
1 001 B Active 15 15.0
2 001 C Active 10 15.0
3 002 A Active 20 20.0
4 002 B Closed 25 25.0
5 003 C Active 30 30.0
|
Pandas dataframe boolean mask on multiple columns
Tag : python , By : Yolanda N. Ceron
Date : March 29 2020, 07:55 AM
hop of those help? You can use the results of your apply statement to boolean index select from the original dataframe: results = df[["A","B"]].apply(lambda x: x.abs()-5*df['d'+x.name] > 0)
A B
0 False True
1 True True
2 True True
3 True False
df[results.A]
A B dA dB
1 2 -4.0 0.263 0.357
2 5 5.0 0.382 0.397
3 -4 -0.5 0.330 0.115
df[results.any(axis=1)]
A B dA dB
0 -1 3.0 0.310 0.080
1 2 -4.0 0.263 0.357
2 5 5.0 0.382 0.397
3 -4 -0.5 0.330 0.115
df[results.all(axis=1)]
A B dA dB
1 2 -4.0 0.263 0.357
2 5 5.0 0.382 0.397
|
Pandas - on each column apply a function returning multiple values
Tag : python , By : glisignoli
Date : March 29 2020, 07:55 AM
I hope this helps you . The key issue is that your function is returning a tuple object, and what res_1, res_2 = my_fun(df['a']) is doing is unpacking the returned tuples res_1 and res_2 as Series object. To illustrate: df.apply(my_fun)
# a ([11, 12, 13], [3, 6, 9])
# b ([14, 15, 16], [12, 15, 18])
# c ([20, 21, 22], [30, 33, 36])
# dtype: object
df.applymap(my_fun)
# a b c
# 0 (11, 3) (14, 12) (20, 30)
# 1 (12, 6) (15, 15) (21, 33)
# 2 (13, 9) (16, 18) (22, 36)
df1 = df.apply(my_fun, axis = 0).apply(lambda x: x[0]).transpose()
df2 = df.apply(my_fun, axis = 0).apply(lambda x: x[1]).transpose()
df1
# a b c
# 0 11 14 20
# 1 12 15 21
# 2 13 16 22
df2
# a b c
# 0 3 12 30
# 1 6 15 33
# 2 9 18 36
|
Setting dataframe by using both iloc and a boolean mask (mask at multiple different index (row) values in the dataframe)
Date : March 29 2020, 07:55 AM
may help you . I believe need DataFrame.iloc and DataFrame.mask, which set values to NaN by boolean mask by default (only necessary same number of rows and columns of selected df with boolean mask). Also df2_null mask is converted to numpy array for avoid alignment by indices. df.iloc[20:24] = df.iloc[20:24].mask(df2_null.values)
print (df.iloc[15:30])
A B
15 15.0 15.0
16 16.0 16.0
17 17.0 17.0
18 18.0 18.0
19 19.0 19.0
20 20.0 NaN
21 NaN 21.0
22 22.0 NaN
23 23.0 23.0
24 24.0 24.0
25 25.0 25.0
26 26.0 26.0
27 27.0 27.0
28 28.0 28.0
29 29.0 29.0
df = pd.DataFrame({'A': list(range(0,30)), 'B': list(range(0,30))})
arr = df.values.astype(float)
arr[20:24] = np.where(df2_null.values, np.nan, arr[20:24])
print (arr)
[[ 0. 0.]
[ 1. 1.]
[ 2. 2.]
[ 3. 3.]
[ 4. 4.]
[ 5. 5.]
[ 6. 6.]
[ 7. 7.]
[ 8. 8.]
[ 9. 9.]
[10. 10.]
[11. 11.]
[12. 12.]
[13. 13.]
[14. 14.]
[15. 15.]
[16. 16.]
[17. 17.]
[18. 18.]
[19. 19.]
[20. nan]
[nan 21.]
[22. nan]
[23. 23.]
[24. 24.]
[25. 25.]
[26. 26.]
[27. 27.]
[28. 28.]
[29. 29.]]
|