logo
down
shadow

How to assign the label of one column to the new one based on group maximum in pandas


How to assign the label of one column to the new one based on group maximum in pandas

Content Index :

How to assign the label of one column to the new one based on group maximum in pandas
Tag : python , By : Hadley
Date : November 29 2020, 12:01 PM

around this issue I have the following sample dataframe , You could try this:
df['assigned_label'] = df.groupby('Id')['label']\
                         .transform(lambda x: x.mode()[0] if len(x.mode()) == 1 else 'R')
   Id_hour Id hour label assigned_label
0      A_1  A    1     H              L
1      A_2  A    2     L              L
2      A_3  A    3     L              L
3      A_4  A    4     L              L
4      B_1  B    1     H              H
5      B_2  B    2     H              H
6      B_3  B    3     H              H
7      B_4  B    4     L              H
8      C_1  C    1     H              R
9      C_2  C    2     H              R
10     C_3  C    3     L              R
11     C_4  C    4     L              R
‚Äč

Comments
No Comments Right Now !

Boards Message :
You Must Login Or Sign Up to Add Your Comments .

Share : facebook icon twitter icon

Group data based on column label in pandas dataframe


Tag : python , By : Brian Cupps
Date : March 29 2020, 07:55 AM
wish help you to fix your issue You can use column-wise (axis=1) groupby and take the mean:
In [11]: df = pd.DataFrame(np.random.randn(4, 3), columns=[[1, 2, 3], ['d', 's', 'd']])

In [12]: df.columns.names = ['PLOT', 'DEPTH']

In [13]: df
Out[13]:
PLOT          1         2         3
DEPTH         d         s         d
0     -0.557490 -1.231495 -0.333703
1      0.513394  1.046577  0.596306
2     -0.404606 -1.615080 -0.694562
3     -0.078497 -0.683405  0.056857

In [14]: df.groupby(level='DEPTH', axis=1).mean()
Out[14]:
DEPTH         d         s
0     -0.445596 -1.231495
1      0.554850  1.046577
2     -0.549584 -1.615080
3     -0.010820 -0.683405

Pandas assign label based on index value


Tag : python , By : Nick Coats
Date : March 29 2020, 07:55 AM
Hope that helps I think you need double numpy.where with Index.isin :
df['2_name'] = np.where(df.index.isin(random_m + model_m), 'A',
               np.where(df.index.isin(random_y + model_y), 'B', 'not_assigned'))
np.random.seed(100)
df = pd.DataFrame(np.random.randint(10, size=(10,1)), columns=['A'])
#print (df)

random_m = [0,1]
random_y =  [2,3]
model_m = [7,4]
model_y = [5,6]

print (type(random_m))
<class 'list'>

print (random_m + model_m)
[0, 1, 7, 4]

print (random_y + model_y)
[2, 3, 5, 6]

df['2_name'] = np.where(df.index.isin(random_m + model_m), 'A',
               np.where(df.index.isin(random_y + model_y), 'B', 'not_assigned'))
print (df)
   A        2_name
0  8             A
1  8             A
2  3             B
3  7             B
4  7             A
5  0             B
6  4             B
7  2             A
8  5  not_assigned
9  2  not_assigned

DB2 SQL: within group, assign value of column 2 when column 1 is at maximum, to each row


Tag : development , By : user186012
Date : March 29 2020, 07:55 AM
around this issue The way I understood your question is differnt to result of the relationship column you provide. Here is an example - maybe you have to adhust the calcuation (in case I missunderstood you)
SELECT group
     , year
     , age
     , factor
     , factor / first_value(factor) OVER(PARTITION BY group  ORDER BY year desc) as relationship
  FROM test_r

How to assign the label of one column to the new one based on per group maximum value of another column ? panda tranform


Tag : python , By : James Dio
Date : December 10 2020, 07:12 AM
I wish this help you Create index by DataFrame.set_index for get values of index by DataFrameGroupBy.idxmax with GroupBy.transform, because different index values with original is assigned numpy array:
#convert column to numeric
df['label_weight'] = df['label_weight'].astype(int)
#pandas 0.24+
df['assigned_label'] = (df.set_index('label')
                          .groupby('Id')['label_weight']
                          .transform('idxmax')
                          .to_numpy())

#pandas below 0.24
df['assigned_label'] = (df.set_index('label')
                          .groupby('Id')['label_weight']
                          .transform('idxmax')
                          .values)

print (df)
   Id  label_weight label assgined_label
0   A            30     H              H
1   A            30     H              H
2   A            30     H              H
3   A            28     M              H
4   B            29     H              M
5   B            31     M              M
6   B            31     M              M
7   B            30     L              M
8   C            26     H              L
9   C            26     H              L
10  C            28     L              L
11  C            28     L              L

Assign groups based on a group and a maximum sum


Tag : sql , By : nhuser
Date : March 29 2020, 07:55 AM
will be helpful for those in need I need to group rows by some columns and by a running sum until it reaches a threshold. The closest I got was with a query based on this answer, but this solution is not as precise as I need it to be, because the sum has to be reset and restarted when it reaches the threshold. , You can do this using a recursive CTE:
with tt as (
      select t.*, row_number() over (order by id) as seqnum
      from t
     ),
     cte as (
      select id, groupid, code, total, total as totaltotal, 1 as grp, tt.seqnum
      from tt
      where seqnum = 1
      union all
      select tt.id, tt.groupid, tt.code, tt.total,
             (case when cte.totaltotal + tt.total > 100 or cte.groupid <> tt.groupid or cte.code <> tt.code
                   then tt.total else totaltotal + tt.total
              end),
             (case when cte.totaltotal + tt.total > 100 or cte.groupid <> tt.groupid or cte.code <> tt.code
                   then cte.grp + 1 else cte.grp
              end),
             tt.seqnum
      from cte join
           tt
           on tt.seqnum = cte.seqnum + 1
     )
select *
from cte
order by id;
Related Posts Related QUESTIONS :
  • Dataframe Timestamp Filter for new/repeating value
  • Problem with clicking select2 dropdownlist in selenium
  • pandas dataframe masks to write values into new column
  • How to click on item in navigation bar on top of page using selenium python?
  • Add multiple EntityRuler with spaCy (ValueError: 'entity_ruler' already exists in pipeline)
  • error when replacing missing ')' using negative look ahead regex in python
  • Is there a way to remove specific strings from indexes using a for loop?
  • select multiple tags by position in beautifulSoup
  • pytest: getting AttributeError: 'CaptureFixture' object has no attribute 'readouterror' capturing stdout
  • Shipping PyGObject/GTK+ app on Windows with MingW
  • Python script to deduplicate lines in multiple files
  • How to prevent window and widgets in a pyqt5 application from changing size when the visibility of one widget is altered
  • How to draw stacked bar plot from df.groupby('feature')['label'].value_counts()
  • Python subprocess doesn't work without sleep
  • How can I adjust 'the time' in python with module Re
  • Join original np array with resulting np array in a form of dictionary? multidimensional array? etc?
  • Forcing labels on histograms in each individual graph in a figure
  • For an infinite dataset, is the data used in each epoch the same?
  • Is there a more efficent way to extend a string?
  • How to calculate each single element of a numpy array based on conditions
  • How do I change the width of Jupyter notebook's cell's left part?
  • Measure distance between lat/lon coordinates and utm coordinates
  • Installing megam for NLTK on Windows
  • filter dataframe on each value of a samn column have a specific value of another column in Panda\Python
  • Threading with pubsub throwing AssertionError: 'callableObj is not callable' in wxPython
  • Get grouped data from 2 dataframes with condition
  • How can I import all of sklearns regressors
  • How to take all elements except the first k
  • Whats wrong with my iteration list of lists from csv
  • Tensorflow Estimator API save image summary in eval mode
  • How to Pack with PyQt - how to make QFrame/Layout adapt to content
  • How do I get certain Time Range in Python
  • python doubly linked list - insertAfter node
  • Open .h5 file in Python
  • Joining a directory name with a binary file name
  • python, sort list with two arguments in compare function
  • Is it possible to print from Python using non-ANSI colors?
  • Pandas concat historical data using date minus some number of days
  • CV2: Import Error in Python OpenCV
  • Is it possible to do this loop in a one-liner?
  • invalid literal for int() with base 10: - django
  • Why does my code print a value that I have not assigned as yet?
  • the collatz func in automate boring stuff with python
  • How to find all possible combinations of parameters and funtions
  • about backpropagation deep neural network in tensorflow
  • Sort strings in pandas
  • How do access my flask app hosted in docker?
  • Replace the sentence include some text with Python regex
  • Counting the most common element in a 2D List in Python
  • logout a user from the system using a function in python
  • mp4 metadata not found but exists
  • Django: QuerySet with ExpressionWrapper
  • Pandas string search in list of dicts
  • Decryption from RSA encrypted string from sqlite is not the same
  • need of maximum value in int
  • a list of several tuples, how to extract the same of the first two elements in the small tuple in the large tuple
  • Display image of 2D Sinewaves in 3D
  • how to prevent a for loop from overwriting a dictionary?
  • How To Fix: RuntimeError: size mismatch in pyTorch
  • Concatenating two Pandas DataFrames while maintaining index order
  • shadow
    Privacy Policy - Terms - Contact Us © scrbit.com