logo
down
shadow

Read data with NAs into python and calculate mean row-wise


Read data with NAs into python and calculate mean row-wise

Content Index :

Read data with NAs into python and calculate mean row-wise
Tag : python , By : Rb.Ridge
Date : November 23 2020, 01:01 AM

With these it helps There are a number of different ways you could read in the data from the file using NumPy. Here's one way using np.genfromtxt. The names in the first column become NumPy nan values, as do any other non-float strings in your file:
>>> arr = np.genfromtxt(input_file, delimiter=';', dtype=np.float64)
>>> arr
array([[             nan,   1.60000000e+01,   3.03125000e+01,
          6.77830307e-03,   4.91988890e-04,   2.79672875e-01,
          3.71057514e-03,   6.67111408e-04,   1.77896375e-03],
       [             nan,   6.00000000e+00,   3.35000000e+01,
          3.29180051e-02,   3.12809941e-03,   3.08224812e-01,
          1.24857680e-02,   6.44874361e-03,   6.67111408e-04],
       [             nan,   1.00000000e+00,              nan,
                     nan,              nan,              nan,
                     nan,              nan,              nan]])
>>> np.nanmean(arr, axis=1)
array([ 5.82569998,  4.98298407,  1.        ])

Comments
No Comments Right Now !

Boards Message :
You Must Login Or Sign Up to Add Your Comments .

Share : facebook icon twitter icon

How to calculate an element-wise quotient of two data frames?


Tag : r , By : David Marchant
Date : March 29 2020, 07:55 AM
I hope this helps . , Simple: do A/B:
R> C <- A/B
R> C
  x   y       z
1 1 2.0 2.33333
2 2 2.5 2.66667
3 3 3.0 3.00000
R> 

In Python, calculate the softmax of an array column-wise using numpy


Tag : python , By : Tom
Date : March 29 2020, 07:55 AM
it fixes the issue The reason for the "zeros" lies in the data type of the inputs, which are of the "int" type. Converting the input to "float" solved the problem:
import numpy as np

#scores=np.array([1.0,2.0,3.0])

scores=np.array([[1,2,3,6],
                [2,4,5,6],
                [3,8,7,6]])

def softmax(x):
    x=x.astype(float)
    if x.ndim==1:
        S=np.sum(np.exp(x))
        return np.exp(x)/S
    elif x.ndim==2:
        result=np.zeros_like(x)
        M,N=x.shape
        for n in range(N):
            S=np.sum(np.exp(x[:,n]))
            result[:,n]=np.exp(x[:,n])/S
        return result
    else:
        print("The input array is not 1- or 2-dimensional.")

s=softmax(scores)
print(s)
[[ 0.09003057  0.00242826  0.01587624  0.33333333]
 [ 0.24472847  0.01794253  0.11731043  0.33333333]
 [ 0.66524096  0.97962921  0.86681333  0.33333333]]

Calculate ratio row wise in a data frame


Tag : r , By : Myatus
Date : March 29 2020, 07:55 AM
will help you This is my data frame: , As the / is vectorized, we can do this without looping
df[7:9,-1]/df[4:6, -1]
#      Var1      Var2     Var3 
#7  0.57692308 1.0444444 1.888889
#8  0.08988764 0.4677419 1.548387
#9  0.61797753 2.6363636 0.281250

How to calculate quarterly wise churn and retention rate using python


Tag : python , By : Val
Date : March 29 2020, 07:55 AM
will be helpful for those in need How to calculate quarterly wise churn and retention rate with date column using python. with date column i want to group that quarterly using python. , I think need:
out = pd.DataFrame({ 'Date': pd.to_datetime(['2015-01-01','2015-05-01','2015-07-01','2015-10-01','2015-04-01','2015-12-01','2016-01-01','2016-02-01','2015-05-01', '2015-10-01']), 'Churn': ['Yes'] * 8 + ['No'] * 2 })
print (out)
  Churn       Date
0   Yes 2015-01-01
1   Yes 2015-05-01
2   Yes 2015-07-01
3   Yes 2015-10-01
4   Yes 2015-04-01
5   Yes 2015-12-01
6   Yes 2016-01-01
7   Yes 2016-02-01
8    No 2015-05-01
9    No 2015-10-01
df = (out.loc[out['Churn'] == 'Yes']
         .groupby([out["Date"].dt.year,out["Date"].dt.quarter])["Churn"]
         .count()
         .rename_axis(('year','quarter'))
         .reset_index(name='count'))

print(df)
   year  quarter  count
0  2015        1      1
1  2015        2      2
2  2015        3      1
3  2015        4      2
4  2016        1      2
dfs = dict(tuple(out.groupby(out['Date'].dt.year)))
print (dfs)
{2016:   Churn       Date
6   Yes 2016-01-01
7   Yes 2016-02-01, 2015:   Churn       Date
0   Yes 2015-01-01
1   Yes 2015-05-01
2   Yes 2015-07-01
3   Yes 2015-10-01
4   Yes 2015-04-01
5   Yes 2015-12-01
8    No 2015-05-01
9    No 2015-10-01}

print (dfs.keys())
dict_keys([2016, 2015])

print (dfs[2015])
  Churn       Date
0   Yes 2015-01-01
1   Yes 2015-05-01
2   Yes 2015-07-01
3   Yes 2015-10-01
4   Yes 2015-04-01
5   Yes 2015-12-01
8    No 2015-05-01
9    No 2015-10-01


Tenure column looks like this

out["tenure"].unique() 
Out[14]: 
array([ 8, 15, 32,  9, 48, 58, 10, 29,  1, 66, 24, 68,  4, 53,  6, 20, 52,
       49, 71,  2, 65, 67, 27, 18, 47, 45, 43, 59, 13, 17, 72, 61, 34, 11,
       35, 69, 63, 30, 19, 39,  3, 46, 54, 36, 12, 41, 50, 40, 28, 44, 51,
       33, 21, 70, 23, 16, 56, 14, 62,  7, 25, 31, 60,  5, 42, 22, 37, 64,
       57, 38, 26, 55])
like 1 to 18 --> 1 range
     19 to 36 --> 2nd range
     37 to 54 --> 3rd range like that
quarterly_churn_yes = out.loc[out['Churn'] == 'Yes'].groupby([out["Date"].dt.year,out["Date"].dt.quarter]).count().rename_axis(('year','quarter'))
quarterly_churn_yes["Churn"]

quarterly_churn_rate = out.groupby(out["Date"].dt.quarter).apply(lambda x: quarterly_churn_yes["Churn"] / total_churn).sum()
print(quarterly_churn_rate)

How to calculate a row-wise count of duplicates based on (element-wise) selected adjacent columns


Tag : r , By : inquiringmind
Date : March 29 2020, 07:55 AM
will help you A base R solution assuming you have equal number of "conf" and "chall" columns
#Find indexes of "conf" column
conf_col <- grep("conf", names(test))

#Find indexes of "chall" column
chall_col <- grep("chall", names(test))

#compare element wise and take row wise sum
test$Final <- rowSums(test[conf_col] == test[chall_col])


test
#  group userID A_conf A_chall B_conf B_chall Final
#1     1    220      1       1      1       2     1
#2     1    222      4       6      4       4     1
#3     2    223      6       5      3       2     0
#4     1    224      1       5      4       4     1
#5     2    228      4       4      4       4     2
rowSums(test[grep("conf", names(test))] == test[grep("chall", names(test))])
Related Posts Related QUESTIONS :
  • Join original np array with resulting np array in a form of dictionary? multidimensional array? etc?
  • Forcing labels on histograms in each individual graph in a figure
  • For an infinite dataset, is the data used in each epoch the same?
  • Is there a more efficent way to extend a string?
  • How to calculate each single element of a numpy array based on conditions
  • How do I change the width of Jupyter notebook's cell's left part?
  • Measure distance between lat/lon coordinates and utm coordinates
  • Installing megam for NLTK on Windows
  • filter dataframe on each value of a samn column have a specific value of another column in Panda\Python
  • Threading with pubsub throwing AssertionError: 'callableObj is not callable' in wxPython
  • Get grouped data from 2 dataframes with condition
  • How can I import all of sklearns regressors
  • How to take all elements except the first k
  • Whats wrong with my iteration list of lists from csv
  • Tensorflow Estimator API save image summary in eval mode
  • How to Pack with PyQt - how to make QFrame/Layout adapt to content
  • How do I get certain Time Range in Python
  • python doubly linked list - insertAfter node
  • Open .h5 file in Python
  • Joining a directory name with a binary file name
  • python, sort list with two arguments in compare function
  • Is it possible to print from Python using non-ANSI colors?
  • Pandas concat historical data using date minus some number of days
  • CV2: Import Error in Python OpenCV
  • Is it possible to do this loop in a one-liner?
  • invalid literal for int() with base 10: - django
  • Why does my code print a value that I have not assigned as yet?
  • the collatz func in automate boring stuff with python
  • How to find all possible combinations of parameters and funtions
  • about backpropagation deep neural network in tensorflow
  • Sort strings in pandas
  • How do access my flask app hosted in docker?
  • Replace the sentence include some text with Python regex
  • Counting the most common element in a 2D List in Python
  • logout a user from the system using a function in python
  • mp4 metadata not found but exists
  • Django: QuerySet with ExpressionWrapper
  • Pandas string search in list of dicts
  • Decryption from RSA encrypted string from sqlite is not the same
  • need of maximum value in int
  • a list of several tuples, how to extract the same of the first two elements in the small tuple in the large tuple
  • Display image of 2D Sinewaves in 3D
  • how to prevent a for loop from overwriting a dictionary?
  • How To Fix: RuntimeError: size mismatch in pyTorch
  • Concatenating two Pandas DataFrames while maintaining index order
  • Why does this not run into an infinite loop?
  • Python Multithreading no current event loop
  • Element Tree - Seaching for specific element value without looping
  • Ignore Nulls in pandas map dictionary
  • How do I get scrap data from web pages using beautifulsoup in python
  • Variable used, golobal or local?
  • I have a regex statement to pull all numbers out of a text file, but it only finds 77 out of the 81 numbers in the file
  • How do I create a dataframe of jobs and companies that includes hyperlinks?
  • Detect if user has clicked the 'maximized' button
  • Does flask_login automatically set the "next" argument?
  • Indents in python 3
  • How to create a pool of threads
  • Pandas giving IndexError on one dataframe but not on another similar dataframe
  • Django Rest Framework - Testing client.login doesn't login user, ret anonymous user
  • Running dag without dag file in airflow
  • shadow
    Privacy Policy - Terms - Contact Us © scrbit.com