how to merge a multirows header of a pandas dataframe into a single cell header?
Tag : python , By : Moe Skeeto
Date : March 29 2020, 07:55 AM
I wish this helpful for you You can first select by loc, then replace NaN to empty string by fillna and apply join. If necessary remove first and last whitespaces by str.strip and then remove first rows by selecting df.loc[10:]: df.columns = df.loc[5:9].fillna('').apply(' '.join).str.strip()
#if need monotonic index (0,1,2...) add reset index
print (df.loc[10:].reset_index(drop=True))
Planting date YYYY.DDD Harvest date YYYY.DDD(kg/ha) Yield YYYY.DDD \
0 1999.26 2000.21 5669.46
1 2000.27 2001.22 10282.5
2 2001.27 2002.22 8210.09
Flowering date YYYY.DDD Maturity date YYYY.DDD Maturity date YYYY.DDD \
0 2000.14 2000.19 2000.19
1 2001.15 2001.2 2001.2
2 2002.15 2002.2 2002.2
Maturity date (kg/ha) Above ground biomass
0 2000.19 11626.7
1 2001.2 20565
2 2002.2 16509
|
Create dataframe ,subset from other dataframe with extra columns having blank values.There is a lookup table for header
Date : March 29 2020, 07:55 AM
I wish this help you I have a dataframe , Use rename and reindex In [124]: (df1.rename(columns=df2.set_index('Header2')['Header1'])
.reindex(columns=df2['Header1'].values))
Out[124]:
ticket JOB_ID virtual
0 4 abc NaN
1 6 cde NaN
2 7 kde NaN
3 8 mde NaN
In [125]: df1
Out[125]:
id job fname lname
0 4 abc james frank
1 6 cde bob altin
2 7 kde kevin mchon
3 8 mde george fndes
In [126]: df2
Out[126]:
Header1 Header2
0 ticket id
1 JOB_ID job
2 virtual NaN
In [127]: df2.set_index('Header2')['Header1']
Out[127]:
Header2
id ticket
job JOB_ID
NaN virtual
Name: Header1, dtype: object
|
In a multi header dataframe, how to select a column with one of the tuple values of the header, and the rest being anyth
Tag : python , By : Novi Indrayani
Date : March 29 2020, 07:55 AM
hop of those help? So I don't know if this is the "best" solution, but what did end up working for me was to first sort the headers using: df.sort_index(axis=1, inplace=True)
df.loc[:, (slice(None), slice(None), "MyHeader3")]
df[[x for x in df.columns if 'header3' in x]]
|
Pandas- rename dataframe multilevel header according to the name of the first level header
Tag : python , By : Sergio Rudenko
Date : March 29 2020, 07:55 AM
wish help you to fix your issue I have a dataframe like this : , I can not find a function can directly doing this so df.columns=df.columns.values
df
Out[110]:
(X, a) (X, b) (Y, a) (Y, b)
0 1 3 4 2
1 5 7 8 6
df.rename(columns={('Y', 'b'):('Y', 'b1')})
Out[111]:
(X, a) (X, b) (Y, a) (Y, b1)
0 1 3 4 2
1 5 7 8 6
df=df.rename(columns={('Y', 'b'):('Y', 'b1')})
df.columns=pd.MultiIndex.from_tuples(df.columns)
df
Out[114]:
X Y
a b a b1
0 1 3 4 2
1 5 7 8 6
|
Convert one level mixed header dataframe to vertical dataframe in Pandas
Date : March 29 2020, 07:55 AM
I think the issue was by ths following , You can create MultiIndex with columns with : by str.split with created index by non : columns before by DataFrame.set_index and then reshape by DataFrame.stack: df = df.set_index('date')
df.columns = df.columns.str.split(':', expand=True)
df = df.stack().rename_axis(('date','district')).reset_index()
print (df)
date district price ratio
0 2017 cy 14 0.3
1 2017 dc 12 0.1
2 2017 xc 11 0.1
3 2018 cy 15 0.6
4 2018 dc 14 0.2
5 2018 xc 12 0.7
6 2019 cy 16 0.8
7 2019 dc 13 0.5
8 2019 xc 13 -0.2
df = df.set_index('date')
df.columns = df.columns.str.split(':', expand=True)
lvl = pd.CategoricalIndex(df.columns.levels[1],
ordered=True,
categories=df.columns.get_level_values(1).drop_duplicates())
df.columns = df.columns.set_levels(lvl, level=1)
df = df.stack().sort_index(level=[1,0]).rename_axis(('date','district')).reset_index()
print (df)
date district price ratio
0 2017 dc 12 0.1
1 2018 dc 14 0.2
2 2019 dc 13 0.5
3 2017 xc 11 0.1
4 2018 xc 12 0.7
5 2019 xc 13 -0.2
6 2017 cy 14 0.3
7 2018 cy 15 0.6
8 2019 cy 16 0.8
|