Pandas: Rotates column headers to row labels… here is a solution to the problem.
Pandas: Rotates column headers to row labels
I extract data from websites using the following method:
import pandas as pd
import matplotlib.pyplot as plt
stat_dict={'Disposals' : 0,
'Kicks' : 1,
'Marks' : 2,
'Handballs' : 3,
'Goals' : 4,
'Behinds' : 5,
'Hitouts' : 6,
'Tackles' : 7,
'Rebounds' : 8,
'Inside50s' : 9,
'Clearances': 9,
'Clangers' : 10,
'FreesFor' : 11,
'FreesAgainst' : 12,
'ContestedPosessions' : 13,
'UncontestedPosesseions' : 14,
'ContestedMarks' : 15,
'MarksInside50' : 16,
'OnePercenters' : 17,
'Bounces' : 18,
'GoalAssists' : 19,
'Timeplayed' : 20}
team_lower_case='fremantle'
player1="Fyfe, Nat"
stat_required='Disposals'
rounds=8
tables = pd.read_html("https://afltables.com/afl/stats/teams/" +str(team_lower_case)+"/2018_gbg.html")
for df in tables:
df.drop(df.columns[rounds+1:], axis=1, inplace=True) # remove unwanted columns
df.columns = df.columns.droplevel(0) # remove extra index level
stat_table=tables[stat_dict[stat_required]]
player_stat=stat_table[stat_table["Player"]==player1]
Produces the following:
Player R1 R2 R3 R4 R5 R6 R7 R8
8 Fyfe, Nat 22.0 29.0 38.0 25.0 43.0 27.0 33.0 36.0
How do I index data in the Player column?
How to rotate column headers to row labels or vice versa?
The output I’m looking for is as follows:
Round Fyfe, Nat Neale,Lachie
R1 22 37
R2 29 28
The final output I want is a scatterplot that plots column headers on the x-axis, e.g. R1, R2, etc… and row data on the y-axis.
I
feel like I should be able to draw the data frame directly, but the only way I can succeed is to do the following:
for round, disp in player_stat.iterrows():
player1_list=[]
player1_list.append(disp)
plt.style.use('ggplot')
plt.scatter(range(1,rounds+1), player1_list, label=player1)
plt.legend(loc="lower right")
plt.title("Disposals per round")
plt.xlabel("Rounds")
plt.ylabel("Disposals")
plt.ylim(ymin=0)
Looks like I should use .transpose, which is close to what I want and outputs the following:
8
Player Fyfe, Nat
R1 22
R2 29
R3 38
R4 25
R5 43
R6 27
R7 33
R8 36
Solution
Use > set_index Transpose via T
transpose:
stat_table=tables[stat_dict[stat_required]].set_index('Player'). T
print (stat_table)
Player Ballantyne, Hayden Banfield, Bailey Blakely, Connor \
R1 10.0 12.0 17.0
R2 11.0 14.0 30.0
R3 18.0 11.0 21.0
R4 9.0 17.0 16.0
R5 14.0 18.0 23.0
R6 6.0 14.0 31.0
R7 14.0 20.0 21.0
R8 11.0 12.0 35.0
Player Brayshaw, Andrew Cerra, Adam Cox, Brennan Crowden, Mitch \
R1 12.0 NaN NaN NaN
R2 16.0 9.0 NaN 11.0
R3 7.0 10.0 NaN 13.0
R4 11.0 7.0 NaN 15.0
R5 17.0 15.0 NaN 17.0
R6 14.0 11.0 NaN 11.0
R7 14.0 18.0 8.0 8.0
R8 16.0 14.0 12.0 10.0
Player Duman, Taylin Fyfe, Nat Hamling, Joel ... Pearce, Danyle \
R1 NaN 22.0 NaN ... 12.0
R2 NaN 29.0 12.0 ... NaN
R3 NaN 38.0 11.0 ... NaN
R4 NaN 25.0 16.0 ... NaN
R5 15.0 43.0 11.0 ... NaN
R6 12.0 27.0 9.0 ... NaN
R7 8.0 33.0 12.0 ... NaN
R8 15.0 36.0 18.0 ... 18.0
Player Ryan, Luke Sandilands, Aaron Sheridan, Tom Sutcliffe, Cameron \
R1 16.0 16.0 NaN 17.0
R2 22.0 10.0 NaN NaN
R3 21.0 14.0 14.0 NaN
R4 14.0 15.0 11.0 NaN
R5 21.0 9.0 NaN NaN
R6 18.0 10.0 NaN NaN
R7 13.0 11.0 NaN NaN
R8 27.0 13.0 NaN NaN
Player Taberner, Matthew Tucker, Darcy Walters, Michael Wilson, Nathan \
R1 20.0 12.0 18.0 15.0
R2 16.0 NaN 26.0 23.0
R3 18.0 NaN 23.0 18.0
R4 19.0 15.0 21.0 14.0
R5 6.0 17.0 18.0 27.0
R6 NaN 17.0 2.0 15.0
R7 NaN 15.0 NaN 19.0
R8 NaN 13.0 NaN NaN
Player Totals
R1 358.0
R2 399.0
R3 387.0
R4 362.0
R5 407.0
R6 356.0
R7 346.0
R8 404.0
[8 rows x 32 columns]
You can then select the column via player1
:
player_stat=stat_table[player1]
print (player_stat)
R1 22.0
R2 29.0
R3 38.0
R4 25.0
R5 43.0
R6 27.0
R7 33.0
R8 36.0
Name: Fyfe, Nat, dtype: float64
Last plot
:
plt.scatter((range(1,rounds+1)), stat_table[player1])