import pandas as pd %matplotlib inline pd.options.display.max_rows = 10
df = pd.read_table('http://bit.ly/chiporders')
df.head()
Note that I used the aggfunc='sum' which tells the method that I wish to sum up the values of quantities for each item_name.
You can use pretty much any function for the aggfunc including user defined functions. but some of the more common functions are:
"sum" for Adding the values "mean" for Average "std" for Standard Deviation "var" for Variance "min" for Minimum "max" for the maximum "median" for median
df.pivot_table(values="quantity", index = "item_name",aggfunc ="sum")
50 rows × 1 columns
#first I will access the string methods via str keyword and then use the replace function to replace $ with nothing df.item_price = df.item_price.str.replace("$","") ## Now I can set the type of item_price as a float(decimal number) df.item_price = df.item_price.astype("float") #to make sure everything did work i use the dtypes method to see the types of my columns df.dtypes #Sure enough the item_price is now a float
order_id int64 quantity int64 item_name object choice_description object item_price float64 dtype: object
df.pivot_table(values = "item_price", index="choice_description", columns = "quantity", aggfunc = "sum")
1043 rows × 4 columns
There we have it , we now have a pivot table of our chipolle orders based on the choice description and quantity of the order of course we can do one more thing to make this pivot look a little bit better
df.pivot_table(values = "item_price", index="choice_description", columns = "quantity", aggfunc = "sum").fillna(0)
in the method below we first created the pivot table using pivot_table() method we then sort the values of quanitites using sort_values() method and finally used the plot() method to plot the pivot table
df.pivot_table(values="quantity", index = "item_name",aggfunc ="sum").sort_values("quantity").plot(kind = "bar",figsize = (18,5))
<matplotlib.axes._subplots.AxesSubplot at 0x15ec16faac8>