欧美成人猛片aaaaaaa,双乳被老汉揉搓a毛片免费观看,a片在线观看免费

Pandas是入門Python做數據分析所必須要掌握的一個庫，本文精選了十套練習題，幫助讀者上手Python代碼，完成數據集探索。

本文內容由科賽網翻譯整理自Github，建議讀者完成科賽網?從零上手Python關鍵代碼?和?Pandas基礎命令速查表?教程學習的之后，再對本教程代碼進行調試學習。

【小提示：本文所使用的數據集下載地址：DATA | TRAIN 練習數據集】

上集：10套練習，教你如何用Pandas做數據分析【1-5】

練習6-統計

探索風速數據

相應數據集：wind.data

步驟1 導入必要的庫

# 運行以下代碼

import pandas as pd

import datetime

步驟2 從以下地址導入數據

import pandas as pd

# 運行以下代碼

path6 = "../input/pandas_exercise/exercise_data/wind.data"? # wind.data

步驟3 將數據作存儲并且設置前三列為合適的索引

import datetime

# 運行以下代碼

data = pd.read_table(path6, sep = "\s+", parse_dates = [[0,1,2]])

data.head()

out[293]:

步驟4 2061年？我們真的有這一年的數據？創建一個函數并用它去修復這個bug

# 運行以下代碼

def fix_century(x):

? ? year = x.year - 100 if x.year > 1989 else x.year

? ? return datetime.date(year, x.month, x.day)

# apply the function fix_century on the column and replace the values to the right ones

data['Yr_Mo_Dy'] = data['Yr_Mo_Dy'].apply(fix_century)

# data.info()

data.head()

out[294]:

步驟5 將日期設為索引，注意數據類型，應該是datetime64[ns]

# 運行以下代碼

# transform Yr_Mo_Dy it to date type datetime64

data["Yr_Mo_Dy"] = pd.to_datetime(data["Yr_Mo_Dy"])

# set 'Yr_Mo_Dy' as the index

data = data.set_index('Yr_Mo_Dy')

data.head()

# data.info()

out[295]:

步驟6 對應每一個location，一共有多少數據值缺失

# 運行以下代碼

data.isnull().sum()

out[296]:

步驟7 對應每一個location，一共有多少完整的數據值

# 運行以下代碼

data.shape[0] - data.isnull().sum()

out[297]:

步驟8 對于全體數據，計算風速的平均值

# 運行以下代碼

data.mean().mean()

out[298]:

10.227982360836924

步驟9 創建一個名為loc_stats的數據框去計算并存儲每個location的風速最小值，最大值，平均值和標準差

# 運行以下代碼

loc_stats = pd.DataFrame()

loc_stats['min'] = data.min() # min

loc_stats['max'] = data.max() # max

loc_stats['mean'] = data.mean() # mean

loc_stats['std'] = data.std() # standard deviations

loc_stats

out[299]:

步驟10 創建一個名為day_stats的數據框去計算并存儲所有location的風速最小值，最大值，平均值和標準差

# 運行以下代碼

# create the dataframe

day_stats = pd.DataFrame()

# this time we determine axis equals to one so it gets each row.

day_stats['min'] = data.min(axis = 1) # min

day_stats['max'] = data.max(axis = 1) # max

day_stats['mean'] = data.mean(axis = 1) # mean

day_stats['std'] = data.std(axis = 1) # standard deviations

day_stats.head()

out[300]:

步驟11 對于每一個location，計算一月份的平均風速

(注意，1961年的1月和1962年的1月應該區別對待)

# 運行以下代碼

# creates a new column 'date' and gets the values from the index

data['date'] = data.index

# creates a column for each value from date

data['month'] = data['date'].apply(lambda date: date.month)

data['year'] = data['date'].apply(lambda date: date.year)

data['day'] = data['date'].apply(lambda date: date.day)

# gets all value from the month 1 and assign to janyary_winds

january_winds = data.query('month == 1')

# gets the mean from january_winds, using .loc to not print the mean of month, year and day

january_winds.loc[:,'RPT':"MAL"].mean()

out[301]:

步驟12 對于數據記錄按照年為頻率取樣

# 運行以下代碼

data.query('month == 1 and day == 1')

out[302]:

步驟13 對于數據記錄按照月為頻率取樣

# 運行以下代碼

data.query('day == 1')

out[303]:

練習7-可視化

探索泰坦尼克災難數據

相應數據集：train.csv

步驟1 導入必要的庫

# 運行以下代碼

import pandas as pd

import matplotlib.pyplot as plt

import seaborn as sns

import numpy as np

%matplotlib inline

步驟2 從以下地址導入數據

# 運行以下代碼

path7 = '../input/pandas_exercise/exercise_data/train.csv'? # train.csv

步驟3 將數據框命名為titanic

# 運行以下代碼

titanic = pd.read_csv(path7)

titanic.head()

out[306]:

步驟4 將PassengerId設置為索引

# 運行以下代碼

titanic.set_index('PassengerId').head()

out[307]:

步驟5 繪制一個展示男女乘客比例的扇形圖

# 運行以下代碼

# sum the instances of males and females

males = (titanic['Sex'] == 'male').sum()

females = (titanic['Sex'] == 'female').sum()

# put them into a list called proportions

proportions = [males, females]

# Create a pie chart

plt.pie(

? ? # using proportions

? ? proportions,

? ? # with the labels being officer names

? ? labels = ['Males', 'Females'],

? ? # with no shadows

? ? shadow = False,

? ? # with colors

? ? colors = ['blue','red'],

? ? # with one slide exploded out

? ? explode = (0.15 , 0),

? ? # with the start angle at 90%

? ? startangle = 90,

? ? # with the percent listed as a fraction

? ? autopct = '%1.1f%%'

? ? )

# View the plot drop above

plt.axis('equal')

# Set labels

plt.title("Sex Proportion")

# View the plot

plt.tight_layout()

plt.show()

步驟6 繪制一個展示船票Fare, 與乘客年齡和性別的散點圖

# 運行以下代碼

# creates the plot using

lm = sns.lmplot(x = 'Age', y = 'Fare', data = titanic, hue = 'Sex', fit_reg=False)

# set title

lm.set(title = 'Fare x Age')

# get the axes object and tweak it

axes = lm.axes

axes[0,0].set_ylim(-5,)

axes[0,0].set_xlim(-5,85)

out[309]:

(-5, 85)

步驟7 有多少人生還？

# 運行以下代碼

titanic.Survived.sum()

out[310]:

342

步驟8 繪制一個展示船票價格的直方圖

# 運行以下代碼

# sort the values from the top to the least value and slice the first 5 items

df = titanic.Fare.sort_values(ascending = False)

# create bins interval using numpy

binsVal = np.arange(0,600,10)

binsVal

# create the plot

plt.hist(df, bins = binsVal)

# Set the title and labels

plt.xlabel('Fare')

plt.ylabel('Frequency')

plt.title('Fare Payed Histrogram')

# show the plot

plt.show()

練習8-創建數據框

探索Pokemon數據

相應數據集：練習中手動內置的數據

步驟1 導入必要的庫

# 運行以下代碼

import pandas as pd

步驟2 創建一個數據字典

# 運行以下代碼

raw_data = {"name": ['Bulbasaur', 'Charmander','Squirtle','Caterpie'],

? ? ? ? ? ? "evolution": ['Ivysaur','Charmeleon','Wartortle','Metapod'],

? ? ? ? ? ? "type": ['grass', 'fire', 'water', 'bug'],

? ? ? ? ? ? "hp": [45, 39, 44, 45],

? ? ? ? ? ? "pokedex": ['yes', 'no','yes','no']? ? ? ? ? ? ? ? ? ? ? ?

? ? ? ? ? ? }

步驟3 將數據字典存為一個名叫pokemon的數據框中

# 運行以下代碼

pokemon = pd.DataFrame(raw_data)

pokemon.head()

out[314]:

步驟4 數據框的列排序是字母順序，請重新修改為name, type, hp, evolution, pokedex這個順序

# 運行以下代碼

pokemon = pokemon[['name', 'type', 'hp', 'evolution','pokedex']]

pokemon

out[315]:

步驟5 添加一個列place

# 運行以下代碼

pokemon['place'] = ['park','street','lake','forest']

pokemon

out[316]:

步驟6 查看每個列的數據類型

# 運行以下代碼

pokemon.dtypes

out[317]:

name object

type object

hp int64

evolution object

pokedex object

place object

dtype: object

練習9-時間序列

探索Apple公司股價數據

相應數據集：Apple_stock.csv

步驟1 導入必要的庫

# 運行以下代碼

import pandas as pd

import numpy as np

# visualization

import matplotlib.pyplot as plt

%matplotlib inline

步驟2 數據集地址

# 運行以下代碼

path9 = '../input/pandas_exercise/exercise_data/Apple_stock.csv'? # Apple_stock.csv

步驟3 讀取數據并存為一個名叫apple的數據框

# 運行以下代碼

apple = pd.read_csv(path9)

apple.head()

out[320]:

步驟4 查看每一列的數據類型

# 運行以下代碼

apple.dtypes

out[321]:

Date object

Open float64

High float64

Low float64

Close float64

Volume int64

Adj Close float64

dtype: object

步驟5 將Date這個列轉換為datetime類型

# 運行以下代碼

apple.Date = pd.to_datetime(apple.Date)

apple['Date'].head()

out[322]:

0 2014-07-08

1 2014-07-07

2 2014-07-03

3 2014-07-02

4 2014-07-01

Name: Date, dtype: datetime64[ns]

步驟6 將Date設置為索引

# 運行以下代碼

apple = apple.set_index('Date')

apple.head()

out[323]:

步驟7 有重復的日期嗎？

# 運行以下代碼

apple.index.is_unique

out[324]:

True

步驟8 將index設置為升序

# 運行以下代碼

apple.sort_index(ascending = True).head()

out[325]:

步驟9 找到每個月的最后一個交易日(business day)

# 運行以下代碼

apple_month = apple.resample('BM')

apple_month.head()

out[326]:

步驟10 數據集中最早的日期和最晚的日期相差多少天？

# 運行以下代碼

(apple.index.max() - apple.index.min()).days

out[327]:

12261

步驟11 在數據中一共有多少個月？

# 運行以下代碼

apple_months = apple.resample('BM').mean()

len(apple_months.index)

out[328]:

404

步驟12 按照時間順序可視化Adj Close值

# 運行以下代碼

# makes the plot and assign it to a variable

appl_open = apple['Adj Close'].plot(title = "Apple Stock")

# changes the size of the graph

fig = appl_open.get_figure()

fig.set_size_inches(13.5, 9)

練習10-刪除數據

探索Iris紙鳶花數據

相應數據集：iris.csv

步驟1 導入必要的庫

# 運行以下代碼

import pandas as pd

步驟2 數據集地址

# 運行以下代碼

path10 ='../input/pandas_exercise/exercise_data/iris.csv'? # iris.csv

步驟3 將數據集存成變量iris

# 運行以下代碼

iris = pd.read_csv(path10)

iris.head()

out[332]:

步驟4 創建數據框的列名稱

iris = pd.read_csv(path10,names = ['sepal_length','sepal_width', 'petal_length', 'petal_width', 'class'])

iris.head()

out[333]:

步驟5 數據框中有缺失值嗎？

# 運行以下代碼

pd.isnull(iris).sum()

out[334]:

sepal_length 0

sepal_width 0

petal_length 0

petal_width 0

class 0

dtype: int64

步驟6 將列petal_length的第10到19行設置為缺失值

# 運行以下代碼

iris.iloc[10:20,2:3] = np.nan

iris.head(20)

out[335]:

步驟7 將缺失值全部替換為1.0

# 運行以下代碼

iris.petal_length.fillna(1, inplace = True)

iris

out[336]:

步驟8 刪除列class

# 運行以下代碼

del iris['class']

iris.head()

out[337]:

步驟9 將數據框前三行設置為缺失值

# 運行以下代碼

iris.iloc[0:3 ,:] = np.nan

iris.head()

out[338]:

步驟10 刪除有缺失值的行

# 運行以下代碼

iris = iris.dropna(how='any')

iris.head()

out[339]:

步驟11 重新設置索引

# 運行以下代碼

iris = iris.reset_index(drop = True)

iris.head()

out[340]:

轉載本文請聯系?科賽網kesci.com?取得授權。

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频

10套練習，教你如何用Pandas做數據分析【6-10】

10套練習，教你如何用Pandas做數據分析【6-10】

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美 国产 综合 欧美 视频

10套練習，教你如何用Pandas做數據分析【6-10】

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频