最近在寫代碼的時候,遇到一個詭異的List越界的問題:IndexError: list index out of range ,對著出錯的這一段代碼左看右看都發現不了問題:
只有老老實實一條條的跟蹤下來,結果發現一個很有趣的現象,每一次讀完一行數據后,原始的csv文件中總有一行空行。
最后在stackoverflow上找到了比較經典的解釋,原來 python3里面對 str和bytes類型做了嚴格的區分,不像python2里面某些函數里可以混用。所以用python3來寫wirterow時,打開文件不要用wb模式,只需要使用w模式,然后帶上newline=‘’。
In Python 2.X, it was required to open the csvfile with 'b' because the csv module does its own line termination handling.
In Python 3.X, the csv module still does its own line termination handling, but still needs to know an encoding for Unicode strings. The correct way to open a csv file for writing is:
outputfile=open("out.csv",'w',encoding='utf8',newline='')
encoding
can be whatever you require, but newline=''
suppresses text mode newline handling. On Windows, failing to do this will write \r\r\n file line endings instead of the correct \r\n. This is mentioned in the 3.X csv.reader documentation only, but csv.writer requires it as well.
所以需要將之前 寫CSV文件的方式改為以下代碼則運行成功
# 將每一條數據抽離,保存在 citys.csv 文件中
with open("./citys.csv", "w",newline='') as f:
writ = csv.writer(f)
for city_tag in city_tags:
# 獲取 <a> 標簽的 href 鏈接
city_url = city_tag.get("href")
# 獲取 <a> 標簽的文字,如:天津
city_name = city_tag.get_text()
writ.writerow((city_name, city_url))
運行后獲得正常的文件:
特記之。