白洁美红美芳高义互换,а√最新版地址在线天堂,在线精品自偷自拍无码中文

安裝

win+X 命令提示符（使用管理員權限啟動控制臺）
輸入安裝命令

pip install beautifulsoup4

Beautiful Soup庫的安裝小測

演示HTML頁面地址：http://python123.io/ws/demo.html

demo = r.text
from bs4 import BeautifulSoup
soup = BeautifulSoup(demo, "html.parser")
print(soup.prettify())

BeautifulSoup庫的基本元素

Beaufitul Soup庫的引用
Beautiful Soup庫，也叫beautifulsoup4或bs4

from bs4 import BeautifulSoup

from bs4 import BeautifulSoup
soup = BeautifulSoup("<html>data</html>","html.parser")
soup2 = BeautifulSoup(open("D://demo.html"), "html.parser")

BeautifulSoup對應一個HTML/XML文檔的全部內容

Beautiful Soup庫解析器

解析器	使用方法	條件
bs4的HTML解析器	BeautifulSoup(mk, 'html.parser')	安裝bs4庫
lxml的HTML解析器	BeautifulSoup(mk, 'lxml')	pip install lxml
lxml的XML解析器	BeautifulSoup(mk, 'xml')	pip install lxml
html5lib的解析器	BeautifulSoup(mk, 'html5lib')	pip install html5lib

Beautiful Soup類的基本元素

基本元素	說明
Tag	標簽，最基本的信息組織黨員，分別用<>和</>標明開頭和結尾
Name	標簽的名字，<p>...</p>的名字是‘p’，格式：<tag>.name
Attributes	標簽的屬性，字典形式組織，格式：<tag>.attrs
NavigableString	標簽內非屬性字符串，<>...</>中字符串，格式：<tag>.string
Comment	標簽內字符串的注釋部分，一種特殊的Comment類型

基于bs4庫的HTML內容遍歷方法

回顧demo.html

>>> import requests
>>> r = requests.get("http://python123.io/ws/demo.html")
>>> demo = r.text
>>> demo
'<html><head><title>This is a python demo page</title></head><body><p class="title"><b>The demo python introduces several python courses.</b></p><p class="course">Python is a wonderful general-purpose programming language. You can learn Python from novice to professional by tracking the following courses:<a  class="py1" id="link1">Basic Python</a> and <a  class="py2" id="link2">Advanced Python</a>.</p></body></html>'

HTML基本格式

<html>
    <head>
        <title>This is a python demo page</title>
    </head>
    <body>
        <p class="title">
            <b>The demo python introduces several python courses.</b>
        </p>
        <p class="course">Python is a wonderful general-purpose programming language. You can learn Python from novice to professional by tracking the following courses:
            <a  class="py1" id="link1">Basic Python</a>
             and 
            <a  class="py2" id="link2">Advanced Python</a>
            .
        </p>
    </body>
</html>

標簽樹的下行遍歷

屬性	說明
.contents	子節點的列表，將<tag>所有兒子節點存入列表
.children	子節點的迭代類型，與.contents類似，用于循環遍歷兒子節點
.descendants	子孫節點的迭代類型，包含所有子孫節點，用于循環遍歷

for child in soup.body.children:
    print(child)

標簽樹的上行遍歷

屬性	說明
.parent	節點的父親標簽
.parents	節點先輩標簽的迭代類型，用于循環遍歷先輩節點

>>> soup = BeautifulSoup(demo, "html.parser")
>>> for parent in soup.a.parents:
           if parent is None:
               print(parent)
           else:
               print(parent.name)

標簽樹的平行遍歷

屬性	說明
.next_sibling	返回按照HTML文本順序的下一個平行節點標簽
.previous_sibling	返回按照HTML文本順序的上一個平行節點標簽
.next_siblings	迭代類型，返回按照HTML文本順序的后續所有平行節點標簽
.previous_siblings	迭代類型，返回按照HTML文本順序的前續所有平行節點標簽

for sibling in soup.a.next_siblings:
    print(sibling)
for sibling in soup.a.previous_siblings:
    print(sibling)

基于bs4庫的HTML格式輸出

bs4庫的prettify()方法

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频

Beautiful Soup庫

Beautiful Soup庫

安裝

Beautiful Soup庫的安裝小測

BeautifulSoup庫的基本元素

Beautiful Soup庫解析器

Beautiful Soup類的基本元素

基于bs4庫的HTML內容遍歷方法

標簽樹的下行遍歷

標簽樹的上行遍歷

標簽樹的平行遍歷

基于bs4庫的HTML格式輸出

推薦閱讀更多精彩內容

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美 国产 综合 欧美 视频

Beautiful Soup庫

安裝

Beautiful Soup庫的安裝小測

BeautifulSoup庫的基本元素

Beautiful Soup庫解析器

Beautiful Soup類的基本元素

基于bs4庫的HTML內容遍歷方法

標簽樹的下行遍歷

標簽樹的上行遍歷

標簽樹的平行遍歷

基于bs4庫的HTML格式輸出

推薦閱讀更多精彩內容

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频