2016-11-19 71 views
0

我试图从这个HTML “SlutrengøringALM(DKK 750,00)丹麦克朗。”:Beautifulsoup越来越跨度标签价值的内容对

<div id="bookingpartoptionalitems" class="paddingLeft paddingRight"> 
<div class="title paddingTop">Valgfrie tilkøb:</div> 
<div class="dots dotsHeight alignment-line"> 
    <div class="alignment-container optional-items-controlarea"><span class="control-area checkboxArea paddingRight negMarginTop"> <input id="fvF3625F31BE0A4F0A8DCD3F59477CD535" type="checkbox" class="checkbox" value="1"></span> 
    </div> 
    <div class="alignment-container optional-items-namearea"><span class="BookingDataItemName paddingRight"><label for="fvF3625F31BE0A4F0A8DCD3F59477CD535">Håndklæder (leje)</label> <span class="BookingDataItemUnitPrice">(<span class="currency">DKK</span> <span class="value">112,00</span>)</span> 
     </span> 
    </div> 
    <div class="alignment-container"><span class="BookingDataItemTotalPrice paddingLeft"><span class="currency">DKK</span> <span class="value">0,00</span></span> 
    </div> 
    <div class="alignment-container"></div> 
</div> 
<div class="dots dotsHeight alignment-line"> 
    <div class="alignment-container optional-items-controlarea"><span class="control-area checkboxArea paddingRight negMarginTop"><input id="fvC7796D75FE6D429187EB9705D87B0289" type="checkbox" class="checkbox" value="1"></span> 
    </div> 
    <div class="alignment-container optional-items-namearea"><span class="BookingDataItemName paddingRight"><label for="fvC7796D75FE6D429187EB9705D87B0289">Slutrengøring alm.</label> <span class="BookingDataItemUnitPrice">(<span class="currency">DKK</span> <span class="value">750,00</span>)</span> 
     </span> 
    </div> 
    <div class="alignment-container"><span class="BookingDataItemTotalPrice paddingLeft"><span class="currency">DKK</span> <span class="value">0,00</span></span> 
    </div> 
    <div class="alignment-container"></div> 
</div> 
<div class="dots dotsHeight alignment-line"> 
    <div class="alignment-container optional-items-controlarea"><span class="control-area checkboxArea paddingRight negMarginTop"><input id="fv64F0EAE9857F4D219BB3EDE247ED6EA8" type="checkbox" class="checkbox" value="1"></span> 
    </div> 
    <div class="alignment-container optional-items-namearea"><span class="BookingDataItemName paddingRight"><label for="fv64F0EAE9857F4D219BB3EDE247ED6EA8">Leje Sengelinnede </label> <span class="BookingDataItemUnitPrice">(<span class="currency">DKK</span> <span class="value">112,00</span>)</span> 
     </span> 
    </div> 
    <div class="alignment-container"><span class="BookingDataItemTotalPrice paddingLeft"><span class="currency">DKK</span> <span class="value">0,00</span></span> 
    </div> 
    <div class="alignment-container"></div> 
</div> 
<div class="dots dotsHeight alignment-line last-item"> 
    <div class="alignment-container optional-items-controlarea"><span class="control-area checkboxArea paddingRight negMarginTop"><input id="fvF418ABD7452A45C2B22F98AE5348B13F" type="checkbox" class="checkbox" value="1"></span> 
    </div> 
    <div class="alignment-container optional-items-namearea"><span class="BookingDataItemName paddingRight"><label for="fvF418ABD7452A45C2B22F98AE5348B13F">Internet</label> <span class="BookingDataItemUnitPrice">(<span class="currency">DKK</span> <span class="value">149,00</span>)</span> 
     </span> 
    </div> 
    <div class="alignment-container"><span class="BookingDataItemTotalPrice paddingLeft"><span class="currency">DKK</span> <span class="value">0,00</span></span> 
    </div> 
    <div class="alignment-container"></div> 
</div> 

我试图bsObj.select("#bookingpartoptionalitems label")其输出:

[<label for="fvEC6D027BF92643FB915F1B3D40C2ADAC">Senget▒jspakke</label>, <label for="fv4C0AAC0318FC408C9D42A6EC152AE878">Barnestol</label>, <label for="fv1B2B8ADFBAA74CE094B55514FF02674F">Barneseng</label>, <label for="fvCA3BB2602AD44C07A1F38B430A73D699">Ekstra Fryser (100L) inkl. levering</label>, <label for="fv7F8D503E6BE84A78A54C92001C195DCA">Levering/afhentning tilk▒bte varer</label>, <label for="fv62D7E7BCC1914FBB82802AF9A0D10B27">Tr▒kvogn</label>, <label for="fvF3D92DC8F8BC43F48525A9D032A6130F">Afbestillingsforsikring (ingen selvrisiko)</label>, <label for="fv3CED5B2C3ADC4309A3B7EEA11BBC924D">Kombiforsikring (ingen selvrisiko)</label>, <label for="fv5BC0B453EA5A42E19BFCAC87739CC515">Beach Bowl Key2Activity</label>] 

bsObj.select("#bookingpartoptionalitems .value")其输出:

[<span class="value">105,00</span>, <span class="value">0,00</span>, <span class="value">0,00</span>, <span class="value">0,00</span>, <span class="value">0,00</span>, <span class="value">0,00</span>, <span class="value">300,00</span>, <span class="value">0,00</span>, <span class="value">140,00</span>, <span class="value">0,00</span>, <span class="value">125,00</span>, <span class="value">0,00</span>, <span class="value">243,00</span>, <span class="value">0,00</span>, <span class="value">360,00</span>, <span class="value">0,00</span>, <span class="value">119,00</span>, <span class="value">0,00</span>] 

是否有方法可以成对获取标签和值。由于标签for="fvC7796D75FE6D429187EB9705D87B0289"似乎是动态生成的,因此无法使用。

我希望有人可以提供帮助。

回答

1

所以你想获得所有的标签值对?一种方法是,你可以运行你已经尝试过的两个查询并合并数据,因为我相信它将是有序的。或者你可以做这样的事情:

items = bsObj.find_all('div', class_='optional-items-namearea') 

for item in items: 
    print(item.label.get_text(), item.find('span', class_='value').get_text()) 

这将找到所有与类"optional-items-namearea"的项目,然后在它们之间迭代并提取内标签的文本。对于需要使用查找的值,因为它位于另一个元素内。

对于示例数据输出将是:

Håndklæder (leje) 112,00 
Slutrengøring alm. 750,00 
Leje Sengelinnede 112,00 
Internet 149,00 
1
from bs4 import BeautifulSoup 

soup = BeautifulSoup(html, 'lxml') 
divs = soup.find_all(class_="alignment-container optional-items-namearea") 

for div in divs: 
    pair = div.get_text(strip=True) 
    print(pair) 

出来:

Håndklæder (leje)(DKK112,00) 
Slutrengøring alm.(DKK750,00) 
Leje Sengelinnede(DKK112,00) 
Internet(DKK149,00)