2015-07-20 48 views
-1

我把表的内容与代码列表:BeautifulSoup - 表 - 摆脱那些 n

['\n    Cendres brutes (%)\n  ', '\n    7.4\n  ', '\n    Cellulose brute (%)\n  ', '\n    1.6\n  ', '\n    Fibres alimentaires (%)\n  ', '\n    6.6\n  ', '\n    Matière grasse (%)\n  ', '\n    16.0\n  ', '\n    Acide linoléique (%)\n  ', '\n    3.1\n  ', '\n    Energie métabolisable (calculée selon NRC85) (kcal/kg)\n  ', '\n    3652.5\n  ', '\n    Energie métabolisable (mesurée) (kcal/kg)\n  ', '\n    3900.0\n  ', '\n    Humidité (%)\n  ', '\n    9.5\n  ', '\n    Extrait non azoté (%)\n  ', '\n    40.5\n  ', '\n    Oméga 6 (%)\n  ', '\n    3.18\n  ', '\n    Protéine brute (%)\n  ', '\n    25.0\n  ', '\n    Amidon (%)\n  ', '\n    35.5\n  ', '\n    Chlore (%)\n  ', '\n    1.43\n  ', '\n    Cuivre (mg/kg)\n  ', '\n    15.0\n  ', '\n    Iode (mg/kg)\n  ', '\n    2.9\n  ', '\n    Fer (mg/kg)\n  ', '\n    167.0\n  ', '\n    Manganèse (mg/kg)\n  ', '\n    68.0\n  ', '\n    Zinc (mg/kg)\n  ', '\n    242.0\n  ', '\n    Biotine (mg/kg)\n  ', '\n    3.13\n  ', '\n    Choline (mg/kg)\n  ', '\n    1600.0\n  ', '\n    Acide folique (mg/kg)\n  ', '\n    13.9\n  ', '\n    Vitamine A (UI/kg)\n  ', '\n    32000.0\n  ', '\n    Vitamine B1 Thiamine (mg/kg)\n  ', '\n    27.5\n  ', '\n    Vitamine B2 Riboflavine (mg/kg)\n  ', '\n    49.6\n  ', '\n    Vitamine B3 Niacine (mg/kg)\n  ', '\n    490.0\n  ', '\n    Vitamine B5 Acide pantothénique (mg/kg)\n  ', '\n    147.8\n  ', '\n    Vitamine B6 Pyridoxine (mg/kg)\n  ', '\n    77.1\n  ', '\n    Vitamine C (mg/kg)\n  ', '\n    200.0\n  ', '\n    Vitamine D3 (UI/kg)\n  ', '\n    800.0\n  ', '\n    Vitamine E (mg/kg)\n  ', '\n    600.0\n  ', '\n    Arginine (%)\n  ', '\n    1.53\n  ', '\n    Lutéine (mg/kg)\n  ', '\n    5.0\n  ', '\n    Méthionine Cystine (%)\n  ', '\n    1.18\n  ', '\n    Taurine (mg/kg)\n  ', '\n    2900.0\n  '] 

soup = BeautifulSoup(html_doc,"html.parser") 


for h1 in soup.find_all('h1'): 
    print (h1.get_text()) 

for h2 in soup.find_all('h2'): 
    print (h2.get_text()) 

restricted_webpage= soup.find("div", {"id":"ingredients"}) 
readable_restricted=str(restricted_webpage) 

soup2=BeautifulSoup(readable_restricted,"html.parser") 

rows=list() 
for td in soup2.find_all('td'): 
    rows.append(str(td.get_text())) 

print(rows) 

结果是由那些\ n受损HTML_Doc可以是found here

+1

'td.get_text()'已经是一个字符串所以'td.get_text()i.strip()'将去掉换行和多余的空格 –

+0

谢谢。它完美的作品。 – BoobaGump

回答

0

以下应该解决您的问题:

​​ 使用列表理解

另一种结果:

map(str.strip, rows) 

随着帕德里克·坎宁安说,你也可以直接在str.strip方法对你td.get_text()调用中使用:

rows = [td.get_text().strip() for td in soup2.find_all('td')]