问题刮的所有行数据与美丽的汤

我有以下的HTML（截断简洁和网址的伪）：

<tbody> 
       <tr> 
       <th >Part1</th> 
       <td> 
        <a href="http://somewebpage.com">87</a> 
</td> 
       <td> 
        <a href="http://somewebpage.com">7</a> 
       </td> 
       <th>Part2</th> 
       <td> 
        <a href="http://somewebpage.com"">68</a> 
       </td>........

使用下列内容：

`soup=BeautifulSoup(page['content'], "html.parser") 
table = soup.find("table") 
table_data = [[cell.text for cell in row("td")] 
for row in table("tr")] 
pprint(table_data) `

table_data是这样的：

[[], 
[u'87', u'7'], 
[u'68'],

如何让'Part1'和'Part2'出现在同一个列表中？

遗憾的麻烦;-)

预期输出：

[[], 
    [u'Part1',u'87', u'7'], 
    [u'Part2', u'68'],

来源

2017-07-11 Matt A

请在您的问题上使用[编辑]（https://stackoverflow.com/posts/45037330/edit）链接添加预期输出 – styvane

使用此行：'row（[“td”， “th”]）' –

谢谢，但是我正在寻找下面的输出（如果我第一次加入，会有所帮助，对不起） –

你的表是不正确的结构。如果你的表是结构化这样https://www.w3schools.com/tags/tag_thead.asp

试想：

content = """<table> 
<thead> 
    <tr> 
    <th>Month</th> 
    <th>Savings</th> 
    </tr> 
</thead> 
<tfoot> 
    <tr> 
    <td>Sum</td> 
    <td>$180</td> 
    </tr> 
</tfoot> 
<tbody> 
    <tr> 
    <td>January</td> 
    <td>$100</td> 
    </tr> 
    <tr> 
    <td>February</td> 
    <td>$80</td> 
    </tr> 
</tbody> 
</table>""" 

from bs4 import BeautifulSoup 

soup = BeautifulSoup(content, "html.parser") 
table = soup.find("table") 

print([header.text for header in soup.find("table").find("thead").find_all("th")]) 

for row in soup.find("table").find("tbody").find_all("tr"): 
    print([data.text for data in row.find_all("td")]) 

print([footer.text for footer in soup.find("table").find("tfoot").find_all("td")])

输出

['Month', 'Savings'] 
['January', '$100'] 
['February', '$80'] 
['Sum', '$180']

来源

2017-07-11 14:39:20

嗨，我不控制表的结构 –

如果你的“表中的数据是这样的：”部分是按照这种格式正确结构的表你想要的值，你只是想'平坦'的名单，请尝试：

2d_list = [[], [u'87', u'7'], [u'68']] 

1d_list = [x for y in 2d_list for x in y]

产生于：[u'87, u'7', u'68']

来源

2017-07-11 14:41:22 flevinkelming

我正在寻找acheicve：['Part1'，u'87'，u'7'，u'68']，['part2' ，u'68' ] ...... –

问题刮的所有行数据与美丽的汤

回答

相关问题