python
  • web-scraping
  • beautifulsoup
  • 2017-07-19 62 views 0 likes 
    0

    我正在处理简单的项目,并且遇到了问题。我想从"div player_data="得到字符串。下面是这个divPython - BeautifulSoup从player_data获取字符串

    <div id="mediaplayer60597053" 
        player_data='{ 
         "id": "mediaplayer60597053", 
         "ads": { 
         "schedule": [{ 
          "enabled": true, 
          "counter": false, 
          "skip": true, 
          "click": true, 
          "key": "", 
          "tag": "https:\/\/www.cda.pl\/xml.php?type=g_embed&get=pool&ts=1500453286", 
          "repeat": 1, 
          "time": 0, 
          "type": "pool", 
          "displayAs": "prerol" 
         }] 
         }, 
         "video": { 
         "id": "60597053", 
         "file": "http:\/\/vrbx072.cda.pl\/dYXEHM8Nw3y_TZTmTs4e0g\/1500496486\/vl9afb2190473cc908d0c33cdb15bb212994083ca30c797154058bc8717c4ca746.mp4", 
         "manifest": null, 
         "duration": "6115", 
         "durationFull": "01:41:55", 
         "poster": "\/\/static.cda.pl\/v001\/img\/mobile\/poster16x9.png", 
         "type": "plain", 
         "width": 1920, 
         "height": 816, 
         "content_rating": null, 
         "quality": "vl", 
         "ts": 1500453286, 
         "hash": "26be0bc36e8575c32ff32f4329a301889d1f6f7a" 
         }, 
         "nextVideo": null, 
         "autoplay": false, 
         "seekTo": 0, 
         "premium": false, 
         "api": { 
         "client": "json_client", 
         "ts": "1500453286_60686", 
         "key": "9a3859a86e909430bd379badfa68d0d712603626", 
         "method": "" 
         }, 
         "user": { 
         "role": "guest" 
         } 
        }' 
        tabindex="1"> 
    </div> 
    

    我想这个字符串:

    "http:\/\/vrbx072.cda.pl\/dYXEHM8Nw3y_TZTmTs4e0g\/1500496486\/vl9afb2190473cc908d0c33cdb15bb212994083ca30c797154058bc8717c4ca746.mp4 
    

    感谢您的帮助。

    回答

    1

    看起来你需要你得到div,然后从那里提取json对象。您可以使用soup.find来提取div,然后使用json.loads将json字符串转换为python字典。

    import json 
    
    div = soup.find('div', {'id' : 'mediaplayer60597053' }) 
    data = json.loads(div['player_data']) 
    
    print(data['video']['file']) 
    # 'http://vrbx072.cda.pl/dYXEHM8Nw3y_TZTmTs4e0g/1500496486/vl9afb2190473cc908d0c33cdb15bb212994083ca30c797154058bc8717c4ca746.mp4' 
    
    +0

    THX的答案,但它给了我这个'uggcf://ieok056.pqn.cy/0r_FFJVYyyttw9jq-BHXmD/1500497686/uq9nso2190473pp908q0p33pqo15oo212994083pn30p797154058op8717p4pn746nqp.zc4 ' – jestembotem

    +0

    @jestembotem提出了整改意见。现在检查。 –

    +0

    此代码是正确的。我犯了一个错误。 Thx – jestembotem

    相关问题