0
问题:使用rvest我似乎无法找到我需要从我通过幽灵js呈现的html页面的信息块。我已经尝试了几乎所有可能的格式,但我似乎无法让html_node选择正确的块。阅读幻影渲染HTML到R
HTML代码幻影呈现:
<div class="page">
<div class="main-header">
</script>
<div id="listing-703036966" class="shop-srp-listings__listing">
<div class="card listing-row--search hide-fade">
<div class="listing-row__main">
<div class="listing-row__image">
<div class="media-count shadowed">
<a href="/vehicledetail/detail/703036966/overview/" target="_self" class="media-count--photo" data-goto-vdp="703036966" data-standard-link="md-thumb">
25 Photos
</a>
<a href="/vehicledetail/detail/703036966/overview/" target="_self" class="media-count--video" data-goto-vdp="703036966" data-standard-link="md-thumb">
1 Video
</a>
</div>
<a href="/vehicledetail/detail/703036966/overview/" target="_self" class="gray-bg listing-row__photo" data-goto-vdp="703036966" data-standard-link="md-thumb">
<img alt="New 2018 BMW 750 i" src="https://www.cstatic-images.com/phototab/e/1/4/e2/f87fb57ec51cab4f57cbaeb9f9f.jpg" onload="window.performance.mark('serverSideFirstPhotoLoaded')">
</a>
<div class="compare-srp">
<div class="listing-row__save">
<a id="703036966" class="switch-favorite unsaved saveVehicleHeart compare-switch-favorite" savedfeatureinstance="" vehicle="{"listingId":703036966,"mkId":20005,"mkNm":"BMW","mdId":20536,"mdNm":"750","trimId":25905,"trimName":"i","modelYearId":35797618,"modelYear":2018,"stkTyp":"New","state":"NC","zipcode":"27107"}" cars-common-omniture-custom="" omniture-events="">
<div class="save-icon-wrapper">
<div class="cui-icon icon-heart-line">
<svg width="16" height="16" class="icon-image">
<use xlink:href="#cui-icon-heart-outline"></use>
</svg>
</div>
<div class="cui-icon icon-heart">
<svg width="16" height="16" class="icon-image">
<use xlink:href="#cui-icon-heart-fill"></use>
</svg>
</div>
</div>
<p class="saved-label">Save</p>
</a>
</div>
<div class="compare-button" data-compare-listing="703036966">
<div class="compare-icon-wrapper">
<div class="cui-icon icon-plus-sign">
<svg width="16" height="16" class="icon-plus-sign">
<use xlink:href="#cui-icon-plus-sign"></use>
</svg>
</div>
<div class="cui-icon icon-checkmark">
<svg width="16" height="16" class="icon-checkmark">
<use xlink:href="#cui-icon-checkmark"></use>
</svg>
</div>
</div>
<p class="compare-button__label compare">Compare</p>
<p class="compare-button__label added">Added</p>
</div>
</div>
</div>
等
我R中已经做了
library(rvest)
library(stringr)
library(plyr)
library(dplyr)
library(ggvis)
library(knitr)
library(tidyverse)
cars <- read_html("my file.html") %>%
html_nodes("div") %>%
html_text()
然而,当我检查汽车矢量我完全缺少所需的代码块它是:
<a id="703036966" class="switch-favorite unsaved saveVehicleHeart compare-switch-favorite" savedfeatureinstance="" vehicle=". {"listingId":703036966,"mkId":20005,"mkNm":"BMW","mdId":20536,"mdNm":"750","trimId":25905,"trimName":"i","modelYearId":35797618,"modelYear":2018,"stkTyp":"New","state":"NC","zipcode":"27107"}" cars-common-omniture-custom="" omniture-events="">
但它永远不会转换为可用的形式,并且我尝试失去它的所有不同节点(div,p,span)。
任何想法?
通过“完整的HTML”你是指你发布什么或一个更大的HTML与多个车帖? –
我想通了..html_node与html_nodes。再次感谢!回复非常好 – MDEWITT
谢谢。很高兴帮助。 –