Clojure中

搜索XML我有以下示例XML：Clojure中

<data> 
    <products> 
    <product> 
     <section>Red Section</section> 
     <images> 
     <image>img.jpg</image> 
     <image>img2.jpg</image> 
     </images> 
    </product> 
    <product> 
     <section>Blue Section</section> 
     <images> 
     <image>img.jpg</image> 
     <image>img3.jpg</image> 
     </images> 
    </product> 
    <product> 
     <section>Green Section</section> 
     <images> 
     <image>img.jpg</image> 
     <image>img2.jpg</image> 
     </images> 
    </product> 
    </products> 
</data>

我知道如何分析它的Clojure

(require '[clojure.xml :as xml]) 
(def x (xml/parse 'location/of/that/xml'))

这将返回描述XML

{:tag :data, 
:attrs nil, 
:content [ 
    {:tag :products, 
     :attrs nil, 
     :content [ 
      {:tag :product, 
      :attrs nil, 
      :content [] ..

嵌套地图

这个结构当然可以用标准的Clojure函数遍历，但它可能会变得非常冗长，特别是如果比较t例如，用XPath查询它。是否有任何帮手来遍历和搜索这样的结构？我怎样才能，例如

得到所有<product>
列表只得到文本“img2.jpg”
得到其section是产品的“红色款，其<images>标签包含一个<image>产品“

感谢

来源

2012-07-18 pistacchio

您可以使用诸如clj-xpath

库

来源

2012-07-18 09:17:23 Ankur

你愿意编辑你的答案并添加一个例子吗？ – octopusgrabbus 2012-07-18 18:35:40

运用data.zip Zippers这里是你的第二个用例的解决方案：在许多情况下

(ns core 
    (:use clojure.data.zip.xml) 
    (:require [clojure.zip :as zip] 
      [clojure.xml :as xml])) 

(def data (zip/xml-zip (xml/parse PATH))) 
(def products (xml-> data :products :product)) 

(for [product products :let [image (xml-> product :images :image)] 
         :when (some (text= "img2.jpg") image)] 
    {:section (xml1-> product :section text) 
    :images (map text image)}) 
=> ({:section "Red Section", :images ("img.jpg" "img2.jpg")} 
    {:section "Green Section", :images ("img.jpg" "img2.jpg")})

来源

2012-07-18 14:59:39 ponzao

随着clojures地图和矢量语义对于访问XML的语法足够的线程第一个宏。在许多情况下，您希望更特定于xml的某些内容（如xpath库），但在许多情况下，现有语言几乎与添加任何依赖关系一样简洁。

(pprint (-> (xml/parse "/tmp/xml") 
     :content first :content second :content first :content first)) 
"Blue Section"

来源

2012-07-18 18:34:22

下面是使用data.zip的替代版本，用于所有三个用例。我发现xml->和xml1->具有非常强大的内置导航功能，向量中具有子查询。

;; [org.clojure/data.zip "0.1.1"] 

(ns example.core 
    (:require 
    [clojure.zip :as zip] 
    [clojure.xml :as xml] 
    [clojure.data.zip.xml :refer [text xml-> xml1->]])) 

(def data (zip/xml-zip (xml/parse "/tmp/products.xml"))) 

(let [all-products (xml-> data :products :product) 
     red-section (xml1-> data :products :product [:section "Red Section"]) 
     img2 (xml-> data :products :product [:images [:image "img2.jpg"]])] 
    {:all-products (map (fn [product] (xml1-> product :section text)) all-products) 
    :red-section (xml1-> red-section :section text) 
    :img2 (map (fn [product] (xml1-> product :section text)) img2)}) 

=> {:all-products ("Red Section" "Blue Section" "Green Section"), 
    :red-section "Red Section", 
    :img2 ("Red Section" "Green Section")}

来源

2014-02-13 13:04:20

+1我知道你以后回答，但你有所有3个问题的唯一答案，你很好地分离导航和报告结果 – 2017-02-03 14:45:28

The Tupelo library可以很容易地解决类似这样的使用tupelo.forest树状数据结构的问题。请see this question for more information。 API文档can be found here。

在这里，我们加载你的xml数据，并将其首先转化为有活力，然后使用tupelo.forest使用的本地树结构。利布斯&数据DEF：

(ns tst.tupelo.forest-examples 
    (:use tupelo.forest tupelo.test) 
    (:require 
    [clojure.data.xml :as dx] 
    [clojure.java.io :as io] 
    [clojure.set :as cs] 
    [net.cgrand.enlive-html :as en-html] 
    [schema.core :as s] 
    [tupelo.core :as t] 
    [tupelo.string :as ts])) 
(t/refer-tupelo) 

(def xml-str-prod "<data> 
        <products> 
         <product> 
         <section>Red Section</section> 
         <images> 
          <image>img.jpg</image> 
          <image>img2.jpg</image> 
         </images> 
         </product> 
         <product> 
         <section>Blue Section</section> 
         <images> 
          <image>img.jpg</image> 
          <image>img3.jpg</image> 
         </images> 
         </product> 
         <product> 
         <section>Green Section</section> 
         <images> 
          <image>img.jpg</image> 
          <image>img2.jpg</image> 
         </images> 
         </product> 
        </products> 
        </data> ")

和初始化代码：

(dotest 
    (with-forest (new-forest) 
    (let [enlive-tree   (->> xml-str-prod 
           java.io.StringReader. 
           en-html/html-resource 
           first) 
      root-hid    (add-tree-enlive enlive-tree) 
      tree-1    (hid->hiccup root-hid)

在HID后缀代表“十六进制ID”，它是作用就像一个指向节点/叶在树中唯一的十六进制值。在这个阶段，我们刚刚加载在林中的数据结构中的数据，创建树-1，它看起来像：

[:data 
[:tupelo.forest/raw "\n     "] 
[:products 
    [:tupelo.forest/raw "\n      "] 
    [:product 
    [:tupelo.forest/raw "\n      "] 
    [:section "Red Section"] 
    [:tupelo.forest/raw "\n      "] 
    [:images 
    [:tupelo.forest/raw "\n       "] 
    [:image "img.jpg"] 
    [:tupelo.forest/raw "\n       "] 
    [:image "img2.jpg"] 
    [:tupelo.forest/raw "\n      "]] 
    [:tupelo.forest/raw "\n      "]] 
    [:tupelo.forest/raw "\n      "] 
    [:product 
    [:tupelo.forest/raw "\n      "] 
    [:section "Blue Section"] 
    [:tupelo.forest/raw "\n      "] 
    [:images 
    [:tupelo.forest/raw "\n       "] 
    [:image "img.jpg"] 
    [:tupelo.forest/raw "\n       "] 
    [:image "img3.jpg"] 
    [:tupelo.forest/raw "\n      "]] 
    [:tupelo.forest/raw "\n      "]] 
    [:tupelo.forest/raw "\n      "] 
    [:product 
    [:tupelo.forest/raw "\n      "] 
    [:section "Green Section"] 
    [:tupelo.forest/raw "\n      "] 
    [:images 
    [:tupelo.forest/raw "\n       "] 
    [:image "img.jpg"] 
    [:tupelo.forest/raw "\n       "] 
    [:image "img2.jpg"] 
    [:tupelo.forest/raw "\n      "]] 
    [:tupelo.forest/raw "\n      "]] 
    [:tupelo.forest/raw "\n     "]] 
[:tupelo.forest/raw "\n     "]]

接下来，我们删除所有空白字符串与此代码：

blank-leaf-hid?  (fn [hid] (and (leaf-hid? hid) ; ensure it is a leaf node 
           (let [value (hid->value hid)] 
             (and (string? value) 
             (or (zero? (count value)) ; empty string 
              (ts/whitespace? value)))))) ; all whitespace string 

blank-leaf-hids  (keep-if blank-leaf-hid? (all-hids)) 
>>     (apply remove-hid blank-leaf-hids) 
tree-2    (hid->hiccup root-hid)

产生好得多的结果树（打嗝格式）

[:data 
[:products 
    [:product 
    [:section "Red Section"] 
    [:images [:image "img.jpg"] [:image "img2.jpg"]]] 
    [:product 
    [:section "Blue Section"] 
    [:images [:image "img.jpg"] [:image "img3.jpg"]]] 
    [:product 
    [:section "Green Section"] 
    [:images [:image "img.jpg"] [:image "img2.jpg"]]]]]

下面的代码然后计算解答上述三个问题：

个

product-hids   (find-hids root-hid [:** :product]) 
product-trees-hiccup (mapv hid->hiccup product-hids) 

img2-paths   (find-paths-leaf root-hid [:data :products :product :images :image] "img2.jpg") 
img2-prod-paths  (mapv #(drop-last 2 %) img2-paths) 
img2-prod-hids  (mapv last img2-prod-paths) 
img2-trees-hiccup (mapv hid->hiccup img2-prod-hids) 

red-sect-paths  (find-paths-leaf root-hid [:data :products :product :section] "Red Section") 
red-prod-paths  (mapv #(drop-last 1 %) red-sect-paths) 
red-prod-hids  (mapv last red-prod-paths) 
red-trees-hiccup  (mapv hid->hiccup red-prod-hids)]

带结果：

(is= product-trees-hiccup 
    [[:product 
    [:section "Red Section"] 
    [:images 
     [:image "img.jpg"] 
     [:image "img2.jpg"]]] 
    [:product 
    [:section "Blue Section"] 
    [:images 
     [:image "img.jpg"] 
     [:image "img3.jpg"]]] 
    [:product 
    [:section "Green Section"] 
    [:images 
     [:image "img.jpg"] 
     [:image "img2.jpg"]]]]) 

(is= img2-trees-hiccup 
    [[:product 
    [:section "Red Section"] 
    [:images 
    [:image "img.jpg"] 
    [:image "img2.jpg"]]] 
    [:product 
    [:section "Green Section"] 
    [:images 
    [:image "img.jpg"] 
    [:image "img2.jpg"]]]]) 

(is= red-trees-hiccup 
    [[:product 
    [:section "Red Section"] 
    [:images 
    [:image "img.jpg"] 
    [:image "img2.jpg"]]]]))))

完整例子可以发现in the forest-examples unit test。

来源

2017-06-08 02:50:32

回答

相关问题