2017-10-21 137 views
3

我试图用空字符串替换向量中的重复项。但是,我可以找到的唯一功能是删除重复项,而不是替换它们。我怎么能采取在Clojure向量中替换重复项

["Oct 2016" "Oct 2016" "Nov 2016" "Nov 2016" "Nov 2016" "Nov 2016"]

输出:

["Oct 2016" "" "Nov 2016" "" "" ""]

一切我能找到将返回["Oct 2016" "Nov 2016"]目前,我正在实现由做一个嵌套doseq所需的输出,但似乎效率不高。有没有更好的方法来实现这一目标? 谢谢!

回答

5

以下是解决方案的策略。

  1. loop在向量的项目上。
  2. 维护受访物品的set。它可以用来检查唯一性。
  3. 对于每个项目:如果该集合包含当前项目,则将""插入到结果向量中。
  4. 如果当前项目是唯一的,那么将其插入结果向量和集合中。
  5. 当所有项目被访问时返回结果向量。
  6. 可选:使用transient结果向量可获得更好的性能。

代码:

(defn duplicate->empty [xs] 
    (loop [xs  (seq xs) 
     result [] 
     found #{}] 
     (if-let [[x & xs] (seq xs)] 
      (if (contains? found x) 
      (recur xs (conj result "") found) 
      (recur xs (conj result x) (conj found x))) 
      result))) 

调用它:

(duplicate->empty ["Oct 2016" "Oct 2016" "Nov 2016" "Nov 2016" "Nov 2016" "Nov 2016"]) 
=> ["Oct 2016" "" "Nov 2016" "" "" ""] 
+0

弄成初始创建的矢量的期间固定的问题,但是这并工作,以及,标记为溶液。 – Zaden

0

你可以使用iterate

(def months ["Oct 2016" "Oct 2016" "Nov 2016" "Nov 2016" "Nov 2016" "Nov 2016"]) 

(defn step [[[head & tail] dups res]] 
    [tail 
    (conj dups head) 
    (conj res (if (dups head) 
       "" 
       head))]) 

(defn empty-dups [xs] 
    (->> (iterate step [xs #{} []]) 
     (drop-while (fn [[[head] _ _]] head)) 
     (map #(nth % 2)) 
     first)) 

(empty-dups months) 
;; => ["Oct 2016" "" "Nov 2016" "" "" ""] 
1
(defn eliminate-duplicates [v] 
     (let [result (transient (vec (repeat (count v) ""))) 
       index-of-first-occurences (apply merge-with #(first %&) (map-indexed (fn [x y] {y x}) v))] 
      (doall (for [[s pos] index-of-first-occurences] 
         (assoc! result pos s))) 
      (persistent! result))) 
2

换能器的版本只是为了补偿leteness。

(defn empty-duplicates 
    ([] 
    (fn [rf] 
    (let [seen (volatile! #{})] 
     (fn 
     ([] (rf)) 
     ([res] (rf res)) 
     ([res x] 
      (if (contains? @seen x) 
      (rf res "") 
      (do (vswap! seen conj x) 
       (rf res x)))))))) 
    ([coll] 
    (sequence (empty-duplicates) coll))) 

(comment 

    (def months ["Oct 2016" "Oct 2016" "Nov 2016" "Nov 2016" "Nov 2016" "Nov 2016"]) 

    (into [] (empty-duplicates) months) ;=> ["Oct 2016" "" "Nov 2016" "" "" ""] 

) 
1

基本上与上述相同,但使用懒惰序列生成:

(defn rdups 
    ([items] (rdups #{} items)) 
    ([found [x & xs :as items]] 
    (when (seq items) 
    (if (contains? found x) 
     (lazy-seq (cons "" (rdups found xs))) 
     (lazy-seq (cons x (rdups (conj found x) xs))))))) 

user> (rdups ["Oct 2016" "Oct 2016" "Nov 2016" "Nov 2016" "Nov 2016" "Nov 2016"]) 
;;=> ("Oct 2016" "" "Nov 2016" "" "" "") 
+0

使用lazy-seq(如接受的ans)PLUS丢弃循环。看起来最干净的方式给我。 –