2012-01-30 23 views
17

是否有一个高级API用于在Haskell中使用正则表达式进行搜索和替换?特别是,我正在查看Text.Regex.TDFAText.Regex.Posix包。我真的很喜欢类型的东西:Haskell正则表达式库替换/替换

f :: Regex -> (ResultInfo -> m String) -> String -> m String 

因此,举例来说,以取代“狗”与“猫”你可以写

runIdentity . f "dog" (return . const "cat") -- :: String -> String 

或做更多先进的东西与单子一样,计数发生次数等。

Haskell文档很缺乏。一些低级的API说明是here

回答

4

我不知道,创建此功能的任何现有的功能,但我认为我会最终使用像AllMatches [] (MatchOffset, MatchLength) instance of RegexContent东西来模拟它:

replaceAll :: RegexLike r String => r -> (String -> String) -> String -> String 
replaceAll re f s = start end 
    where (_, end, start) = foldl' go (0, s, id) $ getAllMatches $ match re s 
     go (ind,read,write) (off,len) = 
      let (skip, start) = splitAt (off - ind) read 
       (matched, remaining) = splitAt len matched 
      in (off + len, remaining, write . (skip++) . (f matched ++)) 

replaceAllM :: (Monad m, RegexLike r String) => r -> (String -> m String) -> String -> m String 
replaceAllM re f s = do 
    let go (ind,read,write) (off,len) = do 
     let (skip, start) = splitAt (off - ind) read 
     let (matched, remaining) = splitAt len matched 
     replacement <- f matched 
     return (off + len, remaining, write . (skip++) . (replacement++)) 
    (_, end, start) <- foldM go (0, s, return) $ getAllMatches $ match re s 
    start end 
28

如何在包装文字的subRegex .Regex?

Prelude Text.Regex> :t subRegex 
subRegex :: Regex -> String -> String -> String 

Prelude Text.Regex> subRegex (mkRegex "foo") "foobar" "123" 
"123bar" 
1

也许这种方法适合你。

import Data.Array (elems) 
import Text.Regex.TDFA ((=~), MatchArray) 

replaceAll :: String -> String -> String -> String   
replaceAll regex new_str str = 
    let parts = concat $ map elems $ (str =~ regex :: [MatchArray]) 
    in foldl (replace' new_str) str (reverse parts) 

    where 
    replace' :: [a] -> [a] -> (Int, Int) -> [a] 
    replace' new list (shift, l) = 
     let (pre, post) = splitAt shift list 
     in pre ++ new ++ (drop l post) 
3

基于@风铃草的答案,但有固定的错字所以它不只是<<loop>>

replaceAll :: Regex -> (String -> String) -> String -> String 
replaceAll re f s = start end 
    where (_, end, start) = foldl' go (0, s, id) $ getAllMatches $ match re s 
     go (ind,read,write) (off,len) = 
      let (skip, start) = splitAt (off - ind) read 
       (matched, remaining) = splitAt len start 
      in (off + len, remaining, write . (skip++) . (f matched ++)) 
1

您可以使用replaceAllData.Text.ICU.Replace module

Prelude> :set -XOverloadedStrings 
Prelude> import Data.Text.ICU.Replace 
Prelude Data.Text.ICU.Replace> replaceAll "cat" "dog" "Bailey is a cat, and Max is a cat too." 
"Bailey is a dog, and Max is a dog too."