2010-08-03 39 views
2

的重复图案我有如下的布局文件的字符串:含有重新排列可变长度

TABLE name_of_table

COLUMNS FIRST_COLUMN 2nd_column [..]正th_column

VALUES 1st_value 2nd_value [ ...]第n个值

VALUES yet_another_value ...继续

开始从另一个表重复.....

我想有这个文本文件重新安排到了我,所以我没有在每一个VALUES行的前面键入表和列,得到以下特性:

TABLE name_of_table COLUMNS FIRST_COLUMN [..]的第n列的值1st_value

TABLE name_of_table COLUMNS FIRST_COLUMN [..]的第n列的值yetanother_value

我需要输入一次在这里重新排列的几行,因此让整个文本文件与hGetContents字符串似乎是恰当的,产生的字符串是这样的:

表name_of_table柱FIRST_COLUMN [ ..] n-th_column VALUES 1st_value [..] n-th_value VALUES another_value [..] yet_another VALUES ...... ANOTHER TABLE .... COLUMNS .... VALUES [....] VALUES .. 。

我已经试过这样做了嵌套的情况下,递归。这给我一个困境,我需要帮助:

1)我需要递归,以避免无尽的情况下嵌套问题。

2)与递归,我不能有一个替代添加字符串的前几部分,因为递归只引用我的字符串的尾部!

能说明问题:

myStr::[[Char]]->[[Char]] myStr [] = [] myStr one = case (head one) of "table" -> "insert into":(head two):columnRecursion (three) ++ case (head four) of "values" -> (head four):valueRecursion (tail three) ++ myStr (tail four) _ -> case head (tail four) of "values" -> (head (tail four):myStr (tail (tail four)) _ -> where two = tail one three = tail two four = tail three columnRecursion::[[Char]] -> [[Char]] columnRecursion [] = [] columnRecursion cool = case (head cool) of "columns" -> "(":columnRecursion (tail cool) "values" -> [")"] _ -> (head cool):columnRecursion (tail cool) valueRecursion::[[Char]] -> [[Char]] valueRecursion foo = case head foo of "values" -> "insert into":(head two):columnRecursion (three) ++ valueRecursion (tail foo) "table" -> [] "columns"-> [] _ -> (head foo):valueRecursion (tail foo)

我风与FIRSTPART,价值观BLA BLA BLA VALUES BLA,我不能再FIRSTPART取,创造FIRSTPART,价值观,FIRSTPART, VALUES,FIRSTPART,VALUES。

试图通过在valueRecursion中引用myStr来做到这一点显然超出了范围。

怎么办?

+0

好像您需要采取双管齐下的方法 - 将输入解析为理智的数据结构,然后遍历数据结构以产生消除输出。我会看看我是否找到一个优雅的解决方案。 – jrockway 2010-08-03 03:16:07

回答

2

对于我来说,这种问题只会超过使用实际解析工具的阈值。下面是与Attoparsec快速工作示例:

import Control.Applicative 
import Data.Attoparsec (maybeResult) 
import Data.Attoparsec.Char8 
import qualified Data.Attoparsec.Char8 as A (takeWhile) 
import qualified Data.ByteString.Char8 as B 
import Data.Maybe (fromMaybe) 

data Entry = Entry String [String] [[String]] deriving (Show) 

entry = Entry <$> table <*> cols <*> many1 vals 
items = sepBy1 (A.takeWhile $ notInClass " \n") $ char ' ' 
table = string (B.pack "TABLE ") *> many1 (notChar '\n') <* endOfLine 
cols = string (B.pack "COLUMNS ") *> (map B.unpack <$> items) <* endOfLine 
vals = string (B.pack "VALUES ") *> (map B.unpack <$> items) <* endOfLine 

parseEntries :: B.ByteString -> Maybe [Entry] 
parseEntries = maybeResult . flip feed B.empty . parse (sepBy1 entry skipSpace) 

,有点机械:和

pretty :: Entry -> String 
pretty (Entry t cs vs) 
    = unwords $ ["TABLE", t, "COLUMNS"] 
    ++ cs ++ concatMap ("VALUES" :) vs 

layout :: B.ByteString -> Maybe String 
layout = (unlines . map pretty <$>) . parseEntries 

testLayout :: FilePath -> IO() 
testLayout f = putStr . fromMaybe [] =<< layout <$> B.readFile f 

给出此输入:

TABLE test 
COLUMNS a b c 
VALUES 1 2 3 
VALUES 4 5 6 

TABLE another 
COLUMNS x y z q 
VALUES 7 8 9 10 
VALUES 1 2 3 4 

我们得到如下:

*Main> testLayout "test.dat" 
TABLE test COLUMNS a b c VALUES 1 2 3 VALUES 4 5 6 
TABLE another COLUMNS x y z q VALUES 7 8 9 10 VALUES 1 2 3 4 

W这似乎是你想要的?

+0

这正是我想要的! =)我希望自己创建这个解析器,但是我意识到可能需要做更多的工作,而不是绕过我的初级阶段。 ^^ – 2010-08-03 10:03:01

0

这个答案是literate Haskell,所以你可以将它复制粘贴到一个名为table.lhs的文件中以得到一个工作程序。

与一些进口

> import Control.Arrow ((&&&)) 
> import Control.Monad (forM_) 
> import Data.List (intercalate,isPrefixOf) 
> import Data.Maybe (fromJust) 

,说我们代表了以下记录表开始:

> data Table = Table { tblName :: String 
>     , tblCols :: [String] 
>     , tblVals :: [String] 
>     } 
> deriving (Show) 

也就是说,我们记录表的名称,列名的列表,列值的列表。

在输入中的每个表上开始与TABLE开头的行,所以在输入所有行相应地分开成块:

> tables :: [String] -> [Table] 
> tables [] = [] 
> tables xs = next : tables ys 
> where next = mkTable (th:tt) 
>   (th:rest) = dropWhile (not . isTable) xs 
>   (tt,ys) = break isTable rest 
>   isTable = ("TABLE" `isPrefixOf`) 

已经分块输入到表中,给定的表的名称是第一个字在TABLE行。列名是出现在COLUMNS线的所有文字,列值来自VALUES线:

> mkTable :: [String] -> Table 
> mkTable xs = Table name cols vals 
> where name = head $ fromJust $ lookup "TABLE" tagged 
>   cols = grab "COLUMNS" 
>   vals = grab "VALUES" 
>   grab t = concatMap snd $ filter ((== t) . fst) tagged 
>   tagged = map ((head &&& tail) . words) 
>    $ filter (not . null) xs 

给定一个Table记录,我们通过粘贴名称,值打印出来,和SQL适当的顺序关键字一起在一行:

> main :: IO() 
> main = do 
> input <- readFile "input" 
> forM_ (tables $ lines input) $ 
>  \t -> do putStrLn $ intercalate " " $ 
>     "TABLE" : (tblName t) : 
>    ("COLUMNS" : (tblCols t)) ++ 
>    ("VALUES" : (tblVals t)) 

鉴于

TABLE name_of_table 

COLUMNS first_column 2nd_column [..] n-th_column 

VALUES 1st_value 2nd_value [...] n-th value 

VALUES yet_another_value ... go on 

TABLE name_of_table 

COLUMNS first_column 2nd_column [..] n-th_column 

VALUES 1st_value 2nd_value [...] n-th value 

VALUES yet_another_value ... go on

没有想象力输入输出

$ runhaskell table.lhs 
TABLE name_of_table COLUMNS first_column 2nd_column [..] n-th_column VALUES 1st_value 2nd_value [...] n-th value yet_another_value ... go on 
TABLE name_of_table COLUMNS first_column 2nd_column [..] n-th_column VALUES 1st_value 2nd_value [...] n-th value yet_another_value ... go on