这实际上是我在几天前问的question的延续。我采用了应用函子的路线并创建了自己的实例。json在haskell中的解析第2部分 - lambda中的非穷举模式
我需要在文件中解析大量的json语句,一行接一行。一个例子JSON声明是这样的 -
{"question_text": "How can NBC defend tape delaying the Olympics when everyone has
Twitter?", "context_topic": {"followers": 21, "name": "NBC Coverage of the London
Olympics (July & August 2012)"}, "topics": [{"followers": 2705,
"name": "NBC"},{"followers": 21, "name": "NBC Coverage of the London
Olympics (July & August 2012)"},
{"followers": 17828, "name": "Olympic Games"},
{"followers": 11955, "name": "2012 Summer Olympics in London"}],
"question_key": "AAEAABORnPCiXO94q0oSDqfCuMJ2jh0ThsH2dHy4ATgigZ5J",
"__ans__": true, "anonymous": false}
遗憾的JSON格式。它变得不好
我有大约10000个这样的json语句,我需要解析它们。我写的代码是 这样的事情 -
parseToRecord :: B.ByteString -> Question
parseToRecord bstr = (\(Ok x) -> x) decodedObj where decodedObj = decode (B.unpack bstr) :: Result Question
main :: IO()
main = do
-- my first line in the file tells how many json statements
-- are there followed by a lot of other irrelevant info...
ts <- B.getContents >>= return . fst . fromJust . B.readInteger . head . B.lines
json_text <- B.getContents >>= return . tail . B.lines
let training_data = take (fromIntegral ts) json_text
let questions = map parseToRecord training_data
print $ questions !! 8922
此代码给我一个运行时错误Non-exhaustive patterns in lambda
。代码中的错误引用\(Ok x) -> x
。通过命中和试用,我得出的结论是,该程序运行良好,直到第8921次索引并且在第8922次迭代中失败。
我检查了相应的json语句,试图通过调用它的函数来解析它,并且它可以工作。但是,当我拨打地图时,它不起作用。我真的不明白发生了什么事。在“学习Haskell为了一件好事”中学习了一点Haskell之后,我想深入一个真实世界的编程项目,但似乎陷入了困境。
编辑::完整的代码如下
{-# LANGUAGE BangPatterns #-}
{-# OPTIONS_GHC -O2 -optc-O2 #-}
{-# OPTIONS_GHC -fno-warn-incomplete-uni-patterns #-}
import qualified Data.ByteString.Lazy.Char8 as B
import Data.Maybe
import NLP.Tokenize
import Control.Applicative
import Control.Monad
import Text.JSON
data Topic = Topic
{ followers :: Integer,
name :: String
} deriving (Show)
data Question = Question
{ question_text :: String,
context_topic :: Topic,
topics :: [Topic],
question_key :: String,
__ans__ :: Bool,
anonymous :: Bool
} deriving (Show)
(!) :: (JSON a) => JSObject JSValue -> String -> Result a
(!) = flip valFromObj
instance JSON Topic where
-- Keep the compiler quiet
showJSON = undefined
readJSON (JSObject obj) =
Topic <$>
obj ! "followers" <*>
obj ! "name"
readJSON _ = mzero
instance JSON Question where
-- Keep the compiler quiet
showJSON = undefined
readJSON (JSObject obj) =
Question <$>
obj ! "question_text" <*>
obj ! "context_topic" <*>
obj ! "topics" <*>
obj ! "question_key" <*>
obj ! "__ans__" <*>
obj ! "anonymous"
readJSON _ = mzero
isAnswered (Question _ _ _ _ status _) = status
isAnonymous (Question _ _ _ _ _ status) = status
parseToRecord :: B.ByteString -> Question
parseToRecord bstr = handle decodedObj
where handle (Ok k) = k
handle (Error e) = error (e ++ "\n" ++ show bstr)
decodedObj = decode (B.unpack bstr) :: Result Question
--parseToRecord bstr = (\(Ok x) -> x) decodedObj where decodedObj = decode (B.unpack bstr) :: Result Question
main :: IO()
main = do
ts <- B.getContents >>= return . fst . fromJust . B.readInteger . head . B.lines
json_text <- B.getContents >>= return . tail . B.lines
let training_data = take (fromIntegral ts) json_text
let questions = map parseToRecord training_data
let correlation = foldr (\x acc -> if (isAnonymous x == isAnswered x) then (fst acc + 1, snd acc + 1) else (fst acc, snd acc + 1)) (0,0) questions
print $ fst correlation
这里是它可以作为输入的可执行data。我正在使用ghc 7.6.3。如果程序名称是ans.hs,我遵循这些步骤。
$ ghc --make ans.hs
$ ./ans < path/to/the/file/sample/answered_data_10k.in
非常感谢!
是的,我试过了。它报告 '结果:MonadPlus.empty' – shashydhar
也许你应该在你的文章中完成代码,以便测试它。此外,它可能有助于知道哪部分字符串不解析。因此,而不是仅仅打印错误消息,您可以打印s.th.例如:'error(e ++“\ n”++ show bstr)' – ichistmeinname
感谢您检查问题。我已经完成了代码供您测试。另外,这里是json文件的链接。 (zip文件中带.in类型的文件) - http://hr-testcases.s3.amazonaws.com/688/sample.zip – shashydhar