2015-01-07 30 views
0

我的项目仍在编码。但不幸的是我遇到了另一个问题Erlang:从解析的html中获取信息

最近,我设法从erlang网站获取html代码,并且我在mochiweb插件中解析了这些信息。 这里什么都搞到现在:

{<<"html">>,[], 
[{<<"head">>,[],[]}, 
    {<<"body">>,[], 
    [{<<"table">>,[], 
    [{<<"tr">>,[], 
     [{<<"td">>,[{<<"id">>,<<"day">>}],[<<"Poniedzialek ">>]}, 
     {<<"td">>,[{<<"id">>,<<"temp">>}],[<<" -5 ">>]}, 
     {<<"td">>,[{<<"id">>,<<"wiatr">>}],[<<"13 km/h">>]}]}, 
     {<<"tr">>,[], 
     [{<<"td">>,[{<<"id">>,<<"day">>}],[<<"Wtorek ">>]}, 
     {<<"td">>,[{<<"id">>,<<"temp">>}],[<<" -15 ">>]}, 
     {<<"td">>,[{<<"id">>,<<"wiatr">>}],[<<"13 km/h">>]}]}, 
     {<<"tr">>,[], 
     [{<<"td">>,[{<<"id">>,<<"day">>}],[<<"Sroda ">>]}, 
     {<<"td">>,[{<<"id">>,<<"temp">>}],[<<" 10 ">>]}, 
     {<<"td">>,[{<<"id">>,<<"wiatr">>}],[<<"13 km/h">>]}]}, 
     {<<"tr">>,[], 
     [{<<"td">>,[{<<"id">>,<<"day">>}],[<<"Czwartek ">>]}, 
     {<<"td">>,[{<<"id">>,<<"temp">>}],[<<" 12 ">>]}, 
     {<<"td">>,[{<<"id">>,<<"wiatr">>}],[<<"13 km/h">>]}]}, 
     {<<"tr">>,[], 
     [{<<"td">>,[{<<"id">>,<<"day">>}],[<<"Piatek ">>]}, 
     {<<"td">>,[{<<"id">>,<<"temp">>}],[<<" 20 ">>]}, 
     {<<"td">>,[{<<"id">>,<<"wiat"...>>}],[<<"13 km/h">>]}]}, 
     {<<"tr">>,[], 
     [{<<"td">>,[{<<"id">>,<<"day">>}],[<<"Poniedzialek"...>>]}, 
     {<<"td">>,[{<<"id">>,<<"temp">>}],[<<" -5 ">>]}, 
     {<<"td">>,[{<<"id">>,<<...>>}],[<<"13 k"...>>]}]}]}]}]} 

现在我想提取有关温度信息,风。如何使erlang中的函数获得可能列表中的确切温度或添加到json文件中,而不需要任何不必要的事情。

+0

[解析从mochiweb \ _html获得的结果](http://stackoverflow.com/questions/16148202/parsing-the-result-obtained-from-mochiweb-html) – legoscia

+0

感谢@legoscia对此评论。非常感谢。 – KonradPrg

回答

0

现在,我已经有了:

[{<<"td">>,[{<<"id">>,<<"temp">>}],[<<" -5 ">>]}, 
{<<"td">>,[{<<"id">>,<<"temp">>}],[<<" -15 ">>]}, 
{<<"td">>,[{<<"id">>,<<"temp">>}],[<<" 10 ">>]}, 
{<<"td">>,[{<<"id">>,<<"temp">>}],[<<" 12 ">>]}, 
{<<"td">>,[{<<"id">>,<<"temp">>}],[<<" 20 ">>]}, 
{<<"td">>,[{<<"id">>,<<"temp">>}],[<<" -5 ">>]}] 

好吧,我得到了:

[<<" -5 ">>] 

如何只提取-5?

+0

您可以使用[binary_to_integer/1](http://erlang.org/doc/man/erlang.html#binary_to_integer-1),但首先需要过滤掉空格。您可以使用[binary:part/2](http://www.erlang.org/doc/man/binary.html#part-2)'binary:part(ParsedInteger,{1,byte_size(ParsedInteger) - 2} )'。或者使用[bit string comprehension](http://erlang.org/doc/reference_manual/expressions.html#id81778):'<< <>。 <><= ParsedInteger,Char/= $ >>。''''''''''表示ascii代码为空格。就像'$ A'的意思是A的代码。 –

+0

太棒了!非常感谢! – KonradPrg