2013-10-30 56 views
0

现在我有这个shell代码,我需要添加HTML标签到简单的文本。换句话说,我需要将文本格式化为基本的HTML代码。SHELL text to html

现在我有这个shell脚本:

#!/bin/sh 
# t2h {$1} html-ize a text file and save as foo.htm 
NL=" 
" 
cat $1 \ 
| sed -e 's/ at/at /g' \ 
| sed -e 's/[[:cntrl:]]/ /g'\ 
| sed -e 's/^[[:space:]]*$//g' \ 
| sed -e '/^$/{'"$NL"'N'"$NL"'/^\n$/D'"$NL"'}' \ 
| sed -e 's/^$/<\/UL><P>/g' \ 
| sed -e '/<P>$/{'"$NL"'N'"$NL"'s/\n//'"$NL"'}'\ 
| sed -e 's/<P>[[:space:]]*"/<P><UL>"/' \ 
| sed -e 's/^[[:space:]]*-/<BR> -/g' \ 
| sed -e 's/http:\/\/[[:graph:]\.\/]*/<A HREF="&">[&]<\/A> /g'\ 
           > foo.htm 

这里是文本的例子:

"Greatest properly off ham exercise all. Unsatiable invitation its possession nor off. All difficulty estimating unreserved increasing the solicitude. Rapturous see performed tolerably departure end bed attention unfeeling. On unpleasing principles alteration of. Be at performed preferred determine collected. Him nay acuteness discourse listening estimable our law. Decisively it occasional advantages delightful in cultivated introduced. Like law mean form are sang loud lady put.  " 

"Greatest properly off ham exercise all. Unsatiable invitation its possession nor off. All difficulty estimating unreserved increasing the solicitude. Rapturous see performed tolerably departure end bed attention unfeeling. On unpleasing principles alteration of. Be at performed preferred determine collected. Him nay acuteness discourse listening estimable our law. Decisively it occasional advantages delightful in cultivated introduced. Like law mean form are sang loud lady put.  " 

,它需要是这样的:

<html> 
<p>Greatest properly off ham exercise all. Unsatiable invitation its possession nor off. All difficulty estimating unreserved increasing the solicitude. Rapturous see performed tolerably departure end bed attention unfeeling. On unpleasing principles alteration of. Be at performed preferred determine collected. Him nay acuteness discourse listening estimable our law. Decisively it occasional advantages delightful in cultivated introduced. Like law mean form are sang loud lady put.</p><br /> 
<br /> 
<p>Greatest properly off ham exercise all. Unsatiable invitation its possession nor off. All difficulty estimating unreserved increasing the solicitude. Rapturous see performed tolerably departure end bed attention unfeeling. On unpleasing principles alteration of. Be at performed preferred determine collected. Him nay acuteness discourse listening estimable our law. Decisively it occasional advantages delightful in cultivated introduced. Like law mean form are sang loud lady put.</p> 
</html> 
+0

也许你可以使用降价? – BeniBela

+0

你的第一个“sed”命令实际上是在做什么吗?顺便说一句,你有脚本,输入和输出,但你也有一个问题呢? – thom

回答

0

我看到你想用“sed”来替换换行符,无论出于何种原因。
这是行不通的。
“sed”对于换行符是盲目的,因为“sed”是面向工作行的。 这意味着在开始评估之前,换行符被“sed”剥离。

最简单的解决方法是在用“sed”开始之前,使用“tr”将每个换行符替换为另一个字符。 这个其他字符不应该是可以在文本文件中找到的字符(或者您必须以某种方式构建自己的转义序列)。

如果您更正了这些错误,但它仍然无效,请发布您更正的代码。
并发布您期望的每个sed命令。

P.S.用bash的本地字符串处理可能会更好。

0

嗨尝试这样的事情bash脚本数据文件与文本文件进行格式化

#!/bin/bash 
echo '<html>' >foo.htm 
cat DataFile |tr -d '"'|grep -v "^$" |\ 
while read Line 
do 
cat <<eof 
    <p>${Line}</p><br><br> 
eof 
done >>foo.htm 
echo '</html>' >>foo.htm