2014-02-24 121 views
1

我需要将包含<h2>..</h2>,<p>..</p><a href=".."><img ..></a>元素的HTML数据转换为格式正确的归属字符串。我想分配<h2>UIFontTextStyleHeadline1<p>UIFontTextStyleBody并存储图像链接。我只需要将输出与标题和正文元素进行归属,我将分别处理图像。将HTML转换为格式正确的归档字符串

到目前为止,我有这样的代码:

NSMutableAttributedString *content = [[NSMutableAttributedString alloc] 
     initWithData:[[post objectForKey:@"content"] 
    dataUsingEncoding:NSUTF8StringEncoding] 
       options:@{NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType, 
        NSCharacterEncodingDocumentAttribute: [NSNumber numberWithInt:NSUTF8StringEncoding]} 
    documentAttributes:nil error:nil]; 

其输出到这样的事情:

Heading 
{ 
    NSColor = "UIDeviceRGBColorSpace 0 0 0 1"; 
    NSFont = "<UICTFont: 0xd47bc00> font-family: \"TimesNewRomanPS-BoldMT\"; font-weight: bold; font-style: normal; font-size: 18.00pt"; 
    NSKern = 0; 
    NSParagraphStyle = "Alignment 4, LineSpacing 0, ParagraphSpacing 14.94, ParagraphSpacingBefore 0, HeadIndent 0, TailIndent 0, FirstLineHeadIndent 0, LineHeight 0/0, LineHeightMultiple 0, LineBreakMode 0, Tabs (\n), DefaultTabInterval 36, Blocks (null), Lists (null), BaseWritingDirection 0, HyphenationFactor 0, TighteningFactor 0, HeaderLevel 2"; 
    NSStrokeColor = "UIDeviceRGBColorSpace 0 0 0 1"; 
    NSStrokeWidth = 0; 
}{ 
    NSAttachment = "<NSTextAttachment: 0xd486590>"; 
    NSColor = "UIDeviceRGBColorSpace 0 0 0.933333 1"; 
    NSFont = "<UICTFont: 0xd47cdb0> font-family: \"Times New Roman\"; font-weight: normal; font-style: normal; font-size: 12.00pt"; 
    NSKern = 0; 
    NSLink = "http://www.placeholder.com/image.jpg"; 
    NSParagraphStyle = "Alignment 4, LineSpacing 0, ParagraphSpacing 12, ParagraphSpacingBefore 0, HeadIndent 0, TailIndent 0, FirstLineHeadIndent 0, LineHeight 0/0, LineHeightMultiple 0, LineBreakMode 0, Tabs (\n), DefaultTabInterval 36, Blocks (null), Lists (null), BaseWritingDirection 0, HyphenationFactor 0, TighteningFactor 0, HeaderLevel 0"; 
    NSStrokeColor = "UIDeviceRGBColorSpace 0 0 0.933333 1"; 
    NSStrokeWidth = 0; 
} 
Body text, body text, body text. Body text, body text, body text. 
{ 
    NSColor = "UIDeviceRGBColorSpace 0 0 0 1"; 
    NSFont = "<UICTFont: 0xd47cdb0> font-family: \"Times New Roman\"; font-weight: normal; font-style: normal; font-size: 12.00pt"; 
    NSKern = 0; 
    NSParagraphStyle = "Alignment 4, LineSpacing 0, ParagraphSpacing 12, ParagraphSpacingBefore 0, HeadIndent 0, TailIndent 0, FirstLineHeadIndent 0, LineHeight 0/0, LineHeightMultiple 0, LineBreakMode 0, Tabs (\n), DefaultTabInterval 36, Blocks (null), Lists (null), BaseWritingDirection 0, HyphenationFactor 0, TighteningFactor 0, HeaderLevel 0"; 
    NSStrokeColor = "UIDeviceRGBColorSpace 0 0 0 1"; 
    NSStrokeWidth = 0; 
} 

我是新来attributedString,寻求一种有效的方式将这些属性到转换上面提到的标准字体。谢谢。

回答

0

如果有人将寻求类似的东西,我就完了使用TFHpple librabry在HTML中分离数据从文本元素的图像,然后我改变attributedString的格式属性如下:

NSString *contentString = [self parseHTMLdata:bodyString]; 

NSMutableAttributedString *content = [[NSMutableAttributedString alloc] initWithData:[contentString dataUsingEncoding:NSUTF8StringEncoding] options:@{NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType, NSCharacterEncodingDocumentAttribute: [NSNumber numberWithInt:NSUTF8StringEncoding]} documentAttributes:nil error:nil]; 

// prepare new format 
NSRange effectiveRange = NSMakeRange(0, 0); 

NSDictionary *attributes; 

while (NSMaxRange(effectiveRange) < [content length]) { 

attributes = [content attributesAtIndex:NSMaxRange(effectiveRange) effectiveRange:&effectiveRange]; 

    UIFont *font = [attributes objectForKey:@"NSFont"]; 

    if (font.pointSize == 18.0f) { 

     [content addAttribute:NSFontAttributeName value:self.headlineFont range:effectiveRange]; 

    } else { 

     [content addAttribute:NSFontAttributeName value:self.bodyFont range:effectiveRange]; 
    } 
} 

而且hpple部分:

- (NSString *)parseHTMLdata:(NSString *)content 
{ 
    NSData *data = [content dataUsingEncoding:NSUTF8StringEncoding]; 

    TFHpple *parser = [[TFHpple alloc] initWithHTMLData:data]; 

    NSString *xpathQueryString = @"//body"; 

    NSArray *elements = [[[parser searchWithXPathQuery:xpathQueryString] firstObject] children]; 

    NSMutableString *textContent = [[NSMutableString alloc] init]; 

    for (TFHppleElement *element in elements) { 

     if ([[element tagName] isEqualToString:@"h2"] || [[element tagName] isEqualToString:@"p"]) { 

      if ([[[element firstChild] tagName] isEqualToString:@"a"]) { 

       // image element, just save it in array 
      } else { 

       // pure h2 or p element 
       [textContent appendString:[element raw]]; 
      } 
     } 
    } 

    return textContent; 
} 

检查在属性的字体大小可能看起来脆弱,如果它会引起一些问题,我可以更深入地保持航向/ body标签段落样式。

相关问题