2012-01-15 131 views
3

我需要将短语拆分为单词,数字,标点符号和空格/制表符。我也想保留事物的顺序。将文本拆分为单词,数字和标点符号

NSString *text = [NSString stringWithFormat:@"The 3 quick:\"brown fox, jump's\" over."]; 

这是我需要产生一种名单:

['The', ' ', '3', ' ', 'quick, ':', '"', 'brown', ' ', 'fox', ',', ' ', 'jump's', ' ', '.'] 

谢谢!!

+2

你从哪里得到“quick”和“:”之间的空格? – 2012-01-15 13:42:57

+1

应该保留还是拆分全位数字符串?换句话说,“333快速”变成了“[”“”,“”,“333”,“”,“快速”],还是“[”“”,“3”,“3” ,“3”,“”,“快”]'? – dasblinkenlight 2012-01-15 13:47:03

+0

应该保留数字。 “333”将保持333. – 2012-01-15 13:57:09

回答

2

试用这一类我写了使用NSScanner & NSCharacterSet

@interface NSString(Splitting) 

-(NSArray *) arrayBySeparatingComponentsInCharacterSet:(NSCharacterSet *) charSet; 

@end 

@implementation NSString(Splitting) 

BOOL scanOneCharacterFromSetIntoString(NSScanner *self, NSCharacterSet * charSet, NSString **outStr); 
BOOL scanOneCharacterFromSetIntoString(NSScanner *self, NSCharacterSet * charSet, NSString **outStr) 
{ 
    // check for index out of bounds 
    NSString *inStr = self.string; 

    if (self.scanLocation >= inStr.length) 
    { 
     return NO; 
    } 

    unichar ch = [inStr characterAtIndex:self.scanLocation]; 

    if (![charSet characterIsMember:ch]) 
    { 
     return NO; 
    } 

    self.scanLocation++; 
    if (outStr) 
    { 
     *outStr = [NSString stringWithCharacters:&ch length:1]; 
    } 

    return YES; 
} 

-(NSArray *) arrayBySeparatingComponentsInCharacterSet:(NSCharacterSet *)charSet 
{ 
    NSScanner *scanner = [NSScanner scannerWithString:self]; 
    NSMutableArray *result = [NSMutableArray array]; 

    NSString *temp = nil; 
    while ([scanner scanUpToCharactersFromSet:charSet intoString:&temp] || scanOneCharacterFromSetIntoString(scanner, charSet, &temp)) {; 
     [result addObject:temp]; 

     if ([scanner scanLocation] >= [self length]) 
     { 
      break; 
     } 

     unichar temp2 = [self characterAtIndex:[scanner scanLocation]]; 

     if ([charSet characterIsMember:temp2]) 
     { 
      [result addObject:[NSString stringWithFormat:@"%c", temp2]]; 
      // only update the scan location if the scan was sucessful 
      [scanner setScanLocation:[scanner scanLocation] + 1]; 
     } 
    } 

    return result; 
} 

@end 

int main (int argc, const char * argv[]) 
{ 
    @autoreleasepool { 

     NSString *str = @"The 3 quick:\"brown fox, jump's\" over."; 
     NSArray *array = [str arrayBySeparatingComponentsInCharacterSet:[NSCharacterSet characterSetWithCharactersInString:@" :\",'."]]; 
     NSLog(@"%@", array); 
    } 
} 

应该是你所需要的,只是改变字符集,你所需要的。还要注意,这是在启用了ARC的情况下编译的,所以它可能会或可能不会在引用计数环境中的内存管理中正常工作。

+0

谢谢!它奇妙地工作。你为我节省了大量的挫折,更不用说时间了。 – 2012-01-15 14:18:23

+1

嘿,没问题,只是乐意帮忙。 – 2012-01-15 14:25:35

+0

一个问题:NSString * str = @“hello world ...”; 句末有多个标点符号会导致崩溃。还有任何想法如何处理省略号(三个点“...”)? – 2012-01-18 15:29:01

相关问题