2010-03-28 137 views
0

假设我有这样的:如何用大于一个单个字符的分隔符分隔字符串?

"foo bar 1 and foo bar 2" 

我怎样才能把它分成:

foo bar 1 
foo bar 2 

我试过strtok()strsep()但都没有工作。他们不认可“和”作为分隔符,他们认可“a”,“n”和“d”作为分隔符。

任何函数来帮助我这个,或者我将不得不拆分的空白空间,并做一些字符串操作?

回答

5

你可以使用strstr()找到第一个“和”,并通过跳过这么多字符并重新执行来“自动”标记字符串。

2

这里是一个不错的短的例子,我只是写了演示如何使用strstr在给定的字符串分割字符串:

#include <string.h> 
#include <stdio.h> 

void split(char *phrase, char *delimiter) 
{ 
    char *loc = strstr(phrase, delimiter); 
    if (loc == NULL) 
    { 
     printf("Could not find delimiter\n"); 
    } 
    else 
    { 
     char buf[256]; /* malloc would be much more robust here */ 
     int length = strlen(delimiter); 
     strncpy(buf, phrase, loc - phrase); 
     printf("Before delimiter: '%s'\n", buf); 
     printf("After delimiter: '%s'\n", loc+length); 
    } 
} 

int main() 
{ 
    split("foo bar 1 and foo bar 2", "and"); 
    printf("-----\n"); 
    split("foo bar 1 and foo bar 2", "quux"); 
    return 0; 
} 

输出:

 
Before delimiter: 'foo bar 1 ' 
After delimiter: ' foo bar 2' 
----- 
Could not find delimiter 

当然,我没有经过充分测试,它可能容易受到与字符串长度相关的大多数标准缓冲区溢出问题的影响;但这至少是一个可证明的例子。

5

在C中分割字符串的主要问题是它不可避免地会产生一些动态内存管理,并且在任何可能的情况下倾向于通过标准库来避免 。这就是为什么标准的C函数没有处理动态内存分配的原因,只有malloc/calloc/realloc 这样做。

但是自己做这件事并不难。让我引导你通过 它。

我们需要返回一些字符串,并且最简单的方法是将返回一个指向字符串的数组指针数组,该指针数组由 作为NULL项终止。除了最后的NULL之外,数组中的每个元素都指向一个动态分配的字符串 。

首先我们需要一些辅助函数来处理这样的数组。 最简单的一个是一个(最后NULL前元件 )计算的字符串数:

/* Return length of a NULL-delimited array of strings. */ 
size_t str_array_len(char **array) 
{ 
    size_t len; 

    for (len = 0; array[len] != NULL; ++len) 
     continue; 
    return len; 
} 

另一种简单的一个是用于释放该阵列的功能:

/* Free a dynamic array of dynamic strings. */ 
void str_array_free(char **array) 
{ 
    if (array == NULL) 
     return; 
    for (size_t i = 0; array[i] != NULL; ++i) 
     free(array[i]); 
    free(array); 
} 

稍微更复杂的是该函数将字符串 的副本添加到数组中。它需要处理一些特殊情况,例如 数组尚不存在(整个数组为空)。另外,它需要 句柄字符串不以'\ 0'结尾,以便我们的实际分割函数更容易在 追加时仅使用输入字符串的一部分。

/* Append an item to a dynamically allocated array of strings. On failure, 
    return NULL, in which case the original array is intact. The item 
    string is dynamically copied. If the array is NULL, allocate a new 
    array. Otherwise, extend the array. Make sure the array is always 
    NULL-terminated. Input string might not be '\0'-terminated. */ 
char **str_array_append(char **array, size_t nitems, const char *item, 
         size_t itemlen) 
{ 
    /* Make a dynamic copy of the item. */ 
    char *copy; 
    if (item == NULL) 
     copy = NULL; 
    else { 
     copy = malloc(itemlen + 1); 
     if (copy == NULL) 
      return NULL; 
     memcpy(copy, item, itemlen); 
     copy[itemlen] = '\0'; 
    } 

    /* Extend array with one element. Except extend it by two elements, 
     in case it did not yet exist. This might mean it is a teeny bit 
     too big, but we don't care. */ 
    array = realloc(array, (nitems + 2) * sizeof(array[0])); 
    if (array == NULL) { 
     free(copy); 
     return NULL; 
    } 

    /* Add copy of item to array, and return it. */ 
    array[nitems] = copy; 
    array[nitems+1] = NULL; 
    return array; 
} 

这是一个有趣的。对于非常好的风格,如果将输入项设置为自己的 函数,将拆分为动态副本,但我会将其作为excercise给读者。

最后,我们有实际的分裂函数。它也需要处理 一些特殊情况:

  • 输入字符串可能以分隔符开头或结尾。
  • 可能有分隔符彼此相邻。
  • 输入字符串可能根本不包含分隔符。

我已选择一个空字符串添加到的结果,如果隔膜是 旁边的开始或输入字符串的末尾,或毗邻 另一个分离器。如果你需要别的东西,你需要调整 的代码。

除了特殊情况和一些错误处理,拆分 现在是相当简单的。

/* Split a string into substrings. Return dynamic array of dynamically 
    allocated substrings, or NULL if there was an error. Caller is 
    expected to free the memory, for example with str_array_free. */ 
char **str_split(const char *input, const char *sep) 
{ 
    size_t nitems = 0; 
    char **array = NULL; 
    const char *start = input; 
    char *next = strstr(start, sep); 
    size_t seplen = strlen(sep); 
    const char *item; 
    size_t itemlen; 

    for (;;) { 
     next = strstr(start, sep); 
     if (next == NULL) { 
      /* Add the remaining string (or empty string, if input ends with 
       separator. */ 
      char **new = str_array_append(array, nitems, start, strlen(start)); 
      if (new == NULL) { 
       str_array_free(array); 
       return NULL; 
      } 
      array = new; 
      ++nitems; 
      break; 
     } else if (next == input) { 
      /* Input starts with separator. */ 
      item = ""; 
      itemlen = 0; 
     } else { 
      item = start; 
      itemlen = next - item; 
     } 
     char **new = str_array_append(array, nitems, item, itemlen); 
     if (new == NULL) { 
      str_array_free(array); 
      return NULL; 
     } 
     array = new; 
     ++nitems; 
     start = next + seplen; 
    } 

    if (nitems == 0) { 
     /* Input does not contain separator at all. */ 
     assert(array == NULL); 
     array = str_array_append(array, nitems, input, strlen(input)); 
    } 

    return array; 
} 

这是整个程序的一个部分。它还包含一个主程序 来运行一些测试用例。

#include <assert.h> 
#include <stdbool.h> 
#include <stdio.h> 
#include <stdlib.h> 
#include <string.h> 


/* Append an item to a dynamically allocated array of strings. On failure, 
    return NULL, in which case the original array is intact. The item 
    string is dynamically copied. If the array is NULL, allocate a new 
    array. Otherwise, extend the array. Make sure the array is always 
    NULL-terminated. Input string might not be '\0'-terminated. */ 
char **str_array_append(char **array, size_t nitems, const char *item, 
         size_t itemlen) 
{ 
    /* Make a dynamic copy of the item. */ 
    char *copy; 
    if (item == NULL) 
     copy = NULL; 
    else { 
     copy = malloc(itemlen + 1); 
     if (copy == NULL) 
      return NULL; 
     memcpy(copy, item, itemlen); 
     copy[itemlen] = '\0'; 
    } 

    /* Extend array with one element. Except extend it by two elements, 
     in case it did not yet exist. This might mean it is a teeny bit 
     too big, but we don't care. */ 
    array = realloc(array, (nitems + 2) * sizeof(array[0])); 
    if (array == NULL) { 
     free(copy); 
     return NULL; 
    } 

    /* Add copy of item to array, and return it. */ 
    array[nitems] = copy; 
    array[nitems+1] = NULL; 
    return array; 
} 


/* Free a dynamic array of dynamic strings. */ 
void str_array_free(char **array) 
{ 
    if (array == NULL) 
     return; 
    for (size_t i = 0; array[i] != NULL; ++i) 
     free(array[i]); 
    free(array); 
} 


/* Split a string into substrings. Return dynamic array of dynamically 
    allocated substrings, or NULL if there was an error. Caller is 
    expected to free the memory, for example with str_array_free. */ 
char **str_split(const char *input, const char *sep) 
{ 
    size_t nitems = 0; 
    char **array = NULL; 
    const char *start = input; 
    char *next = strstr(start, sep); 
    size_t seplen = strlen(sep); 
    const char *item; 
    size_t itemlen; 

    for (;;) { 
     next = strstr(start, sep); 
     if (next == NULL) { 
      /* Add the remaining string (or empty string, if input ends with 
       separator. */ 
      char **new = str_array_append(array, nitems, start, strlen(start)); 
      if (new == NULL) { 
       str_array_free(array); 
       return NULL; 
      } 
      array = new; 
      ++nitems; 
      break; 
     } else if (next == input) { 
      /* Input starts with separator. */ 
      item = ""; 
      itemlen = 0; 
     } else { 
      item = start; 
      itemlen = next - item; 
     } 
     char **new = str_array_append(array, nitems, item, itemlen); 
     if (new == NULL) { 
      str_array_free(array); 
      return NULL; 
     } 
     array = new; 
     ++nitems; 
     start = next + seplen; 
    } 

    if (nitems == 0) { 
     /* Input does not contain separator at all. */ 
     assert(array == NULL); 
     array = str_array_append(array, nitems, input, strlen(input)); 
    } 

    return array; 
} 


/* Return length of a NULL-delimited array of strings. */ 
size_t str_array_len(char **array) 
{ 
    size_t len; 

    for (len = 0; array[len] != NULL; ++len) 
     continue; 
    return len; 
} 


#define MAX_OUTPUT 20 


int main(void) 
{ 
    struct { 
     const char *input; 
     const char *sep; 
     char *output[MAX_OUTPUT]; 
    } tab[] = { 
     /* Input is empty string. Output should be a list with an empty 
      string. */ 
     { 
      "", 
      "and", 
      { 
       "", 
       NULL, 
      }, 
     }, 
     /* Input is exactly the separator. Output should be two empty 
      strings. */ 
     { 
      "and", 
      "and", 
      { 
       "", 
       "", 
       NULL, 
      }, 
     }, 
     /* Input is non-empty, but does not have separator. Output should 
      be the same string. */ 
     { 
      "foo", 
      "and", 
      { 
       "foo", 
       NULL, 
      }, 
     }, 
     /* Input is non-empty, and does have separator. */ 
     { 
      "foo bar 1 and foo bar 2", 
      " and ", 
      { 
       "foo bar 1", 
       "foo bar 2", 
       NULL, 
      }, 
     }, 
    }; 
    const int tab_len = sizeof(tab)/sizeof(tab[0]); 
    bool errors; 

    errors = false; 

    for (int i = 0; i < tab_len; ++i) { 
     printf("test %d\n", i); 

     char **output = str_split(tab[i].input, tab[i].sep); 
     if (output == NULL) { 
      fprintf(stderr, "output is NULL\n"); 
      errors = true; 
      break; 
     } 
     size_t num_output = str_array_len(output); 
     printf("num_output %lu\n", (unsigned long) num_output); 

     size_t num_correct = str_array_len(tab[i].output); 
     if (num_output != num_correct) { 
      fprintf(stderr, "wrong number of outputs (%lu, not %lu)\n", 
        (unsigned long) num_output, (unsigned long) num_correct); 
      errors = true; 
     } else { 
      for (size_t j = 0; j < num_output; ++j) { 
       if (strcmp(tab[i].output[j], output[j]) != 0) { 
        fprintf(stderr, "output[%lu] is '%s' not '%s'\n", 
          (unsigned long) j, output[j], tab[i].output[j]); 
        errors = true; 
        break; 
       } 
      } 
     } 

     str_array_free(output); 
     printf("\n"); 
    } 

    if (errors) 
     return EXIT_FAILURE; 
    return 0; 
} 
+0

非常感谢你为这个惊人的深入,作用片的代码。 – 2015-05-01 03:58:28

0

如果你知道定界符例如逗号的类型或分号,你可以用这个尝试:

#include<stdio.h> 
#include<conio.h> 
int main() 
{ 
    int i=0,temp=0,temp1=0, temp2=0; 
    char buff[12]="123;456;789"; 
    for(i=0;buff[i]!=';',i++) 
    { 
    temp=temp*10+(buff[i]-48); 
    } 
    for(i=0;buff[i]!=';',i++) 
    { 
    temp1=temp1*10+(buff[i]-48); 
    } 
    for(i=0;buff[i],i++) 
    { 
    temp2=temp2*10+(buff[i]-48); 
    } 
    printf("temp=%d temp1=%d temp2=%d",temp,temp1,temp2); 
    getch(); 
    return 0; 
} 

输出:

temp=123 temp1=456 temp2=789