2013-06-12 100 views
0

我在C语言中进行一些文本提取,并在我的循环中调用 免费时得到一些“垃圾字符串”。下面是一些示例文本:C免费垃圾字符指针

Sentence #1 (34 tokens): 
The Project Gutenberg EBook of Moby Dick; or The Whale, by Herman Melville 

This eBook is for the use of anyone anywhere at no cost and with 
almost no restrictions whatsoever. 
[Text=The CharacterOffsetBegin=0 CharacterOffsetEnd=3 PartOfSpeech=DT Lemma=the]      [Text=Project CharacterOffsetBegin=4 CharacterOffsetEnd=11 PartOfSpeech=NN Lemma=project] 

问:

1 - 我可以放心地重用指针变量自由之后?

感谢您的帮助!

#include <stdio.h> 
    #include <stdlib.h> 
    #include <string.h> 

    #define LINE_M (1024*100) 

    int main(int argc, char **argv) 
    { 

     FILE *file; 
     char buff[LINE_M]; 
     char *lemma; 
     char *text; 
     char *sentence = NULL; 
     char *p, *t; 
     int numSent, numTok; 

     file = fopen("moby.txt.out", "r"); 


     while (fgets(buff, LINE_M, file)) 
     { 
     if(sscanf(buff, "Sentence #%d (%d tokens):", &numSent, &numTok)) 
      continue; 

     if(strstr(buff, "[Text=") == NULL) 
     { 
      if(sentence != NULL) 
      { 
      sentence = realloc(sentence, (strlen(buff) + strlen(sentence) + 2) * sizeof(char)); 
      strcat(sentence, buff); 
      } 
      else 
      { 
      sentence = malloc(sizeof(char) * (strlen(buff) + 1)); 
      strcpy(sentence, buff); 
      } 

      continue; 
     } 
     p = buff; 
     while ((p = strstr(p, "Text=")) != NULL) 
     { 

      p += 5; 
      t = strchr(p, ' '); 

      text = malloc((int)(t - p)); 
      strncpy(text, p, (int)(t - p)); 

      p = strstr(t + 1, "Lemma=") + 6; 
      t = strchr(p, ']'); 

      lemma = malloc((int)(t - p) * sizeof(char)); 
      strncpy(lemma, p, (int)(t - p)); 

      p = t + 1; 

      printf("%s\n", lemma); 
      free(text); 
      free(lemma); 

      text = NULL; 
      lemma = NULL; 

     } 
     free(sentence); 
     sentence = NULL; 

     } 

     fclose(file); 

     return 0; 
    } 
+0

重新使用释放指针指向的内存是不行的,但将新指针存储在存储您释放的旧内存的同一个内存槽中是完全正确的。 (我很确定你问的是第二种情况。) – DaoWen

回答

1

我怀疑你正在复制的字符串不是空终止的,打印时可能包含垃圾字符。

man strncpy

的函数strncpy()函数是类似的,除了在SRC的最多n个字节被复制。警告:如果src的前n个字节中没有空字节,则放在目标中的字符串将不会以空终止

+0

字符串不是'NULL'结尾,而是空终止。 –

+1

如果你想成为那么迂腐:他们是''''''终止。 – Kninnug

+1

其实我认为NUL是最合适的,但我改变了它。谢谢。 – eyalm