这应该是一个简单的函数(它计算字符串中的唯一字符数),但是我遇到了一个奇怪的问题。请注意,我的代码使用期望只有ASCII字母a-z和A-Z。写入char *数组的字符变异
int unique_chars(char* my_str) {
//printf("starting unique_chars\n");
char seen_buffer[52]; // max 52 letters a-z & A-Z
int seen_count = 1; // not ever expecting my_str to be NULL
int i, j;
char next;
//printf("first char is %c\n", my_str[0]);
seen_buffer[0] = my_str[0]; // first char must be unique
for (i=1; i<strlen(my_str); i++) { // walk along the rest of my_str
next = my_str[i];
if (next >= 97) {
next = next - 32; // the next char will always be capital, for convenience
}
for (j=0; j<seen_count; j++) { // compare next to all the unique chars seen before
//printf("current char is %c, checking against %c\n", next, seen_buffer[j]);
if ((next==seen_buffer[j]) || (next+32==seen_buffer[j])) {
//printf("breaking\n");
break; // jump to the next char in my_str if we find a match
}
if (j==seen_count-1) { // at this point, we're sure that next hasn't been seen yet
//printf("new unique char is %c\n", next);
seen_count++;
seen_buffer[seen_count] = next;
//printf("new char val is %c, should be %c\n", seen_buffer[seen_count], next);
break;
}
}
}
return seen_count;
}
int main(int argc, char* argv[]){
char* to_encode = argv[1];
printf("unique chars: %d\n", unique_chars(to_encode));
}
当我用某些字符串调用时,我得到不正确的结果。例如,尝试:
./a.out gghhiijj
这将产生(和printf的取消注释):
starting unique_chars
first char is g
current char is G, checking against g
breaking
current char is H, checking against g
new unique char is H
new char val is H, should be H
current char is H, checking against g
current char is H, checking against
new unique char is H
new char val is H, should be H
current char is I, checking against g
current char is I, checking against
current char is I, checking against H
new unique char is I
new char val is I, should be I
current char is I, checking against g
current char is I, checking against
current char is I, checking against H
current char is I, checking against H
new unique char is I
new char val is I, should be I
current char is J, checking against g
current char is J, checking against
current char is J, checking against H
current char is J, checking against H
current char is J, checking against I
new unique char is J
new char val is J, should be J
current char is J, checking against g
current char is J, checking against
current char is J, checking against H
current char is J, checking against H
current char is J, checking against I
current char is J, checking against I
new unique char is J
new char val is J, should be J
所以,我不断收到在我seen_buffer重复,因为一些空白字符存储,而不是字母字符存在应在那里!然而,当我在写入到seen_buffer后进行比较(即新的字符值是%c,应该是%c \ n)时,显示正确的字符!
任何帮助表示赞赏!
'if(next> = 97){'// EBCDIC字符集中'a'的值是什么?研究它。 C代码的重点是什么,如果不是可移植的?研究C的历史。你为什么不用97代替'a'? – Sebivor 2013-02-28 05:11:34
假设你想检查一个字符是否为小写:'if(islower((unsigned char)next)){...}',现在假设你想把这个小写char转换为大写char:'next = islower (无符号字符)下一个)? toupper((unsigned char)next):next;'让你的编译器为你做优化,因为它足够聪明地执行死代码消除和尾部调用优化。 – Sebivor 2013-02-28 05:15:31
感谢您的建议!不知道isLower是否存在 - 非常有帮助! – David 2013-02-28 05:21:03