2016-02-20 49 views
0

我一直在研究项目(不一定是这个项目),因为今晚太长时间了,而且我正在为我的一个课程编写一个程序,很难。我已经盯着它看了足够长的时间,并没有得到任何好处,所以我认为是时候换一套(或几套)眼睛了。C程序给予意想不到的输出

最终,程序将接受一个整数输入,将其转换为十六进制,并输出包含在特定范围的Unicode十六进制值中的语言的名称。所述语言的十六进制值包含在我们的程序中需要使用的文本文件(Blocks.txt)中。

这就是说,我还没有到那么远,并且正在处理一个中间问题。现在,我只是试图将所有的语言名称存储到结构数组中的char []字段中。结构是我定义的结构。我试图打印所有存储的语言用于测试目的,但输出并不是我所期望或期望的。

什么打印到控制台是正确的,但问题在于什么不打印。该程序在最后语言之后停止了大约40行。我根本不知道为什么会这样。

另一个(可能)重要的事情。在控制台上打印的第一件事是大约20(绝对少于40)空行。

最后一件事,我很抱歉我的低劣杂乱的菜鸟代码。我是C的初学者,我甚至可能会说我不知道​​我在做什么。感谢任何决定提供帮助的人。

我会后我的代码,我的输出,并Blocks.txt如下:

我的代码

//UnicodeBlocks.c 

#include <stdio.h> 
#include <string.h> 

typedef struct{ 
     int max; 
     int min; 
     char lang[300]; 
     //Contains a string holding hex ranges and a language. 
     char lineinfo[300]; 
    } LoadData; 

int main() 
{ 
    LoadData data[300]; 
    char hexstring[300]; 
    int languagecount = 0; 

    FILE *blocks; 
    blocks = fopen("Blocks.txt", "r"); 
    if(blocks == NULL) 
    { 
     perror("Blocks.txt not found"); 
     return(-1); 
    } 

    //Store the hex value ranges and their affiliated 
    //languages into data.allinfo. 
    for(int i = 0; fgets(hexstring,100,blocks) != NULL; i++) 
    { 
     if(hexstring[0] != '\n' && hexstring[0] != '#') 
       { 
        strcpy(data[i].lineinfo, hexstring); 
        languagecount++; 
       } 
    } 

    for(int i = 0; i < languagecount; i++) 
    { 
     char temp1[300]; 
     memset(temp1, 0, sizeof temp1); 
     strcpy(temp1, data[i].lineinfo); 

     for(int k = 0; k < strlen(temp1); k++) 
     { 
      char temp2[300]; 
      memset(temp2, 0, sizeof temp2); 
      if(temp1[k] == ';') 
      { 
       k = k + 2; 
       for(int j = 0; k < strlen(temp1); j++) 
       { 
        temp2[j] = temp1[k]; 
        k++; 
       } 
       strcpy(data[i].lang, temp2); 
       //break; 
      } 
     } 
    } 

    for(int i = 0; i < languagecount; i++) 
    { 
     puts(data[i].lang); 
    } 

    fclose(blocks); 

} 

我的输出

//Insert about 20 blank lines here. I'm sure this is part of the issue, I  
//just don't know why 

Basic Latin 

Latin-1 Supplement 

Latin Extended-A 

Latin Extended-B 

IPA Extensions 

Spacing Modifier Letters 

Combining Diacritical Marks 

Greek and Coptic 

Cyrillic 

Cyrillic Supplement 

Armenian 

Hebrew 

Arabic 

Syriac 

Arabic Supplement 

Thaana 

NKo 

Samaritan 

Mandaic 

Arabic Extended-A 

Devanagari 

Bengali 

Gurmukhi 

Gujarati 

Oriya 

Tamil 

Telugu 

Kannada 

Malayalam 

Sinhala 

Thai 

Lao 

Tibetan 

Myanmar 

Georgian 

Hangul Jamo 

Ethiopic 

Ethiopic Supplement 

Cherokee 

Unified Canadian Aboriginal Syllabics 

Ogham 

Runic 

Tagalog 

Hanunoo 

Buhid 

Tagbanwa 

Khmer 

Mongolian 

Unified Canadian Aboriginal Syllabics Extended 

Limbu 

Tai Le 

New Tai Lue 

Khmer Symbols 

Buginese 

Tai Tham 

Combining Diacritical Marks Extended 

Balinese 

Sundanese 

Batak 

Lepcha 

Ol Chiki 

Sundanese Supplement 

Vedic Extensions 

Phonetic Extensions 

Phonetic Extensions Supplement 

Combining Diacritical Marks Supplement 

Latin Extended Additional 

Greek Extended 

General Punctuation 

Superscripts and Subscripts 

Currency Symbols 

Combining Diacritical Marks for Symbols 

Letterlike Symbols 

Number Forms 

Arrows 

Mathematical Operators 

Miscellaneous Technical 

Control Pictures 

Optical Character Recognition 

Enclosed Alphanumerics 

Box Drawing 

Block Elements 

Geometric Shapes 

Miscellaneous Symbols 

Dingbats 

Miscellaneous Mathematical Symbols-A 

Supplemental Arrows-A 

Braille Patterns 

Supplemental Arrows-B 

Miscellaneous Mathematical Symbols-B 

Supplemental Mathematical Operators 

Miscellaneous Symbols and Arrows 

Glagolitic 

Latin Extended-C 

Coptic 

Georgian Supplement 

Tifinagh 

Ethiopic Extended 

Cyrillic Extended-A 

Supplemental Punctuation 

CJK Radicals Supplement 

Kangxi Radicals 

Ideographic Description Characters 

CJK Symbols and Punctuation 

Hiragana 

Katakana 

Bopomofo 

Hangul Compatibility Jamo 

Kanbun 

Bopomofo Extended 

CJK Strokes 

Katakana Phonetic Extensions 

Enclosed CJK Letters and Months 

CJK Compatibility 

CJK Unified Ideographs Extension A 

Yijing Hexagram Symbols 

CJK Unified Ideographs 

Yi Syllables 

Yi Radicals 

Lisu 

Vai 

Cyrillic Extended-B 

Bamum 

Modifier Tone Letters 

Latin Extended-D 

Syloti Nagri 

Common Indic Number Forms 

Phags-pa 

Saurashtra 

Devanagari Extended 

Kayah Li 

Rejang 

Hangul Jamo Extended-A 

Javanese 

Myanmar Extended-B 

Cham 

Myanmar Extended-A 

Tai Viet 

Meetei Mayek Extensions 

Ethiopic Extended-A 

Latin Extended-E 

Cherokee Supplement 

Meetei Mayek 

Hangul Syllables 

Hangul Jamo Extended-B 

High Surrogates 

High Private Use Surrogates 

Low Surrogates 

Private Use Area 

CJK Compatibility Ideographs 

Alphabetic Presentation Forms 

Arabic Presentation Forms-A 

Variation Selectors 

Vertical Forms 

Combining Half Marks 

CJK Compatibility Forms 

Small Form Variants 

Arabic Presentation Forms-B 

Halfwidth and Fullwidth Forms 

Specials 

Linear B Syllabary 

Linear B Ideograms 

Aegean Numbers 

Ancient Greek Numbers 

Ancient Symbols 

Phaistos Disc 

Lycian 

Carian 

Coptic Epact Numbers 

Old Italic 

Gothic 

Old Permic 

Ugaritic 

Old Persian 

Deseret 

Shavian 

Osmanya 

Elbasan 

Caucasian Albanian 

Linear A 

Cypriot Syllabary 

Imperial Aramaic 

Palmyrene 

Nabataean 

Hatran 

Phoenician 

Lydian 

Meroitic Hieroglyphs 

Meroitic Cursive 

Kharoshthi 

Old South Arabian 

Old North Arabian 

Manichaean 

Avestan 

Inscriptional Parthian 

Inscriptional Pahlavi 

Psalter Pahlavi 

Old Turkic 

Old Hungarian 

Rumi Numeral Symbols 

Brahmi 

Kaithi 

Sora Sompeng 

Chakma 

Mahajani 

Sharada 

Sinhala Archaic Numbers 

Khojki 

Multani 

Khudawadi 

Grantha 

Tirhuta 

Siddham 

Modi 

Takri 

Ahom 

Warang Citi 

Pau Cin Hau 

Cuneiform 

Cuneiform Numbers and Punctuation 

Early Dynastic Cuneiform 

Egyptian Hieroglyphs 

Anatolian Hieroglyphs 

Bamum Supplement 

Mro 

Bassa Vah 

Pahawh Hmong 

Miao 

Blocks.txt:

# Blocks-8.0.0.txt 
# Date: 2014-11-10, 23:04:00 GMT [KW] 
# 
# Unicode Character Database 
# Copyright (c) 1991-2014 Unicode, Inc. 
# For terms of use, see http://www.unicode.org/terms_of_use.html 
# For documentation, see http://www.unicode.org/reports/tr44/ 
# 
# Format: 
# Start Code..End Code; Block Name 

# ================================================ 

# Note: When comparing block names, casing, whitespace, hyphens, 
#   and underbars are ignored. 
#   For example, "Latin Extended-A" and "latin extended a" are equivalent. 
#   For more information on the comparison of property values, 
#   see UAX #44: http://www.unicode.org/reports/tr44/ 
# 
# All block ranges start with a value where (cp MOD 16) = 0, 
# and end with a value where (cp MOD 16) = 15. In other words, 
# the last hexadecimal digit of the start of range is ...0 
# and the last hexadecimal digit of the end of range is ...F. 
# This constraint on block ranges guarantees that allocations 
# are done in terms of whole columns, and that code chart display 
# never involves splitting columns in the charts. 
# 
# All code points not explicitly listed for Block 
# have the value No_Block. 

# Property: Block 
# 
# @missing: 0000..10FFFF; No_Block 

0000..007F; Basic Latin 
0080..00FF; Latin-1 Supplement 
0100..017F; Latin Extended-A 
0180..024F; Latin Extended-B 
0250..02AF; IPA Extensions 
02B0..02FF; Spacing Modifier Letters 
0300..036F; Combining Diacritical Marks 
0370..03FF; Greek and Coptic 
0400..04FF; Cyrillic 
0500..052F; Cyrillic Supplement 
0530..058F; Armenian 
0590..05FF; Hebrew 
0600..06FF; Arabic 
0700..074F; Syriac 
0750..077F; Arabic Supplement 
0780..07BF; Thaana 
07C0..07FF; NKo 
0800..083F; Samaritan 
0840..085F; Mandaic 
08A0..08FF; Arabic Extended-A 
0900..097F; Devanagari 
0980..09FF; Bengali 
0A00..0A7F; Gurmukhi 
0A80..0AFF; Gujarati 
0B00..0B7F; Oriya 
0B80..0BFF; Tamil 
0C00..0C7F; Telugu 
0C80..0CFF; Kannada 
0D00..0D7F; Malayalam 
0D80..0DFF; Sinhala 
0E00..0E7F; Thai 
0E80..0EFF; Lao 
0F00..0FFF; Tibetan 
1000..109F; Myanmar 
10A0..10FF; Georgian 
1100..11FF; Hangul Jamo 
1200..137F; Ethiopic 
1380..139F; Ethiopic Supplement 
13A0..13FF; Cherokee 
1400..167F; Unified Canadian Aboriginal Syllabics 
1680..169F; Ogham 
16A0..16FF; Runic 
1700..171F; Tagalog 
1720..173F; Hanunoo 
1740..175F; Buhid 
1760..177F; Tagbanwa 
1780..17FF; Khmer 
1800..18AF; Mongolian 
18B0..18FF; Unified Canadian Aboriginal Syllabics Extended 
1900..194F; Limbu 
1950..197F; Tai Le 
1980..19DF; New Tai Lue 
19E0..19FF; Khmer Symbols 
1A00..1A1F; Buginese 
1A20..1AAF; Tai Tham 
1AB0..1AFF; Combining Diacritical Marks Extended 
1B00..1B7F; Balinese 
1B80..1BBF; Sundanese 
1BC0..1BFF; Batak 
1C00..1C4F; Lepcha 
1C50..1C7F; Ol Chiki 
1CC0..1CCF; Sundanese Supplement 
1CD0..1CFF; Vedic Extensions 
1D00..1D7F; Phonetic Extensions 
1D80..1DBF; Phonetic Extensions Supplement 
1DC0..1DFF; Combining Diacritical Marks Supplement 
1E00..1EFF; Latin Extended Additional 
1F00..1FFF; Greek Extended 
2000..206F; General Punctuation 
2070..209F; Superscripts and Subscripts 
20A0..20CF; Currency Symbols 
20D0..20FF; Combining Diacritical Marks for Symbols 
2100..214F; Letterlike Symbols 
2150..218F; Number Forms 
2190..21FF; Arrows 
2200..22FF; Mathematical Operators 
2300..23FF; Miscellaneous Technical 
2400..243F; Control Pictures 
2440..245F; Optical Character Recognition 
2460..24FF; Enclosed Alphanumerics 
2500..257F; Box Drawing 
2580..259F; Block Elements 
25A0..25FF; Geometric Shapes 
2600..26FF; Miscellaneous Symbols 
2700..27BF; Dingbats 
27C0..27EF; Miscellaneous Mathematical Symbols-A 
27F0..27FF; Supplemental Arrows-A 
2800..28FF; Braille Patterns 
2900..297F; Supplemental Arrows-B 
2980..29FF; Miscellaneous Mathematical Symbols-B 
2A00..2AFF; Supplemental Mathematical Operators 
2B00..2BFF; Miscellaneous Symbols and Arrows 
2C00..2C5F; Glagolitic 
2C60..2C7F; Latin Extended-C 
2C80..2CFF; Coptic 
2D00..2D2F; Georgian Supplement 
2D30..2D7F; Tifinagh 
2D80..2DDF; Ethiopic Extended 
2DE0..2DFF; Cyrillic Extended-A 
2E00..2E7F; Supplemental Punctuation 
2E80..2EFF; CJK Radicals Supplement 
2F00..2FDF; Kangxi Radicals 
2FF0..2FFF; Ideographic Description Characters 
3000..303F; CJK Symbols and Punctuation 
3040..309F; Hiragana 
30A0..30FF; Katakana 
3100..312F; Bopomofo 
3130..318F; Hangul Compatibility Jamo 
3190..319F; Kanbun 
31A0..31BF; Bopomofo Extended 
31C0..31EF; CJK Strokes 
31F0..31FF; Katakana Phonetic Extensions 
3200..32FF; Enclosed CJK Letters and Months 
3300..33FF; CJK Compatibility 
3400..4DBF; CJK Unified Ideographs Extension A 
4DC0..4DFF; Yijing Hexagram Symbols 
4E00..9FFF; CJK Unified Ideographs 
A000..A48F; Yi Syllables 
A490..A4CF; Yi Radicals 
A4D0..A4FF; Lisu 
A500..A63F; Vai 
A640..A69F; Cyrillic Extended-B 
A6A0..A6FF; Bamum 
A700..A71F; Modifier Tone Letters 
A720..A7FF; Latin Extended-D 
A800..A82F; Syloti Nagri 
A830..A83F; Common Indic Number Forms 
A840..A87F; Phags-pa 
A880..A8DF; Saurashtra 
A8E0..A8FF; Devanagari Extended 
A900..A92F; Kayah Li 
A930..A95F; Rejang 
A960..A97F; Hangul Jamo Extended-A 
A980..A9DF; Javanese 
A9E0..A9FF; Myanmar Extended-B 
AA00..AA5F; Cham 
AA60..AA7F; Myanmar Extended-A 
AA80..AADF; Tai Viet 
AAE0..AAFF; Meetei Mayek Extensions 
AB00..AB2F; Ethiopic Extended-A 
AB30..AB6F; Latin Extended-E 
AB70..ABBF; Cherokee Supplement 
ABC0..ABFF; Meetei Mayek 
AC00..D7AF; Hangul Syllables 
D7B0..D7FF; Hangul Jamo Extended-B 
D800..DB7F; High Surrogates 
DB80..DBFF; High Private Use Surrogates 
DC00..DFFF; Low Surrogates 
E000..F8FF; Private Use Area 
F900..FAFF; CJK Compatibility Ideographs 
FB00..FB4F; Alphabetic Presentation Forms 
FB50..FDFF; Arabic Presentation Forms-A 
FE00..FE0F; Variation Selectors 
FE10..FE1F; Vertical Forms 
FE20..FE2F; Combining Half Marks 
FE30..FE4F; CJK Compatibility Forms 
FE50..FE6F; Small Form Variants 
FE70..FEFF; Arabic Presentation Forms-B 
FF00..FFEF; Halfwidth and Fullwidth Forms 
FFF0..FFFF; Specials 
10000..1007F; Linear B Syllabary 
10080..100FF; Linear B Ideograms 
10100..1013F; Aegean Numbers 
10140..1018F; Ancient Greek Numbers 
10190..101CF; Ancient Symbols 
101D0..101FF; Phaistos Disc 
10280..1029F; Lycian 
102A0..102DF; Carian 
102E0..102FF; Coptic Epact Numbers 
10300..1032F; Old Italic 
10330..1034F; Gothic 
10350..1037F; Old Permic 
10380..1039F; Ugaritic 
103A0..103DF; Old Persian 
10400..1044F; Deseret 
10450..1047F; Shavian 
10480..104AF; Osmanya 
10500..1052F; Elbasan 
10530..1056F; Caucasian Albanian 
10600..1077F; Linear A 
10800..1083F; Cypriot Syllabary 
10840..1085F; Imperial Aramaic 
10860..1087F; Palmyrene 
10880..108AF; Nabataean 
108E0..108FF; Hatran 
10900..1091F; Phoenician 
10920..1093F; Lydian 
10980..1099F; Meroitic Hieroglyphs 
109A0..109FF; Meroitic Cursive 
10A00..10A5F; Kharoshthi 
10A60..10A7F; Old South Arabian 
10A80..10A9F; Old North Arabian 
10AC0..10AFF; Manichaean 
10B00..10B3F; Avestan 
10B40..10B5F; Inscriptional Parthian 
10B60..10B7F; Inscriptional Pahlavi 
10B80..10BAF; Psalter Pahlavi 
10C00..10C4F; Old Turkic 
10C80..10CFF; Old Hungarian 
10E60..10E7F; Rumi Numeral Symbols 
11000..1107F; Brahmi 
11080..110CF; Kaithi 
110D0..110FF; Sora Sompeng 
11100..1114F; Chakma 
11150..1117F; Mahajani 
11180..111DF; Sharada 
111E0..111FF; Sinhala Archaic Numbers 
11200..1124F; Khojki 
11280..112AF; Multani 
112B0..112FF; Khudawadi 
11300..1137F; Grantha 
11480..114DF; Tirhuta 
11580..115FF; Siddham 
11600..1165F; Modi 
11680..116CF; Takri 
11700..1173F; Ahom 
118A0..118FF; Warang Citi 
11AC0..11AFF; Pau Cin Hau 
12000..123FF; Cuneiform 
12400..1247F; Cuneiform Numbers and Punctuation 
12480..1254F; Early Dynastic Cuneiform 
13000..1342F; Egyptian Hieroglyphs 
14400..1467F; Anatolian Hieroglyphs 
16800..16A3F; Bamum Supplement 
16A40..16A6F; Mro 
16AD0..16AFF; Bassa Vah 
16B00..16B8F; Pahawh Hmong 
16F00..16F9F; Miao     //This is where the output ends. 
1B000..1B0FF; Kana Supplement 
1BC00..1BC9F; Duployan 
1BCA0..1BCAF; Shorthand Format Controls 
1D000..1D0FF; Byzantine Musical Symbols 
1D100..1D1FF; Musical Symbols 
1D200..1D24F; Ancient Greek Musical Notation 
1D300..1D35F; Tai Xuan Jing Symbols 
1D360..1D37F; Counting Rod Numerals 
1D400..1D7FF; Mathematical Alphanumeric Symbols 
1D800..1DAAF; Sutton SignWriting 
1E800..1E8DF; Mende Kikakui 
1EE00..1EEFF; Arabic Mathematical Alphabetic Symbols 
1F000..1F02F; Mahjong Tiles 
1F030..1F09F; Domino Tiles 
1F0A0..1F0FF; Playing Cards 
1F100..1F1FF; Enclosed Alphanumeric Supplement 
1F200..1F2FF; Enclosed Ideographic Supplement 
1F300..1F5FF; Miscellaneous Symbols and Pictographs 
1F600..1F64F; Emoticons 
1F650..1F67F; Ornamental Dingbats 
1F680..1F6FF; Transport and Map Symbols 
1F700..1F77F; Alchemical Symbols 
1F780..1F7FF; Geometric Shapes Extended 
1F800..1F8FF; Supplemental Arrows-C 
1F900..1F9FF; Supplemental Symbols and Pictographs 
20000..2A6DF; CJK Unified Ideographs Extension B 
2A700..2B73F; CJK Unified Ideographs Extension C 
2B740..2B81F; CJK Unified Ideographs Extension D 
2B820..2CEAF; CJK Unified Ideographs Extension E 
2F800..2FA1F; CJK Compatibility Ideographs Supplement 
E0000..E007F; Tags 
E0100..E01EF; Variation Selectors Supplement 
F0000..FFFFF; Supplementary Private Use Area-A 
100000..10FFFF; Supplementary Private Use Area-B 

# EOF 
+0

IS 300是否有足够的行被加载? –

+0

你能否让你的故事缩短,然后说明你的主要问题?我们会更容易帮助你。 – Nik

回答

0

您在中有空缺数组。第一个for循环始终递增i。但并非每次迭代都会执行strcpy(data[i].lineinfo, hexstring);。也就是说,对于输入中的每条评论行,您的输出data不包含该值的数据。

尝试更改您的第一个for循环,如下所示。关键的变化是使用lanaguagecount作为数组索引而不是i

while (fgets(hexstring,100,blocks) != NULL) 
{ 
    if(hexstring[0] != '\n' && hexstring[0] != '#') 
    { 
     strcpy(data[lanuagecount].lineinfo, hexstring); 
     languagecount++; 
    } 
} 

您的代码还有其他问题。值得注意的是,您需要更谨慎地防止缓冲区溢出。但是由于您的工作仍在进行中,我会将其作为您完成的任务。

+0

啊,太棒了。现在有这么多意义。非常感谢。 –