C python中的词法分析器

我使用python创建了一个C语法分析器，作为开发解析器的一部分。在我的代码中，我编写了一些识别关键字，数字，运算符等的方法。编译后没有显示错误。执行时，我可以输入.c文件。我的输出应该列出输入文件中的所有关键字，标识符等。但它没有显示任何东西。任何人都可以帮助我。该代码已附加。C python中的词法分析器

import sys 
import string 
delim=['\t','\n',',',';','(',')','{','}','[',']','#','<','>'] 
oper=['+','-','*','/','%','=','!'] 
key=["int","float","char","double","bool","void","extern","unsigned","goto","static","class","struct","for","if","else","return","register","long","while","do"] 
predirect=["include","define"] 
header=["stdio.h","conio.h","malloc.h","process.h","string.h","ctype.h"] 
word_list1="" 
i=0 
j=0 
f=0 
numflag=0 
token=[0]*50 


def isdelim(c): 
    for k in range(0,14): 
     if c==delim[k]: 
      return 1 
     return 0 

def isop(c): 
    for k in range(0,7): 
     if c==oper[k]: 
      ch=word_list1[i+1] 
      i+=1 
      for j in range(0,6): 
       if ch==oper[j]: 
        fop=1 
        sop=ch 
        return 1 
       #ungetc(ch,fp); 
       return 1 
       j+=1 
     return 0; 
     k+=1 

def check(t): 
    print t 
    if numflag==1: 
     print "\n number "+str(t) 
     return 
    for k in range(0,2):#(i=0;i<2;i++) 
     if strcmp(t,predirect[k])==0: 
      print "\n preprocessor directive "+str(t) 
      return 
    for k in range(0,6): #=0;i<6;i++) 
     if strcmp(t,header[k])==0: 
      print "\n header file "+str(t) 
      return 
    for k in range(0,21): #=0;i<21;i++) 
     if strcmp(key[k],t)==0: 
      print "\n keyword "+str(key[k]) 
      return 
     print "\n identifier \t%s"+str(t) 

def skipcomment(): 
    ch=word_list[i+1] 
    i+=1 
    if ch=='/': 
     while word_list1[i]!='\0': 
      i+=1#ch=getc(fp))!='\0': 
    elif ch=='*': 
     while f==0: 
      ch=word_list1[i] 
      i+=1 
     if c=='/': 
      f=1 
    f=0 




a=raw_input("Enter the file name:") 
s=open(a,"r") 
str1=s.read() 
word_list1=str1.split() 




i=0 
#print word_list1[i] 
for word in word_list1 : 
    print word_list1[i] 
    if word_list1[i]=="/": 
     print word_list1[i] 
    elif word_list1[i]==" ": 
     print word_list1[i] 
    elif word_list1[i].isalpha(): 
     if numflag!=1: 
      token[j]=word_list1[i] 
      j+=1 
     if numflag==1: 
      token[j]='\0' 
      check(token) 
      numflag=0 
      j=0 
      f=0 
     if f==0: 
      f=1 
    elif word_list1[i].isalnum(): 
     if numflag==0: 
      numflag=1 
      token[j]=word_list1[i] 
      j+=1 
     else: 
      if isdelim(word_list1[i]): 
       if numflag==1: 
        token[j]='\0' 
        check(token) 
        numflag=0 
       if f==1: 
        token[j]='\0' 
        numflag=0 
        check(token) 
       j=0 
       f=0 
       print "\n delimiters : "+word_list1[i] 
    elif isop(word_list1[i]): 
     if numflag==1: 
      token[j]='\0' 
      check(token) 
      numflag=0 
      j=0 
      f=0 
     if f==1: 
      token[j]='\0' 
      j=0 
      f=0 
      numflag=0 
      check(token)  
     if fop==1: 
      fop=0 
      print "\n operator \t"+str(word_list1[i])+str(sop) 
     else: 
      print "\n operator \t"+str(c) 
    elif word_list1[i]=='.': 
     token[j]=word_list1[i] 
     j+=1 
    i+=1

来源

2010-10-22 Aneeshia

哇。重新发明轮子有很多工作要做。为什么不下载'ply'并从现有的C语言解析器开始？为什么要这样做？ – 2010-10-22 10:51:07

我不明白你为什么要这样做。你有很多关于你以前的问题的好建议（我认为这是你的动机）http://stackoverflow.com/questions/3976665/parser-generation包括对Python中完整的C语法分析器的引用。 – 2010-10-22 19:51:58

你的代码不好。尝试将其分成更小的函数，您可以单独测试。您是否尝试过调试该程序？一旦你找到导致问题的地方，你可以回到这里问一个更具体的问题。

更多提示。您可以实现isdelim这样简单得多：

def isdelim(c): 
    return c in delim

要为相等比较字符串，使用string1 == string2。 Python中不存在strcmp。我不知道你是否知道Python通常是解释的而不是编译的。这意味着如果你调用一个不存在的函数，你将得不到编译器错误。该程序只会在运行时进行投诉。

在你的功能isop你有无法访问的代码。 j += 1和k += 1这两行不能到达，因为它们恰好在return声明之后。

在Python遍历集合就像下面这样：

for item in collection: 
    # do stuff with item

这些都只是一些提示。你应该真的阅读Python Tutorial。

来源

2010-10-22 09:25:07

我是新的Python ..反正thanx。 – Aneeshia 2010-10-22 09:37:56

@Aneeshia：“我是Python新手”。这意味着你必须首先阅读Python教程。然后，在阅读完教程后，您应该使用Google进行“Python词法扫描”并阅读您在其中找到的代码。从这样大的代码开始，这是一个糟糕的主意。该教程是一个好主意。 – 2010-10-22 12:51:36

def isdelim(c): 
    if c in delim: 
     return 1 
    return 0

您应该了解更多关于Python基础知识。 ATM，您的代码包含太多的if s和for s。

试着学习它hard way。

来源

2010-10-22 09:27:21

它似乎为我输出了相当多的输出，但代码很难跟踪。我跑这对本身和它出错了，像这样：

Traceback (most recent call last): 
    File "C:\dev\snippets\lexical.py", line 92, in <module> 
    token[j]=word_list1[i] 
IndexError: list assignment index out of range

老实说，这是非常糟糕的代码。你应该给的功能更好的名称，并且没有使用魔法的数字是这样的：

for k in range(0,14)

我的意思是，你已经让你可以使用的范围列表。

for k in range(delim)

更有意义。

但你只是想确定是否c是在列表DELIM，所以只说：

if c in delim

为什么要退1和0，它们意味着什么？为什么不使用True和False。

有可能是其他几个明显的问题，如整个代码的“主要”部分。

这不是很Python的：

token=[0]*50

你真的刚才的意思是说什么？

token = []

现在它只是一个空的列表。

，而不是试图用一个计数器是这样的：

token[j]=word_list1[i]

要附加，就像这样：

token.append (word_list[i])

老实说，我认为你已经开始用太硬的问题。

来源

2010-10-22 09:36:03 jgritty

C python中的词法分析器

回答

相关问题