python解析文件的IP地址

我有一个文件有几个IP地址。在txt的4行中有大约900个IP。我希望输出为每行1个IP。我怎样才能做到这一点？基于其他的代码，我想出了这个室内用，但它无法becasue多个IP单线路：python解析文件的IP地址

import sys 
import re 

try: 
    if sys.argv[1:]: 
     print "File: %s" % (sys.argv[1]) 
     logfile = sys.argv[1] 
    else: 
     logfile = raw_input("Please enter a log file to parse, e.g /var/log/secure: ") 
    try: 
     file = open(logfile, "r") 
     ips = [] 
     for text in file.readlines(): 
      text = text.rstrip() 
      regex = re.findall(r'(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})$',text) 
      if regex is not None and regex not in ips: 
       ips.append(regex) 

     for ip in ips: 
      outfile = open("/tmp/list.txt", "a") 
      addy = "".join(ip) 
      if addy is not '': 
       print "IP: %s" % (addy) 
       outfile.write(addy) 
       outfile.write("\n") 
    finally: 
     file.close() 
     outfile.close() 
except IOError, (errno, strerror): 
     print "I/O Error(%s) : %s" % (errno, strerror)

来源

2012-12-24 Mark Hill

你要找的IPv4地址的规范形式。请注意，即使是IPv4地址，也有其他可接受的形式。例如尝试http：// 2130706433 /如果您在本地主机端口80上运行HTTP服务器（2130706433 == 0x7f000001 == 127.0.0.1）。当然，如果你控制文件的格式，你不需要担心这些事情......但是，如果你能够切实支持IPv6，它将会对你的脚本有前瞻性。 –

're.findall（）'总是返回一个列表。它永远不是'没有'。 – jfs

的$锚在你的表达是阻止你找到任何东西，但最后一个条目。卸下，然后使用由.findall()返回的列表：

found = re.findall(r'(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})',text) 
if regex: 
    ips.extend(found)

来源

2012-12-24 23:52:36

的函数findAll返回匹配的数组，你是不是通过每场比赛迭代。

regex = re.findall(r'(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})\.(?:[\d]{1,3})$',text) 
if regex is not None: 
    for match in regex: 
     if match not in ips: 
      ips.append(match)

来源

2012-12-24 23:44:23 Walk

没有re.MULTILINE标志$只在字符串的结尾相匹配。

为了使调试更容易将代码拆分为几个可独立测试的部分。

def extract_ips(data): 
    return re.findall(r"\d{1,3}(?:\.\d{1,3}){3}", data)

正则表达式过滤出一些有效的IPS例如，2130706433, "1::1"。
而相反，正则表达式匹配无效字符串，例如999.999.999.999。你可以validate an ip string using socket.inet_aton() or more general socket.inet_pton()。你甚至可以将输入分成几部分而不用搜索ip并使用这些函数来保持有效的ips。

如果输入文件是小，你并不需要保存IPS的原始顺序：

with open(filename) as infile, open(outfilename, "w") as outfile: 
    outfile.write("\n".join(set(extract_ips(infile.read()))))

否则：

with open(filename) as infile, open(outfilename, "w") as outfile: 
    seen = set() 
    for line in infile: 
     for ip in extract_ips(line): 
      if ip not in seen: 
       seen.add(ip) 
       print >>outfile, ip

来源

2012-12-25 01:08:47 jfs

提取IP地址从文件

我在this discussion回答了类似的问题。总之，这是基于我正在进行的项目之一，用于提取液的网络，并从不同类型的输入数据的基于主机的指标（如字符串，文件，博客文章等）：https://github.com/JohnnyWachter/intel

我会导入在IPAddresses和数据类，然后用它们来完成你的任务，以下列方式：

#!/usr/bin/env/python 

"""Extract IPv4 Addresses From Input File.""" 

from Data import CleanData # Format and Clean the Input Data. 
from IPAddresses import ExtractIPs # Extract IPs From Input Data. 


def get_ip_addresses(input_file_path): 
    """" 
    Read contents of input file and extract IPv4 Addresses. 
    :param iput_file_path: fully qualified path to input file. Expecting str 
    :returns: dictionary of IPv4 and IPv4-like Address lists 
    :rtype: dict 
    """ 

    input_data = [] # Empty list to house formatted input data. 

    input_data.extend(CleanData(input_file_path).to_list()) 

    results = ExtractIPs(input_data).get_ipv4_results() 

    return results

现在你已经列出的字典，您可以轻松访问您想要的数据并以您想要的任何方式输出。下面的例子利用了上面的功能;结果打印到控制台，并把它们写入到一个指定的输出文件：

# Extract the desired data using the aforementioned function. 
ipv4_list = get_ip_addresses('/path/to/input/file') 

# Open your output file in 'append' mode. 
with open('/path/to/output/file', 'a') as outfile: 

    # Ensure that the list of valid IPv4 Addresses is not empty. 
    if ipv4_list['valid_ips']: 

     for ip_address in ipv4_list['valid_ips']: 

      # Print to console 
      print(ip_address) 

      # Write to output file. 
      outfile.write(ip_address)

来源

2014-06-09 22:41:22 Johnny

python解析文件的IP地址

回答

相关问题