我正在读取一个日志文件到java中。对于日志文件中的每一行,我正在检查该行是否包含一个IP地址。如果该行包含一个IP地址,我想然后+1的IP地址显示在日志文件中的次数的计数。我怎样才能在Java中实现这一点?统计文档中字符串的唯一出现次数
以下代码成功地从包含ip地址的每行中提取ip地址,但用于计算ip地址发生的过程不起作用。
void read(String fileName) throws IOException {
BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(fileName)));
int counter = 0;
ArrayList<IPHolder> ips = new ArrayList<IPHolder>();
try {
String line;
while ((line = br.readLine()) != null) {
if(!getIP(line).equals("0.0.0.0")){
if(ips.size()==0){
IPHolder newIP = new IPHolder();
newIP.setIp(getIP(line));
newIP.setCount(0);
ips.add(newIP);
}
for(int j=0;j<ips.size();j++){
if(ips.get(j).getIp().equals(getIP(line))){
ips.get(j).setCount(ips.get(j).getCount()+1);
}else{
IPHolder newIP = new IPHolder();
newIP.setIp(getIP(line));
newIP.setCount(0);
ips.add(newIP);
}
}
if(counter % 1000 == 0){System.out.println(counter+", "+ips.size());}
counter+=1;
}
}
} finally {br.close();}
for(int k=0;k<ips.size();k++){
System.out.println("ip, count: "+ips.get(k).getIp()+" , "+ips.get(k).getCount());
}
}
public String getIP(String ipString){//extracts an ip from a string if the string contains an ip
String IPADDRESS_PATTERN =
"(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)";
Pattern pattern = Pattern.compile(IPADDRESS_PATTERN);
Matcher matcher = pattern.matcher(ipString);
if (matcher.find()) {
return matcher.group();
}
else{
return "0.0.0.0";
}
}
持有者类是:
public class IPHolder {
private String ip;
private int count;
public String getIp(){return ip;}
public void setIp(String i){ip=i;}
public int getCount(){return count;}
public void setCount(int ct){count=ct;}
}
['Map'](https://docs.oracle.com/javase/7/docs/api/java/util/Map.html)可能是你需要的(key = ip,value = count)。番石榴['Multiset'](https://code.google.com/p/guava-libraries/wiki/NewCollectionTypesExplained#Multiset)是一个奇特的选择 – 2014-12-05 22:04:11
@RC。代码如何用地图来代替? – CodeMed 2014-12-05 22:05:11
“Multiset”的链接有一个地图示例 – 2014-12-05 22:06:06