Java SAX解析器进度监控

9

使用javax.swing.ProgressMonitorInputStream.

来源

2010-06-23 10:48:21 EJP

+0

我认为这将足够接近。谢谢！ – Danijel 2010-06-24 10:38:02

+0

任何答案都可以比这更简单吗？！ :) – Matthieu 2013-07-17 07:12:39

1

假设你知道你有多少文章，你不能只在处理程序中保留一个计数器吗？例如。

public void startElement (String uri, String localName, 
          String qName, Attributes attributes) 
          throws SAXException { 
    if(qName.equals("article")){ 
     counter++ 
    } 
    ... 
}

（我不知道你是否正在解析“文章”，这只是一个例子）

如果你不事先知道文章的数量，你需要先算它。然后你可以打印状态nb tags read/total nb of tags，比如说每100个标签（counter % 100 == 0）。

甚至有另一个线程监视进度。在这种情况下，您可能希望同步对计数器的访问，但并非必要，因为它不需要非常准确。

我的2美分

来源

2010-06-23 08:37:02 ewernli

+0

我想通了，但我正在寻找一种方法来做到这一点，而无需首先计算文章。我想也许有一种方法可以找出解析器在文件中的位置，因为我可以轻松地获取文件大小。 – Danijel 2010-06-23 09:15:41

2

您可以通过重写方法org.xml.sax.helpers.DefaultHandler/BaseHandlersetDocumentLocator得到您的文件当前行/列的估计。用一个对象调用此方法，在需要时可从中获取当前行/列的近似值。

编辑：据我所知，没有标准的方法来获得绝对的位置。但是，我相信一些SAX实现提供了这种信息。

来源

2010-06-23 08:54:44

+0

关闭，但后来我必须知道文件中的行数，对不对？ – Danijel 2010-06-23 09:17:10

+0

确实。另一个想法可能是由神秘的EJP指出的。您可以使用输入流中的提升来估计进度。然而，这不是解析过程中的进展，因为可能存在缓冲和预测。 – 2010-06-23 12:20:10

0

我会使用输入流中的位置。制作自己的普通流类，委托/从“真实”类继承并跟踪读取的字节。正如你所说，获取文件总量很容易。我不会担心缓冲，超前等等 - 对于像这样的大文件，它是鸡饲料。另一方面，我将这个职位限制为“99％”。

来源

2011-07-01 17:48:20

10

由于EJP对ProgressMonitorInputStream的建议，最后我扩展了FilterInputStream，这样ChangeListener就可以用来监控当前读取的字节位置。

有了这个，你可以更好地控制，例如为了显示平行读取大XML文件的多个进度条。这正是我所做的。

因此，监测的数据流的一个简化版本：

/** 
* A class that monitors the read progress of an input stream. 
* 
* @author Hermia Yeung "Sheepy" 
* @since 2012-04-05 18:42 
*/ 
public class MonitoredInputStream extends FilterInputStream { 
    private volatile long mark = 0; 
    private volatile long lastTriggeredLocation = 0; 
    private volatile long location = 0; 
    private final int threshold; 
    private final List<ChangeListener> listeners = new ArrayList<>(4); 


    /** 
    * Creates a MonitoredInputStream over an underlying input stream. 
    * @param in Underlying input stream, should be non-null because of no public setter 
    * @param threshold Min. position change (in byte) to trigger change event. 
    */ 
    public MonitoredInputStream(InputStream in, int threshold) { 
     super(in); 
     this.threshold = threshold; 
    } 

    /** 
    * Creates a MonitoredInputStream over an underlying input stream. 
    * Default threshold is 16KB, small threshold may impact performance impact on larger streams. 
    * @param in Underlying input stream, should be non-null because of no public setter 
    */ 
    public MonitoredInputStream(InputStream in) { 
     super(in); 
     this.threshold = 1024*16; 
    } 

    public void addChangeListener(ChangeListener l) { if (!listeners.contains(l)) listeners.add(l); } 
    public void removeChangeListener(ChangeListener l) { listeners.remove(l); } 
    public long getProgress() { return location; } 

    protected void triggerChanged(final long location) { 
     if (threshold > 0 && Math.abs(location-lastTriggeredLocation) < threshold) return; 
     lastTriggeredLocation = location; 
     if (listeners.size() <= 0) return; 
     try { 
     final ChangeEvent evt = new ChangeEvent(this); 
     for (ChangeListener l : listeners) l.stateChanged(evt); 
     } catch (ConcurrentModificationException e) { 
     triggerChanged(location); // List changed? Let's re-try. 
     } 
    } 


    @Override public int read() throws IOException { 
     final int i = super.read(); 
     if (i != -1) triggerChanged(location++); 
     return i; 
    } 

    @Override public int read(byte[] b, int off, int len) throws IOException { 
     final int i = super.read(b, off, len); 
     if (i > 0) triggerChanged(location += i); 
     return i; 
    } 

    @Override public long skip(long n) throws IOException { 
     final long i = super.skip(n); 
     if (i > 0) triggerChanged(location += i); 
     return i; 
    } 

    @Override public void mark(int readlimit) { 
     super.mark(readlimit); 
     mark = location; 
    } 

    @Override public void reset() throws IOException { 
     super.reset(); 
     if (location != mark) triggerChanged(location = mark); 
    } 
}

它不知道 - 或者护理 - 底层流有多大，所以你需要得到它的一些其他的方式，比如从文件本身。

所以，在这里不用简化示例用法：

try (
    MonitoredInputStream mis = new MonitoredInputStream(new FileInputStream(file), 65536*4) 
) { 

    // Setup max progress and listener to monitor read progress 
    progressBar.setMaxProgress((int) file.length()); // Swing thread or before display please 
    mis.addChangeListener(new ChangeListener() { @Override public void stateChanged(ChangeEvent e) { 
     SwingUtilities.invokeLater(new Runnable() { @Override public void run() { 
     progressBar.setProgress((int) mis.getProgress()); // Promise me you WILL use MVC instead of this anonymous class mess! 
     }}); 
    }}); 
    // Start parsing. Listener would call Swing event thread to do the update. 
    SAXParserFactory.newInstance().newSAXParser().parse(mis, this); 

} catch (IOException | ParserConfigurationException | SAXException e) { 

    e.printStackTrace(); 

} finally { 

    progressBar.setVisible(false); // Again please call this in swing event thread 

}

在我的情况下，很好地进展提高自左向右无异常跳跃。调整性能和响应性之间的最佳平衡阈值。太小，阅读速度在小设备上可能翻倍，太大，进展不顺利。

希望它有帮助。如果您发现错误或错别字，请随时编辑，或投票给我一些鼓励！：D

来源

2012-04-09 09:30:57 Sheepy

+0

非常好！正是我在找什么，我会适应，谢谢！ :) – Matthieu 2013-07-17 07:15:25

Java SAX解析器进度监控

回答

相关问题