2017-01-12 69 views
0

我有一个工作程序,但它像弗兰肯斯坦 - 其他程序放在一起的部分,可能是多余的。以下是我想要做的: 在二进制文件&中查找从该位置到EOF的字符串,将内容转储为字符串。如何优化我的代码 - vb.net

这里是我的代码:

Imports System.IO 
Public Class Form1 
    Dim b() As Byte = IO.File.ReadAllBytes("C:\data.bin") 
    Dim encodme As New System.Text.ASCIIEncoding 
    Dim SearchString As String = "xyzzy" 
    Dim bSearch As Byte() = encodme.GetBytes(SearchString) 
    Dim bFound As Boolean = True 
    Dim oneByte As Byte 
    Dim fileData As New IO.FileStream("C:\data.bin", FileMode.Open, FileAccess.Read) 
    Dim strMessage As String 



    Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click 
     For i As Integer = 0 To b.Length - bSearch.Length - 1 
      If b(i) = bSearch(0) Then 
       bFound = True 
       For j As Integer = 0 To bSearch.Length - 1 
        If b(i + j) <> bSearch(j) Then 
         bFound = False 
         Exit For 
        End If 
       Next 
       If bFound Then 
        fileData.Seek(i + 5, SeekOrigin.Begin) 
        strMessage = "" 
        For r As Integer = (i + 5) To fileData.Length() - 1 
         oneByte = fileData.ReadByte() 
         strMessage = strMessage + Chr(oneByte) 


        Next r 
        MsgBox(strMessage) 
       Else 
        MsgBox("File Doesn't have string") 
        Exit Sub 
       End If 

      End If 

     Next 





    End Sub 
End Class 
+5

有关此更好的访问[CodeReview](http://codereview.stackexchange.com/)。 –

+0

在sourcemaking.com上学习 –

回答

-1

在寻找性能,最好避免试图通过这种事情字节按字节走路。相反,您应该使用.NET为您提供的设施。本例使用正则表达式找到任何文件中的字符串的所有匹配,返回每个跟随它,直到下一场比赛或文件的一个UTF-8字符串结束比赛的一切:

Imports System.IO 
Imports System.Text 
Imports System.Text.RegularExpressions 

Public Class Form1 
    Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load 
     Dim matches = FindStringMatchesInFile("C:\Infinite Air\snowboarding.exe", "data") 

     For Each m In matches 
      ... 
     Next 
    End Sub 


    Private Function FindStringMatchesInFile(filename As String, 
              searchString As String) As List(Of String) 
     Dim output As New List(Of String) 
     Dim reader = New StreamReader(filename, Encoding.UTF8) 

     Dim re = New Regex(String.Format("{0}(?:(?!{0}).)*", searchString), 
          RegexOptions.Singleline Or RegexOptions.IgnoreCase, 
          Regex.InfiniteMatchTimeout) 

     Dim matches = re.Matches(reader.ReadToEnd()) 

     For Each m As Match In matches 
      output.Add(m.ToString()) 
     Next 

     Return output 
    End Function 
End Class 

正则表达式模式定义如下:

Matches the characters {searchString} literally (case insensitive) 
Non-capturing group (?:(?!{searchString}).)* 
* Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy) 
Negative Lookahead (?!{searchString}) 
Assert that the Regex below does not match 
Matches the characters {searchString} literally (case insensitive) 
. matches any character 

Global pattern flags 
g modifier: global. All matches (don't return after first match) 
s modifier: single line. Dot matches newline characters 
i modifier: case insensitive. 
+0

downvoter可以解释这个答案有什么问题吗? –

+0

当你的Regex显然不使用任何东西时,什么是使用发布模式标志?我会假设你只是复制并粘贴你所在的任何正则表达式站点。你确实将它们设置为构建正则表达式的选项,而不是内联并且不相关。你可以在帖子后面显示OP的内联方式,这样他们会更清楚地理解。另外OP说***“在二进制文件中查找字符串”***,你的方法将无法正常工作。 – Codexer

+0

首先,标记就在那里,作为选项,因为这正是.NET中通常要做的事情。其次,它们并非无关紧要。如果你删除S标志,匹配停在换行符。但是当然,你可以删除I标志。第三,它在二进制文件上工作,根本没有问题。我只是在一个25MB的可执行文件上尝试过,发现了902个匹配的“数据”字样。尝试一下。 –