我碰到一个网站,似乎很简单,我是非常有信心,我会使用的HttpWebRequest能够读取其数据,并能够做到的GET和POST请求来了。 GET请求工作正常。 POST请求也不会产生任何错误,但发布的表单数据仍然不会影响返回的结果。发布的表单数据具有根据日期过滤数据的字段,但无论发布每个所需数据的事实如何,都不会过滤返回的数据。我添加了每个标题,表单数据并在请求中添加了Cookie。提交的表单数据没有影响
的网页的URL为http://www.bseindia.com/corporates/Insider_Trading_new.aspx?expandable=0
这似乎是一个很普通的网站,但因为它是一个aspx页面,涉及到的ViewState和事件验证因此该预期不是很容易。
我的第一个步骤是使用招网站的GET和POST来分析,这让我感到吃惊,因为提琴手没有捕捉任何流量此URL。我曾尝试查尔斯,但它本身并没有捕获这个网址。除此之外,这位Url Fiddler和Charles都在捕捉其他一切。我还想提一下,当我使用HttpWebRequest从控制台应用程序调用Url时,Fiddler和Charles都捕获了它,但它们没有从Chrome,FireFox和Internet Explorer 11捕获它。
因此,我分析了网络活动FireFox中的开发人员工具,一切都可见,其中包括(标题,参数和Cookie)。在Chrome中没有Cookie存在。当我通过创建HttpWebRequest来检查cookie并获得响应时,不存在cookie。所以,在这个网站上有些奇怪的事情。
我都不知怎么设法创建一个简单的函数来创建请求并得到响应。我做的是,我首先创建一个GET请求,并得到网站的字符串并从中提取视图状态,EventValidation等。我使用这个信息被用于第二个HttpWebRequest这是一个帖子。现在一切正常,我得到的回应,但不是预期的。我想要两个给定日期之间的记录,并且我已经在表单数据中指定了这些日期,但POST请求仍然不会返回过滤的数据。我已经提到了我在下面创建的函数,我将非常感谢任何建议,为什么会发生这种情况以及如何处理此问题。要理解这一点已经成为我的一个挑战,因为我不明白为什么这个简单的网站没有出现在小提琴手中。 (这使用JavaScript回发)
该代码可能看起来很长和可怕,但它是非常简单和直接。
Try
' First GET Request to obtain Viewstate, Eventvalidation etc
Dim objRequest2 As Net.HttpWebRequest = DirectCast(HttpWebRequest.Create("http://www.bseindia.com/corporates/Insider_Trading_new.aspx?expandable=0"), HttpWebRequest)
objRequest2.Method = "GET"
objRequest2.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8"
objRequest2.Headers.Add("Accept-Encoding", "gzip, deflate")
objRequest2.Headers.Add("Accept-Language", "en-GB,en-US;q=0.8,en;q=0.6,ur;q=0.4")
objRequest2.KeepAlive = True
objRequest2.ContentType = "application/x-www-form-urlencoded"
objRequest2.Host = "www.bseindia.com"
objRequest2.UserAgent = "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36"
objRequest2.AutomaticDecompression = DecompressionMethods.Deflate Or DecompressionMethods.GZip
Dim LoginRes2 As Net.HttpWebResponse
Dim sr2 As IO.StreamReader
LoginRes2 = objRequest2.GetResponse()
sr2 = New IO.StreamReader(LoginRes2.GetResponseStream)
Dim getString As String = sr2.ReadToEnd()
Dim getCookieCollection = objRequest2.CookieContainer
' get the page ViewState
Dim viewStateFlag As String = "id=""__VIEWSTATE"" value="""
Dim i As Integer = getString.IndexOf(viewStateFlag) + viewStateFlag.Length
Dim j As Integer = getString.IndexOf("""", i)
Dim viewState As String = getString.Substring(i, j - i)
' get page EventValidation
Dim eventValidationFlag As String = "id=""__EVENTVALIDATION"" value="""
i = getString.IndexOf(eventValidationFlag) + eventValidationFlag.Length
j = getString.IndexOf("""", i)
Dim eventValidation As String = getString.Substring(i, j - i)
' get page EventValidation
Dim viewstateGeneratorFlag As String = "id=""__VIEWSTATEGENERATOR"" value="""
i = getString.IndexOf(viewstateGeneratorFlag) + viewstateGeneratorFlag.Length
j = getString.IndexOf("""", i)
Dim viewStateGenerator As String = getString.Substring(i, j - i)
viewState = System.Web.HttpUtility.UrlEncode(viewState)
eventValidation = System.Web.HttpUtility.UrlEncode(eventValidation)
Dim LoginRes As Net.HttpWebResponse
Dim sr As IO.StreamReader
Dim objRequest As Net.HttpWebRequest
' Second POST request to post the form data along with cookies
objRequest = DirectCast(HttpWebRequest.Create("http://www.bseindia.com/corporates/Insider_Trading_new.aspx?expandable=0"), HttpWebRequest)
Dim formDataCollection As New NameValueCollection
formDataCollection.Add("__EVENTTARGET", "")
formDataCollection.Add("__EVENTARGUMENT", "")
formDataCollection.Add("__VIEWSTATE", viewState)
formDataCollection.Add("__VIEWSTATEGENERATOR", viewStateGenerator)
formDataCollection.Add("__EVENTVALIDATION", eventValidation)
formDataCollection.Add("fmdate", "20160104")
formDataCollection.Add("eddate", "20160204")
formDataCollection.Add("hidCurrentDate", "2016/02/04")
formDataCollection.Add("ctl00_ContentPlaceHolder1_hdnCode", "")
formDataCollection.Add("txtDate", "04/01/2016")
formDataCollection.Add("ddlCalMonthDiv3", "1")
formDataCollection.Add("ddlCalYearDiv3", "2016")
formDataCollection.Add("txtTodate", "04/02/2016")
formDataCollection.Add("ddlCalMonthDiv4", "2")
formDataCollection.Add("ddlCalYearDiv4", "2016")
formDataCollection.Add("Hidden1", "")
formDataCollection.Add("ctl00_ContentPlaceHolder1_GetQuote1_smartSearch", "Enter Security Name/Code/ID")
formDataCollection.Add("btnSubmit.x", "44")
formDataCollection.Add("btnSubmit.y", "2")
Dim strFormdata As String = formDataCollection.ToString()
Dim encoding As New ASCIIEncoding
Dim postBytes As Byte() = encoding.GetBytes(strFormdata)
objRequest.Method = "POST"
objRequest.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8"
objRequest.Headers.Add("Accept-Encoding", "gzip, deflate")
objRequest.Headers.Add("Accept-Language", "en-GB,en-US;q=0.8,en;q=0.6,ur;q=0.4")
objRequest.Headers.Add("Cache-Control", "private, max-age=60")
objRequest.KeepAlive = True
objRequest.ContentType = "application/x-www-form-urlencoded"
objRequest.Host = "www.bseindia.com"
objRequest.Headers.Add("Origin", "http://www.bseindia.com")
objRequest.Referer = "http://www.bseindia.com/corporates/Insider_Trading_new.aspx?expandable=0"
objRequest.Headers.Add("Upgrade-Insecure-Requests", "1")
objRequest.UserAgent = "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36"
objRequest.ContentType = "text/html; charset=utf-8"
objRequest.Date = "Thu, 04 Feb 2016 13:42:04 GMT"
objRequest.Headers.Add("Server", "Microsoft-IIS/8.0")
objRequest.Headers.Add("Vary", "Accept-Encoding")
objRequest.Headers.Add("X-AspNet-Version", "2.0.50727")
objRequest.Headers.Add("ASP.NET", "ASP.NET")
objRequest.AutomaticDecompression = DecompressionMethods.Deflate Or DecompressionMethods.GZip
Dim gaCookies As New CookieContainer()
Dim cookie1 As New Cookie("__asc", "f673f0d5152a823bc335f575d34")
cookie1.Domain = ".bseindia.com"
cookie1.Path = "/"
gaCookies.Add(cookie1)
Dim cookie2 As New Cookie("__auc", "f673f0d5152a823bc335f575d34")
cookie2.Domain = ".bseindia.com"
cookie2.Path = "/"
gaCookies.Add(cookie2)
Dim cookie3 As New Cookie("__utma", "253454874.280640365.1454519857.1454519865.1454519865.1")
cookie3.Domain = ".bseindia.com"
cookie3.Path = "/"
gaCookies.Add(cookie3)
Dim cookie4 As New Cookie("__utmb", "253454874.1.10.1454519865")
cookie4.Domain = ".bseindia.com"
cookie4.Path = "/"
gaCookies.Add(cookie4)
Dim cookie5 As New Cookie("__utmc", "253454874")
cookie5.Domain = ".bseindia.com"
cookie5.Path = "/"
gaCookies.Add(cookie5)
Dim cookie6 As New Cookie("__utmt", "1")
cookie6.Domain = ".bseindia.com"
cookie6.Path = "/"
gaCookies.Add(cookie6)
Dim cookie7 As New Cookie("__utmz", "253454874.1454519865.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)")
cookie7.Domain = ".bseindia.com"
cookie7.Path = "/"
gaCookies.Add(cookie7)
Dim cookie8 As New Cookie("_ga", "GA1.2.280640365.1454519857")
cookie8.Domain = ".bseindia.com"
cookie8.Path = "/"
gaCookies.Add(cookie8)
Dim cookie9 As New Cookie("_gat", "1")
cookie9.Domain = ".bseindia.com"
cookie9.Path = "/"
gaCookies.Add(cookie9)
Dim postStream As Stream = objRequest.GetRequestStream()
postStream.Write(postBytes, 0, postBytes.Length)
postStream.Flush()
postStream.Close()
LoginRes = objRequest.GetResponse()
sr = New IO.StreamReader(LoginRes.GetResponseStream)
ReadWebsite = sr.ReadToEnd()
sr.Close()
sr = Nothing
LoginRes.Close()
LoginRes = Nothing
objRequest = Nothing
Exit Function
Catch ex As Exception
ReadWebsite = Nothing
End Try
注:(对于日期的原始形式的数据,而不视图状态和eventvalidation)
fmdate:20160130 eddate:20160205 hidCurrentDate:2016年2月5日 ctl00_ContentPlaceHolder1_hdnCode: txtDate:2016年4月1日 ddlCalMonthDiv3:1 ddlCalYearDiv3:2016 txtTodate:2016年4月2日 ddlCalMonthDiv4:2 ddlCalYearDiv4:2016 Hidden1: ctl00_ContentPlaceHolder1_Ge tQuote1_smartSearch:输入安全名称/代码/ ID btnSubmit.x:55 btnSubmit.y:13
如果提供评论,为什么问题被低估并投票结束,这将非常有帮助。我知道在这个特定主题上有不同的问题,但这种情况和情况是不同的。对我来说,这个论坛的目的是为了解你和你周围的人所不了解的事情。我已经明确地提到了我所写的所有努力和代码,所以如果没有适当的研究,我也不会问任何问题。 –
我会检查:'formDataCollection.Add(“fmdate”,“20160104”)'和它下面的行。您使用的所有其他日期似乎都是不同的格式。 – Jeroen
@Jeroen感谢您的意见。我使用的是我在检查员发现的相同格式。请检查我的更新评论。我已添加从Chrome复制的原始表单数据。 –