2017-10-18 62 views
0

我在调用API进行地理编码的SSIS作业中有一个c#脚本任务。这个API是专有的,像这样工作,接收请求,获取地址字符串,然后尝试将字符串匹配到一个巨大的地址列表(数百万),如果它找不到它,然后出去到另一个服务,如谷歌和获取地理数据信息。如何加快这个c#HttpWebRequest?

正如你所想象的,这个字符串匹配每个请求占用大量时间。有时它每分钟的请求速度很慢,我有4M地址需要这样做。在API方面进行任何开发工作不是一种选择。为了让这里的过程中更好的画面是我在做什么目前:

我拉从数据库(约4M)地址列表,并把它放在一个DataTable和设置变量:

​​

GetGLFromAddress()这样的工作:

从上面取变量并形成JSON。使用“POST”和httpWebRequest发送JSON。等待请求(耗时)。退货请求。用返回设置新变量。使用这些变量更新/插入到数据库中,然后循环通过原始数据表中的下一行。

理解这个流程很重要,因为我需要能够保持每个请求的变量不变,所以我可以更新数据库中的正确记录。

这里是GetGLFromAddress()

private void GetGLFromAddress() 
    { 
     // Request JSON data with Payload 
     var httpWebRequest = (HttpWebRequest)WebRequest.Create("http:"); 
     httpWebRequest.Headers.Add("Authorization", ""); 
     httpWebRequest.ContentType = "application/json"; 
     httpWebRequest.Method = "POST"; 

     using (var streamWriter = new StreamWriter(httpWebRequest.GetRequestStream())) 
     { 
      // this takes the variables from your c# datatable and formats them for json post 
      var jS = new JavaScriptSerializer(); 
      var newJson = jS.Serialize(new SeriesPost() 
      { 
       AddressLine1 = address, 
       City   = city, 
       StateCode = state, 
       CountryCode = country, 
       PostalCode = zip, 
       CreateSiteIfNotFound = true 
      }); 


      //// So you can see the JSON thats output 
      System.Diagnostics.Debug.WriteLine(newJson); 
      streamWriter.Write(newJson); 
      streamWriter.Flush(); 
      streamWriter.Close(); 

     } 

     try 
     { 
      var httpResponse = (HttpWebResponse)httpWebRequest.GetResponse(); 
      using (var streamReader = new StreamReader(httpResponse.GetResponseStream())) 
      { 
       var result = streamReader.ReadToEnd(); 
       // javascript serializer... deserializing the returned json so that way you can set the variables used for insert string 
       var p1 = new JavaScriptSerializer(); 

       // after this line, obj is a fully deserialzed string of json Notice how I reference obj[x].fieldnames below. If you ever want to change the fiels or bring more in 
       // this is how you do it. 
       var obj = p1.Deserialize<List<RootObject>>(result); 

       // you must ensure the values returned are not null before trying to set the variable. You can see when that happens, I'm manually setting the variable value to null. 
       if (string.IsNullOrWhiteSpace(obj[0].MasterSiteId)) 
       { 
        retGLMID = "null"; 
       } 
       else 
       { 
        retGLMID = obj[0].MasterSiteId.ToString(); 
       } 

       if (string.IsNullOrWhiteSpace(obj[0].PrecisionName)) 
       { 
        retAcc = "null"; 
       } 
       else 
       { 
        retAcc = obj[0].PrecisionName.ToString(); 
       } 

       if (string.IsNullOrWhiteSpace(obj[0].PrimaryAddress.AddressLine1Combined)) 
       { 
        retAddress = "null"; 
       } 
       else 
       { 
        retAddress = obj[0].PrimaryAddress.AddressLine1Combined.ToString(); 
       } 

       if (string.IsNullOrWhiteSpace(obj[0].Latitude)) 
       { 
        retLat = "null"; 
       } 
       else 
       { 
        retLat = obj[0].Latitude.ToString(); 
       } 

       if (string.IsNullOrWhiteSpace(obj[0].Longitude)) 
       { 
        retLong = "null"; 
       } 
       else 
       { 
        retLong = obj[0].Longitude.ToString(); 
       } 
       retNewRecord = obj[0].IsNewRecord.ToString(); 

       // Build insert string... notice how I use the recently created variables 
       // string insertStr = retGLMID + ", '" + retAcc + "', '" + retAddress + "', '" + retLat + "', '" + retLong + "', '" + localID; 
       string insertStr = "insert into table  " + 
            "(ID,GLM_ID,NEW_RECORD_IND,ACCURACY) " + 
            " VALUES          " + 
            "('" + localID + "', '" + retGLMID + "', '" + retNewRecord + "', '" + retAcc + "')"; 


       string connectionString = "Data Source=; Initial Catalog=; Trusted_Connection=Yes"; 
       using (SqlConnection connection = new SqlConnection(connectionString)) 
       { 
        SqlCommand cmd = new SqlCommand(insertStr); 
        cmd.CommandText = insertStr; 
        cmd.CommandType = CommandType.Text; 
        cmd.Connection = connection; 
        connection.Open(); 
        cmd.ExecuteNonQuery(); 
        connection.Close(); 
       } 
      } 
     } 

     { 
      string insertStr2 = "insert into table " + 
           "(ID,GLM_ID,NEW_RECORD_IND,ACCURACY) " + 
           " VALUES          " + 
           "('" + localID + "', null, null, 'Not_Found')"; 
      string connectionString2 = "Data Source=; Initial Catalog=; Trusted_Connection=Yes"; 

      using (SqlConnection connection = new SqlConnection(connectionString2)) 
      { 
       SqlCommand cmd = new SqlCommand(insertStr2); 
       cmd.CommandText = insertStr2; 
       cmd.CommandType = CommandType.Text; 
       cmd.Connection = connection; 
       connection.Open(); 
       cmd.ExecuteNonQuery(); 
       connection.Close(); 
      } 
     } 
    } 

当我试图使用Parallel.Foreach,我曾与变量的问题。我想要运行多个请求,但要保留每个请求的每个变量实例(如果有意义的话)。我无法将API传递给API并返回它,否则这将是理想的。

这甚至可能吗?

如何构建此调用来实现我所追求的内容?

本质上我希望能够发送多个呼叫,以加快整个过程。

编辑:增加了代码GetGlFromAddress()。是的,我是一个新手,所以请客气:)

回答

0

将所有的数据放在一个数组中,一次可以调用多个请求,最好使用多任务或异步方法来调用API。

+0

所以,如果我把所有的请求字符串放在一个数组中,比如说只有2例如A和B.我发送A和B通过,B首先返回。如何确保我更新数据库中的正确记录? Async是否有办法确保以正确的顺序返回事物?那么多线程呢? – user3486773