2011-09-13 95 views
0

我有一个excel电子表格和产品名称列表。我想要做的是(1)将这些产品名称分成5行,(2)设置一个从指定网站(clinicaltrials.gov)提取数据的网站搜索,并将其填充到每个电子表格下方的行中。 (2)现在对我来说更重要和更具挑战性。我知道我必须运行一个遍历所有产品名称的循环。但在我关注循环之前,我需要帮助弄清楚如何编写执行网站搜索的代码。在Excel上进行网站搜索

我接收一些帮助:

以下Excel VBA一小段代码片段将采取细胞与构造的URL的形式:

="URL;http://clinicaltrials.gov/ct2/show?term="& [Cell Reference to Drug name here] &"&rank=1" 

和输出4行,例如:

Estimated Enrollment: 40 
Study Start Date: Jan-11 
Estimated Study Completion Date: Apr-12 
Estimated Primary Completion Date: April 2012 (Final data collection date for primary outcome measure) 

 

    With ActiveSheet.QueryTables.Add(Connection:= _ 
      ActiveCell.Text, Destination:=Cells(ActiveCell.Row, ActiveCell.Column + 1)) 
      .Name = "Clinical Trials" 
      .FieldNames = True 
      .RowNumbers = False 
      .FillAdjacentFormulas = False 
      .PreserveFormatting = True 
      .RefreshOnFileOpen = False 
      .BackgroundQuery = True 
      .RefreshStyle = xlInsertDeleteCells 
      .SavePassword = False 
      .SaveData = True 
      .AdjustColumnWidth = True 
      .RefreshPeriod = 0 
      .WebSelectionType = xlSpecifiedTables 
      .WebFormatting = xlWebFormattingNone 
      .WebTables = "12" 
      .WebPreFormattedTextToColumns = True 
      .WebConsecutiveDelimitersAsOne = True 
      .WebSingleBlockTextImport = False 
      .WebDisableDateRecognition = False 
      .WebDisableRedirections = False 
      .Refresh BackgroundQuery:=False 
     End With 
+0

你能发布一个样品药物名称,以便我可以看到查询结果吗? – JimmyPena

回答

1

您提供的URL不起作用。您需要NCT ID才能进入正确的页面,而不是药物名称。假设你有A1上市的两种药物:B2和适当的NCT ID是在B列

celebrex NCT00571701 
naproxen NCT00586365 

若要使用此代码,设置对Microsoft XML 5.0库的引用和Microsoft窗体2.0库。

Sub GetClinical() 

    Dim i As Long 
    Dim lLast As Long 
    Dim oHttp As MSXML2.XMLHTTP50 
    Dim sHtml As String 
    Dim lDataStart As Long, lTblStart As Long, lTblEnd As Long 
    Dim doClip As DataObject 

    'Find the last cell in column A 
    lLast = Sheet1.Cells(Sheet1.Rows.Count, 1).End(xlUp).Row 
    Set oHttp = New MSXML2.XMLHTTP50 

    'Loop from the last cell to row 1 in column A 
    For i = lLast To 1 Step -1 
     'Insert 5 rows below 
     Sheet1.Cells(i, 1).Offset(1, 0).Resize(5).EntireRow.Insert 

     'get the web page 
     oHttp.Open "GET", "http://clinicaltrials.gov/ct2/show/" & Sheet1.Cells(i, 2).Value & "?rank=1" 
     oHttp.send 
     sHtml = oHttp.responseText 

     'Find the start and end to the table 
     lDataStart = InStr(1, sHtml, "Estimated Enrollment:") 
     lTblStart = InStr(lDataStart - 200, sHtml, "<table") 
     lTblEnd = InStr(lDataStart, sHtml, "</table>") + 8 

     'put the table in the clipboard 
     Set doClip = New DataObject 
     doClip.SetText Mid$(sHtml, lTblStart, lTblEnd - lTblStart) 
     doClip.PutInClipboard 

     'paste the table as text 
     Sheet1.Cells(i, 1).Offset(1, 0).Select 
     Sheet1.PasteSpecial "Text", , , , , , True 

    Next i 

End Sub 

如果你没有NCT号码,我认为你不能构建一个可行的URL。另请注意,我通过查找特定字符串(预计注册人: - 注意其间的两个空格)并备份200个字符来查找表。 200是任意的,但为celebrex和naproxen工作。我无法保证他们的格式一致。他们不使用表格ID,因此很难找到合适的表格。

在运行修改它的代码之前,请始终对数据进行备份。

0

如果您运行搜索并查看结果页面的底部,则会看到可以选择以各种格式下载结果。比如这个网址会下载所有的氟西汀的结果制表符分隔的格式:

http://clinicaltrials.gov/ct2/results/download?down_stds=all&down_flds=all&down_fmt=tsv&term=fluoxetine 

唯一复杂的,结果是压缩的,所以你需要保存文件,第一解压。幸运的是,我已经不得不这样做了......在工作簿的同一文件夹中创建一个名为“files”的文件夹,然后添加此代码并对其进行测试。适合我的作品。

Option Explicit 

Sub Tester() 

    FetchUnzipOpen "fluoxetine" 

End Sub 

Sub FetchUnzipOpen(DrugName As String) 
    Dim s, sz 'don't dim these as strings-must be variants! 
    s = ThisWorkbook.Path & "\files" 
    sz = s & "\test.zip" 
    FetchFile "http://clinicaltrials.gov/ct2/results/download?down_stds=all&" & _ 
       "down_flds=all&down_fmt=tsv&term=" & DrugName, sz 
    Unzip s, sz 
    'now you just need to open the data file (files/search_result.txt) 
End Sub 


Sub FetchFile(sURL As String, sPath) 
Dim oXHTTP As Object 
Dim oStream As Object 

    Set oXHTTP = CreateObject("MSXML2.XMLHTTP") 
    Set oStream = CreateObject("ADODB.Stream") 
    Application.StatusBar = "Fetching " & sURL & " as " & sPath 
    oXHTTP.Open "GET", sURL, False 
    oXHTTP.send 
    With oStream 
     .Type = 1 'adTypeBinary 
     .Open 
     .Write oXHTTP.responseBody 
     .SaveToFile sPath, 2 'adSaveCreateOverWrite 
     .Close 
    End With 
    Set oXHTTP = Nothing 
    Set oStream = Nothing 
    Application.StatusBar = False 

End Sub 

Sub Unzip(sDest, sZip) 
Dim o 
Set o = CreateObject("Shell.Application") 
o.NameSpace(sDest).CopyHere o.NameSpace(sZip).Items 
End Sub