2016-05-14 30 views
0

我应该从saing开始:我不擅长编程,但它非常有趣! 我正在研究类似Siri的程序,我正在尝试实现维基百科功能。要做到这一点,我问一个问题,例如:告诉我关于超人C#从字符串中提取一个字

我需要提取单词超人或任何其他随机的字,有人可能会问这个字符串。这并不难,但真正的问题始于有人问:你能否告诉我有关超人,我仍然想要提取超人这个词。

这是之前我曾尝试一个例子:

if ((c.Contains("tell me about")) || (c.Contains("Tell me about"))) 
{ 
    string query = c; 
    var part = query.Split('t').Last(); //cant search for words containing the letter t like artificial intelligence 

    string url = ("http://lookup.dbpedia.org/api/search.asmx/KeywordSearch?QueryString=" + part + "&MaxHits=1"); 

    XmlReader reader = XmlReader.Create(url); 
    while (reader.Read()) 
     switch (reader.Name.ToString()) 
     { 
      case "Description": 
       sp(reader.ReadString()); 
       break; 

     } 
} 

我几乎能解决这个问题,好像这个解决方案运作时间的80%左右。不过,这是朝着正确方向迈出的一步。

 if ((c.Contains("tell me about")) || (c.Contains("Tell me about"))) 
     { 
      string query = c; 
      string[] lines = Regex.Split(query, "about "); 
      foreach (string line in lines) 
      { 

      string url = ("http://lookup.dbpedia.org/api/search.asmx/KeywordSearch?QueryString=" + line + "&MaxHits=1"); 

       XmlReader reader = XmlReader.Create(url); 
       while (reader.Read()) 

        switch (reader.Name.ToString()) 
        { 
         case "Description": 
          sp(reader.ReadString()); 
          break; 

        } 
      } 

有没有更好/更简单的方法来做到这一点?

+0

你可以使用[parsey mcparseface](http://thenextweb.com/dd/2016/05/12/google-just-open-sourced-something-called-parsey-mcparseface-change-ai-forever/#格列夫)!语言学认知也不是一件容易的事。 – Tdorno

+0

制作一个字典,其中包含要解释的词语解释对。总之,优先处理多个匹配词的出现。那么对于你想要的简单情况,你或多或少地完成了.. – Ian

回答

2

正如评论中所建议的,如果它适用于任何类型的生产应用,最好的选择是使用一些现有的库。

仍然可以自己做一个有趣的练习。

我会说有更多的方法来问超人。

"what do you know about Superman" 
"let's talk about Superman" 
"who is Superman" 

还有很多很多。

所有的问题都是从一些辅助词组成的:“什么”,“谁”,“一个”,“大约”以及描述问题主题的实际词汇:“超人”。 简化的方法是消除所有的辅助设备并采取任何遗留物。

要快速构建问题单词和问题短语的简单列表,我使用了English grammar site。我采取了短语,并删除了问题的主题。这给了我一份列表,其中列出了50-60个辅助词。

现在我要做的就是取出句子并删除辅助列表中的所有单词。代码如下:

class Program 
{ 
    // All the words collected from the sample question phrases. 
    private static string auxStr = @"Who is the Who are Who is that there Where is the Where do you Where are my 
     When do the When is his When are we Why do we Why are they always Why does he What is What is her What is the Which 
     drink did you Which Which is How do you How does he know the answer How can I learn many much often far tell say 
     explain answer for from with about on me he his him her hers your yours they theyr theyrs"; 

    private static List<string> aux = new List<string>(); 

    static void Main(string[] args) 
    { 
     // Build a list of auxiliary words. 
     aux = auxStr.ToLower().Split(' ').Distinct().ToList(); 

     // Test the method to get a subject. 
     var subject = GetSubject("Do you know where is Poland", aux); 

     foreach(var s in subject) 
     { 
      Console.WriteLine(s); 
     } 

     Console.ReadLine(); 
    } 

    private static List<string> GetSubject(string question, List<string> auxiliaries) 
    { 
     // Convert the question to a list of strings 
     var listQuestion = question.ToLower().Split(' ').Distinct().ToList(); 

     // Remove from the question all the words 
     // that are in the list of auxiliary phrases 
     var notAux = listQuestion.Where(w => !auxiliaries.Contains(w)).ToList(); 

     return notAux; 
    } 
} 

这是相当简单的,但没有努力,它缩小了问题的潜在主题的名单。

0

我终于找到了答案:

if ((c.Contains("tell me about")) || (c.Contains("Tell me about"))) 
     { 
      string query = c; 
      string[] lines = Regex.Split(query, "about "); 
      string finalquery = lines[lines.Length - 1]; 

      string url = ("http://lookup.dbpedia.org/api/search.asmx/KeywordSearch?QueryString=" + finalquery + "&MaxHits=1"); 

       XmlReader reader = XmlReader.Create(url); 
       while (reader.Read()) 

        switch (reader.Name.ToString()) 
        { 
         case "Description": 
          sp(reader.ReadString()); 
          break; 

        } 
     } 

现在它工作时间的100%! 如果有人知道更好的方式来做到这一点,我会更乐意听到。