2016-09-13 128 views
0

在umbraco中设置Lucene搜索引擎时,我遇到了一个问题。我试图搜索由Umbraco创建的默认索引中存储的数据。搜索的方法如下:从umbraco搜索中排除宏

 private DictionaryResult GetRowContent(
     Lucene.Net.Highlight.Highlighter highlighter, 
     Lucene.Net.Analysis.Standard.StandardAnalyzer analyzer 
     ,Lucene.Net.Documents.Document doc1, string criteria) 
    { 
     JavaScriptSerializer jsScriptSerializer = new JavaScriptSerializer(); 
     DictionaryResult controls = new DictionaryResult(); 
     Lucene.Net.Analysis.TokenStream stream = analyzer.TokenStream("", new StringReader(doc1.Get("bodyContent"))); 
     dynamic rowContentHtmlDocument = JObject.Parse(((JValue)doc1.Get("bodyContent")).ToString(CultureInfo.CurrentCulture)); 
     foreach (dynamic section in rowContentHtmlDocument.sections) 
     { 
      foreach (var row in section.rows) 
      { 
       foreach (var area in row.areas) 
       { 
        foreach (var control in area.controls) 
        { 
         if (control != null && control.editor != null) // && control.editor.view != null) 
         { 
          JObject rowContentHtml = null; 
          try 
          { 
           rowContentHtml = JObject.Parse(((JContainer)control)["value"].ToString()); 
          } 
          catch (Exception e) 
          { 
          } 
          if (rowContentHtml != null) 
          { 
           try 
           { 
            var macroParamsDictionary = JObject.Parse(((JContainer)rowContentHtml)["macroParamsDictionary"].ToString()); 
            var documentText = macroParamsDictionary.GetValue("dokument"); 
            if (documentText != null) 
            { 
             var document = documentText.ToString().Replace(""", "\""); 
             dynamic documents = jsScriptSerializer.Deserialize<dynamic>(document); 
             foreach (Dictionary<string, object> doc in documents) 
             { 
              if (doc.ContainsKey("FileName") && doc.ContainsKey("DocumentId")) 
              { 
               if (doc["FileName"].ToString().Length > 0 && 
                doc["FileName"].ToString().ToLower().Contains(criteria.ToLower())) 
               { 
                controls.Add(new RowResult() 
                { 
                 Type = 0, 
                 Object = new Document() 
                 { 
                  DocumentName = doc["FileName"].ToString(),//highlighter.GetBestFragments(stream, doc["FileName"].ToString(), 1, "..."), 
                  DocId = Guid.Parse(doc["DocumentId"].ToString()) 
                 } // StringBuilder(@"<a href=" + Url.Action("DownloadDocument", "Document", new { DocumentId = doc["DocumentId"] }) + "> " + @doc["FileName"] + "</a>").ToString() 
                } 
                ); 
               } 
              } 
             } 
            } 
           } 
           catch (Exception e) 
           { 
           } 
          } 
          else 
          { 
           var text = HtmlRemoval.StripTagsRegex(((JContainer)control)["value"].ToString()).Replace("ë", "e").Replace("ç", "c"); 
           var textResultFiltered = highlighter.GetBestFragments(stream,doc1.Get("bodyContent"), 5, "..."); 
           controls.Add(new RowResult() 
           { 
            Type = 1, 
            Object = textResultFiltered 
           }); 
          } 
         } 
        } 
       } 
      } 
     } 
     return controls; 
    } 

这里我试图从简单的html内容过滤宏文件,并呈现不同。但在这部分结束

var text = HtmlRemoval.StripTagsRegex(((JContainer)control)["value"].ToString()).Replace("ë", "e").Replace("ç", "c"); 
          var textResultFiltered = highlighter.GetBestFragments(stream,doc1.Get("bodyContent"), 5, "..."); 
          controls.Add(new RowResult() 
          { 
           Type = 1, 
           Object = textResultFiltered 
          }); 

它包括搜索宏。其结果是我得到的文件属性,但HTML内容在突出显示了宏观的内容象下面这样:

6th Edition V413HAV.pdf","FileContent"... Framework 6th Edition V413HAV.pdf","... with Java 8 - 1st Edition (2015) - Copy.pdf"... 4.5 Framework 6th Edition V413HAV.pdf","... And The NET 4.5 Framework 6th Edition V413HAV.pdf" which is coming from Json data of the macro. Any idea how to exclude the macros from searching or to customize the hmtl content not to search on specific macro ? Thanks in advance. 

我指的这个链接来创建Hightlighter等等 Link to Lucene example

任何想法如何阻止在宏中搜索或从高亮显示的内容中排除它们?

回答

0

如果您只是进行常规搜索,那看起来太复杂了。你知道Umbraco有自己的Lucene“版本”,叫做Examine吗?它内置到Umbraco中,并且不需要太多设置即可运行标准搜索:https://our.umbraco.org/documentation/reference/searching/examine/

我从未在我的搜索结果中看到过使用检查的宏或JSON标记,因此可能试试?

0

您可以轻松使用检查。 您只需要选择您想要的搜索提供程序(config/ExamineSettings.config),它允许您选择是否要避免未发布和受保护的内容。然后,您只需执行下一段代码,您可以选择要搜索的字段或不想避免的Dact类型。

string term = "test" 

var criteria = ExamineManager.Instance.SearchProviderCollection["ExternalSearcher"].CreateSearchCriteria(); 
var crawl = criteria.GroupedOr(new string[] { "nodeName", "pageTitle", "metaDescription", "metaKeywords" }, term) 
       .Not().Field("nodeTypeAlias", "GlobalSettings") 
       .Not().Field("nodeTypeAlias", "Error") 
       .Not().Field("nodeTypeAlias", "File") 
       .Not().Field("nodeTypeAlias", "Folder") 
       .Not().Field("nodeTypeAlias", "Image") 
       .Not().Field("excludeFromSearch", "1") 
       .Compile(); 

ISearchResults SearchResults = ExamineManager.Instance 
       .SearchProviderCollection["ExternalSearcher"] 
       .Search(crawl); 

IList<JsonSearchResult> results = new List<JsonSearchResult>(); 

希望这是有道理的。

+0

嗨卢西奥,谢谢。我想知道如何制作高亮过程。如果你能给我一个真实的例子吗?! –

0

我尝试使用检查以及下文:

SearchQuery = string.Format("+{0}:{1}~", SearchField, criteria); 
var Criteria = ExamineManager.Instance 
        .SearchProviderCollection["ExternalSearcher"] 
        .CreateSearchCriteria(); 
var crawl = Criteria.GroupedOr(new string[] { "bodyContent", "nodeName" }, criteria) 
        .Not() 
        .Field("umbracoNaviHide", "1") 
        .Not() 
        .Field("nodeTypeAlias", "Image") 
        .Compile(); 
IEnumerable<Examine.SearchResult> SearchResults1 = ExamineManager.Instance 
        .SearchProviderCollection["ExternalSearcher"] 
        .Search(crawl); 

我用两种方法来突出作为下面的文字,但这些方法在那里不是很有效!我有一些没有突出显示文字的链接。

 public string GetHighlight(string value, string highlightField, BaseLuceneSearcher searcher, string luceneRawQuery) 
    { 
     var query = GetQueryParser(highlightField).Parse(luceneRawQuery); 
     var scorer = new QueryScorer(searcher.GetSearcher().Rewrite(query)); 

     var highlighter = new Highlighter(HighlightFormatter, scorer); 

     var tokenStream = HighlightAnalyzer.TokenStream(highlightField, new StringReader(value)); 
     return highlighter.GetBestFragments(tokenStream, value, MaxNumHighlights, Separator); 
    } 
    protected QueryParser GetQueryParser(string highlightField) 
    { 
     if (!QueryParsers.ToString().Contains(highlightField)) 
     { 
      var temp = new QueryParser(_luceneVersion, highlightField, HighlightAnalyzer); 
      return temp; 
     } 
     return null; 
    } 

如果你有突出的检查是非常有效的我会非常感激你的回应任何样品..

+0

我还没有试过检查突出显示,所以我画了一个空白,我害怕。但获取没有突出显示文字的链接 - 这不是一个CSS问题吗? –

+0

我遇到的问题只是关于像google这样的高亮段落链接。链接是好的。非常感谢您的帮助:) –