![]() ![]() Multiple terms can be combined together with Boolean operators to form a more complex query (see below). There are two types of terms: Single Terms and Phrases.Ī Single Term is a single word such as test or hello.Ī Phrase is a group of words surrounded by double quotes such as "hello dolly". The "-" or prohibit operator excludes documents that contain the term after the "-" symbol.A query is broken up into terms and operators. For example, the following search will return no results: NOT "jakarta apache". Note: The NOT operator cannot be used with just one term. To search for documents that contain "jakarta apache" but not "Apache Lucene" use the query: "jakarta apache" NOT "Apache Lucene" The symbol ! can be used in place of the word NOT. This is equivalent to a difference using sets. The NOT operator excludes documents that contain the term after NOT. To search for documents that must contain "jakarta" and may contain "lucene" use the query: +jakarta apache NOT The "+" or required operator requires that the term after the "+" symbol exist somewhere in a the field of a single document. To search for documents that contain "jakarta apache" and "Apache Lucene" use the query: "jakarta apache" AND "Apache Lucene" + The symbol & can be used in place of the word AND. This is equivalent to an intersection using sets. The AND operator matches documents where both terms exist anywhere in the text of a single document. To search for documents that contain either "jakarta apache" or just "jakarta" use the query: "jakarta apache" jakarta The symbol || can be used in place of the word OR. This is equivalent to a union using sets. The OR operator links two terms and finds a matching document if either of the terms exist in a document. This means that if there is no Boolean operator between two terms, the OR operator is used. The OR operator is the default conjunction operator. Lucene supports AND, "+", OR, NOT and "-" as Boolean operators(Note: Boolean operators must be ALL CAPS). 0.2) Boolean Operatorsīoolean operators allow terms to be combined through logic operators. Although the boost factor must be positive, it can be less than 1 (e.g. You can also boost Phrase Terms as in the example: "jakarta apache"^4 "Apache Lucene"īy default, the boost factor is 1. ![]() This will make documents with the term jakarta appear more relevant. For example, if you are searching for jakarta apacheĪnd you want the term "jakarta" to be more relevant boost it using the ^ symbol along with the boost factor next to the term. ![]() The higher the boost factor, the more relevant the term will be.īoosting allows you to control the relevance of a document by boosting its term. To boost a term use the caret, "^", symbol with a boost factor (a number) at the end of the term you are searching. Lucene provides the relevance level of matching documents based on the terms found. For example to search for a "apache" and "jakarta" within 10 words of each other in a document use the search: "jakarta apache"~10 Boosting a Term To do a proximity search use the tilde, "~", symbol at the end of a Phrase. Lucene supports finding words are a within a specific distance away. The default that is used if the parameter is not given is 0.5. ![]() The value is between 0 and 1, with a value closer to 1 only terms with a higher similarity will be matched. Starting with Lucene 1.9 an additional (optional) parameter can specify the required similarity. This search will find terms like foam and roams. For example to search for a term similar in spelling to "roam" use the fuzzy search: roam~ To do a fuzzy search use the tilde, "~", symbol at the end of a Single word Term. Lucene supports fuzzy searches based on the Levenshtein Distance, or Edit Distance algorithm. Note: You cannot use a * or ? symbol as the first character of a search. You can also use the wildcard searches in the middle of a term. For example, to search for test, tests or tester, you can use the search: test* Multiple character wildcard searches looks for 0 or more characters. For example, to search for "text" or "test" you can use the search: te?t The single character wildcard search looks for terms that match that with the single character replaced. To perform a multiple character wildcard search use the "*" symbol. To perform a single character wildcard search use the "?" symbol. Lucene supports single and multiple character wildcard searches. Lucene supports modifying query terms to provide a wide range of searching options. ![]()
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |