HYKE LogoSemantic Search

State of the art
An overview of SemSearch
The Google-like query interface
Making sense of the user query
Translating the user query into formal queries
Implementation and experimental evaluation
Conclusions and future work
1st Workshop on Semantic Search
2nd Workshop on Semantic Search
3rd Workshop on Semantic Search
4th Workshop on Semantic Search

Implementation and experimental evaluation

A prototype of SemSearch has been implemented, which uses Sesame and Lucene4.
Sesame provides a query language and a query engine for semantic data repre-
sented in RDF. Lucene provides a fast text search engine, which is used to build
the semantic entity index engine and the semantic entity search engine contained
in the Text Search Layer of SemSearch.
The prototype has been applied to the semantic web portal of our lab. Fig-
ure 5 shows a screenshot of the search results of the query example news:phd
students. As described earlier, the search engine not only gives back the informa-
tion that the user is looking for but also gives back explanations, which makes
the search results much more understandable than those in state-of-art tools.
The search results are ranked according to their closeness to the specified user
keywords. The search engine takes two factors into consideration when ranking.
One is the matching distance between each keyword and its semantic matches.
The other is the number of keywords the search results satisfy. The search en-
gine also provides support for search refinement. It provides a web form to allow
the user to choose the meaning of the keywords and thus supports the user in
reformulating a better search.
To assess the performance of the semantic search engine, we carried out
an initial study in the context of the KMi semantic web portal. We used the
questions that were gathered to evaluate AquaLog (a question answering tool

Fig. 4. A fragment of Java code for SeRQL query construction.

developed in our lab) as the basis for experimental evaluation. Several re-
formulations of each search were attempted if necessary. For each search, we
assigned a score that describes the performance of the search engine: i) 0 - no
result; ii) 1 - could get a result with heavy analysis; iii) 2 - could get a result
with moderate analysis; and iv) 3 - good result.
Taking into account only the questions for which an answer is possible, the
average score was 2.1. The performance scores are only a qualitative assessment
of how we felt the system answered the questions so these results are biased.
However, based on these rudimentary performance measures the semantic search
is answering questions reasonably well where data is available. In particular,
SemSearch was able to answer a high proportion of the questions despite its
simplicity. It is intuitive and simple to learn. The user doesnt need to have a
full grasp of the ontology to get started (though they need to know something).
This is an affordance of the way that results are presented in the interface in
a way that informs the user about the terms in the ontology. That information
can be used to help with search refinement. For example, the user might not
remember relations like generic-area-of-interest but might remember that theSemSearch:

A Search Engine for the Semantic Web 15

Fig. 5. A screenshot of the search results of the query example news:phd students.

word area is involved in a lot of relations about research topics and by browsing
through the output of a search just for area can gather the information on the
ontology vocabulary needed to formulate a better one.