AD Teaching Wiki:

Type: This is well-suited as a project (B.Sc. or M.Sc.) but also provides ample opportunity for continuation with a theses (B.Sc. or M.Sc). You should be fond of good user interfaces and have a good taste concerning layout and colors and such things. You should also like knowledge bases and big datasets and searching in them.

Background info: Broccoli is our current system for convenient semantic search on text combined with a knowledge base. Here is a demo. The demo is described in [1]. Broccoli works with a special backend, which only supports a particular subset of queries (in a nutshell: treelike SPARQL queries with a text-search component). The backend also has its own peculiar and non-standard syntax. The backend is described in [4]. In the meantime, we have developed a full-featured SPARQL+Text engine, called QLever [1]. QLever supports full SPARQL queries with a text-search component. However, the current UI is very simplistic and hard to use: it's a essentially a text field, where one can enter a SPARQL query, and then one gets a table with the result tuples. Here is a link to an internal demo (only works with an IP address from the technical faculty), which searches the Freebase knowledge base in combination with the ClueWeb12 web corpus. An example query for copy&paste is given under [5].

Goal: design a front-end for QLever that makes it easy to ask SPARQL+Text queries, also without prior knowledge of the entity and predicate names of the underlying knowledge base (in the demos above: Freebase). The user interface should be simpler than that for Broccoli.

Subgoal 1: Broccoli does not know synonyms. For example, when you type occupation there will be no relation match because the name of the relation in Freebase is profession. Or if you type acted in there will be no relation match because the name of the relation in Freebase is Film performance. Basic synonyms are very important for usability.

Subgoal 2: Broccoli has four boxes with suggestions, see the demo linked above. This confused users a lot. One of the main tasks will be to devise a simpler interface, but that is still powerful enough to formulate all queries.

Subgoal 3: Broccoli currently shows only a list of entities with text snippets. A result from a general SPARQL query can consist of an arbitrary number of columns (for example, a person, their profession, their gender, and some text snippets where they are mentioned together with certain words).

Subgoal 4: One problem with the current Broccoli interface is that the user has to know what is a predicate and what is an object. However, this depends a lot on how the statements are represented in the knowledge base. Which is not something the user can or is supposed to know a priori. TODO: give an example. It would be better to provide completions for a predicate combined with an object at the same time.

[1] H. Bast and B. Buchhold. QLever: a Query Engine for Efficient SPARQL+Text Search. Paper currently under review. Code and setup instructions on GitHub

[2] H. Bast, F. Bäurle, B. Buchhold, E. Haußmann. Semantic Full-Text Search with Broccoli (Demo paper). SIGIR 2014: 1265-1266

[3] H. Bast, F. Bäurle, B. Buchhold, E. Haußmann. Broccoli: Semantic Full-Text Search at your Fingertips (System and Evaluation Paper). CoRR abs/1207.2615 (2012)

[4] H. Bast and B. Buchhold. An index for efficient semantic full-text search. CIKM 2013: 369-378

[5] Example SPARQL+Text query (astronauts who walked on the moon) for copy&paste in the text field of our internal QLever demo:

PREFIX fb: <http://rdf.freebase.com/ns/>
SELECT ?person ?name TEXT(?t) SCORE(?t) WHERE {
  ?person fb:people.person.profession ?profession .
  ?profession fb:type.object.name.en "Astronaut" .
  ?person fb:type.object.name.en ?name .
  ?person <in-text> ?t .
  ?t <in-text> "walk* moon"
}
LIMIT 100
ORDER BY DESC(SCORE(?t))

AD Teaching Wiki: BachelorAndMasterProjectsAndTheses/SparqlTextUI (last edited 2020-08-18 15:53:06 by Natalie Prange)