1648
Comment:
|
4251
|
Deletions are marked like this. | Additions are marked like this. |
Line 13: | Line 13: |
== UIMA Chain == For an explanation see [[Broccoli Uima]] == Building an Index == First, obtain a Wikipedia XML and a Freebase dump (usually stored in /nfs/raid5/broccoli/...) and make sure these correct files are referenced in broccoli/Makefile and broccoli/freebase/Makefile. Give a proper name to your index using the variable DBTAIL in broccoli/Makefile === Create ontology.txt === If you use an existing RDF3X DB (as usually the case), make sure it is referenced correctly in broccoli/freebase/Makefile and only run (inside the broccoli folder): {{{ make get-freebase-ontology }}} Otherwise run: {{{ make -C freebase/ build-db make get-freebase-ontology }}} === Create cas0.zip === Make sure all paths are set correctly in paths.mak on FILICUDI!! run (in the broccoli fodler) {{{ make deploy-broker }}} Then, on the server you want to have the reader and writer, run {{{ make deploy-reader }}} To get things going run the following on as many PC's (and servers) as possible {{{ make deploy-senna }}} === Create a broccoli index === Make sure to copy/move or reference the cas0.zip you built in broccoli/Makefile then run {{{ make build-txt build-index }}} === Start the server === {{{ make start PORT=<PORT> }}} |
|
Line 15: | Line 71: |
To add individual images (for demos, needs access to raid so it can write to cache folder, and img has to have a file extension for convert to work): | To add individual images (for demos, needs access to raid so it can write to cache folder, and img has to have a file extension for convert to work, tested on '''filicudi''', does not work on stromboli because code requires Python version >= 3.3): |
Line 20: | Line 76: |
== Image Service == |
|
Line 42: | Line 101: |
== Mediator Only Index (CIKM) == An index that contains mediators (used for the CIKM presentation) is available in /nfs/raid5/haussmae/demos/broccoli_mediators_no_text to start (on filicudi, port 7099, should work as any user that can read the files): {{{ /home/haussmae/demos/broccoli_mediators_no_text/ServerMain -p 7099 -o /home/haussmae/demos/broccoli_mediators_no_text/semantic-wikipedia-scientists-ontology -s /home/haussmae/demos/broccoli_mediators_no_text/semantic-wikipedia.stop-words /home/haussmae/demos/broccoli_mediators_no_text/semantic-wikipedia-scientists -m /home/haussmae/demos/broccoli_mediators_no_text/semantic-wikipedia-scientists-ontology.url-mapping }}} The user interface for backend filicudi:7099 is available at http://filicudi.informatik.uni-freiburg.de:6222/BroccoliCIKM (no UI hack) and http://filicudi.informatik.uni-freiburg.de:6222/BroccoliCIKM2 (UI hack). The UI hack makes specific mediator names readable in the query graph (and only there). The hack adjusts the nameLabel variable in the File src/de/uni/freiburg/broccoli/client/ui/BreadcrumbLabel.java of userinterface (in the broccoli respository). |
Broccoli
Code
Code is in https://ad-websvn.informatik.uni-freiburg.de/broccoli/. The Code for CSD is in the subfolder: https://ad-websvn.informatik.uni-freiburg.de/broccoli/nlp/
Current version
Built by Björn beginning of August 2016 (TODO: copy to elba), Wikipedia version from August 2016 (2.8B postings), latest Freebase dump (freebase-rdf-latest, 372M statements extracted a la Freebase Easy).
UIMA Chain
For an explanation see Broccoli Uima
Building an Index
First, obtain a Wikipedia XML and a Freebase dump (usually stored in /nfs/raid5/broccoli/...) and make sure these correct files are referenced in broccoli/Makefile and broccoli/freebase/Makefile. Give a proper name to your index using the variable DBTAIL in broccoli/Makefile
Create ontology.txt
If you use an existing RDF3X DB (as usually the case), make sure it is referenced correctly in broccoli/freebase/Makefile and only run (inside the broccoli folder):
make get-freebase-ontology
Otherwise run:
make -C freebase/ build-db make get-freebase-ontology
Create cas0.zip
Make sure all paths are set correctly in paths.mak
on FILICUDI!! run (in the broccoli fodler)
make deploy-broker
Then, on the server you want to have the reader and writer, run
make deploy-reader
To get things going run the following on as many PC's (and servers) as possible
make deploy-senna
Create a broccoli index
Make sure to copy/move or reference the cas0.zip you built in broccoli/Makefile then run
make build-txt build-index
Start the server
make start PORT=<PORT>
Image Cache
To add individual images (for demos, needs access to raid so it can write to cache folder, and img has to have a file extension for convert to work, tested on filicudi, does not work on stromboli because code requires Python version >= 3.3):
python3 ~/broccoli/img-hack/image_to_cache.py --mid <MID> --img 'http://...'
Image Service
HiWi project Kai Haase: see Google Doc.
TODO: Florians code has a mechanism for removing outdated images, which also removes images in the cache which now return a 404 not found (which effectively removes all images from the cache after the shutdown of the Freebase API). This should be corrected. Here is the guilty piece of code from https://ad-websvn.informatik.uni-freiburg.de/broccoli/freebase-imgsvc/fbthumbsvc.php:
// If no image could be found (404 error) create a 404 cache file for the // current id, return a 404 error and end the script. if ($return_status_code == 404) { // If there still was an expired cache file then remove it now! if ($cachefile_exists) { unlink($cachefile_path); } // Create a 404 cache file for the current id. touch($cachefile_path . '_404'); returnMissingError(); }
Mediator Only Index (CIKM)
An index that contains mediators (used for the CIKM presentation) is available in /nfs/raid5/haussmae/demos/broccoli_mediators_no_text
to start (on filicudi, port 7099, should work as any user that can read the files):
/home/haussmae/demos/broccoli_mediators_no_text/ServerMain -p 7099 -o /home/haussmae/demos/broccoli_mediators_no_text/semantic-wikipedia-scientists-ontology -s /home/haussmae/demos/broccoli_mediators_no_text/semantic-wikipedia.stop-words /home/haussmae/demos/broccoli_mediators_no_text/semantic-wikipedia-scientists -m /home/haussmae/demos/broccoli_mediators_no_text/semantic-wikipedia-scientists-ontology.url-mapping
The user interface for backend filicudi:7099 is available at http://filicudi.informatik.uni-freiburg.de:6222/BroccoliCIKM (no UI hack) and http://filicudi.informatik.uni-freiburg.de:6222/BroccoliCIKM2 (UI hack). The UI hack makes specific mediator names readable in the query graph (and only there). The hack adjusts the nameLabel variable in the File src/de/uni/freiburg/broccoli/client/ui/BreadcrumbLabel.java of userinterface (in the broccoli respository).