AD Research Wiki:

Broccoli

Code

Code is in https://ad-websvn.informatik.uni-freiburg.de/broccoli/. The Code for CSD is in the subfolder: https://ad-websvn.informatik.uni-freiburg.de/broccoli/nlp/

Current version

Built by Björn beginning of August 2016 (TODO: copy to elba), Wikipedia version from August 2016 (2.8B postings), latest Freebase dump (freebase-rdf-latest, 372M statements extracted a la Freebase Easy).

Image Cache

To add individual images (for demos, needs access to raid so it can write to cache folder, and img has to have a file extension for convert to work, tested on filicudi, does not work on stromboli because code requires Python version >= 3.3):

python3 ~/broccoli/img-hack/image_to_cache.py --mid <MID> --img 'http://...'

HiWi project Kai Haase: see Google Doc.

TODO: Florians code has a mechanism for removing outdated images, which also removes images in the cache which now return a 404 not found (which effectively removes all images from the cache after the shutdown of the Freebase API). This should be corrected. Here is the guilty piece of code from https://ad-websvn.informatik.uni-freiburg.de/broccoli/freebase-imgsvc/fbthumbsvc.php:

// If no image could be found (404 error) create a 404 cache file for the
// current id, return a 404 error and end the script.
if ($return_status_code == 404)
{
  // If there still was an expired cache file then remove it now!
  if ($cachefile_exists)
  {
      unlink($cachefile_path);
  }

  // Create a 404 cache file for the current id.
  touch($cachefile_path . '_404');
 
  returnMissingError();
}

AD Research Wiki: Projects/Broccoli (last edited 2016-08-16 15:27:19 by Hannah Bast)