AD Research Wiki:

Broccoli UIMA

This pages lists broccoli-specific decisions. It does not repeat what can be found in the official UIMA documentation. There are two ways we use uima:

End-to-end chain

This runs locally on the server you start it. It usually makes sense if no parse is required. Usually these chains start the Wikipedia XML Reader component.

For one of many examples see the target run-personsentences in broccoli/Makefile

Two UIMA chains on a dataset

This is how we usually build a broccoli index. The first part uses the asynchronous scaleout to erform a (senna)-parse. See targets start-broker, deploy-reader, deploy-senna in brococli/Makefile. This chains writes a cas0.zip that contains binary data of the UIMA objects with the parse done.

Then this data gets picked up by further components, e.g. the target build-txt performs entity recognition and CSD and writes words- and docsfiles.

AD Research Wiki: Broccoli Uima (last edited 2016-08-31 13:04:36 by Björn Buchhold)