Broccoli UIMA
This pages lists broccoli-specific decisions. It does not repeat what can be found in the official UIMA documentation. There are two ways we use uima:
End-to-end chain
This runs locally on the server you start it. It usually makes sense if no parse is required. Usually these chains start the Wikipedia XML Reader component.
For one of many examples see the target run-personsentences in broccoli/Makefile
Two UIMA chains on a dataset
This is how we usually build a broccoli index. The first part uses the asynchronous scaleout to erform a (senna)-parse. See targets start-broker, deploy-reader, deploy-senna in brococli/Makefile. This chains writes a cas0.zip that contains binary data of the UIMA objects with the parse done.
Then this data gets picked up by further components, e.g. the target build-txt performs entity recognition and CSD and writes words- and docsfiles.