Size: 948
Comment:
|
← Revision 3 as of 2016-08-31 13:04:36 ⇥
Size: 956
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 10: | Line 10: |
For one of many examples see the target {{{run-personsentences}}} in brococli/Makefile | For one of many examples see the target {{{run-personsentences}}} in broccoli/Makefile |
Line 16: | Line 18: |
See targets {{start-broker, deploy-reader, deploy-senna}} in brococli/Makefile. | See targets {{{start-broker, deploy-reader, deploy-senna}}} in brococli/Makefile. |
Line 19: | Line 21: |
Then this data gets picked up by further components, e.g. the target {{build-txt}} performs entity recognition and CSD and writes words- and docsfiles. | Then this data gets picked up by further components, e.g. the target {{{build-txt}}} performs entity recognition and CSD and writes words- and docsfiles. |
Broccoli UIMA
This pages lists broccoli-specific decisions. It does not repeat what can be found in the official UIMA documentation. There are two ways we use uima:
End-to-end chain
This runs locally on the server you start it. It usually makes sense if no parse is required. Usually these chains start the Wikipedia XML Reader component.
For one of many examples see the target run-personsentences in broccoli/Makefile
Two UIMA chains on a dataset
This is how we usually build a broccoli index. The first part uses the asynchronous scaleout to erform a (senna)-parse. See targets start-broker, deploy-reader, deploy-senna in brococli/Makefile. This chains writes a cas0.zip that contains binary data of the UIMA objects with the parse done.
Then this data gets picked up by further components, e.g. the target build-txt performs entity recognition and CSD and writes words- and docsfiles.