
h3. Semantic Annotation of UML Domain Models
Proper semantic integration requires that each class and it's attributes from the UML domain model gets mapped to appropriate concepts in a controlled terminology. The caCORE SDK utilizes the NCI Thesaurus as its primary terminology source, but any well structured, concept-based description logics terminology should in principle be suitable. The concept selection process can be entirely manual, or it can be partially automated using the Semantic Connector, a tool supplied by the caCORE SDK. The Semantic Connector uses the UML domain Model expressed in XMI as input and uses the caCORE EVS APIs hosted at the NCI to search the NCI Thesaurus for appropriate concepts. Semantic annotations for classes and attributes are specified using tagged values in the UML domain model.
h3. UML Domain Model Loader
The UML domain model, annotated with semantic concept codes, contains a considerable amount of metadata about the ultimate system -- both data and analytical services - that will be deployed to the grid. However, it is not in a form that is amenable to query and retrieval in a runtime environment nor easily queried by humans to make use of this information for other purposes. UML domain model loader addresses these limitations by transforming and loading the models into the caDSR, which provides APIs that support runtime access to metadata. UML domain model annotated with semantic concept information is exported to XMI format using a UML modeling tool such as Enterprise Architect. It is then used as an input to the UML domain model loader, which uses a set of mapping rules to load metadata represented by Classes, Attributes and Associations into entities of caDSR. Following section contains the details of the UML to caDSR mapping rules.
h3. UML to caDSR Mapping
Metadata represented in UML domain model is mapped to caDSR administered component types, and using the following mapping rules:
* A UML Class is mapped to an Object Class, which according to ISO 11179 specification represents a thing in real-world.
* An attribute of a UML Class is mapped to a Property, which according ISO 11179 specification represents an attribute of a real-world thing.
* Combination of a UML Class and one of it's attributes is mapped to a Data Element Concept.
* Combination of UML Class, one of it's attributes and data type of the attribute is mapped to a Data Element, commonly referred to as a Common Data Element (CDE).
* Project to which the UML domain model belongs to is mapped to a Classification Scheme.
* Packages in the UML model -- which may represent sub-projects within a project -- are mapped to Classification Scheme Items
* Association between two classes is mapped to Object Class Relationship Refer to "Registration of Metadata in the caDSR" chapter of caCORE SDK Programmer's guide for complete details on loading UML domain models to caDSR
h3. caGrid Reliance on caDSR
After a UML domain model is transformed, loaded and curated in caDSR, the model is ready to be used as the basis of an object oriented grid client and service. All data movement in caGrid between client and service is done so using instances of Classes registered in the caDSR. caGrid requires that all data types used in the grid are registered in caDSR, and come from a given Project version. That is, even though Attributes and other items in caDSR can be versioned individually, in order to use those types on the grid, they need to be associated with a specific Project version. Several components of caGrid make use of the wealth of information in the caDSR. As mentioned above, grid services use registered data models as their information model. By doing so, they are able to advertise both the syntax and semantics of the model by exposing an export of the relevant caDSR information as service metadata. The details of the model used to expose this information are shown in the section below. Once the information is exposed in this model, caGrid leverages for grid service advertisement and discovery. These processes are described in the [discovery section|metadata13:Discovery]. Finally, the information models registered in caDSR are used as the conceptual foundation for the actual communication format used to exchange data on the grid. This process of serializing and deserializeing data instances on the grid, is detailed in the [serialization overview|knowledgebase:caGrid Serialization].