Note: The previous page may have more information regarding this series.
Once the registration is complete, several batch processes are run on a timely basis in order to extract, enrich, and link the data with currently available information.
- mass calculation
For every structure submitted, the components and linkage information will have a known mass. Using this information the calculation of the entire structure is a simple process and will be recorded for all newly registered sequences.
- conversion to RDF
The information from the sequence will be converted into RDF format and accessible from the GlyTouCan endpoint. As the data will be in glycoRDF and GlyTouCanRDF ontologies, it is possible to extract information about the sequence and any other linked data using standard SPARQL.
- conversion to WURCS
In order to enrich data using the logic available in the wurcs libraries, conversion from glycoCT was required. A batch process to convert the glycoCT and insert the WURCS into RDF was created to enable this.
- motif relationship search
Using the wurcs formats it was possible to search through the structural data to find substructure relationships of the specifically-defined motif structures. The batch process then inserts these relationships back into RDF.
If you would like to know more, please feel free to read the next page in this series.