May 15, 2015

Registering Structures

    Reviewed by Kiyoko Kinoshita

Note: The previous page may have more information regarding this series.

Registration Processing

Once the registration is complete, several batch processes are run on a timely basis in order to extract, enrich, and link the data with currently available information.

  • mass calculation

For every structure submitted, the components and linkage information will have a known mass. Using this information the calculation of the entire structure is a simple process and will be recorded for all newly registered sequences.

  • conversion to RDF

The information from the sequence will be converted into RDF format and accessible from the GlyTouCan endpoint. As the data will be in glycoRDF and GlyTouCanRDF ontologies, it is possible to extract information about the sequence and any other linked data using standard SPARQL.

  • conversion to WURCS

In order to enrich data using the logic available in the wurcs libraries, conversion from glycoCT was required. A batch process to convert the glycoCT and insert the WURCS into RDF was created to enable this.

  • motif relationship search

Using the wurcs formats it was possible to search through the structural data to find substructure relationships of the specifically-defined motif structures. The batch process then inserts these relationships back into RDF.

If you would like to know more, please feel free to read the next page in this series.

About the author

Nobuyuki Aoki is a research assistant at Soka University. Technical architect, Random Programmer, and Official Plant Watcher of the glycan repository project.