Embedding Methods for Representation Learning: Application to Structured Biomedical Databases and Ontology
Representation learning is an emerging field of research in artificial intelligence (AI). Embedding methods are designed to learn the semantics in knowledge graphs, constructed from unstructured of semi-structured sources (eg. webpages), to complete tasks such as link prediction, knowledge graph completion and classification, sometimes seen in question-answering algorithms such as google search and SIRI.
However, this technique has not been extended to structured, relatively complete and well-categorised datasets such as electronic health records (EHRs) where it may see a much wider impact on biomedical research and healthcare support. Domain-specific ontologies support the organisation and management of biological and medical knowledge, partly by indication of the underlying class relationships, but have not been widely adopted for typing categorised data in AI training. The semantic complexity and high dimensionality of ontologies resemble that of knowledge bases/graphs, but are much more structured.
In my MSc dissertation project supervised by Dr Hegler Tissot, I will evaluate existing knowledge embedding methods (of incomplete knowledge graphs from less structured data sources) based on their theoretical design, performance and hyperparameters. I will discuss how we can adapt these methods to apply on ontologies.
To be updated. Details are subject to change.