The 5th International Workshop on Semantics-Powered Data Mining and Analytics (SEPDA 2020)

in Conjunction with the IEEE International Conference on Bioinformatics and Biomedicine(BIBM 2020)


Biomedical ontologies and controlled terminologies provide structured domain knowledge to a variety of health information systems. The rich thesaurus with concepts linked by semantic relationships has been widely used in natural language processing, data mining, machine learning, semantic annotation, and automated reasoning. The dramatically increasing amount of health-related data poses unprecedented opportunities for mining previously unknown knowledge with semantics-powered data mining and analytics methods. However, due to the heterogeneity of different data sources, it is a challenging problem to exploit multiple sources to solve real-world problems such as designing cost-effective treatment plan for patients, designing generalizable clinical trials, drug repurposing, and clinical phenotyping. The goal of this workshop is to bring people in the field of ontologies, data mining, knowledge representation, knowledge management, and data analytics to discuss innovative semantic methods, applications, and data analytics to address problems in healthcare, biomedicine, public health, and clinical research with biomedical, clinical, behavioral, and social web data. We are inviting original research submissions as well as work-in-progress. Selected full papers will be invited to submit an extended version to a special issue of a premier journal in biomedical informatics.

SEPDA has been established as a key venue for disseminating research on health data analytics using semantic web technologies such as ontologies. In the past few years, we have seen an increasing interest in using semantic web technologies for health data analysis with more and more submissions that present novel methods and applications for linked open data, information extraction, semantic-web-based knowledge bases, and deep learning. The NIH Data Science Strategic Plan explicitly commits to ensuring that all data-science activities and products supported by the agency adhere to the FAIR principles, meaning that data be Findable, Accessible, Interoperable, and Reusable. Semantic web technologies play a crucial role to address the FAIR principles. With the infrastructure support such as NCBO’s BioPortal for ontology maintenance, the CEDAR software for metadata creation and validation, more and more researchers are using ontologies and semantic web technologies for knowledge representation, semantic inference, natural language processing, and data analytics. Meanwhile, we received submissions that use semantic-based methods to tackle critical problems in biomedical informatics such as extracting drug-drug interaction, drug repurposing, adverse drug reaction, detecting early signals for cognitive impairment, and visualizing dietary supplement knowledge. It is thus critical to hold SEPDA workshop annually to continue our momentum and allow researchers to present and discuss novel methods and applications in this fast-growing field. In the previous four years, we held SEPDA in IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2016, BIBM 2017, BIBM 2018) and International Semantic Web Conference (ISWC 2019).


Topics of interest include but not limited to:

  • Semantics-based Data Mining and Analytics
    • Ontology-based text mining and natural language processing
    • Semantics-powered data mining and machine learning from biomedical, clinical, or social web data
    • Information Extraction from biomedical, clinical, or social web data
    • Semantic annotation on biomedical, clinical or social web data
  • Ontologies and Controlled Terminologies
    • Ontology development and enrichment
    • Quality assurance of ontologies and controlled terminologies
    • Semantic harmonization and ontology alignment
    • Knowledge representation and reasoning
    • Knowledge graphs
  • Data Integration
    • Linked open data
    • Novel approaches for data integration of heterogenous data sources
    • Large scale data integration
  • Application
    • Novel symbolic and semantic methods for pandemic surveillance
    • Novel tools and ontologies for data interpretation and visualization
    • Pharmacovigilance
    • Drug repurposing
    • Clinical trial generalizability assessment using ontologies
    • Algorithmic phenotyping and cohort identification using ontologies
    • Improving the literacy of health information consumers

Important Dates

  • Oct 22, 2020 (Extended): Due date for full workshop papers submission
  • Nov 16, 2020: Notification of paper acceptance to authors
  • Nov 26, 2020: Camera-ready of accepted papers
  • Dec 16-19, 2020: Workshops


Please submit a 2-page abstract, a short paper (up to 4 page IEEE 2-column format), or a full-length paper (up to 8 page IEEE 2-column format) through the online submission system (you can download the format instruction here). Electronic submissions (in PDF or Postscript format) are required. Selected participants will be asked to submit their revised papers in a format to be specified at the time of acceptance.

Link to the submission site

  1. To submit a 2-page abstract, please click on "Submit a New Abstract"
  2. To submit a 4-page short paper or 8-page full paper, please click on "Submit a New Full Paper"

In case you cannot open the hyperlink, the link to IEEE template is:

Submission website is:


  • Workshop Co-chairs
    • Zhe He, PhD, School of Information, Florida State University, USA
    • Cui Tao, PhD, School of Biomedical Informatics, University of Texas Health Science Center at Houston, USA
    • Jiang Bian, PhD, Department of Health Outcomes & Biomedical Informatics, University of Florida, USA
    • Rui Zhang, PhD, Institute for Health Informatics and College of Pharmacy, University of Minnesota, USA
    • Xia Hu, PhD, Department of Computer Science, Texas A&M University, USA
  • Program Committee Members
    • Muhammad Amith, University of Texas Health Science Center at Houston
    • James Geller, New Jersey Institute of Technology
    • Zhengxing Huang, Zhejiang University
    • Xia Jing, Clemson University
    • Xiao Luo, Indiana University-Purdue University Indianapolis
    • Yanshan Wang, Mayo Clinic
    • Yaoyun Zhang, Melax Tech / University of Texas Health Science Center at Houston

Workshop Schedule

The date and time for the workshop:

9:00 am - 13:20 pm on Dec 16, 2020 (Seoul Local Time)
7:00 pm - 11:20 pm on Dec 15, 2020 (US Eastern Time)
The 5th International Workshop on Semantics-Powered Data Mining and Analytics
WorkshopChairs: Zhe He, Jiang Bian, Cui Tao, Rui Zhang, Xia Hu
Time Title Presenter/Author
9:00 am – 9:05 am Introduction to SEPDA 2020 Zhe He
9:05 am – 10:00 am Keynote Rui Zhang
10:00 am – 10:10 am Coffee Break
10:10 am – 11:30 am Session 1: Ontology Research (Session Chair: Zhe He)
10:10 am – 10:30 am A Lexical-based Formal Concept Analysis Method to Identify Missing Concepts in the NCI Thesaurus (S08201) Fengbo Zheng and Licong Cui
10:30 am – 10:50 am A Health Consumer Ontology of Fast Food Information (S08205) Muhammad Amith, Jing Wang, Grace Xiong, Kirk Roberts, and Cui Tao
10:50 am – 11:10 am NCCD – RxNorm: Linking Chinese Clinical Drugs to International Drug Vocabulary (S08207) Yaoyun Zhang, Jing Li, and Mui Zandt
11:10 am – 11:30 am Generating Training Data for Concept-Mining for an 'Interface Terminology' Annotating Cardiology EHRs (B723) Vipina Keloth, Shuxin Zhou, Andrew Einstein, Gai Elhanan, Yan Chen, James Geller, and Yehoshua Perl
11:30 am – 11:50 am Coffee Break

11:50 am – 13:10 pm
Session 2: Ontology-Based Analytics and Applications (Session Chair: Jiang Bian)
11:50 am – 12:10 pm A Methodology to Develop Knowledge Graphs for Indication Expansion: An Exploratory Study (S08202) Ozge Gurbuz, Miao Sun, and Nathan Lawless
12:10 pm – 12:30 pm Deep Learning Identification of Asthma Inhaler Techniques in Clinical Notes (S08203) Bhavani Singh Agnikula Kshatriya, Elham Sagheb, Chung-Il Wi, Jungwon Yoon, Hee Yun Seol, Young Juhn, and Sunghwan Sohn
12:30 pm – 12:50 pm KEoG: A knowledge-aware edge-oriented graph neural network for document-level relation extraction (S08206) Tao Li, Weihua Peng, Qingcai Chen, Xiaolong Wang, and Buzhou Tang
12:50 pm – 13:10 pm Opioid2FHIR: A system for extracting FHIR-compatible opioid prescriptions from clinical text (S08208) Jingqi Wang, William Mathews, Huy Pham, Hua Xu, and Yaoyun Zhang
13:10 pm – 13:20 pm Closing Remarks

Past Workshops

  • The first SEPDA workshop (SEPDA 2016) was successfully held in conjunction with IEEE BIBM 2016 in Shenzhen, China. Dr. Hua Xu, a leading expert in the field of Biomedical Informatics, gave the keynote speech on the bioCADDIE project. Eight high quality papers were published and presented. About 20 people attended the workshop. Selected papers were invited to publish the extended version to the special issue on “Semantics-Powered Healthcare Engineering and Data Analytics” in the Journal of Healthcare Engineering.
  • The 2nd SEPDA Workshop (SEPDA 2017) was held in conjunction with IEEE BIBM 2017 in Kansas City, Missouri, USA. 17 high quality papers were published and presented. About 30 people attended the workshop. Extended journal version of the papers have been published in BMC Medical Informatics & Decision Making.
  • The 3rd SEPDA Workshop (SEPDA 2018) was held in conjunction with IEEE BIBM 2018 in Madrid, Spain. 14 high quality papers were published and presented. Around 40 people attended the workshop. Extended journal version of the papers has been published in BMC Medical Informatics & Decision Making.
  • The 4th SEPDA Workshop (SEPDA 2019) was held in conjunction with International Semantic Web Conference (ISWC 2019) in Auckland, New Zealand. 11 high quality papers were published and presented. Around 40 people attended the workshop. Six extended journal version of the papers will be published in BMC Medical Informatics & Decision Making in July 2020.