Seongyong Park - Machine Learning, Bioinformatics

Research Scientist in Cancer Data Science Lab at NCI, Bethesda, MD, USA.

Design of machine learning and statistical models to discover predictive signatures of immunotherapy response in cancer.

Keywords

Machine Learning – Statistics – Bioinformatics – Scientific computing – Data management

Experience

Education

  • 2015-2022: Ph.D. in Bio and Brain Engineering at Korean Advanced Institute of Science and Technology (KAIST), Daejeon, South Korea.
  • 2010-2012: M.S. in Mechanical Engineering. Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea.
  • 2002-2010: B.S. in Mechanical Engineering. Pukyung National University (PKNU), Busan, South Korea.

Bio

I have over 8 years of experience in bioinformatics and machine learning. In the meantime, I managed Multi-Component and Multi-Target (MCMT) drug discovery from Traditional Korean Medicine project and initiated Developing AI model for cancer drug discovery projects for Synergistic Bioinformatics Lab at KAIST. I managed databases related to the bioinformatic/chemoinformatic discovery and closely interact with members of experimental labs who verify prediction of biomarkers and candidate drugs. I designed computational methods to harmonize prior knowledge of disease gene, biological network, disease mechanism associated gene sets and gene expression datasets of patient/model organisms. I managed CPU/GPU clusters to perform high-performance computing (HPC) too.

My major research topic for MCMT project is developing computational methods to optimize gene expression marker set to explain clinical and model organism responses at the same time, by utilizing the prior knowledge related to the disease mechanisms. I have co-authored several patents and peer-reviewed articles related to bioinformatics and machine/deep learning. For bioinformatic applications, I studied gene expression based prognostic model development, assay oriented marker set optimization, drug-target interaction prediction and drug repurposing. I’m also interested in applying statistial/machine/deep learning in various other topics such as application of overlap statistics, sequence based protein classification, protein-protein interaction prediction, spectral signal processing including ElectroEncehpaloGraphy (EEG) and Surface Enhanced Raman Spectroscopy (SERS).

Selected Publications

Boosting Prior Knowledge in Machine Learning for Biomarker Discovery

  • Seongyong Park, Gwansu Yi. Development of Gene Expression-Based Random Forest Model for Predicting Neoadjuvant Chemotherapy Response in Triple-Negative Breast Cancer Cancers. Febrary, 2022 [online]
  • Yoon Hyeok Lee, Hojae Choi, Seongyong Park, Boah Lee, Gwansu Yi. Drug repositioning for enzyme modulator based on human metabolite-likeness. BMC Bioinformatics, May, 2017 [online]

Self-Supervised Learning for reproducible research

  • Seongyong Park, Jaeseok Lee, Shujaat Khan, Abdul Wahab, Minseok Kim. SERSNet: Surface Enhanced Raman Spectroscopy based Bio-Molecule Detection using Deep Neural Network. Biosensors, November, 2021 [online]
  • Seongyong Park, Jaeseok Lee, Shujaat Khan, Abdul Wahab, Minseok Kim. Machine Learning-based Heavy Metal Ion Detection Using Surface-Enhanced Raman Spectroscopy. Sensors, January, 2022 [online]

Sparse Representation Learning

  • Muhammad Usman, Shujaat Khan, Seongyong Park, Abdul Wahab. AFP-SRC: Identification of Antifreeze Proteins Using Sparse Representation Classifier. Neural Computing and Applications (NCA), September. 2021 [online]
  • Muhammad Usman, Shujaat Khan, Seongyong Park, Jeong-A Lee. AoP-LSE: Antioxidant Proteins Classification Using Deep Latent Space Encoding of Sequence Features. Current Issues in Molecular Biology: (Bioinformatics and Systems Biology section), October. 2021 [online]

Ovelap Statistics and its applications [Video]

  • Seongyong Park, Shujaat Khan, Muhammad Moinuddin, Ubaid M. Al-Saggaf. GSSMD: A new standardized effect size measure to improve robustness and interpretability in biological applications, 2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Seoul, Korea (South), 2020, pp. 1096-1099, [online]

Patents

  • Bumki Min, Gwansu Yi and Seongyong Park. System and method for disease prediction based on group marker consisting of genes having similar function. KR Patent 1022361940000, App 1020200145453, issued Mar 30, 2021 [online]

  • Taesung Kim and Seongyong Park. Microfluidic concentrator array for observing predation behavior of microbes.KR. Patent 1012385560000, App. 1020100110945, issued Febrary 22, 2013. [online]

Presentations

BIBM, Online, 2020 [Publication]

Biofusion Seminar, KAIST, Daejeon, 2019

Biopharmaceutical Seminar, KAIST, Daejeon, 2016