Leveraging Artificial Intelligence for Protein-Based Drug Target Prediction in Pseudomonas aeruginosa
Keywords:
drug discovery, Machine learning, Target identification, Pseudomonas aeruginosaAbstract
Protein-based drug target identification, as a novel approach, has found its importance in controlling the threat of emerging antimicrobial resistance, especially in opportunistic microorganisms such as Pseudomonas aeruginosa. Recent advancements in artificial intelligence (AI) and machine learning (ML) have enabled the investigation of large-scale protein data sets, facilitating protein identification and functional annotation. In this article, the authors have proposed a hybrid computational approach, combining unsupervised and supervised machine learning methods, which can be used to analyze the physicochemical properties of P. aeruginosa proteins. Unsupervised methods like K-Means clustering and Principal Component Analysis (PCA) have been integrated into the model to identify the internal patterns among proteins, and Support Vector Machines (SVMs) have been used to classify protein functions. The authors have proven the effectiveness of the AI model in obtaining biologically relevant results regarding protein virulence and resistance with the help of additional protein physicochemical properties, thereby establishing protein analysis as a viable approach in the field of drug discovery.
Downloads
References
[1] Antolin, A.A., Workman, P., & Al-Lazikani, B. (2021). Public resources for chemical probes: The journey so far and the road ahead. Future Medicinal Chemistry, 13(8), 731–747. https://doi.org/10.4155/fmc-2020-0306
[2] Bosc, N., Atkinson, F., Félix, E., Gaulton, A., Hersey, A., & Leach, A.R. (2019). Large-scale comparison of QSAR and conformal prediction methods and their applications in drug discovery. Journal of Cheminformatics, 11, 4. https://doi.org/10.1186/s13321-018-0325-4
[3] Breidenstein, E.B.M., de la Fuente-Núñez, C., & Hancock, R.E.W. (2011). Pseudomonas aeruginosa: All roads lead to resistance. Trends in Microbiology, 19(8), 419–426. https://doi.org/10.1016/j.tim.2011.04.005
[4] Hanser, T., Steinmetz, F.P., Plante, J., et al. (2019). Avoiding hERG liability in drug design via QSAR and data fusion. Journal of Cheminformatics, 11, 9. https://doi.org/10.1186/s13321-019-0338-7
[5] Leeson, P.D., Bento, A.P., Gaulton, A., et al. (2021). Target-based evaluation of drug-like properties and ligand efficiencies. Journal of Medicinal Chemistry, 64(11), 7210–7230. https://doi.org/10.1021/acs.jmedchem.0c02117
[6] Mayr, A., Klambauer, G., Unterthiner, T., et al. (2018). Large-scale comparison of machine learning methods for drug target prediction on ChEMBL. Chemical Science, 9, 5441–5451. https://doi.org/10.1039/C8SC00148K
[7] Merk, D., Friedrich, L., Grisoni, F., & Schneider, G. (2018). De novo design of bioactive small molecules by artificial intelligence. Molecular Informatics, 37(1–2), 1700153. https://doi.org/10.1002/minf.201700153
[8] Oprea, T.I., Bologa, C.G., Brunak, S., et al. (2018). Unexplored therapeutic opportunities in the human genome. Nature Reviews Drug Discovery, 17(5), 317–332. https://doi.org/10.1038/nrd.2018.14
[9] Walter, M., Allen, L.N., de la Vega de León, A., et al. (2022). Benefits of imputation models over traditional QSAR for toxicity prediction. Journal of Cheminformatics, 14, 32. https://doi.org/10.1186/s13321-022-00615-3.
[10] Zdrazil, B., & Guha, R. (2018). The rise and fall of a scaffold: A trend analysis of scaffolds in the medicinal chemistry literature. Journal of Medicinal Chemistry, 61(11), 4688–4703. https://doi.org/10.1021/acs.jmedchem.7b01631.
[11] UniProt Consortium. (2018). UniProt proteome database entry UP000253594. Submitted to EMBL/GenBank/DDBJ databases (July 2018). Retrieved 2026, from https://www.uniprot.org/proteomes/UP000253594.
[12] Koukaras, P., & Tjortjis, C. (2025). Data preprocessing and feature engineering for data mining: Techniques, tools, and best practices. AI, 6(10), 257. https://doi.org/10.3390/ai6100257.
[13] Kaliappan, J., Saravana Kumar, I.J., et al. (2024). Analyzing classification and feature selection strategies for diabetes prediction across diverse diabetes datasets. Frontiers in Artificial Intelligence, 7, 1421751. https://doi.org/10.3389/frai.2024.1421751.
[14] Kuo, C.-T., Xu, D., & Friesen, R. (2025). A brief review of unsupervised machine learning algorithms: Dimensionality reduction and clustering. Universe, 11(12), 412. https://doi.org/10.3390/universe11120412.
[15] Kamdar, N., & Musen, M.A. (2025). Exploring homology detection via k-means clustering of proteins embedded with a large language model. Bioinformatics, 41(10). https://doi.org/10.1093/bioinformatics/btaf472.
[16] Kılıç, D.K., & Nielsen, P. (2023). Comparative analyses of unsupervised PCA–KMeans algorithm. Sensors, 22(23), 9172. https://doi.org/10.3390/s22239172.
[17] deHealth Lab. (2024). An overview on the advancements of support vector machine models in healthcare applications: A review. Information, 15(4), 235. https://doi.org/10.3390/info15040235.
[18] Zhang, Y., et al. (2024). A review of machine learning techniques for the classification and detection of breast cancer from medical images. Diagnostics, 13(14), 2460. https://doi.org/10.3390/diagnostics13142460.
[19] Pedregosa, F., et al. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.
[20] Montgomery, R.M. (2024). Overview of clustering techniques: From k-means to spectral methods.
Downloads
Published
Issue
Section
Categories
License
Copyright (c) 2026 Computational Discovery and Intelligent Systems (CDIS)

This work is licensed under a Creative Commons Attribution 4.0 International License.
Computational Discovery and Intelligent Systems (CDIS) content is published under a Creative Commons Attribution License (CCBY). This means that content is freely available to all readers upon publication, and content is published as soon as production is complete.
Computational Discovery and Intelligent Systems (CDIS) seeks to publish the most influential papers that will significantly advance scientific understanding. Selected articles must present new and widely significant data, syntheses, or concepts. They should merit recognition by the wider scientific community and the general public through publication in a reputable scientific journal.






