MACHINE LEARNING ALGORITHMS IMPLEMENTATION IN THE HEALTHCARE SYSTEM AS A PROSPECTIVE AREA FOR SCIENCE, HEALTHCARE, AND BUSINESS

Relevance. The current state of medicine is imperfect as in every other field. Some main discrete problems may be separated in diagnostics and disease management. Biomedical data operation difficulties are a serious limiting factor in solving crucial healthcare problems, represented in the statistically significant groups of diseases. Accumulation of life science data creates as possibilities as chal - lenges to effectively utilize it in clinical practice. Machine learning-based tools are necessary for the generation of new insights and the discovery of new hidden patterns especially on big datasets. AI-based decisions may be successfully utilized for diagnosis of diseases, monitoring of general health, prediction of risks, treatment solutions, and biomedical knowledge generation. Objective. To analyze the potential of machine learning algorithms in healthcare on exact existing problems and make a forecast of their development in near future. Method. An analytical review of the literature on keywords from the scientometric databases Scopus, PubMed, Wiley. Search depth 7 years from 2013 to 2020. Results. Analyzing the current general state of the healthcare system we separated the most relevant problems linked to diagnostics, treatment, and systemic management: diagnostics errors, delayed diagnostics (including during emergencies), overdiagnosis, bureau-cracy, communication issues, and “handoff” difficulties. We examined details of the convenient decision-making process in the clinical environment in order to define exact points which may be significantly improved by AI-based decisions, among them: diagnosis of dis - eases, monitoring of general health, prediction of risks, treatment solutions, and biomedical knowledge generation. We defined machine learning algorithms as a prospective tool for disease diagnostics and management, as well as for new utilizable insights generation and big data processing. Conclusion. Machine learning is a group of technologies that can become a cornerstone for dealing with various medical problems. But still, we have some problems to solve before the intense implementation of such tools in the healthcare system.

Relevance. Nowadays, information technologies (IT) have a critical role in almost every field of our life, including healthcare. The development of the Biomedical Data Science sphere interconnects with gathering a significant mass of open electronic health records, developing new algorithms, and increasing computing power. Moreover, Biomedicine is one of the fastest-growing areas of knowledge over the past 30 years. As the statistics on PubMed show, in most areas of biology and medicine, the number of scientific articles doubles every few years. A large amount of data creates as possibilities as challenges, along with the development of data science allows structuring this information and, therefore, accesses it faster, using it more efficiently, generating new insights. It gives the possibility to solve a wide range of existing problems, which provides opportunities for both healthcare providers and patients. Diagnostic uncertainty is a weighty problem for healthcare providers. This phenomenon defines as a "subjective perception of an inability to provide an accurate explanation of the patient's health problem" [1]. According to the meta-analyzes article published in 2020, at least 0.7% of adult admissions involve a harmful diagnostic error [2]. Diagnostic uncertainty can lead to diagnostic delays, over-testing, and diagnostic errors, which can result in inadequate treatment prescription [3]. Based on the Global Health Data Exchange, 105,788 people died in 2019 from adverse effects of medical treatment, showing an increase of more than 1 percent compared to 1990. Dynamics show slow but persisting growth of iatrogenic harm for patients. We can interpret this fact as "adverse effects of healthcare system development," which happened owing to fast data accumulation about new details and methods of diagnostics and treatment, which doctors can't process successfully on their own in such short periods. The real numbers of iatrogenesis may be way higher than the following because to prove the adverse effects of the treatment, full information about diagnostics and treatment should be stored and freely accessible [4]. Institute of Medicine (IOM) estimates the United States annually spends $750 Billion in waste (approximately 30% of health care spending). IOM identified 6 waste domains: unnecessary services ($210 billion annually); inefficient delivery of care ($130 billion); excess administrative costs ($190 billion); inflated prices ($105 billion); prevention failures ($55 billion), and fraud ($75 billion) (www.theatlantic.com/health/archive/2012/09/how-the-us-health-care-system-wastes-750-billion-annually/262106/). Improving the quality of diagnostics and treatment by AI assistance in diagnosis of diseases, monitoring of general health, prediction of risks, treatment solutions, biomedical knowledge generation may significantly decrease costs for these domains in varying degrees. The current state of the health care system may improve with the integrated diagnostics. Integrated diagnostics is a combination of three independent diagnostic disciplines (radiology, pathology, and laboratory medicine) for therapeutic and diagnostic purposes using advanced information technology [5]. Machine learning (ML) algorithms may be an excellent tool for information collection and structurization as well as deep data analysis. Accurate diagnostics and treatment require both personal and general scientific data analysis. Personal medical data is successfully analyzed by a doctor, while the interpretation of large amounts of scientific data may significantly improve with artificial intelligence (AI). AI would also influence data storing, processing, and security as well as give some economic benefits. In this article, we will focus on the existing medical problems and their causes as well as possible solutions using ML.

METHOD
An analytical review of the literature on keywords from the scientometric databases Scopus, PubMed, Wiley. Search depth 7 years from 2013 to 2020.

Relevant healthcare problems
As in any other area of our lives, the healthcare system evolves rapidly, but its development is unaccomplished. Lately, a lot of difficulties resolve, while some of them still exist. Analyzing previous researches, we have emphasized some of the most significant problems, which can be partially solved using ML algorithms as well as other methods of biomedical data science. Among them are diagnostics errors, delayed diagnostics (including during emergencies), overdiagnosis, bureaucracy, communication issues, "handoff" difficulties, which have a significant impact on the quality of healthcare. We will describe some of them as examples of cardiovascular system diseases, neurological disorders, oncology, and kidney disease.

Cardiovascular problems
According to the World Health Organization (WHO), cardiovascular diseases (CVDs) are the number 1 cause of death globally. Heart attacks and strokes cause four out of five cardiovascular disease deaths. The proportion of deaths from non-communicable diseases is 41 million annually, leading to 71 % of deaths, respectively. Among them, cardiovascular disease is a global problem, killing 17.9 million people annually, or 31 % of all deaths each year [6]. Early diagnosis of atherosclerosis and blood clots may prevent complications such as stroke, heart attack, or pulmonary embolism. Late detection of atherosclerotic plaques can also lead to the development of aneurysms and problems with coronary, carotid, peripheral, renal arteries [7]. Early diagnostics has a significant effect on survival the earlier it carries out, the higher the chance of recovery [8]. Thus, monitoring of biomarkers as well as other predictors of CVDs that indicate the development of blood clots and atherosclerotic plaques can significantly improve the situation and reduce the total number of deaths related to the cardiovascular system. The rate of cholesterol testing two or more times in 3 years has been growing exponentially for the last twenty years. This tendency has a positive correlation with patients' data accumulation, which gives more possibilities for efficient monitoring of lab test dynamics. Such an approach can be useful for any chronic disease. To illustrate that cholesterol level monitoring may be helpful as a predictor of cardiovascular events. As a result, we can enhance prophylactics and prescribe a preventive treatment, such as lipid-lowering drugs, before consequences develop [9].
According to WHO, CVDs, and two other groups of diseases (cancer and infectious diseases) are the main categories of average diagnostic errors in primary health care. Errors may occur when minor warning symptoms are missed or ignored in primary care [10]. Cross-analysis of a large population sample estimate from 15,000 to 165,000 misdiagnosed cerebrovascular events annually in United States emergency departments (EDs), disproportionately representing headache or dizziness. Some cerebrovascular diseases do not diagnose immediately, which can lead to mortality or disability of the patient [11]. Approximately 9% of cerebrovascular diseases go unnoticed at the initial ED presentation. The risk of misdiagnosis is higher if the patient's complaints are minor, and the symptoms are mild, non-specific, or transient [12]. Another important one is deep vein thrombosis (DVT). There is a high risk of missing the diagnosis in a patient with deep vein thrombosis. According to Yuhong Zhang, the missed diagnosis of DVT in the lower extremities using ultrasound is about 50% in patients without DVT symptoms [13]. The most significant complication of DVT is pulmonary embolism (PE), which is a very life-threatening condition. PE may most likely lead to death without proper management, and its diagnostics may often delay clinical practice [14]. Also, there are a lot of errors related to cardiovascular medications. The ED and acute hospital is the most common locations at high risk for medication errors [15]. According to a study in 2016, cardiovascular drugs associate with 24.7% of medical errors. Among them, leading anticoagulants -11.3% of the errors [16]. Besides, the use of incorrect doses of the drug and unnecessary drugs for the treatment of cardiovascular diseases can lead to the development of thrombosis [17].
2. Neurological disorders 2.1 Parkinson`s disease. According to the statistics, about 1% of people over age 60 have Parkinson`s disease (PD), and this percentage rises with aging [18]. In the case of early diagnosis, the efficiency of pharmacological management increases, and non-pharmacological management is also possible. The combination of these two approaches helps to manage present symptoms and prolongs an active and healthy life [19]. Early diagnostics may conduct with the help of neurochemical biomarkers, such as orexin, Dopamine, Dopamine receptors, and Dopamine Transporter Activity, α-Synuclein, and others [20]. PD misdiagnosing occurs in about 30% of all cases [21]. An accurate diagnosis of PD is essential both for patients care and researches associated with epidemiology, genetics, medical imaging, neurochemical biomarkers, and symptomatic and disease-modifying treatments [22].

Multiple sclerosis.
Multiple sclerosis (MS) is a potentially disabling autoimmune disease without efficient treatment and exact etiology. It is characterized by immune-mediated attacks on the central nervous system (CNS) and following demyelination and reversible or relapsing neurological symptoms [23]. Hence, it is obviously expedient to generate new insights on early laboratory diagnostics of MS by analyzing big patient data. According to the statistics, MS affects approximately 900,000 people in the United States and 2.5 million people worldwide [24]. Following the statistics from studying in the UK, peak incidence occurred between ages 40 and 50 years and maximum prevalence between ages 55 and 60 years [25]. Early diagnosis of MS is possible as a combination of symptoms (lasting at least 24 hours) and clinical tests, including early biomarkers: oligoclonal bands, anti-MOG antibodies, antinuclear antibodies [26]. It helps to decrease the possibility of disability and to lower the secondary relapse rate [27]. Misdiagnosis of MS brings certain risks associated with not receiving early-stage treatment [28]. The diagnostic error usually occurs when different disorders that aren't associated with demyelination and inflammation processes expose symptoms typical for MS [29]. Alternative conditions may frequently suggest by the presence of "red flags" in the clinical presentation. These are atypical for MS signs, symptoms, or findings that should be detected and investigated by radiographic, clinical, or laboratory methods to reduce the possibility of MS misdiagnosis [29].
2.3 Alzheimer's disease. Alzheimer's disease (AD) is a detrimental worldwide social problem. The prognosis estimates that the number of AD patients only in the USA will rise from 5 million to 14 million by 2050 [30]. According to WHO, the total number of patients with dementia may reach 82 million in 2030 and 152 million in 2050 [31]. AD should diagnose in the preclinical phase or while AD-induced mild cognitive impairment (MCI) to decrease the possibility of irreparable brain damage. Some people with MCI have returned normal cognition without dementia related to AD due to on-time diagnostics and treatment [30]. Diagnostics of AD with the help of biomarkers measurement may prevent a significant number of false-positive diagnoses, as opposed to alone guidelines diagnostics [32]. In perspective, early biomarkers will become a required part in monitoring the effects of AD treatment [30]. Detection and measurement of biomarkers are also a significant part of AD drug development. It allows identifying better compatibility between designed drug and patient for clinical trials [33].
That's why the biomarkers monitoring process should simplify in all stages: from diagnostics to treatment.
Patients and people associated with diagnostics and treatment of AD should be able to get fast access to correct and relevant measured biomarkers. 3. Oncology 3.1. Thyroid cancer. Thyroid cancer (TC) is the most common endocrine cancer [34]. TC's are often overdiagnosed in the USA and South Korea, mainly by ultrasonography [35] [36]. The most common way of thyroid tumor management is a radical thyroidectomy. It provokes an increase in hypoparathyroidism incidence in South Korea [36]. In 2015, the Korean Committee for National Cancer Screening Guidelines issued a recommendation against thyroid cancer screening with ultrasonography for healthy individuals [37]. Thyroid cancer has an estimated 5-year survival of 98.1% overall: 99.9% for localized disease and 55.5% for distant disease [34]. It may indicate that thyroidectomy may be irrational in some cases. So, harm from disease management may be more significant than one from the tumor itself. Fine-needle aspiration is the most common method in the diagnosis of TC. When performed, ∼70% of all thyroid tumors classify as benign, 4.0% as malignant, and 10% as suspicious or indeterminate, and 17% demonstrate an insufficient sample [38]. There are some non-specific biomarkers that may indicate a presence of the thyroid tumor [39] that all together and in combination with other existing diagnostic methods can more accurately suggest a possibility of TC. This approach will provide a more accurate prescription of thyroidectomy.
3.2. Prostate Cancer. Prostate Cancer (PC) is the second most common cancer in men. In the USA, 33,330 deaths occur from prostate cancer [34]. The implementation of the prostate-specific antigen (PSA) test increased the level of prostate cancer detection, resulting in overdiagnosis and overtreatment [40]. Undergoing radical pros-3D PRINTING OF A LOWELL MAKES MASK IN PLA tatectomy or radiation therapy may lead to some complications (urinary symptoms, operative mortality) as well as long-term sequelae (urinary incontinence, impotence, and bowel dysfunction) [41]. In our opinion, a PSA blood test alone is not enough for the diagnosis, so we need more specific biomarkers (BM's) [42].

Kidney disease
Chronic kidney disease (CKD) is a significant disorder that affects a lot of people around the world. Over $1 trillion is spent globally on end-stage renal disease care [43]. Unfortunately, it's often recognizable only by laboratory abnormalities in the latest stages. Late diagnostic caused no effective kidney disease treatment development [44] and may limit the number of BM's for early disease detection. Measuring glomerular filtration rate (GFR) is a "gold standard" for CKD assessment, but not specific enough, especially during the early stages of the disease [45]. There is a bid amount of BM's associated with kidney damage and, or loss of function, which can implement via ML methods for CKD management [46]. Factors such as age, gender, race, and family history are crucial for CKD. Moreover, hypertension, smoking, diabetes mellitus, and obesity can also lead to kidney disease. It's critical for patients and doctors to notion all aspects of effective diagnostics and treatment. Normal renal senescence and physiological loss of GFR should be noticed and differentiated from life-threatening signs of CKD. Сomplete analysis of the patient's data: including his age, comorbidity, in complex with albuminuria, GFR, and early biomarkers of kidney damage is a potentially efficient tool in CKD diagnostics and management [47].
Healthcare problems and data processing Such tools as electronic health records gave a push for biomedical data science development but still, it doesn't look like we can use that vast amounts of data fluently. According to The Joint Commission Center for Transforming Healthcare Hand-off Communications Project, "hand-off is a transfer and acceptance of patient care responsibility achieved through effective communication". The hand-off is a process of medical information transmission from one health care provider to another for treatment or diagnostic propose. There are more than 4,000 hand-offs in a typical teaching hospital every day (https://psnet.ahrq.gov/web-mm/triple-handoff). A lot of essential information may get lost during a patient's data transfer between healthcare providers. The electronic health record system became a grandiose invention, which solved this problem to a large extent, but it still partially exists. Another serious difficulty is receiving essential medical data for urgent patient management when his person is unidentified. As we can see in modern devices, it may be partially solved by data storage on smartphones or other devices and proper linking these local systems to the electronic health record system. It also would be great to give patients the possibility to participate in their biomedical data replenishment, but only under the doctor's control.
Such tools as electronic health records gave a push for biomedical data science development but still, it doesn't look like we can use that vast amounts of data fluently. We mentioned many BM's that may successfully use for the improvement activities in particular diseases diagnostics and management. But to do so, a lot of scientific information should be identified, analyzed, and verified in detail before the implementation of them in clinical practice. With the help of AI, we can gather a large amount of data about potential BM's from science resources like PubMed for diagnostic improvement. ML algorithms can become an excellent instrument for such BM's significance assessment and defining their role in disease diagnostics and management. Utilizing such an approach can potentially solve problems of overdiagnosis we have shown in part about oncology, help predict and detect chronic diseases in early stages and, generally, partially solve the problems we have listed earlier.
Сomprehensive analysis of the patient's data: including his age, sex, race, comorbidity in complex with utilizing suitable diagnostic methods, as well as general scientific data is a promising tool for the healthcare future.

Machine learning algorithms and biomedical data processing
Decision-making in medicine is a responsible and complex task that requires taking into account a huge number of factors. Depending on the field of medicine, these factors and their number may vary, but even so, we can identify the most fundamental among them: l Patient Laboratory Data / Clinical History / Genomics Data l Psychological state / Human conditions l Consumption of pharmaceuticals Modern science is trying to describe and digitize these factors. It should be noted that the assessment of such factors requires the adoption of both general information and personal data. Since a human body is a complex object, it is difficult to make deductive conclusions about its nature, therefore, personal information is valuable. On the other hand, it is rather difficult to interpret personal data, therefore, for making an informed decision, the best strategy is to focus both on patient data and global information.
In the current technological situation, artificial intelligence is not able to take over decision-making, but modern machine learning algorithms can be an excellent tool for medical professionals. Such systems should have the following properties (table1): l Good accuracy: machine learning algorithms should have near or higher accuracy than physicians.
l Good explanation ability: output result of a system should be interpretable for physicians.
l Ability to work with missing and noisy data: it is a widespread situation in medicine when data are Elmoutawakkil N., Seffar A-E., Elmoutawakkil D., Hacib N., Bellemkhannate S. missing on noisy, so algorithms should be less sensitive as possible for this situation.
l Ability to use different input data with is count minimization: it very time and cost expensive to use a lot of input variables, algorithms should be able to work with different data, whit minimal volume.
l High performance: algorithms must be able to train fast and efficiently.
l Large coverage of output variables: the more output solutions a system has to offer, the higher its value.
l High differentiation power: there are a lot of subtypes of some diseases, so it is important to create very detailed systems for their classification.
We can highlight the following areas of medicine in which machine learning is actively implemented: diagnosis of diseases, monitoring of general health, prediction of risks, treatment solutions, biomedical knowledge generation. In general, the proliferation of machine learning in clinical practice is lagging behind the potential possible, given the opportunities already available.
Diagnosis of diseases is a classical topic, which people associate with using machine learning in medicine. Formally, this task can be described as predicting a certain class from the entire class pool based on input data, in the form of attributes for each patient. Lets, patient P={p1,p2,...,pk} , |P|=k , A is a set of attributes, A={a1,a2,...,an} , |A|=n . A set of patients and their attributes form a matrix MAP, that is constructed by P×A.
Set of classes C = {c1, c2, cm}, for each row in MAP corresponds to some class, in the case of disease classification, each class is a specific disease. The task is based on a known predict class for new patient data, based on its attributes. The most popular classifiers are Support Vector Machine (SVM), Naïve Bayes Classifiers, Random Forest, K-Nearest-Neighbour (KNN), neural network classifiers. In Table 1, we have shortly described them, their advantages and disadvantages.
A very prospective field is using external knowledge graphs for improving medical diagnosis results [48]. Knowledge graph g = {N, R} is a set of medical entities as nodes (N) and relations between these entities {R}.
Sometimes we do not have a set of classes, but we want to subdivide and categorize sets of attributes. Examples of clustering algorithms are K-means Clustering, Agglomerative Clustering, Multikernel Learning algorithms, Density-Based Spatial Clustering of Applications with Noise (DBSCAN), Ordering Points To Identify the Clustering Structure (OPTICS).
Biomedical data security Accumulation of patients' personal medical information, creation of new services for its archivation and processing caused the manifestation of an essential problem with data security. According to the Digital Guardian analysis published in 2019, the average total cost of a data breach by industry was the highest in the healthcare  The reason for this phenomenon is that personal medical information cannot be changed once it was stolen, on the opposite of credit card or social insurance data. It can be proved with the annual report of the IBM company and Ponemon Institute, which states that the average data leakage cost in 2019 was 3,92 million dollars, while for healthcare it was 6,45 million dollars, two times higher than in any other field. From 2005 to 2019, 249,09 million people suffered from a data breach in healthcare and we can notice that the number of HACKs was consistently growing during this period [49].
Data breach in the insurance company Premera Blue Cross can be a good example of hacking. In May 2014, an employee of this company received a phishing email with a link to the document, which contained malware that allowed a malefactor to enter an internal network of the company and theft the medical data of 11 million people (https://cutt. ly/6zDBjZG). It's also important to protect portative devices from the access of the third person as happened in Chicago in 2013 when 4 unencrypted computers were stolen from Advocate Medical Group. 4 million patients suffering from this incident and financial loss was assessed to be billions of dollars (www.healthcareitnews.com/news/Advocate-Health-slapped-with-lawsuit-after-massive-data-breach). It should be noticed that data breaches can be caused not only by external attacks but also from the inside.
It's crucially important to constantly upgrade systems that work with biomedical data because their malfunction can lead to wrong treatment and lethal outcomes, as happened in the hospital in Germany in 2020 (www. wired.com/story/a-patient-dies-after-a-ransomware-attack-hits-a-hospital/#:~:text=A woman seeking emergency treatment, was widely reported on Thursday). AI-based systems, which we review in this article as a prospective accessory diagnostic instrument, require supporting mechanisms to confirm the authenticity of the data they operate. AI implementation in the healthcare system requires standardization of data sets it works with to prevent inaccurate results. Such systems, especially their data sets, must be protected from all types of data breaches. It's important not only to provide technical innovations in data security but also to check on the employees that have access to this information, implement the multi-layer revision of all the changes. Both for just storing patients' information and operating on it using AI-based systems, we can list some general recommendations for biomedical data security: use of anti-malware solutions and protection networks with effective firewalls; use of multi-factor authentication; use of security patch management, anti-social engineering, and phishing programs; investing in cybersecurity insurance; data encryption; creating reliable backups; investing in employee training; conducting frequent audits of the cybersecurity system. Contributors. All the authors did the literature search, read, and agreed to the final manuscript.
Declaration of interests. We declare no competing interests.

CONCLUSION
In this article, we have detected some severe healthcare system problems and reviewed their possible solutions using ML algorithms. Based on our analysis, the healthcare system was upgraded significantly by inventions like HER's, but still, we have target points for its improvement. Machine learning is a group of technologies that can become a cornerstone for dealing with various medical problems. Utilizing various types of AI will be useful for the understanding of the risk factors, behavioral patterns, and features of the therapeutic pathway to provide adequate in-time treatment. To implement this technique, healthcare should focus on big data accumulation and structurization, making the substrate for further investigations and market development. It's necessary to provide an interdisciplinary approach for universal AI formation, concentrating on the realization of technologies for global scientific data mining and processing. This strategy will give us the possibility to use AI as a full-fledged accessory diagnostic instrument and improve many medical issues. The application of ML algorithms, in perspective, can provide a tool for comparison outputs of laboratory and clinical studies with existing healthcare standards in order to help in the development of more advanced diagnostic methods and personalized treatment by generating new insights and detecting hidden corrections. All of the mechanisms described above will allow gathering large amounts of big data that will set a stage for the future development of biomedical data science. The development of AI-based systems requires control over the data it works with to provide accurate and reliable results. Weighty economic benefits to be expected for patients and the healthcare system in general. We believe that in the near future AI will become a fully functional diagnostic instrument, collaborating with physicians in order to provide the best quality of medical services. We believe, it will be possible for future AI's to substantially optimize healthcare reducing, making the lives of both patients and healthcare providers a bit better.