The World Health Organization (WHO) describes coronaviruses as a group of viruses, several of which infect humans, which usually cause respiratory disease or illness, ranging from the common cold to severe acute respiratory distress syndrome (ARDS) [1]. The most recently discovered coronavirus, SARS-CoV-2, causes the infectious disease COVID-19, a pandemic affecting many countries around the world [1]. COVID-19 typically presents clinically as fever, dry cough, dyspnea, and fatigue, but signs and symptoms vary widely [1]. Reports are emerging of patients presenting with cardiovascular symptoms, including chest tightness and heart palpitations [2, 3]. Some patients present exclusively with cardiac complaints and do not have any other respiratory complaints [3]. Underlying cardiovascular disease increases the risk of severe COVID-19 disease and death [4, 5], but does COVID-19 infection increase the risk of, or cause, cardiovascular complications or cardiovascular disease in patients without underlying cardiovascular disease?
Machine learning and data mining techniques have been used in numerous real-world applications. An assumption of traditional machine learning methodologies is the training data and testing data are taken from the same domain, such that the input feature space and data distribution characteristics are the same. However, in some real-world machine learning scenarios, this assumption does not hold. There are cases where training data is expensive or difficult to collect. Therefore, there is a need to create high-performance learners trained with more easily obtained data from different domains. This methodology is referred to as transfer learning. This survey paper formally defines transfer learning, presents information on current solutions, and reviews applications applied to transfer learning. Lastly, there is information listed on software downloads for various transfer learning solutions and a discussion of possible future research work. The transfer learning solutions surveyed are independent of data size and can be applied to big data environments.
Akella Presents Vol 59
This section presents surveyed papers covering homogeneous transfer learning solutions and is divided into subsections that correspond to the transfer categories of instance-based, feature-based (both asymmetric and symmetric), parameter-based, and relational-based. Recall that homogeneous transfer learning is the case where \(\mathcalX_\mathcalS = \mathcalX_\mathcalT\). The algorithms surveyed are summarized in Table 2.
The paper by Yao [138] first presents an instance-based transfer learning approach followed by a separate parameter-based transfer learning approach. In the transfer learning process, if the source and target domains are not related enough, negative transfer can occur. Since it is difficult to measure the relatedness between any particular source and target domain, Yao [138] proposes to transfer knowledge from multiple source domains using a boosting method in an attempt to minimize the effects of negative transfer from a single unrelated source domain. The boosting process requires some amount of labeled target data. Yao [138] effectively extends the work of Dai [21] (TrAdaBoost) by expanding the transfer boosting algorithm to multiple source domains. In the TrAdaBoost algorithm, during every boosting iteration, a so-called weak classifier is built using weighted instance data from the previous iteration. Then, the misclassified source instances are lowered in importance and the misclassified target instances are raised in importance. In the multi-source TrAdaBoost algorithm (called MsTrAdaBoost), each iteration step first finds a weak classifier for each source and target combination, and then the final weak classifier is selected for that iteration by finding the one that minimizes the target classification error. The instance reweighting step remains the same as in the TrAdaBoost. An alternative multi-source boosting method (TaskTrAdaBoost) is proposed that transfers internal learner parameter information from the source to the target. The TaskTrAdaBoost algorithm first finds candidate weak classifiers from each individual source by performing an AdaBoost process on each source domain. Then an AdaBoost process is performed on the labeled target data, and at every boosting iteration, the weak classifier used is selected from the candidate weak source classifiers (found in the previous step) that has the lowest classification error using the labeled target data. Experiments are performed for the application of object category recognition where the area under the curve (AUC) is measured as the performance metric. An AdaBoost baseline approach using only the limited labeled target data is measured along with a TrAdaBoost approach using a single source (the multiple sources are combined to one) and the limited labeled target data. Linear SVM learners are used as the base classifiers in all approaches. Both the MsTrAdaBoost and TaskTrAdaBoost approaches outperform the baseline approach and TrAdaBoost approach. The MsTrAdaBoost and TaskTrAdaBoost demonstrated similar performance.
The algorithm by Wang [121], referred to as the domain adaptation manifold alignment (DAMA) algorithm, proposes using a manifold alignment [45] process to perform a symmetric transformation of the domain input spaces. In this solution, there are multiple labeled source domains and a limited labeled target domain for a total of K domains where all K domains share the same output label space. The approach is to create a separate mapping function for each domain to transform the heterogeneous input space to a common latent input space while preserving the underlying structure of each domain. Each domain is modeled as a manifold. To create the latent input space, a larger matrix model is created that represents and captures the joint manifold union of all input domains. In this manifold model, each domain is represented by a Laplacian matrix that captures the closeness to other instances sharing the same label. The instances with the same labels are forced to be neighbors while separating the instances with different labels. A dimensionality reduction step is performed through a generalized eigenvalue decomposition process to eliminate feature redundancy. The final learner is built in two stages. The first stage is a linear regression model trained on the source data using the latent feature space. The second stage is also a linear regression model that is summed with the first stage. The second stage uses a manifold regularization [4] process to ensure the prediction error is minimized when using the labeled target data. The first stage is trained only using the source data and the second stage compensates for the domain differences caused by the first stage to achieve enhanced target predictions. The experiments are focused on the application of document text classification where classification accuracy is measured as the performance metric. The methods tested against include a canonical correlation analysis approach and a manifold regularization approach, which is considered the baseline method. The baseline method uses the limited labeled target domain data and does not use source domain information. The approach presented in this paper substantially outperforms the canonical correlation analysis and baseline approach; however, these approaches are not directly referenced so it is difficult to understand the significance of the test results. A unique aspect of this paper is the modeling of multiple source domains in a heterogeneous solution.
The subject of transfer learning is a well-researched area as evidenced with more than 700 academic papers addressing the topic in the last 5 years. This survey paper presents solutions from the literature representing current trends in transfer learning. Homogeneous transfer learning papers are surveyed that demonstrate instance-based, feature-based, parameter-based, and relational-based information transfer techniques. Solutions having various requirements for labeled and unlabeled data are also presented as a key attribute. The relatively new area of heterogeneous transfer learning is surveyed showing the two dominant approaches for domain adaptation being asymmetric and symmetric transformations. Many real-world applications that transfer learning is applied to are listed and discussed in this survey paper. In some cases, the proposed transfer learning solutions are very specific to the underlying application and cannot be generically used for other applications. A list of software downloads implementing a portion of the solutions surveyed is presented in the appendix of this paper. A great benefit to researchers is to have software available from previous solutions so experiments can be performed more efficiently and more reliably. A single open-source software repository for published transfer learning solutions would be a great asset to the research community. 2ff7e9595c
コメント