Abstract | With the technology development on detecting circulating tumor cells (CTCs) and cell-free DNAs (cfDNAs) in blood, serum, and plasma, non-invasive diagnosis of cancer becomes promising. A few studies reported good correlations between signals from tumor tissues and CTCs or cfDNAs, making it possible to detect cancers using CTCs and cfDNAs. However, the detection cannot tell which cancer types the person has. To meet these challenges, we developed an algorithm, eTumorType, to identify cancer types based on copy number variations (CNVs) of the cancer founding clone. eTumorType integrates cancer hallmark concepts and a few computational techniques such as stochastic gradient boosting, voting, centroid, and leading patterns. eTumorType has been trained and validated on a large dataset including 18 common cancer types and 5327 tumor samples. eTumorType produced high accuracies (0.86–0.96) and high recall rates (0.79–0.92) for predicting colon, brain, prostate, and kidney cancers. In addition, relatively high accuracies (0.78–0.92) and recall rates (0.58–0.95) have also been achieved for predicting ovarian, breast luminal, lung, endometrial, stomach, head and neck, leukemia, and skin cancers. These results suggest that eTumorType could be used for non-invasive diagnosis to determine cancer types based on CNVs of CTCs and cfDNAs. |
---|