Lung cancerLung cancer, also known as lung carcinoma, is a malignant tumor that begins in the lung. Lung cancer is caused by genetic damage to the DNA of cells in the airways, often caused by cigarette smoking or inhaling damaging chemicals. Damaged airway cells gain the ability to multiply unchecked, causing the growth of a tumor. Without treatment, tumors spread throughout the lung, damaging lung function. Eventually lung tumors metastasize, spreading to other parts of the body.
Lung noduleA lung nodule or pulmonary nodule is a relatively small focal density in the lung. A solitary pulmonary nodule (SPN) or coin lesion, is a mass in the lung smaller than three centimeters in diameter. A pulmonary micronodule has a diameter of less than three millimetres. There may also be multiple nodules. One or more lung nodules can be an incidental finding found in up to 0.2% of chest X-rays and around 1% of CT scans.
Survival analysisSurvival analysis is a branch of statistics for analyzing the expected duration of time until one event occurs, such as death in biological organisms and failure in mechanical systems. This topic is called reliability theory or reliability analysis in engineering, duration analysis or duration modelling in economics, and event history analysis in sociology.
Thyroid noduleThyroid nodules are nodules (raised areas of tissue or fluid) which commonly arise within an otherwise normal thyroid gland. They may be hyperplastic or tumorous, but only a small percentage of thyroid tumors are malignant. Small, asymptomatic nodules are common, and often go unnoticed. Nodules that grow larger or produce symptoms may eventually need medical care. A goitre may have one nodule – uninodular, multiple nodules – multinodular, or be diffuse.
Training, validation, and test data setsIn machine learning, a common task is the study and construction of algorithms that can learn from and make predictions on data. Such algorithms function by making data-driven predictions or decisions, through building a mathematical model from input data. These input data used to build the model are usually divided into multiple data sets. In particular, three data sets are commonly used in different stages of the creation of the model: training, validation, and test sets.
Survival functionThe survival function is a function that gives the probability that a patient, device, or other object of interest will survive past a certain time. The survival function is also known as the survivor function or reliability function. The term reliability function is common in engineering while the term survival function is used in a broader range of applications, including human mortality. The survival function is the complementary cumulative distribution function of the lifetime.
Adenocarcinoma of the lungAdenocarcinoma of the lung is the most common type of lung cancer, and like other forms of lung cancer, it is characterized by distinct cellular and molecular features. It is classified as one of several non-small cell lung cancers (NSCLC), to distinguish it from small cell lung cancer which has a different behavior and prognosis. Lung adenocarcinoma is further classified into several subtypes and variants. The signs and symptoms of this specific type of lung cancer are similar to other forms of lung cancer, and patients most commonly complain of persistent cough and shortness of breath.
MalignancyMalignancy () is the tendency of a medical condition to become progressively worse; the term is most familiar as a characterization of cancer. A malignant tumor contrasts with a non-cancerous benign tumor in that a malignancy is not self-limited in its growth, is capable of invading into adjacent tissues, and may be capable of spreading to distant tissues. A benign tumor has none of those properties, but may be harmful to health. The term benign in more general medical use characterises a condition or growth that is not cancerous, i.
Homogeneity and heterogeneityHomogeneity and heterogeneity are concepts relating to the uniformity of a substance, process or image. A homogeneous feature is uniform in composition or character (i.e. color, shape, size, weight, height, distribution, texture, language, income, disease, temperature, radioactivity, architectural design, etc.); one that is heterogeneous is distinctly nonuniform in at least one of these qualities.
Cross-validation (statistics)Cross-validation, sometimes called rotation estimation or out-of-sample testing, is any of various similar model validation techniques for assessing how the results of a statistical analysis will generalize to an independent data set. Cross-validation is a resampling method that uses different portions of the data to test and train a model on different iterations. It is mainly used in settings where the goal is prediction, and one wants to estimate how accurately a predictive model will perform in practice.