Cancer is one of the dangerous diseases caused by abnormal division of
cells and uncontrolled exponential growth of cells. Cancer cells usually behave
dierently from the normal cells and can spread to other parts of the
body. This spreading process of cancer cells to other parts of the body is called
metastasis [1]. Cancer arises from the conversion of normal cells into cancerous
cells in a multistage process that generally progresses from a pre-cancerous
cells to a malignant tumor.
Cancer is the second-leading cause of death worldwide and an approximately
9.6 million people die every year from cancer according to the Union for International
Cancer Control (UICC), Switzerland (https: //www.worldcancerday.org/
what-cancer). Early classication of cancer sub-type classes has a great importance
in serving better diagnosis to the patients. Therefore, cancer sub-types
(classes) prediction at initial stage has become a vital area of research in the
eld of machine learning and medical science worldwide to the researchers and
scientists. There exist dierent clinical approaches to diagnosis of cancer which
are described.
Apart from the clinical approaches of predicting cancer, computational biologists
suggest complementary and relatively inexpensive solution for cancer prediction,
and primary (early) diagnosis using modern technology like machine
learning [3] and soft computing [4] etc. to apply on microarray gene expression
data [5]. Machine learning [3] technology provides set of computer models that
automatically learn from data and experience. Whereas, soft computing [6] is
a collection of methodologies which exploit the tolerance for imprecision and
uncertainty to achieve tractability, robustness, and low solution cost. Microarray
technology [5] records thousands of genes simultaneously. Number of genes
present in microarray data is normally very large as compared to the number
of samples [7]. Also the clinically labeled samples are very few. Moreover the
cancer subtypes exist in microarray gene expression data are often vague, indiscernible,
ambiguous, and overlapping in nature [8]. Therefore, it is important
to construct robust classiers in this complex (vague, indiscernible, ambiguous)
scenario that would achieve high accuracy in classifying cancerous samples [9]
in presence of limited training samples. Detailed description about machine
learning, soft computing and microarray technology are provided.