Text Mining Approach to Analyse the Relation between Obesity and Breast Cancer Data

Article Preview

Abstract:

Biomedical research needs to leverage and exploit large amount of information reported in scientific publication. Literature data collected from publications has to be managed to extract information, transforms into an understandable structure using text mining approaches. Text mining refers to the process of deriving high-quality information from text by finding relationships between entities which do not show direct associations. Therefore, as an example of this approach, we present the link between two diseases i.e. breast cancer and obesity.Obesity is known to be associated with cancer mortality, but little is known about the link between lifetime changes in BMI of obese person and cancer mortality in both males and females. In this article, literature data for obesity and breast cancer was obtained using PubMed database and then methodologies which employs groups of common genes and keywords with their frequency of occurrence in the data were used, aimed to establish relation between obesity and breast cancer visualized using Pi-charts and bar graphs. From the data analysis, we obtained 1 gene which showed the link between both the diseases and validated using statistical analysis and disease-connect web server. We also proposed 8 common higher frequency keywords which could be used for indexing while searching the literature for obesity and breast cancer in combination.

Info:

Pages:

1-9

Citation:

Online since:

July 2015

Export:

Share:

Citation:

[1] Funk, C. S., I. Kahanda, A. Ben-Hur and K. M. Verspoor (2015). "Evaluating a variety of text-mined features for automatic protein function prediction with GOstruct." J Biomed Semantics6: 9.

DOI: 10.1186/s13326-015-0006-4

Google Scholar

[2] Preiss, J., M. Stevenson and R. Gaizauskas (2015). "Exploring Relation Types for Literature-based Discovery." J Am Med Inform Assoc.

Google Scholar

[3] Ramezankhani, A., O. Pournik, J. Shahrabi, F. Azizi and F. Hadaegh (2015). "An application of association rule mining to extract risk pattern for type 2 diabetes using tehran lipid and glucose study database." Int J Endocrinol Metab13(2): e25389.

DOI: 10.5812/ijem.25389

Google Scholar

[4] Burkhart, K. K., D. Abernethy and D. Jackson (2015). "Data Mining FAERS to Analyze Molecular Targets of Drugs Highly Associated with Stevens-Johnson Syndrome." J Med Toxicol.

DOI: 10.1007/s13181-015-0472-1

Google Scholar

[5] Bravo, A., J. Pinero, N. Queralt-Rosinach, M. Rautschka and L. I. Furlong (2015). "Extraction of relations between genes and diseases from text and large-scale data analysis: implications for translational research." BMC Bioinformatics16(1): 55.

DOI: 10.1186/s12859-015-0472-9

Google Scholar

[6] Taghizadeh, N., H. M. Boezen, J. P. Schouten, C. P. Schroder, E. G. Vries and J. M. Vonk (2015). "BMI and Lifetime Changes in BMI and Cancer Mortality Risk." PLoS One10(4): e0125261.

DOI: 10.1371/journal.pone.0125261

Google Scholar

[7] Scholz, C., U. Andergassen, P. Hepp, C. Schindlbeck, T. W. Friedl, N. Harbeck, M. Kiechle, H. Sommer, H. Hauner, K. Friese, B. Rack and W. Janni (2015). "Obesity as an independent risk factor for decreased survival in node-positive high-risk breast cancer." Breast Cancer Res Treat.

DOI: 10.1007/s10549-015-3422-3

Google Scholar

[8] "Rapid-I: Rapid Miner." Rapid - I. Rapid - I, n.d. Web. 10 Nov. 2012.

Google Scholar

[9] Jyoti Rani, S.Ramachandan and Ab. Rauf Shah (2014). Text mining of PubMed abstracts. R package version 1.0.4.

Google Scholar

[10] Kevin Becker, Douglas Hosack, Glynn Dennis, Richard A Lempicki, Tiffani J Bright ,Chris Cheadle and Jim Engel (10 December 2003), PubMatrix: atool for multiplex literature mining BMC Bioinformatics, Vol. 4, No. 161.

DOI: 10.1186/1471-2105-4-61

Google Scholar

[11] Ashburner et al, Gene ontology: tool for the unification of biology (2000) Nat Genet 25(1):25-9

Google Scholar

[12] Liu CC, Tseng YT, Li W, Wu CY, Mayzus I, Rzhetsky A, Sun F, Waterman M, Chen JJ, Chaudhary PM, Loscalzo J, Crandall E, Zhou XJ. (2014) DiseaseConnect: a comprehensive web server for mechanism-based disease-disease connections. Nucleic Acids Research.

DOI: 10.1093/nar/gku412

Google Scholar

[13] Layla Oesper, Daniele Merico, Ruth Isserlin and Gary D Bader (2011). WordCloud: a cytoscape plugin to create a visual semantic summary of networks. Source Code for Biology and Medicine, 6:7.

DOI: 10.1186/1751-0473-6-7

Google Scholar