|
Summary
Apr 2006, Vol. 7, No. 3, Pages 455-465
, DOI 10.2217/14622416.7.3.455
(doi:10.2217/14622416.7.3.455)
Collaborative Study: chronic fatigue syndrome – Research Report Linear data mining the Wichita clinical matrix suggests sleep and allostatic load involvement in chronic fatigue syndrome Brian M Gurbaxani 1†, James F Jones 1, Benjamin N Goertzel 2 & Elizabeth M Maloney 11Centers for Disease Control and Prevention, 600 Clifton Road, MS A-15, Atlanta, GA 30333, USA. buw8@cdc.gov † Author for correspondence Objectives: To provide a mathematical introduction to the Wichita (KS, USA) clinical dataset, which is all of the nongenetic data (no microarray or single nucleotide polymorphism data) from the 2-day clinical evaluation, and show the preliminary findings and limitations, of popular, matrix algebra-based data mining techniques. Methods: An initial matrix of 440 variables by 227 human subjects was reduced to 183 variables by 164 subjects. Variables were excluded that strongly correlated with chronic fatigue syndrome (CFS) case classification by design (for example, the multidimensional fatigue inventory [MFI] data), that were otherwise self reporting in nature and also tended to correlate strongly with CFS classification, or were sparse or nonvarying between case and control. Subjects were excluded if they did not clearly fall into well-defined CFS classifications, had comorbid depression with melancholic features, or other medical or psychiatric exclusions. The popular data mining techniques, principle components analysis (PCA) and linear discriminant analysis (LDA), were used to determine how well the data separated into groups. Two different feature selection methods helped identify the most discriminating parameters. Results: Although purely biological features (variables) were found to separate CFS cases from controls, including many allostatic load and sleep-related variables, most parameters were not statistically significant individually. However, biological correlates of CFS, such as heart rate and heart rate variability, require further investigation. Conclusions: Feature selection of a limited number of variables from the purely biological dataset produced better separation between groups than a PCA of the entire dataset. Feature selection highlighted the importance of many of the allostatic load variables studied in more detail by Maloney and colleagues in this issue [1], as well as some sleep-related variables. Nonetheless, matrix linear algebra-based data mining approaches appeared to be of limited utility when compared with more sophisticated nonlinear analyses on richer data types, such as those found in Maloney and colleagues [1] and Goertzel and colleagues [2] in this issue.
Cited byJudith K. Sluiter, Alida M. Guijt, Monique H. Frings-Dresen. (2009) Reproducibility and validity of heart rate variability and respiration rate measurements in participants with prolonged fatigue complaints. International Archives of Occupational and Environmental Health 82:5, 623-630 Online publication date: 1-May-2009. CrossRef Benjamin N Goertzel, Cassio Pennachin, Lucio de Souza Coelho, Brian Gurbaxani, Elizabeth M Maloney, James F Jones. (2006) Combinations of single nucleotide polymorphisms in neuroendocrine effector and receptor genes predict chronic fatigue syndrome. Pharmacogenomics 7:3, 475-483 Online publication date: 1-Apr-2006. Summary
| Full Text
| PDF (139 KB)
| PDF Plus (166 KB) Benjamin N Goertzel, Cassio Pennachin, Lucio de Souza Coelho, Elizabeth M Maloney, James F Jones, Brian Gurbaxani. (2006) Allostatic load is associated with symptoms in chronic fatigue syndrome patients. Pharmacogenomics 7:3, 485-494 Online publication date: 1-Apr-2006. Summary
| Full Text
| PDF (159 KB)
| PDF Plus (185 KB)
|
|
|