General Neurology

Updated 07.08.2021
Released 07.13.1999
Expires For CME 07.08.2024

Statistics for neurologists

Author: K K Jain MD†

Introduction

Overview

This article describes the role of statistics in the practice of medicine and, particularly, in neurology. Most of the applications are in clinical trials. Statistical methods are important for evaluating results of diagnostic tests and epidemiology of disease. They are also important in evaluating results of clinical trials, making diagnoses, and choosing appropriate treatment. Logistic regression can be used for outcome studies and estimation of risk factors as predictors of disease. Bayes’ theorem has been applied to determine if a patent foramen ovale is incidental or causal in patients with cryptic stroke.

Key points

	• Statistics has a broad application in medicine, including neurology.
	• Statistical issues are important in evaluating results of clinical trials, making diagnoses, and choosing appropriate treatment.
	• Bayesian analysis, a commonly used approach, begins with the observed differences between treatment A and treatment B and then asks how probable it is that treatment A is superior to treatment B.

Historical note and terminology

"Statistics" is defined as methodology for learning from experience, usually in the form of numbers derived from several separate measurements with individual variations. The scope of application in medicine is broad. Practicing physicians, including neurologists, encounter statistics in audits, resource allocations, publications, and hospital-utility data. Two basic terms used in describing disease epidemiology are “incidence” and “prevalence.” Incidence is the rate of new cases of the disease occurring within a period, eg, per month or per year, and should not be confused with prevalence, which is the proportion of cases in the population at a given time. Incidence conveys information about the risk of contracting the disease, whereas prevalence indicates how widespread the disease is. Consideration of incidence of a disease may be important for making decisions about recommending prophylactic therapy. Inadequate understanding of basic statistics may lead to errors, and an excellent example was given in article on straight and crooked thinking in medicine as follows (05):

An investigator published an article showing that of 200 epileptic subjects, 24% had had infantile convulsions in the first two years of life, whereas of 200 normal subjects only 2% had had infantile convulsions. He went on to argue that the pronounced difference between the 24% in the epileptics and in the 2% in the control group made it clear that convulsions within the first two years of life ought to be taken as a manifestation of epilepsy and that any child having convulsions in infancy should be treated with anticonvulsant drugs for several years. The argument appears most convincing, doesn't it? The fallacy is not at all obvious. The incidence of epilepsy in the population has been left out. Now, this is about 1 in 400. So among 40,000 people there would be 100 epileptics, 24 of whom had infantile convulsions. But among 40,000 people there would be 800 normal people (2% of 40,000) who had suffered from infantile convulsions. So it would mean treating 800 people, of whom only 24 had epilepsy, in other words, submitting 32 normal children to prolonged anticonvulsant therapy in order to make sure of treating one epileptic early in life.

The most useful statistical procedure for assessing diagnostic tests is based on a theorem named after Reverend Thomas Bayes, who discovered it in 1763 (06). The ideas incorporated in this theorem are familiar to all clinicians making clinical diagnoses. The likelihood of a disease being present depends not only on the signs and symptoms, but also on the frequency of the disease in the community. The latter probability is termed as "prior probability." Application of statistics in clinical trials did not start until the mid-twentieth century (20). Statistical issues are now important in evaluating results of clinical trials, making diagnoses, and choosing appropriate treatment. For those involved in clinical research, statistics cover all stages from planning and design of studies to data analysis and interpretation. This article will deal with a few important basics of statistics that are useful to neurologists in the interpretation of data from clinical trials and for evaluation of diagnostic procedures. Statistical programs can be installed on personal computers.

Description

	• Some of the important tools of statistics with applications in clinical studies in neurology are hazard ratio, hypothesis testing, confounding variables, P value, confidence interval, etc.
	• Bayesian analysis requires the establishment of a prior probability, eg, to show that treatment A is superior to treatment B after a trial has been conducted; it is necessary to specify the probability of treatment A's superiority based on evidence available before the trial.

Some of the concepts and techniques used to evaluate the results of clinical trials are as follows:

Hazard ratio. This represents the odds that an individual in the group with the higher hazard reaches the endpoint first. In a therapeutic trial examining time to disease resolution with a drug, it represents the odds that a treated patient will resolve symptoms before a control patient. In a trial to evaluate preventive effect, it describes the likelihood of progression of disease in the treatment group compared to the control group. For example, a hazard ratio of 0.66 in a clinical trial of use of statins or fibrates to lower serum cholesterol was associated with a one third lower risk of stroke (01). With a hazard ratio 1.12, no association was found between lipid lowering drug use and coronary heart disease.

Hypothesis testing. A statistical procedure can be designed to test and possibly disprove a "null hypothesis," a term used to state that no difference exists between outcomes resulting from treatments that are being compared. The results of such comparisons are seldom identical; there will usually be some difference in the outcomes between the experimental and the control groups. If the difference is significant, the null hypothesis is disproved. Chi-square, a statistical test commonly used to compare observed data with data we would expect to obtain according to a specific hypothesis, is used to test “null hypothesis.”

Confounding variables. These are defined as the variables correlated (positively or negatively) with both the dependent variable and the independent variable so that the results do not reflect the actual relationship between the variables under study. For example, search for the causes of diseases in epidemiological studies is based on associations with various risk factors, but there may be other factors that are associated with the exposure and affect the risk of developing the disease that will distort the observed association between the disease and exposure under study. Various methods to modify a study design to exclude or control confounding variables, besides randomization, include restriction and matching based on age and sex.

Sample size. In a clinical trial, an important question is the number of subjects needed for a statistically significant result. There are several methods for sample size calculations, which is usually based on the statistics used in the analysis of the data. According to the "rule of thumb," an important component of sample size calculations is the power associated with a certain statistical procedure (34). Switching to a different statistical procedure during the data analysis may alter the anticipated power. Moreover, the estimated treatment effect may no longer be meaningful in the scale of the new statistic.

Errors. Errors occur when one incorrectly evaluates the difference in outcomes between the placebo and the treatment groups. Type 1 errors are the erroneous conclusion of difference when in fact no difference exists. The probability of a type 1 error (alpha) is usually set at 0.05. Type 2 errors occur when a false conclusion is made that the 2 outcomes are not significantly different when they actually are. Type 2 errors can result from erroneously failing to reject the null hypothesis. The probability of a type 2 error (beta) decreases as the sample gets larger or the statistical power (1-beta) increases.

P values. P values, or significance levels, indicate the probability that the results were obtained by chance and are used to assess the degree of dissimilarity between 2 or more sets of measurements or between 1 set of measurements and a standard. P values measure the strength of the evidence against the null hypothesis. The smaller the p value, the stronger the evidence against the null hypothesis. When calculating the p value, one chooses a measure of the quantity of interest and a range of measurements that will be computed.

P value is usually indicated as smaller than 0.05 (p < 0.05) or smaller than 0.01 (p < 0.01). When the p value is between 0.05 and 0.01, the result is usually considered to be statistically significant; if the p value is less than 0.01, the result is often considered to be highly statistically significant. The advantage of p values is that they provide a specific, objectively chosen level for the investigator to keep in mind. Furthermore, it is simpler to determine if the p value is larger or smaller than 0.05 than to compute the exact probability. The main disadvantage is that p values suggest a rather meaningless cutoff point that is not relevant to the investigation. P values convey meaningful information only if they are put into a clinical context. Confidence interval can be used to indicate the clinical significance of a p value. This is important because a small clinical difference may be statistically significant because of a large sample size, whereas a clinically important effect may appear statistically nonsignificant if the number of subjects studied is too small.

P value remains controversial, and the idea that a single number can capture both the long-range outcomes of an experiment and the evidential meaning of a single result has been questioned. An arbitrary division of results, into "significant" or "nonsignificant" according to the p value, was not the intention of the founders of statistical inference. Some investigators suggest using the Bayesian approach instead because it enables the integration of background knowledge with statistical findings. Correlation of P values with Bayes factors suggests that 1 reason for lack of reproducibility of scientific studies is attributed to the conduct of significance tests at unjustifiably high levels of significance. To address this problem, evidence thresholds required for the declaration of a “significant” finding should be increased to 25–50:1, and to 100–200:1, ie, conduct of tests at P values of 0.005 or 0.001 (22). In view of the prevalent misuses of and misconceptions concerning P values, some statisticians prefer to supplement or even replace P values with other approaches, including methods that emphasize estimation over testing such as confidence interval (38).

Confidence interval. Confidence interval covers a range of values, including the true value. The confidence interval covers a large proportion of the sampling distribution of the statistic of interest. The 95% confidence interval for the sample is the range of values from mean -1.96 standard error to mean +1.96 standard error and is usually interpreted as a range of values that contain the true population mean with a probability of 0.95. The true effect of the treatment may be greater or lesser than what is observed, and confidence intervals tell us how much greater or smaller the true effect is likely to be. Confidence intervals can be calculated for various measures of association. Confidence intervals are recommended for better communication of clinical trial results and for evaluation of diagnostic tests.

The t test. The t test, also called the "Student's t test," is used for measured variables when comparing 2 means. The t test, although valuable, does not give the probability that the 2 random samples have, in fact, come from the same population (14). The unpaired t test compares the medians of 2 independent samples. The unpaired t test is parametric, and its nonparametric equivalent is the Mann-Whitney U test. The paired t test compares 2 paired observations on the same individuals or on matched individuals. Its nonparametric equivalent is the Wilcoxon matched pair test. The values for t are calculated as follows:

t (unpaired) =	Difference between means Standard error of difference
t (paired) =	Mean difference Standard error of difference

Analysis of variation (ANOVA) or multivariate analysis. This enables comparison among more than 2 sample means. A 1-way analysis of variation deals with a single, categorically independent variable, whereas factorial analysis of variation can deal with multiple factors in several different configurations. Several statistical packages are available for ANOVA. Analysis of variation has an advantage over the t test when several drugs in different groups are being compared and numerous possible comparisons can be made. In this situation, the use of multiple t tests to do 2-way comparisons would be inappropriate, as it would lead to the loss of any interpretable level of significance. A 1-way analysis of variation is presented as a table that includes the sums of squares between groups and sums of squares within groups. It also contains the value of the F ratio, which is equal to the mean square (between) divided by mean square (within). The larger the F ratio is, the more significant the results are. An extension of this approach called "factorial analysis of variation" enables the inclusion of any number of factors in a single experiment and looks at the independent effect of each factor without distorting the overall probability of chance difference.

Standard deviation as a measure of variability. The standard deviation, usually depicted by the abbreviation SD, is a measure of variability. When the standard deviation of a sample is calculated, it is an estimate of the variability of the population from which the sample was drawn. It should not be confused with standard error, which depends on both the standard deviation and the sample size; the standard error falls as the sample size increases. This forms the basis for calculation of sample size for a controlled trial. Although standard error falls as sample size increases, standard deviation tends not to change with the increase.

Measures of association. Measures of association refer to the use of statistical analysis to study the association between 2 variables within a group of subjects in a clinical trial. Correlation (with calculation of correlation coefficient) is the method used to study possible association between 2 continuous variables. Regression analysis enables the value of 1 variable to be predicted from any known value of the other variable. Traditionally linear regression methods have been preferred to nonlinear regression methods because of their inherent simplicity. With the widespread availability of computers and statistical software packages, nonlinear regression, an often more appropriate analysis, should be considered. Most computer programs for nonlinear regression analysis provide information necessary to perform all the calculations.

Odds ratio. The odds ratio is the ratio of the probability that the event of interest occurs to the probability that it does not. This is usually estimated by the ratio of the number of times that the event of interest occurs to the number of times that it does not. Odds ratio is also useful as a measure of association. It was originally used in analysis of case-control epidemiological studies for measuring relative risk, but it is now applied to randomized trials and meta-analysis. Supposing the odds of occupational exposure to aluminum were 1.3 higher in cases with dementia than in controls, the odds ratio is said to be 1.3. An odds ratio greater than 1 indicates a positive association between exposure and disease; an odds ratio equal to 1 indicates no association; and an odds ratio less than 1 indicates a negative association. However, association is not a proof of causation. The calculation to find the approximate 95% confidence interval for an odds ratio is not difficult. The use of odds ratios has increased in medical reports for the following reasons (08):

	• They provide an estimate (with confidence interval) of the relationship between 2 binary ("yes or no") variables.
	• They enable examination of the effects of other variables on that relationship, using logistic regression.
	• They provide a convenient interpretation in case-control studies.

Comparison of survival rates. Survival rates of 2 groups of patients treated by different methods in clinical trials may need to be compared. A traditional method is plotting curves of each of the 2 variants use time from start of observation and survival rate at various points of time. The limitation of this approach is that the survival curves differ, but this is not sufficient for the investigator to conclude that the survival in 1 group is worse than the other. It gives a comparison at some arbitrary point in time but does not provide a comparison of the total survival experience of the 2 groups. According to the Kaplan Meier survival curve, censoring is unrelated to prognosis, the survival probabilities are the same for subjects recruited early and late in the study, and the events happen at the times specified (07).

The logrank test is used to test the null hypothesis that there is no difference between the groups in the probability of event (death) at any time point and the analysis is based on the times of deaths (09). For example, in a clinical trial of 50 patients, 1 patient in group I (20 patients) dies in week 3, so the risk of death in this week is 1/50. If the null hypothesis were true, the expected number of deaths in group I is 20 x 1/50 = 0.4. Similarly, in group II (30 patients) the expected number of deaths is 30 x 1/50 = 0.6. The same calculations are performed each time an event occurs. This way of handling censored observations is the same as for the Kaplan-Meier survival curve. The logrank test is most likely to detect a difference between groups when the risk of an event is consistently greater for 1 group than the other. It is unlikely to detect a difference when survival curves cross, a situation that may occur when comparing a medical with a surgical intervention.

Logistic regression. This is usually carried out by professional statisticians and is playing an increasing role in clinical studies. It is relevant to studies when only 2 possible outcomes are of interest: recovery or death; improvement or no improvement; and presence or absence of side effects. Logistic regression can also be used to analyze the role of risk factors at the onset of a disease to test the independence of various parameters, such as discontinuation of antiepileptic medications and generalized background abnormalities on EEG as predictors of risk of status epilepticus.

Standard regression-based methods have been applied to statistically measure individual rates of impairment at several time points after concussion in college football players to decide when an athlete can return to competition (27). The data suggest that use of neuropsychological testing to detect subtle cognitive impairment is most useful once postconcussive symptoms have resolved.

Artificial neural networks are now being designed as an alternative to logistic regression to predict evolution of some events during a disease. In a retrospective study of a database of patients with confirmed aneurysmal subarachnoid hemorrhage, a simple artificial neural network model was more sensitive and specific than multiple logistic regression models for prediction of cerebral vasospasm (15).

Scientific inference from big data. Current research activity and collection of information is generating big data, ie, data sets whose heterogeneity, complexity, and size -- measured in terabytes or petabytes -- exceed the capability of traditional approaches to data processing, storage, and analysis. Analysis of big data is needed to identify complex patterns hidden inside volumes of data that could accelerate scientific discovery and development of beneficial technologies as well as products. For example, analysis of big data combined from a patient’s electronic health records, environmental exposure, activities, and genetic and proteomic information is expected to help guide the development of personalized medicine (28).

Bayesian statistical methods. In comparing results of treatment, Bayesian analysis begins with the observed differences between treatment A and treatment B and then asks how likely it is that treatment A is superior to treatment B. In other words, the Bayesian method induces the probability of the existence of the true, but as of yet unknown, underlying state. However, the Bayesian analysis requires the establishment of a prior probability. To obtain a Bayesian probability that treatment A is superior to treatment B after a trial has been conducted, it is necessary to specify the probability of treatment A's superiority based on evidence available before the trial. The first application of Bayesian methodology was in diagnostic medicine. As applied to diagnosis, Bayes’ theorem states as follows:

Prob
(disease/test positive) =

Prob Prob
(test positive/disease) x (disease)
-------------------------------------------------- =
Prob (test positive)

Prob Prob Prob Prob
(test positive/disease) x (disease) + (test positive/no disease) x (no disease)
---------------------------------------------------------------------------------------
Prob Prob Prob Prob
(test positive/disease) x (disease) + (test positive/no disease) x (no disease)

Bayesian statistics has now permeated all the major areas of medical statistics, including clinical trials, epidemiology, meta-analyses and evidence synthesis, spatial modeling, longitudinal modeling, survival modeling, molecular genetics, and decision making in respect of new technologies.

Statistical evaluation of clinical measurements. Several clinical measurements are not precise and may vary according to the examiner and the technique used. Studies comparing 2 methods are common. The aim of these studies is to see if the results of the methods agree well enough for 1 method to replace the other. Another objective is to see if the results of 2 studies conducted by different observers using the same method agree with each other; this is termed "inter-rater agreement." Here, the problem is one of estimation, rather than any hypothesis testing. A simple approach to assessing agreement is to see how many exact agreements are observed. The weakness of this approach is that it does not take into account where the agreement is or what agreements would take place by chance. One measure of agreement, called kappa, has a value of 1.0 when the agreement is perfect, a value of 0 when no agreement is better than chance, and negative values when agreement is worse than chance.

Statistical evaluation of diagnostic tests. Considerable clinical research is being done to evaluate and improve methods of diagnosis of disease. The terms "positive" and "negative" refer to the presence or absence of the condition of interest as confirmed by a definitive examination. An example is the polymerase chain-reaction-based rapid test for the detection of enteroviral RNA in the cerebrospinal fluid, where the results are later verified by culture of the cerebrospinal fluid. In a study, 96.3% of the patients who tested positive with polymerase chain reaction were later positive with culture as well, whereas 99.0% of those that had a negative polymerase chain reaction test were shown to have a negative culture. The 2 terms used to describe these situations are "sensitivity" and "specificity." Sensitivity is the proportion of positives correctly identified by the test, and specificity is the proportion of negatives correctly identified by the test.

Sensitivity and specificity do not tell us the probability of the test resulting in a correct diagnosis, whether it is positive or negative. For this we use positive and negative predictive values. Positive predictive value is the proportion of patients with positive test results who are correctly diagnosed; negative predictive value is the proportion of patients with negative test results who are correctly diagnosed. Positive and negative predictive values give a direct assessment of the usefulness of the test in practice. The 4 possibilities are as follows:

(A) True positive: The test is positive, and the disease is present.
(B) False positive: The test is positive, but the disease is absent.
(C) False negative: The test is negative, but the disease is present
(D) True negative: The test is negative, and disease is absent.

These quantities can be represented as follows:

• Sensitivity = a/(a + c)
• Specificity = d/(b + d)
• Positive predictive value (PPV) = a/(a + b)
• Negative predictive value (NPV) = d/(c + d)

Prevalence of the disease in the study can be calculated as (a + c)/n if the study is carried out in a definable group of patients.

With the above information, positive predictive value and negative predictive value can be calculated as follows:

PPV =	sensitivity x prevalence ---------------------------------------------------------- sensitivity x prevalence + (1-specificity) x (1-prevalence)
NPV =	sensitivity x (1-prevalence) ---------------------------------------------------------- (1-sensitivity) x prevalence + specificity x (1-prevalence)

If sensitivity and specificity estimates are reported without a measure of precision, clinicians cannot know the range wherein the true values of the indices are likely to lie. Therefore, evaluations of diagnostic accuracy should be qualified with confidence intervals.

An example of application of the statistical evaluation is critical review of a publication claiming that MRI-assisted diagnosis of autistic spectrum disorder can be carried out with a sensitivity and specificity of up to 90% and 80%, respectively (16). A common misconception would be that the test is 90% (sensitivity) accurate. For autistic spectrum disorder, if the prevalence is 1 in 100, the diagnostic accuracy may be less than 5% (positive predictive value), ie, 5 in every 100 with a positive test would have autistic spectrum disorder. Of those who do not have the disease, 80% (specificity) will test negative, but the problem is with the 20% who do not have the disease and yet test positive. This approach would not be worthwhile for screening a population with low prevalence of autistic spectrum disorder.

Statistical significance. The term “statistical significance” has been used for decades in various publications. A proposal with support from other authors suggests retaining P values but abandoning ambiguous statements (significant/nonsignificant), suggests discussing “compatible” effect sizes, and points out that many effects are refuted on discovery or replication (03). There is considerable criticism of this proposal. Although the statistics of scientific work requires improvement, banning of statistical significance while retaining P values (or confidence intervals) will not improve the situation and may foster statistical confusion and create problematic issues with study interpretation. Therefore, the term “statistical significance” should be retained (21).

Clinical applications

	• Basic tools of statistics have applications in neurology, including neuroepidemiology, and evaluation of therapies in practice as well as in clinical trials.
	• Odds ratio is a useful tool in neurologic investigations as its 95% confidence interval estimates the likely degree of sampling error and provides a test of significance at the 5% level.
	• Bayesian statistics is used in causality assessment of drug-induced neurologic disorders, determining accuracy of diagnostics tests and projecting outcome of diseases and their treatment.

Statistics has practical applications in epidemiology of neurologic disorders, evaluation of therapies in practice, and analysis of data from clinical trials, including those relevant to neurology. Some examples are given here.

Neuroepidemiology. Statistical approaches can be used to determine if there is a difference in the rate of a neurologic disease or disease characteristic among subgroups of patients. Measures of prevalence describe the proportion of population that has the disease at a specific point in time. Measures of incidence, on the other hand, describe the frequency of occurrence of new cases during a time period. Cumulative incidence (CI) is defined as the following fraction:

CI =	Number of individuals who get a disease during a certain period ------------------------------------------------------------------- Number of individuals in the population at the beginning of the period

Both the numerator and the denominator consist of individuals who are free from the disease at the beginning of the period. Cumulative incidence is the proportion of healthy individuals who get a disease during this period. Measures such as odds ratios and multivariate analysis are used in epidemiological studies.

An instrument, the Quality of Longitudinal AD Studies, was developed and tested to evaluate the quality of longitudinal statistical applications in published studies on Alzheimer disease (40). Item-specific inter-rater reliability coefficients were high, indicating the applications of this tool for reliable assessment of original publications on Alzheimer disease.

Odds ratio. This has been found to be a useful tool in neurologic investigations. Its 95% confidence interval estimates the likely degree of sampling error and provides a test of significance at the 5% level. Odds ratios have been computed to estimate the probability of concurrent diagnosis of Alzheimer disease with a given neuropsychological performance level. Odds ratio has been used to study the association between tau genotype H1H1 and frontotemporal dementia (36). The H1H1 genotype was significantly over-represented in patients with frontotemporal dementia compared with controls (62% vs. 46%; P = 0.01, OR = 1.95). After stratification according to apolipoprotein E genotype, a significant interaction was found between apolipoprotein E and tau genotypes (P = 0.03). Although this study confirms the primary role of tau in frontotemporal dementia, the mechanism that increases the risk of frontotemporal dementia remains unknown.

A prospective, multicenter, cohort study has compared dedicated stroke wards versus specialist stroke-team care at general hospital wards (29). Twenty-eight-day case-fatality rate was 12.6% at stroke wards versus 15.2% at stroke teams for all patients (P = 0.002), and stroke-ward care also predicted better outcome when analyzed with multivariate logistic regression model (odds ratio 1.701; confidence interval: 1.025-2.822).

Multivariate analysis. This method has been used for the analysis of intracerebral hemorrhage in the elderly. Distinctive clinical features of intracerebral hemorrhage in elderly people were assessed by multivariate analysis, which showed that female sex (OR = 3.2, 95% CI = 1.27 to 7.99) and moderate or severe neurologic deficit at hospital discharge (OR =4.75, 95% Cl = 1.36 to 16.55) were found to be independent clinical factors associated with intracerebral hemorrhage in elderly people (04).

Multivariate analysis has been used for determining the association between baseline patient characteristics, history variables, and variables from the neurologic examination for prediction of intracranial metastases in cancer patients presenting with headache. Headache duration of 10 weeks or longer (OR of 11.0; 95% CI), emesis (OR of 4.0; 95% CI), and pain (OR of 6.7; 95% CI) were predictive of metastases (11). No variable from the neurologic examination was found to add information to the prediction model, but MRI examination could not be excluded from this prediction model.

In a study, the survival of amyotrophic lateral sclerosis patients who received their care at a multidisciplinary care facility was compared to the survival of similar patients in general neurology clinics using a prospective population-based registry of patients with this disease (43). Multivariate analysis showed that management by multidisciplinary care was associated with only a 10% increase in survival probability at 12 months (95%CI= 0.44-1.89; p = 0.9). In this population-based series, management of amyotrophic lateral sclerosis by multidisciplinary clinics did not improve survival, regardless of site of symptoms onset.

Multivariate analysis has been used to study the natural history of multiple sclerosis in a population-based cohort to identify predictive clinical factors of disability (12). The analysis showed that for relapsing-remitting multiple sclerosis, a shorter time to assignment of an Expanded Disability Status Scale score of 4 was associated with an older age of onset of multiple sclerosis and incomplete recovery from the first relapse.

Multivariate analyses can be used in cerebrovascular research for the comparison of more than 2 experimental groups where continuous measures such as infarct volume, cerebral blood flow, or vessel diameter are the primary variables of interest (32).

Multivariate analyses of cortical thickness enable clear distinction between control subjects and patients and moderate distinction between patients with posterior cortical atrophy and logopenic progressive aphasia, whereas patients with Alzheimer disease who have early-onset of amnesia are distributed along a continuum between these extremes (31).

The multisystem nature of autism spectrum disorder requires the use of multivariate methods of statistical analysis over common univariate approaches for discovering clinical biomarkers relevant to this goal (35). Multivariate statistical analyses can be used to quantify the value of these behavioral and physiological biomarkers for autism spectrum disorder diagnosis.

Multivariate statistical analysis of heart rate variations can be used for predicting seizures because excessive neuronal activities in the preictal period of epilepsy affect the autonomic nervous system function, which results in heart rate variations (17). Results of clinical application of this method demonstrated that seizures could be predicted in 10 out of 11 preictal episodes prior to the seizure onset, ie, its sensitivity was 91%. This method can be used in daily life because the heart rate can be measured easily by using a wearable sensor.

Multivariate algorithms can be used to predict treatment response to different antidepressant therapies based on patients' personal data, which may enable clinicians to make personalized treatment decisions to provide more effective as well as more tolerable medications to the right patient (18).

Multivariate statistical analyses using IBM’s SPSS Statistics software indicated that increase in age, increase in the duration of diabetes, and the male gender significantly contributed to the abnormal nerve conduction velocity in diabetic neuropathy (19). Neurologic signs in the nerves of the lower limbs were generally higher in patients with statistically significant nerve conduction abnormalities, eg, in the peroneal motor nerve (P value = 0.003) and the sural sensory nerve (P value = 0.003), even if diabetes was subclinical.

Bayesian statistics in neurology. Applications of the Bayesian theorem in neurology are shown in Table 1:

Table 1. Application of Bayesian Statistics in Neurology

	• Causality assessment of drug-induced neurologic disorders
	• Modeling the natural history of diseases where it would be unethical to deny treatment
	• Measuring diagnostic accuracy of neurologists' own clinical diagnoses in comparison with neuropathologic diagnoses
	• Integration of brain-imaging data
	• For preprocessing and statistical analysis of structural MRI data (33)
	• Neuroepidemiological studies
	• A discriminant analysis improves the distinction between myopathic, neuropathic, and unclassifiable motor unit potentials over that obtained by conventional quantitative motor unit potential analysis.
	• Prediction of clinical outcome in stroke trials according to stratification based on the observed frequency of worsening
	• Comparison of outcome rates of carotid endarterectomy between surgeons
	• Comparative evaluation of cost-effectiveness of 2 treatments

Causality assessment in drug-induced neurologic disorders. A Bayesian approach is used to assess the probability that an adverse reaction is drug induced. This approach uses the background information of relative incidence of the adverse event in exposed as well as nonexposed patients, based on likelihood ratios from specific case information. A posterior probability that the drug caused the adverse event is obtained from posterior odds determined from a combination of background information and case-specific data. Computerized models are available for the evaluation of some adverse events. Bayesian data mining in the US Vaccine Adverse Event Reporting System prospectively detected the safety signal for febrile seizures after the 2010 to 2011 seasonal influenza virus vaccine in young children (26). Bayesian methodology has been found to be useful during interim analyses of clinical trials and in making treatment decisions.

Modeling the natural history of diabetic retinopathy. Bayesian approach has been used to model the natural history of diabetic retinopathy. This approach would help show the natural progression of the disease without medical intervention; denial of treatment would not be ethically possible in an actual patient.

Improving accuracy of clinical diagnosis. An example of improving accuracy of clinical diagnosis was demonstrated in a study using Discrete Bayesian Network analysis, where odors of orange, cinnamon, peppermint, and pineapple, combined with age and Mini-Mental State Examination, achieved a high predictive ability for incident dementia (13).

To predict progression of amyotrophic lateral sclerosis (ALS). For improving management of amyotrophic lateral sclerosis, there is a need for better tools to estimate disease progression from clinical trial data. Crowd-sourcing algorithms involving distribution of problem-solving tasks to a large group of persons and analysis of collective data to assess quality and processing of solutions in parallel was shown to classify amyotrophic lateral sclerosis patients according to disease characteristics, including progression (24).

Identifying new risk factors for dementia. A study has identified new risk factors for dementia from nationwide longitudinal population-based data by using Bayesian statistics (39). Hearing loss and senile cataract were associated with an increased risk of dementia.

Comparison of outcome rates of carotid endarterectomy between surgeons. A Bayesian hierarchical modeling approach is used to determine outcome data (stroke and death within 30 days) together with information on possible risk factors specific for carotid endarterectomy. Although this approach can detect diabetes, stroke, and heart disease as significant risk factors, there is no significant difference between the predicted and observed outcome rates for any surgeon.

Significance of patent foramen ovale in cryptogenic stroke. Bayes’ theorem has been applied to calculate the probability that a patent foramen ovale is incidental or causal in patients with cryptic stroke (02). The results show that in patients with otherwise cryptic stroke, approximately one third of discovered patent foramen ovales are likely to be incidental and do not require closure.

Evaluation of neuroprotective effect of endovascular cooling in cardiac arrest. Randomized clinical trials in comatose survivors of cardiac arrest have shown that therapeutic hypothermia improves neurologic recovery. Narrow inclusion criteria allow cooling only to a limited group of primary cardiac arrest survivors. Bayesian approach has been applied to show the efficacy and safety of endovascular cooling in unselected survivors of cardiac arrest.

Comparative evaluation of cost-effectiveness of 2 treatments. A Bayesian approach starts with the proportionality assumption, which states that the differences in healthcare expenditure (less the direct cost of therapy) are directly proportional to the differences in effectiveness. The method has been applied to data from published fixed dosage, parallel-design studies comparing both topiramate and lamotrigine with placebo, with a conclusion favoring topiramate. However, the method needs to be interpreted with care and is not intended to replace good, comparative, pharmacoeconomic research.

Application of statistics in brain imaging studies. Some examples of the role of statistics in brain imaging studies are:

Meta-analyses of brain imaging studies. Meta-analyses are essential to summarize the results of the increasing number of neuroimaging studies in neurology and psychiatry but are not always feasible as full image information is rarely available. Activation likelihood estimation or multilevel kernel density analysis are more feasible as they only need reported peak coordinates. Signed differential mapping is based on the positive features of existing peak-probability methods, which enables meta-analyses of studies comparing patients with controls, and the performance might be enhanced by including statistical parametric maps (30).

Integration of brain-imaging data. Bayesian inference using Markov Chain Monte Carlo techniques enables the combination of disparate information from various imaging techniques while dealing with the uncertainty in each. Bayesian inference has been used to integrate different forms of brain imaging data from magnetoencephalography and functional magnetic resonance imaging in a probabilistic framework (23).

Diagnosis and assessment of Alzheimer disease by imaging. PET has been widely used to guide the clinicians in the diagnosis of Alzheimer disease, but the subjectivity of its evaluation has favored the development of computer-aided diagnosis systems. An algorithm based on a combination of feature extraction techniques has been proposed as a tool for the early detection of Alzheimer disease in terms of the accuracy, sensitivity, and specificity (10). A fully Bayesian method with an adjust spike-and-slab absolute shrinkage and selection operator procedure has been developed for the estimation and selection of influential features or images. Simulation studies have shown satisfactory performance of this method, which has led to application for a study on the Alzheimer's Disease Neuroimaging Initiative data set (37).

A study has explored the heterogeneity of atrophy patterns in late-onset Alzheimer disease using a data-driven Bayesian framework that accounted for and estimated latent Alzheimer disease atrophy factors derived from structural MRI data (42). The results showed 3 latent Alzheimer disease atrophy factors with distinct memory and executive function patterns. The cortical atrophy factor was associated with the worst executive function performance, whereas the temporal atrophy factor was associated with the worst memory performance. This approach enabled assessment of multiple atrophy factors expressed to various degrees in an individual rather than assigning the individual to a single subtype. Prediction of individual-specific cognitive decline trajectories has potential applications for personalized prevention and monitoring of Alzheimer disease progression.

A Bayesian model for prediction of status of Parkinson disease using imaging data. A Bayesian hierarchical model using posterior predictive probabilities has been proposed for predicting status of Parkinson disease by incorporating information from both functional and structural brain imaging scans (41). This model correctly identifies key regions that are highly associated with disease. This model provides useful information for the diagnosis of Parkinson disease, neurophysiological basis of the disease, early premotor alterations, and effective strategies for designing studies to test potential neuroprotective therapies.

A Bayesian model for prediction of motor progression in Parkinson disease. Bayesian multivariate predictive inference platform has been applied to data from the Parkinson's Progression Markers Initiative study (NCT01141023) to construct models for prediction of the annual rate of change in combined scores from the Movement Disorder Society-Unified Parkinson's Disease Rating Scale (25). Genetic variation was the most useful predictive marker of motor progression. This model confirmed the validated predictors of Parkinson disease motor progression and would be useful for clinical disease monitoring and treatment as well as for clinical trials.

References

01: Alpérovitch A, Kurth T, Bertrand M, et al. Primary prevention with lipid lowering drug and long term risk of vascular events in older people: population based cohort study. BMJ 2015;350:h2335. PMID 25989805
02: Alsheikh-Ali AA, Thaler DE, Kent DM. Patent foramen ovale in cryptogenic stroke: incidental or pathogenic. Stroke 2009;40(7):2349-55. PMID 25581302
03: Amrhein V, Greenland S, McShane B. Scientists rise up against statistical significance. Nature 2019;567(7748):305-7. PMID 30894741
04: Arboix A, Vall-Llosera A, Garcia-Eroles L, Massons J, Oliveres M, Targa C. Clinical features and functional outcome of intracerebral hemorrhage in patients aged 85 and older. J Am Geriatr Soc 2002;50(3):449-54. PMID 11943039
05: Asher R. Straight and crooked thinking in medicine. Br Med J 1954;2(4885):460-2.** PMID 13182247
06: Bayes T. An essay towards solving a problem in the doctrine of chances. Philos Trans R Soc Lond B Biol Sci 1763;53:370-418.
07: Bland JM, Altman DG. Survival probabilities (the Kaplan-Meier method). BMJ 1998;317(7172):1572. PMID 9836663
08: Bland JM, Altman DG. Statistics notes. The odds ratio. BMJ 2000;320(7247):1468. PMID 10827061
09: Bland JM, Altman DG. The logrank test. BMJ 2004;328(7447):1073. PMID 15117797
10: Chaves R, Ramirez J, Gorriz JM, Illan IA, Gomez-Rio M, Carnero C. Effective diagnosis of Alzheimer's disease by means of large margin-based methodology. BMC Med Inform Decis Mak 2012;12(1):79. PMID 22849649
11: Christiaans MH, Kelder JC, Arnoldus EP, Tijssen CC. Prediction of intracranial metastases in cancer patients with headache. Cancer 2002;94:2063-8. PMID 11932910
12: Debouverie M, Pittion-Vouyovitch S, Louis S, Guillemin F; LORSEP Group. Natural history of multiple sclerosis in a population-based cohort. Eur J Neurol 2008;15(9):916-21. PMID 18637953
13: Ding D, Liang X, Xiao Z, Wu W, Zhao Q, Cao Y. Can dementia be predicted using olfactory identification test in the elderly? A Bayesian network analysis. Brain Behav 2020;10(11):e01822. PMID 32864870
14: Drummond GB, Tom BD. Statistics, probability, significance, likelihood: words mean what we define them to mean. Adv Physiol Educ 2011;35(4):361-4. PMID 22139771
15: Dumont TM, Rughani AI, Tranmer BI. Prediction of symptomatic cerebral vasospasm after aneurysmal subarachnoid hemorrhage with an artificial neural network: feasibility and comparison with logistic regression models. World Neurosurg 2011;75(1):57-63; discussion 25-8. PMID 21492664
16: Ecker C, Marquand A, Mourão-Miranda J, et al. Describing the brain in autism in five dimensions--magnetic resonance imaging-assisted diagnosis of autism spectrum disorder using a multiparameter classification approach. J Neurosci 2010;30(32):10612-23. PMID 20702694
17: Fujiwara K, Miyajima M, Yamakawa T, et al. Epileptic seizure prediction based on multivariate statistical process control of heart rate variability features. IEEE Trans Biomed Eng 2016;63(6):1321-32.** PMID 26841385
18: Gillett G, Tomlinson A, Efthimiou O, Cipriani A. Predicting treatment effects in unipolar depression: a meta-review. Pharmacol Ther 2020;212:107557.** PMID 32437828
19: Haji Naghi Tehrani K. A study of nerve conduction velocity in diabetic patients and its relationship with tendon reflexes (T-Reflex). Acta Biomed 2020;91(3):e2020066. PMID 32921766
20: Hill AB. The clinical trial. Brit Med Bull 1951;7:278-82. PMID 14879025
21: Ioannidis JPA. The importance of predefined rules and prespecified statistical analyses: do not abandon significance. JAMA 2019;321(21):2067-8. PMID 30946431
22: Johnson VE. Revised standards for statistical evidence. Proc Natl Acad Sci U S A 2013;110(48):19313-7. PMID 24218581
23: Jun SC, George JS, Kim W, et al. Bayesian brain source imaging based on combined MEG/EEG and fMRI using MCMC. Neuroimage 2008;40(4):1581-94. PMID 18314351
24: Küffner R, Zach N, Norel R, et al. Crowdsourced analysis of clinical trial data to predict amyotrophic lateral sclerosis progression. Nat Biotechnol 2015;33(1):51-7.** PMID 25362243
25: Latourelle JC, Beste MT, Hadzi TC, et al. Large-scale identification of clinical and genetic predictors of motor progression in patients with newly diagnosed Parkinson's disease: a longitudinal cohort study and validation. Lancet Neurol 2017;16(11):908-16.** PMID 28958801
26: Martin D, Menschik D, Bryant-Genevier M, Ball R. Data mining for prospective early detection of safety signals in the Vaccine Adverse Event Reporting System (VAERS): a case study of febrile seizures after a 2010-2011 seasonal influenza virus vaccine. Drug Saf 2013;36(7):547-56. PMID 23657824
27: McCrea M, Barr WB, Guskiewicz K, et al. Standard regression-based methods for measuring recovery after sport-related concussion. J Int Neuropsychol Soc 2005;11:58-69. PMID 15686609
28: National Academies of Sciences, Engineering, and Medicine. Refining the concept of scientific inference when working with big data: proceedings of a workshop. Washington, DC: The National Academies Press, 2017.
29: Ovary C, Szegedi N, May Z, Gubucz I, Nagy Z. Comparison of stroke ward care versus mobile stroke teams in the Hungarian stroke database project. Eur J Neurol 2007;14(7):757-61. PMID 17594331
30: Radua J, Mataix-Cols D, Phillips ML, et al. A new meta-analytic method for neuroimaging studies that combines reported peak coordinates and statistical parametric maps. Eur Psychiatry 2012;27(8):605-11. PMID 21658917
31: Ridgway GR, Lehmann M, Barnes J, et al. Early-onset Alzheimer disease clinical variants: multivariate analyses of cortical thickness. Neurology 2012;79(1):80-4. PMID 22722624
32: Schlattmann P, Dirnagl U. Statistics in experimental cerebrovascular research: comparison of more than two groups with a continuous outcome variable. J Cereb Blood Flow Metab 2010;30(9):1558-63. PMID 20571520
33: Schmidt P, Schmid VJ, Gaser C, et al. Fully Bayesian inference for structural MRI: application to segmentation and statistical analysis of T2-hypointensities. PLoS One 2013;8(7):e68196. PMID 23874537
34: van Belle G. Statistical Rules of Thumb. http://www.vanbelle.org/chapters/webchapter2.pdf. New York, NY: Wiley Interscience, 2008.
35: Vargason T, Grivas G, Hollowood-Jones KL, Hahn J. Towards a multivariate biomarker-based diagnosis of autism spectrum disorder: review and discussion of recent advancements. Semin Pediatr Neurol 2020;34:100803. PMID 32446437
36: Verpillat P, Camuzat A, Hannequin D, et al. Association between the extended tau haplotype and frontotemporal dementia. Arch Neurol 2002;59(6):935-9. PMID 12056929
37: Wang X, Song X, Zhu H. Bayesian latent factor on image regression with nonignorable missing data. Stat Med 2021;40(4):920-32. PMID 33169396
38: Wasserstein RL, Lazar NA. The ASA's statement on p-values: context, process, and purpose. The American Statistician 2016;70:129-33.
39: Wen YH, Wu SS, Lin CH, et al. A Bayesian approach to identifying new risk factors for dementia: a nationwide population-based study. Medicine (Baltimore) 2016;95(21):e3658. PMID 27227925
40: Xiong C, Tang Y, van Belle G, Miller JP, Launer JL, Morris JC. Evaluating the quality of longitudinal statistical applications in original publications on Alzheimer's disease. Neuroepidemiology 2008;30(2):112-9. PMID 18334827
41: Xue W, Bowman FD, Kang J. A Bayesian spatial model to predict disease status using imaging data from various modalities. Front Neurosci 2018;12:184.** PMID 29632471
42: Zhang X, Mormino EC, Sun N, et al. Bayesian model reveals latent atrophy factors with dissociable cognitive trajectories in Alzheimer’s disease. Proc Natl Acad Sci U S A 2016;113(42):E6535-44.** PMID 27702899
43: Zoccolella S, Beghi E, Palagano G, et al. ALS multidisciplinary clinic and survival: Results from a population-based study in Southern Italy. J Neurol 2007;254(8):1107-12. PMID 17431705

This is an article preview.
Start a Free Account
to access the full version.

Nearly 3,000 illustrations, including video clips of neurologic disorders.
Every article is reviewed by our esteemed Editorial Board for accuracy and currency.
Full spectrum of neurology in 1,200 comprehensive articles.
Listen to MedLink on the go with Audio versions of each article.

Questions or Comment?

MedLink®, LLC

3525 Del Mar Heights Rd, Ste 304
San Diego, CA 92130-2122

Toll Free (U.S. + Canada): 800-452-2400

US Number: +1-619-640-4660

Support: service@medlink.com

Editor: editor@medlink.com

ISSN: 2831-9125

Statistics for neurologists

Statistics for neurologists

Introduction

Overview

Key points

Historical note and terminology

Description

Clinical applications

Table 1. Application of Bayesian Statistics in Neurology

References

Contributors

Author

This is an article preview.
Start a Free Account
to access the full version.

Questions or Comment?

Also in This Category

Congenital lymphocytic choriomeningitis virus infection

Urinary dysfunction in neurologic disorders

Dysphagia

Neurovascular injuries

Botulinum toxin treatment of neurologic disorders

Encephalitis lethargica

Pregnancy and stroke

Back pain

Statistics for neurologists

Introduction

Overview

Key points

Historical note and terminology

Description

Clinical applications

Table 1. Application of Bayesian Statistics in Neurology

References

Contributors

Author

This is an article preview. Start a Free Account to access the full version.

Questions or Comment?

Also in This Category

Congenital lymphocytic choriomeningitis virus infection

Urinary dysfunction in neurologic disorders

Dysphagia

Neurovascular injuries

Botulinum toxin treatment of neurologic disorders

Encephalitis lethargica

Pregnancy and stroke

Back pain

This is an article preview.
Start a Free Account
to access the full version.