top of page
  • richardcwangrcw

Data in Medicine --- An interview with a physician

Statistics is essential to medicine because analyzing and interpreting data will provide doctors and practitioners with evidence-based results. With the increase of high-tech medical tools at our disposal, more and more data is being generated, intensifying the need for proper statistical analysis.

Dr. Hao Wang is an ER physician who works in Emergency Medicine. In the E.R., a lot of information comes from electronic medical records documenting the patient's condition, so proper data analysis is paramount to Dr. Wang’s ability to treat patients and save lives. “Since we have so much data, it is hard to interpret them without the knowledge of statistics,” Wang said. Statistics is generally known as the study and manipulation of data, including gathering, reviewing, analyzing, and drawing conclusions from data. Data are analyzed using statistical software and interpreted by statisticians, and outcomes can be predicted. These interpretations allow doctors to make more informed decisions about the patient’s condition and potential treatments. “Statistics can perceive patterns. Patients with a specific disease have their clinical presentation, lab values, and clinical courses fall into specific patterns. Using statistics, doctors can discover common patterns and risk factors affecting patient outcomes, which will help us to find suitable managements that will result in the best clinical outcome,” Dr. Wang explained. He sees many patients with various problems. His job is to recognize a patient’s problems and correctly provide proper treatment: “Using statistics, we're analyzing the patients we saw and then predicting…the future patients’ patterns.”

Without the advent of modern statistical software, medical experts would not be able to learn the crucial in-depth information needed for diagnosis and treatment. Medical professionals and researchers use statistics daily through hypothesis testing, clinical prediction models, and machine learning to increase the chances of survival for patients.

Hypothesis testing

Statistics can be used to test different hypotheses in the medical field. This is called hypothesis testing, which is essential to researchers for making conclusions about their claims and finding the overall significance of their results. Various statistical methods and measures are available to provide evidence supporting or against a particular hypothesis. Different methods are used depending on the type of hypothesis generated.

In medicine, different statistical methods can be used test different hypotheses and provide researchers with accurate results. While the Chi-square tells us whether two variables are independent of one another, the student t-test can be used to test whether the means of two populations are different. “For example, [if] the average score in a biology class is 90 for one group of students and 89 for another, can we say that the group with a score of 90 is better than the group with a score of 89? It depends on the statistical analysis and whether the difference is statistically significant,” Dr. Wang said. Without these hypothesis-testing tools, it would be challenged to determine if there’s statistical significance between two data sets. Another commonly used statistical method is multivariate logistic regression analysis, which calculates the probability of an event depending on multiple sets of variables. “For example, when you want to identify risk factors for cancer, such as age, gender, education level, and smoking status, you would use [multivariate logistic regression analysis] to analyze all these factors together and identify which factor(s) predict the likelihood of developing cancer in the future,” Dr. Wang explained.

Clinical Prediction Models

Clinical prediction models are tools that compute the risk of an outcome given a particular set of patient characteristics. They can be used in many ways, such as predicting whether a patient may have a heart attack or a stroke in the future. These models have revolutionized the medical industry, leading to higher life expectancy among high-risk patients. “For a binary outcome, such as a ‘yes or no’ for a disease, we use a classifier, an algorithm that maps the input data to a specific category,” Wang said. Usually, an ensemble classifier is more accurate than any of the individual classifiers making up the ensemble. “Some commonly used machine learning algorithms to predict the binary outcome include random forest, decision tree, and XGboost.” Sometimes the outcomes are continuous, not binary, variables. These variables don’t have “yes or no” options, instead possessing a continuous spectrum of outcomes. An example of a continuous variable is the length of time a person waits in the waiting room to see a doctor. “You may wait for five minutes, or you may wait for ten minutes. You must use other methods (to predict specific outcomes), such as linear regressions, to make predictions.” Dr. Wang explained. “So, it's hard to say what statistical method you use for particular models. It all depends on what kind of hypothesis and variables you choose.” Clearly, a statistician must have strong background knowledge to perform the correct medical tests specific to a particular disease or patient. Besides clinical prediction models, statistics can also be used for other purposes, such as determining the effectiveness of different medications. “For instance, when treating a patient, it is important to determine which medication is more effective than others and the differences in the side effects of these medications. Statistics can be utilized to assess the severity of side effects,” Dr. Wang stated. Based on the statistical data, doctors can decide to avoid medications with severe side effects and opt for those with milder side effects. “Predictions in medicine have become more accurate due to the big data and advanced statistical software availability,” Dr. Wang said.

AI and Advancements

For both these experts, the advent of AI algorithms and machine learning has completely changed the medical statistics playing field. According to Dr. Wang, “Artificial intelligence and machine learning are [both] growing in popularity. So now, doctors need more and more background in statistics. Artificial intelligence and machine learning algorithms can be used to predict a model. Those models can help physicians to predict the patient's outcome. In the next 5-10 years, with more advanced machine learning and artificial intelligence, these technologies can help physicians recognize disease patterns. Especially when disease[s] are rare, and aren’t commonly diagnosed, using artificial intelligence to identify the patterns will help the physician recognize them. I think this will be the future trend. Physicians use machine learning algorithms to read samples, including pathological slides and CT or MRI results."

4 views0 comments

Recent Posts

See All


bottom of page