Surviving Sickeness
Visualizing Percentile Scores with Machine Learning
Surviving Sickness: Can You Beat the Flu?
Have you ever wondered what your chances are of surviving a nasty case of influenza? In this project, we set out to answer that question using data science and machine learning. Our model doesn’t just guess — it makes predictions based on real-world data and statistical analysis. Let’s break down how it works.
The Big Idea
The goal was to build a tool that could predict the probability of surviving influenza based on key symptoms and patient information. Using a logistic regression model trained on historical health data, we can now estimate someone’s survival odds with just a few inputs.
-Data Formats:_ Uses text and numeric data formatting. The numeric data is used to input user age, and the text is what allows users to input either yes/no, depending on their symptoms. The output is also given in a textual format.
-Different Use Cases: This feature can be used in healthcare and education. Since this is a disease survivability tracker, it has the potential to be adapted into a survey that patients can take to diagnose themselves with different diseases and their survivability. However, since this can be very risky, it is more applicable in an educational setting. This data is based on previous cases of diseases, so it can be used by medical students to study how different variables like age/ other symptoms affect survivability.
-Pipeline Stages: (link to the documentation blog will go here)
-Tooling/Libraries: This user inputs programs from scikit-learn (LogisticRegression and LabelEncoder) and Pandas. It was not done in Jupyter notebooks, and it has a full Python backend and JavaScript frontend. This feature also utilizes a CSV file, where the data that’s being analyzed is stored.
-Data Source Type: The data stored in the CSV file was taken from an open-source dataset created by UC Irvine. Link Here
The Data
We used a dataset that contains anonymized health records of people who had influenza.The dataset was from a reputable source, which is UCI data repository. Each row in the dataset represents a patient, and each column represents a symptom or factor:
| Feature | Description |
|---|---|
| Age | Patient’s age in years |
| Fever | Did the patient have a fever? (Yes/No) |
| Cough | Was a cough present? (Yes/No) |
| Shortness of Breath | Difficulty breathing? (Yes/No) |
| Survived | Outcome (1 = survived, 0 = did not survive) |
The Model
To train our model, we used logistic regression — a statistical technique ideal for binary classification problems (like survive or not survive). Here’s why it works well:
Logistic regression predicts probabilities between 0 and 1.
It’s interpretable — you can see which features affect the outcome.
It works well on smaller datasets with binary outcomes.
Making a Prediction
When you use the web form, you input your:
Age
Whether or not you have a fever
Whether or not you’re coughing
If you have shortness of breath
Your inputs are sent to the backend, where we feed them into our model:
input_features = [[age, fever, cough, sob]]
survival_prob = model.predict_proba(input_features)[0][1]
The model calculates your survival probability, and we return it with two values:
Survival chance (e.g. 87%)
Death risk (e.g. 13%)
You’ll then see this information clearly on the page, alongside a popup explaining what it means.
User Interface
We designed the frontend with simplicity and clarity in mind. Key features include:
A clean form with labeled fields
Instant results after submission
A “What does this mean?” tooltip to help interpret your score
Future Improvements
While our current model is functional, there’s always room to grow:
✅ Add more features like sex, vaccination status, or comorbidities
📈 Visualize the output with graphs or survival curves
🧠 Try more complex models (Random Forests, Gradient Boosting)
This project is just the beginning — it shows how machine learning can support healthcare decisions by giving people insight into their conditions.
| Aspect | Detail |
|---|---|
| ML Model | Logistic Regression |
| Prediction Goal | Probability of surviving influenza |
| Inputs Used | Age, Fever, Cough, Shortness of Breath |
| Tech Stack | Python, Flask, JavaScript, HTML/CSS |