Regression Analysis
Applied statistical analysis project using the AutoMPG dataset (Statlib) to model and predict vehicle fuel efficiency based on their technical characteristics.
Applied Methodology:
- Complete ETL (Extract, Transform, Load) of the dataset, including handling of missing values and outliers
- Exploratory Data Analysis (EDA) to identify correlations and patterns
- Multiple regression model construction with feature selection
- Statistical validation through hypothesis testing and confidence intervals
- Residual diagnostics to verify model assumptions
Technologies:
- R and Tidyverse for data manipulation and analysis
- Rigorous statistical inference techniques
- Data visualization for insight communication
Results: The developed model demonstrated that characteristics such as number of cylinders, weight, and manufacturing year are significant predictors of fuel efficiency. The analysis revealed insights into how automotive design factors impact consumption, evidencing the applicability of statistical methods to practical engineering problems.
This project consolidated my ability to work with real data, apply statistical rigor in analyses, and communicate technical results clearly.