This project analyzes a dataset of dinosaurs, exploring their Period, Diet, and Country of discovery. The goal is to clean the data, standardize values, and create visualizations to uncover patterns in R, using Rstudio.
- File:
dinosaur.csv - Columns:
Name: Dinosaur's namePeriod: Time period (e.g., Jurassic, Cretaceous)Diet: Type of diet (herbivore, carnivore, etc.)Country: Location where the fossil was found
Make sure to install these packages first (only once):
install.packages(c("dplyr", "tidyr", "ggplot2", "scales", "lubridate", "stringr", "data.table", "skimr", "maps"))Then load them in the script:
library(dplyr)
library(tidyr)
library(ggplot2)
library(scales)
library(lubridate)
library(stringr)
library(data.table)
library(skimr)
library(maps)- Removed extra spaces in the Period and Country columns
- Standardized text to lowercase
- Fixed inconsistent values (e.g., merging "early-late cretaceous" into "cretaceous")
- Checked for missing values and handled unknown data
ggplot(data, aes(x = Period)) +
geom_bar(fill = "lightblue") +
labs(title = "Number of Dinosaurs by Period", x = "Period", y = "Count") +
theme_minimal()period_diet_table <- table(data$Period, data$Diet)
ggplot(as.data.frame(period_diet_table), aes(x = Var1, y = Var2, fill = Freq)) +
geom_tile() +
scale_fill_gradient(low = "lightblue", high = "darkblue") +
labs(title = "Relationship Between Period and Diet", x = "Period", y = "Diet") +
theme_minimal()ggplot(data, aes(x = Diet, fill = Period)) +
geom_bar(position = "dodge") +
labs(title = "Dinosaur Diets Across Periods", x = "Diet", y = "Count") +
theme_minimal()Check the
images/folder for the saved visualizations.
💡 Made with R and 🦖 dinosaurs!


