# Define the function
<- function(variable = "height", top = 10) {
top_starwars # Load necessary libraries
require(dplyr)
# Ensure the dataset is loaded
if (!exists("starwars")) {
<- dplyr::starwars
starwars
}
# Check if the specified variable exists in the dataframe
if (!variable %in% names(starwars)) {
stop("The specified variable does not exist in the starwars dataset.")
}
# Sort the dataframe based on the specified variable and handle missing values
<- starwars %>%
ranked_df filter(!is.na(variable)) %>%
arrange(desc(across(all_of(variable)))) %>%
select(name, all_of(variable)) %>%
slice_head(n = top)
# Return the top n entries
return(ranked_df)
}
# Example usage:
# top_starwars("mass", 5)
Introduction
This section covers the basics of handling and analysing the starwars
dataset from the dplyr
package in R. You can use these example functions and dataset to create and use custom functions to explore this dataset for your mini-package.
The Star Wars Dataset
- Is part of the
dplyr
package in R. - Includes details about characters from the Star Wars universe.
- Variables include: name, height, mass, hair_color, skin_color, eye_color, birth_year, gender, homeworld, species.
- Some variables are lists and need special handling. If you load the
my_stawars.csv
dataset here, these columns are dropped for easier handling.
Function: top_starwars
Purpose
Returns a data frame ranked by a user-specified variable. Useful for sorting characters by attributes like height or mass.
Code
Function: plot_relationship
Purpose
Plots the relationship between two variables in the starwars dataset, showcasing the potential correlations or distributions.
Code
# Define the function
<- function(x, y) {
plot_relationship
# Load necessary libraries
require(ggplot2)
require(dplyr)
# Ensure the dataset is loaded
if (!exists("starwars")) {
<- dplyr::starwars
starwars
}
# Check if the specified variables exist in the dataframe
if (!(x %in% names(starwars) && y %in% names(starwars))) {
stop("One or both of the specified variables do not exist in the starwars dataset.")
}
# Create the plot
<- ggplot(data = starwars, aes(x = !!sym(x), y = !!sym(y))) +
plot geom_point(aes(color = species)) +
labs(title = paste("Relationship between", x, "and", y),
x = x,
y = y) +
theme_minimal(base_family = "Helvetica") +
theme(plot.background = element_rect(fill = "black",
colour = "black"),
panel.background = element_rect(fill = "black"),
text = element_text(colour = "white"),
axis.text = element_text(colour = "white"),
plot.title = element_text(size = 20, face = "bold",
hjust = 0.5, colour = "white"),
legend.position = "bottom",
legend.text = element_text(size = 6), # Smaller legend text
legend.key.size = unit(0.1, "cm")) # Smaller legend keys)
# Print the plot
print(plot)
}
# Example usage:
# plot_relationship(x = "height", y = "mass")
Next steps
Use the provided dataset (here) and at least one of these functions to build your own mini-package.
Remember that to provide a meaningful documentation for your functions which should include:
- a description of the function,
- the input parameters,
- the output,
- and an example of usage.