Assignment #9: Visualization in R – Base Graphics, Lattice, and ggplot2

 


Base Graphics, Lattice, and ggplot2



This scatter plot shows the relationship between vehicle weight (wt) and fuel efficiency (mpg) using base R graphics. Each point represents a car in the dataset. The plot reveals a clear negative relationship, where MPG decreases as weight increases. Lighter cars tend to have much higher fuel efficiency, while heavier cars fall into lower MPG ranges. This suggests that vehicle weight is a strong predictor of fuel economy.








This histogram displays the distribution of fuel efficiency (mpg) across all cars in the dataset. The x-axis represents MPG ranges, while the y-axis shows the number of cars in each range. Most vehicles fall between 15 and 25 MPG, indicating that moderate fuel efficiency is the most common. The distribution is slightly skewed toward higher MPG values, with fewer cars achieving very high efficiency. Overall, the dataset is concentrated in the lower-to-mid MPG range.







This lattice scatter plot shows the relationship between weight and MPG, separated by the number of cylinders. Each panel represents a different cylinder group (4, 6, or 8). Across all panels, there is a consistent negative relationship between weight and fuel efficiency. However, 4-cylinder cars are generally lighter and more efficient, while 8-cylinder cars are heavier and less efficient. This highlights how both weight and engine size influence MPG.


This boxplot compares fuel efficiency (mpg) across different cylinder groups. Each box represents the distribution of MPG for 4-, 6-, and 8-cylinder cars. The median MPG decreases as the number of cylinders increases, showing a clear inverse relationship. The 4-cylinder group has the highest and most variable MPG, while 8-cylinder cars have consistently lower efficiency. This confirms that engine size strongly impacts fuel economy.




This ggplot2 scatter plot shows MPG versus weight, with points colored by cylinder count and regression lines added for each group. The negative relationship between weight and MPG is clearly visible across all groups. The regression lines help emphasize the trend within each cylinder category. 4-cylinder cars appear at higher MPG levels, while 8-cylinder cars cluster at lower values. This visualization effectively combines grouping and trend analysis in a single plot.




This faceted histogram shows the distribution of MPG for each cylinder group separately. Each panel represents a different number of cylinders, allowing for easy comparison. The 4-cylinder group has the highest MPG values, while the 8-cylinder group is concentrated at much lower values. The 6-cylinder group falls in between with a narrower range. This reinforces the pattern that more cylinders are associated with lower fuel efficiency.


Comparison of Base R, Lattice, and ggplot2

Syntax and Workflow

One of the main differences between the three visualization systems is their syntax and overall workflow. Base R graphics use a direct, function-based approach. Functions like plot() and hist() are quick and simple, but customization can become cluttered as complexity increases.

The lattice package uses a formula-based syntax, which makes it well suited for grouped or conditioned plots. However, it can feel less intuitive and offers less flexibility in customization.

ggplot2 follows the “grammar of graphics,” building plots layer by layer. While it may take more effort to learn, it provides a consistent structure and makes it easy to add elements like colors, regression lines, and facets.

Control and Output Quality

ggplot2 produced the most polished and professional-looking visualizations with minimal effort once the syntax was understood. Base R was fastest for simple plots but required more manual adjustments, while lattice worked well for grouped data but was less flexible overall.

Challenges

Switching between systems required adapting to different approaches. Base R is immediate, lattice uses conditioning, and ggplot2 relies on layering. Although ggplot2 was initially the most difficult, it became the most powerful once understood.

Conclusion

Overall, ggplot2 was the preferred system due to its flexibility and visual quality. Base R is useful for quick exploration, and lattice is effective for structured comparisons.







Comments

Popular posts from this blog

Module # 6 Doing math in R part 2