Data Wrangling, Cleaning, and Exploring

Once you have identified a dataset the next step would be to clean/wrangle any messy data to make it usable. That will then allow you to move on to figuring out what you want to visualize, which often involves analyzing different subsets of the data, and thinking both in terms of the forest and the trees (both: high level and close reading).How you combine or work with different subsets will determine what your viewers can take away.

This section will connect you with a few gentle/introductory ways into working with data for those with little to no experience in this space. It will cover some basics of understanding CSV files, pivot tables, and cleaning data with OpenRefine: a free tool for this purpose. If you are just starting out you might also try working with a simpler dataset and giving it a few test runs.

Note: While this section focuses on working in/with spreadsheets you might also consider adding Python or R to your list of skills for the future. A few materials on those tools will be included in the Additional Resources section.


Read Further

  • Miriam Posner has created many helpful tutorials that I would encourage you to explore.