Data Wrangling, Cleaning, and Exploring¶
Once you have identified a dataset the next step would be to clean/wrangle any messy data to make it usable. That will then allow you to move on to figuring out what you want to visualize, which often involves analyzing different subsets of the data, and thinking both in terms of the forest and the trees (both: high level and close reading).How you combine or work with different subsets will determine what your viewers can take away.
This section will connect you with a few gentle/introductory ways into working with data for those with little to no experience in this space. It will cover some basics of understanding CSV files, pivot tables, and cleaning data with OpenRefine: a free tool for this purpose. If you are just starting out you might also try working with a simpler dataset and giving it a few test runs.
Note: While this section focuses on working in/with spreadsheets you might also consider adding Python or R to your list of skills for the future. A few materials on those tools will be included in the Additional Resources section.
WTF CSV by DataBasic is a great initial resource for figuring out what a csv file contains, and how to start asking questions of that material. This is a a very basic but useful tool to get you started thinking about your data. Though, ultimately, you will probably work with something like Excel, Google Sheets, or LibreOffice Calc.
“Gentle Introduction to Cleaning Data” made by Tactical Tech for School of Data, and covers cleaning with LibreOffice Calc or a similar tool.
“Getting Started with OpenRefine” by Miriam Posner that will get you started with OpenRefine: from installing to working with the tool.
“Make A Pivot Table with Excel” by Miriam Posner, covers the basics of pivot tables and turning them into simple charts in Excel or a similar tool.
“Gentle Introduction to Exploring and Understanding Your Data” made by Tactical Tech for School of Data, and covers making and adding to pivot tables with LibreOffice Calc or a similar tool.