Mon 07 March 2022 2 minutes

One of the weird things about ZIP codes is that they’re numbers, yet if you add or subtract them you get no meaningful data. There’s a reason for this: ZIP codes are qualitative data rather than quantitative.

Qualitative data (you’ll also find it referred to as “nominal” or “categorical” data) represents, as its name suggests, the quality and categorization of variables. This differs from quantitative data which measures something. Qualitative data is like your social security number or the key code you enter to get into a building — you use a specific pattern for a specific reason, and changing that pattern renders the data null.

It’s for this reason that in Censtats data sets you’ll likely see that the ZCTA column is formatted as text. It is recommended that they always remain this way as to avoid whatever program you might be using erasing key information such as leading zeroes. You may run into this problem if you load our csvs into Microsoft Excel or any alternatives without changing the formatting of the column - to do so, look at the screenshot below:

NOTE: This brief tutorial is done using LibreOffice, so things may look slightly different depending on your program of choice.

If you highlight the column in question (in this case, the ZCTA column) you can then use the "format" menu dropdown to select "Cells...". Once you select to format cells, you'll be given several options. Whichever option changes the cells from "numbers" to "text" is the correct one for the ZIP code column.

Ultimately, if you ever question what the difference between qualitative and quantitative is, remember this: Does adding these variables together yield meaningful information? If yes, then it’s quantitative; if no, then it’s qualitative.

