forked from andypetrella/data-design
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathch13-graphing-the-results-of-checkbox-responses.html
435 lines (395 loc) · 13.9 KB
/
ch13-graphing-the-results-of-checkbox-responses.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
<section class="red" data-type="chapter">
<header>
<div class="icon"><img src="../images/sections/05/checkbox.png" /></div>
<p>Chapter 13</p>
<h1>Graphing the Results of Checkbox Responses</h1>
<p data-type="author">By Ellen Cooper</p>
</header>
<section data-type="sect1">
<p>This chapter focuses on checkbox responses or multiple response questions, where a question can be answered with more than one answer, if applicable.</p>
<h2>Checkboxes Vs. Radio Buttons</h2>
<p>Let’s say you’re doing a survey and you’re interested in what multimedia devices your respondents have used over the last six months. You would use a checkbox response question if you wanted to find out all of the multiple devices that people used over the six-month period. A radio button only allows respondents to select a single answer, so you could only use it to find out, for example, which one device a person used most often during that same six-month period. Each type of question has merit; which you should use just depends on the purpose of your question and how you are going to use the results.</p>
<h2>What a Checkbox Question Really Is</h2>
<p>So here’s the most important thing to know about checkbox questions, and it’s why you have to consider how you graph the results of checkbox questions differently than you do the results of other types of questions. Checkbox questions aren’t really their own question type! They’re actually just a shorthand way to write a <em>series</em> of yes/no questions. A respondent checks a box if an answer choice applies and leaves it blank if it doesn’t.</p>
<p>We have the checkbox format because it makes surveys more streamlined and easier to understand. In the example below, we asked, “Which of the following electronic devices have you used in the past 6 months? Please select all that apply.” The premise behind the question is that it’s likely that a respondent could use more than one electronic device over a 6-month period, such as a cell phone and a tablet.</p>
<p>If we were to pose this as a series of yes/no questions, it would read something like this:</p>
<table>
<tbody>
<tr>
<th colspan="2">In the last 6 months, have you used a/an:</th>
</tr>
<tr>
<td>Desktop PC?</td>
<td>Y / N</td>
</tr>
<tr>
<td>Desktop Mac?</td>
<td>Y / N</td>
</tr>
<tr>
<td>iPad?</td>
<td>Y / N</td>
</tr>
<tr>
<td>Tablet (other than an iPad)?</td>
<td>Y / N</td>
</tr>
<tr>
<td>Laptop (Mac or PC)?</td>
<td>Y / N</td>
</tr>
<tr>
<td>Cell phone?</td>
<td>Y / N</td>
</tr>
</tbody>
</table>
<p>With the checkbox question, survey respondents only need to check the answers that apply to them, while in a series of yes/no questions, they would need to respond to every question, even if all their answers were “No”. With a checkbox question, you can simply provide a “None” option at the bottom of your choice list to handle this situation. When several yes/no questions are related, checkbox questions also prevent repetition of instructions, since all the questions are grouped into one.</p>
<p>These changes can help improve survey readability, flow, length, and overall response rates. However, if you want to handle the resulting data correctly, it is very important for you to remember that the underlying structure of a checkbox is actually a series of dichotmous questions.</p>
<h2>How Checkbox Answers are Received</h2>
<p>How your results or raw data are compiled will, of course, depend on the program you are using to design and distribute your survey. One of the more common formats is shown in the table below; this particular data structure reflects how a checkbox question serves as a quick way to represent a series of yes/no questions. A “1” is shown when a device was selected and a “0” if a device was not selected.</p>
<table>
<tbody>
<tr>
<th>Date</th>
<th>Q1_PC</th>
<th>Q1_Mac</th>
<th>Q1_Tablet</th>
<th>Q1_iPad</th>
<th>Q1_Laptop</th>
<th>Q1_Cellphone</th>
<th>Q1_None</th>
</tr>
<tr>
<td>10/02/2013</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>10/01/2013</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>09/30/2013</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>09/30/2013</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>09/30/2013</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>09/30/2013</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>09/30/2013</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>09/27/2013</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>09/26/2013</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>09/26/2013</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td> </td>
<td>6</td>
<td>3</td>
<td>3</td>
<td>2</td>
<td>5</td>
<td>9</td>
<td>1</td>
</tr>
</tbody>
</table>
<p>You might also receive results like this:</p>
<table>
<tbody>
<tr>
<th>Date</th>
<th>Response</th>
</tr>
<tr>
<td>10/02/2013</td>
<td>PC, Tablet, Cellphone</td>
</tr>
<tr>
<td>10/01/2013</td>
<td>Mac, iPad, Tablet, Cellphone</td>
</tr>
<tr>
<td>09/30/2013</td>
<td>PC, iPad, Cellphone</td>
</tr>
<tr>
<td>09/30/2013</td>
<td>PC, Tablet, Cellphone</td>
</tr>
<tr>
<td>09/30/2013</td>
<td>Mac, Laptop, Cellphone</td>
</tr>
<tr>
<td>09/30/2013</td>
<td>Mac, Tablet, Cellphone</td>
</tr>
<tr>
<td>09/30/2013</td>
<td>PC, Laptop, Cellphone</td>
</tr>
<tr>
<td>09/27/2013</td>
<td>PC, Laptop, Cellphone</td>
</tr>
<tr>
<td>09/26/2013</td>
<td>PC, Laptop, Cellphone</td>
</tr>
<tr>
<td>09/26/2013</td>
<td>None</td>
</tr>
</tbody>
</table>
<p>Or like this:</p>
<table>
<tbody>
<tr>
<th>Date</th>
<th>Q1_PC</th>
<th>Q1_Mac</th>
<th>Q1_Tablet</th>
<th>Q1_iPad</th>
<th>Q1_Laptop</th>
<th>Q1_Cellphone</th>
<th>Q1_None</th>
</tr>
<tr>
<td>10/02/2013</td>
<td>Q1_PC</td>
<td> </td>
<td>Q1_Tablet</td>
<td> </td>
<td> </td>
<td>Q1_Cellphone</td>
<td> </td>
</tr>
<tr>
<td>10/01/2013</td>
<td> </td>
<td>Q1_Mac</td>
<td> </td>
<td>Q1_iPad</td>
<td>Q1_Laptop</td>
<td>Q1_Cellphone</td>
<td> </td>
</tr>
<tr>
<td>09/30/2013</td>
<td>Q1_PC</td>
<td> </td>
<td> </td>
<td>Q1_iPad</td>
<td> </td>
<td>Q1_Cellphone</td>
<td> </td>
</tr>
<tr>
<td>09/30/2013</td>
<td>Q1_PC</td>
<td> </td>
<td>Q1_Tablet</td>
<td> </td>
<td> </td>
<td>Q1_Cellphone</td>
<td> </td>
</tr>
<tr>
<td>09/30/2013</td>
<td> </td>
<td>Q1_Mac</td>
<td> </td>
<td> </td>
<td>Q1_Laptop</td>
<td>Q1_Cellphone</td>
<td> </td>
</tr>
<tr>
<td>09/30/2013</td>
<td> </td>
<td>Q1_Mac</td>
<td>Q1_Tablet</td>
<td> </td>
<td> </td>
<td>Q1_Cellphone</td>
<td> </td>
</tr>
<tr>
<td>09/30/2013</td>
<td>Q1_PC</td>
<td> </td>
<td> </td>
<td> </td>
<td>Q1_Laptop</td>
<td>Q1_Cellphone</td>
<td> </td>
</tr>
<tr>
<td>09/27/2013</td>
<td>Q1_PC</td>
<td> </td>
<td> </td>
<td> </td>
<td>Q1_Laptop</td>
<td>Q1_Cellphone</td>
<td> </td>
</tr>
<tr>
<td>09/26/2013</td>
<td>Q1_PC</td>
<td> </td>
<td> </td>
<td> </td>
<td>Q1_Laptop</td>
<td>Q1_Cellphone</td>
<td> </td>
</tr>
<tr>
<td>09/26/2013</td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td>Q1_None</td>
</tr>
</tbody>
</table>
<p>All three of the above examples represent the same answers, but they’re formatted in different ways. Since different survey collection tools format checkbox responses in different ways, you may need to reformat your data to match the specific format required by the visualization software you are using.</p>
<p>Let’s take a look at a summary of possible responses to the checkbox question posed above.</p>
<table class="tableizer-table">
<tbody>
<tr class="tableizer-firstrow">
<th>Table 1 Electronic Devices Used</th>
<th>Total</th>
</tr>
<tr>
<td>PC</td>
<td>421 (84%)</td>
</tr>
<tr>
<td>Mac (desktop)</td>
<td>300 (60%)</td>
</tr>
<tr>
<td>Tablet (any kind)</td>
<td>285 (57%)</td>
</tr>
<tr>
<td>iPad</td>
<td>185 (37%)</td>
</tr>
<tr>
<td>Laptop</td>
<td>200 (40%)</td>
</tr>
<tr>
<td>Cell phone (any kind)</td>
<td>450 (90%)</td>
</tr>
</tbody>
</table>
<p>You may notice that the total number of responses (1,841) is greater than the number of people that did the survey (N=500)! Why? It’s because of the whole “a checkbox is really a bunch of yes/no questions rolled into one” thing. The total possible number of checked boxes in a checkbox question? It’s the (# of “real” answer options) X (# of respondents). (Here, a “real” answer option means one that isn’t “None,” “N/A” or “Prefer not to Answer,” since selecting one of those options would prevent a person from choosing any other answers in additional to that.) For this survey, there were 6 device options (aside from “None”) that a person could select and there were 500 people that answered the survey. So the total number of boxes that had the potential to get checked during the survey was 3000, not just 500.</p>
<h2>Displaying Your Results</h2>
<p>Since the total number of responses is greater than the number of respondents, you need to use some caution when creating graphs based on these data. There are a few different ways to illustrate your results, depending on what your overall question of interest is.</p>
<h3>Bar Charts</h3>
<p>One way is to construct a bar graph and base the percentages on the number of respondents that selected each answer choice, like in the graph below. Clearly, cellphones (90%) and PCs (84%) are the most commonly-cited electronic devices used in the past six months.</p>
<figure><img alt="Electronic devices used" src="../images/sections/05/electronic-devices-used-1a.png" /></figure>
<p>However, the fact that the total adds to more than 100% can be unsettling for some. An alternative way to illustrate the results is to base the percentages on the total mentions (1,841). Analyzing results based on mentions is useful when you want the percentages to total 100%.</p>
<figure><img alt="Electronic devices used" src="../images/sections/05/electronic-devices-used-1b.png" /></figure>
<p>Keep in mind that this way of displaying data is based on the number of mentions of a device, not the number of consumers who use that device. While you may be tempted to say, “24% of consumers used a cellphone in the past six months,” the bar chart above isn’t actually displaying that information.</p>
<p>Rather than the number of respondents (500), the chart shows the number of responses (1,841) to our survey on electronic devices. So, it would be better to say, “Based on all devices mentioned, cellphones were mentioned approximately one-quarter (24%) of the time, followed closely by PCs (23%).” This percentage represents the number of cellphone mentions (450) out of the total mentions (1,841) and accurately reflects the information on display.</p>
<p>Depending on your question of interest, you can also group your data. Maybe you’re more interested in reporting how many devices people used rather than exactly what devices were. You could make a column chart like the one below.</p>
<figure><img alt="Percent by grouping" src="../images/sections/05/devices-reported.png" /></figure>
<div data-type="warning"><h3>Warning about pie charts and checkbox questions</h3>
<p>Don’t use pie charts if you’re basing your percentages on the number of respondents that selected each answer choice! Pie charts are used to represent part-to-whole relationships and the total percentage of all the groups has to equal 100%. Since the possible sum of the percentages is greater than 100% when you base these calculations on the number of respondents, pie charts are an incorrect choice for displaying these results.</p>
</div>
<h3>Over Time</h3>
<p>If your checkbox responses have been collected over a period of time, say 2009–2013, you could display the responses in a line chart as shown below.</p>
<figure><img alt="Line chart for checkboxes" src="../images/sections/05/checkbox-line.png" /></figure>
<h2>Attitudinal Measurements</h2>
<p>So far, we’ve been looking at the use of checkbox questions to gather data on basic counts (e.g. electronic devices used). Checkbox responses can also be used to assess agreement with <a class="glossterm" href="glossary01.html#statement-attitudinal" target="_blank">attitudinal statements</a>. The graph below shows attitudes of homeowners towards owning their home. In this case, since the statistic of interest is what percentage of homeowners agree with each statement, it is probably best to keep the graph as is, with the total exceeding 100%.</p>
<figure><img alt="Home ownership bar graph" src="../images/sections/05/home-ownership.png" /></figure>
</section>
</section>