Equations are not being displayed properly on some articles. We hope to have this fixed soon. Our apologies.

Mitchell, E. (2006). A Proposed Methodological Tool for Social Surveys. PHILICA.COM Article number 60.

ISSN 1751-3030  
Log in  
  1341 Articles and Observations available | Content last updated 23 February, 05:23  
Philica entries accessed 3 654 719 times  

NEWS: The SOAP Project, in collaboration with CERN, are conducting a survey on open-access publishing. Please take a moment to give them your views

Submit an Article or Observation

We aim to suit all browsers, but recommend Firefox particularly:

A Proposed Methodological Tool for Social Surveys

Ethan Mitchellconfirmed userThis person has donated to Philica (Independent Researcher)

Published in socio.philica.com

A proposal for the creation of rubrics to code write-in answers to common identifying questions on surveys. The use of a standard rubric would allow surveyors to capture a wider range of possible answers, without reactive bias. This could be useful both in studying minority populations, and also to ensuring accuracy when identifying non-minority populations. An example of such a rubric, dealing with gender, is appended.

Article body

A Proposed Methodological Tool for Social Surveys


                        The name that can be named is not the eternal name

                                                                                                -Lao Tsu


            Survey questionnaires have long tried to find a happy medium between multiple-choice and write-in questions.  Multiple-choice question formats offer an ease of use and a degree of reliability that write-in questions do not.  Yet they also create a “complex question” fallacy on the part of the surveyor, which is only slightly mitigated by allowing “Other” or “None of the Above” as responses.  Again, multiple-choice questions show the respondent something about the researcher’s thought processes, which may bias them in regard to the rest of the survey.  Simply by printing the phrase “African-Americans” rather than “Blacks,” or “Palestinians” rather than “Arabs,” the researcher is conveying bias information to the person filling out the survey.  These shortcomings have been recognized to a large extent.  Gallup no longer asks whether gentlemen prefer blondes or brunettes, and the US Census has expanded their list of racial/ethnic categories from 10 possibilities in 1990 to 126 in 2000{1}.

            While write-in questions allow more freedom for the subject to express themselves precisely, they are almost certain to capture a greater range of responses than the surveyor has anticipated.  Without any pre-existing method to code these responses, the surveyor faces two non-trivial problems.  First, they may need to conduct secondary research to inform themselves as to what a response means.  If someone lists their first language as “Konjo,” it is likely that most researchers in the US or Europe will have no idea what language family the respondent belongs to.  Secondly, the surveyor must make a long series of categorization choices in order to collapse the data for analysis.

            Decisions about coding such results are inevitably reactive.  For example, a surveyor may ask a write-in question about religion, intending to collapse the results into “Christian,” “Jewish,” “Other Religions,” and “No Religion.”  Some responses, however, could be intuitively placed in more than one of these categories:  Mormons, Secular Humanists, Jews for Jesus.  For the surveyor to make these taxonomic decisions after the fact introduces a potentially severe reactive bias.  This is true even if the surveyor has taken steps to be blind to the remainder of the data.  For example, if the above survey is conducted in the hopes of showing that Christianity is declining in the American West, then it will be obvious that the coding of “Mormons” as Christian or Other will significantly change the results.

            In practice, such post-survey work is even more problematic, since it often involves multiple people handling the data, inconsistent coding methods, and non-blind coding.  For all of these reasons, multiple-choice questions have been preferred to write-ins answers for a range of questions, including basic identifiers.

            The problems with write-in questions are not, however, insurmountable.  They can be addressed by the creation of rubrics for systematically coding the answers to given questions.  Such rubrics should be comprehensive enough to capture nearly all responses, so that the level of unusable answers is little different from the level of unusable answers that multiple-choice questions generate{2}.  Moreover, such rubrics should have a fixed system for data collapse, so that the researcher can reduce the results to 20, or 10, or 5, or 2 categories, without any opportunity for reactive bias.

            Importantly, such a rubric might have value even if the data is ultimately collapsed down as far as the original multiple-choice question.  For example, it is likely that at least 98% of respondents are “cisgendered:” that is, they identify as male or female.  On all but very large surveys, the transgendered respondents will be so few in number that it will be hard to draw statistically meaningful conclusions about them as a population.  However, a write-in system at least segregates those responses, so that they do not get erroneously included in male or female categories.

            Rubrics of this nature cannot be a substitute for in-depth questions on a topic.  They may, however, be able to achieve a much greater level of precision in identifying questions that are incidental to the major aim of a survey.  My aim, then, is to provide a toolbox for coding commonly asked questions, so that the surveyor is neither blind to minority responses, nor dazzled by their own biases in coding them.





            With this end in mind, I have published the first of what I hope will be a family of rubrics.  They can be collectively referred to as ‘Write-in Survey Standards Published on Philica,” or WISS-POP, which is certainly pronounced “wizpop.”  The remainder of this section describes how to use this rubric, ad any that follow.

            First, it must be understood that the initial versions of these rubrics are quite provisional.  The range of possible answers has been compiled from conversations and literature, but not yet from field testing.  Since these rubrics will be superseded, it is important for the researcher who wishes to use them to refer to the version in question.  The first to be published is “WISS-POP Gender 1.0,” and others will follow this format.  The decision to use a particular rubric should be made before receipt of the survey results, or once again there is a potential for reaction to the data.

            Once the data is in hand, each of the rubrics contains a search-and-replace list to collapse variant spellings, synonyms, and so forth.  Once this list has been applied, either electronically or manually, the data is ready for coding.

            Each rubric contains an alphabetical index of possible responses (as seen after the search-and-replace function).  In general, the responses are coded with a particular number.  In a very few cases, the researcher may be asked to make an outside distinction between two possibilities.  For example, the language “Konjo,” mentioned above, might refer either to a language spoken in Uganda or a language spoken in Indonesia.  In such cases, a third code is provided if the surveyor feels they cannot make this distinction.           Results that are not listed in the index should be assigned to the most reasonable categories.  The majority of these cases will involve extraneous additions to answers that are found in the index, such as “Baptist!!!” for “Baptist.”  More problematic cases should be annotated in the documentation for the survey results, or simply left out.  Please consider informing the author of such cases for inclusion in future versions of WISS-POP.

            Finally, with all answers coded, the surveyor can collapse the numerical codes to any degree, using a dendrogram provided with the rubric.

            Without any question, this method of data collection and coding presents certain unique challenges.  It also reduces the “emic” answers submitted by respondents to “etic” form, losing a great range of connotations and secondary information.  But though these fine subtleties are lost, WISS-POP may be able to provide a window on a universe of nuances that would otherwise be invisible, and can do so with relatively little effort.




WISS-POP Gender 1.0


            “But there are more than five sexes…”

                                                                        -Lawrence Durrell



            This remainder of this paper is a rubric for coding write-in answers to the question “Gender?” in the WISS-POP system.  This rubric can be referenced as WISS-POP Gender 1.0.  It was created as an example of the type of tool I am describing, but I hope it is useful in its own right.

            Gender is possibly the most controversial identifying question on social surveys.  As with race or other demographic groupings, there is ambiguity and controversy involved in the classification.  However, the ambiguities in regard to gender are often unknown or denied.  For this reason, an expanded multiple-choice question (Female/Male/Other) would seem extraordinary to a large majority of respondents, and possibly introduce bias to their other answers.

            By capitulating to this state of affairs, our lack of knowledge about gender has become self-sustaining.  In a widely cited meta-study, Blackless et al. (2000) have proposed that as many as 2% of live human births may be non-sexually-dimorphic.  Using that figure, there are about as many physiologically intersex people as there are German speakers.  To this we must add the overlapping layers of gender construction, presentation, identity, alteration, and performance.  And yet all of these nuances are systematically rendered invisible to the social scientist who is not directly studying gender construction{3}.  Moreover, insofar as respondents are categorized as male or female when this is not how they identify, their experience is not simply unrecorded, it actually injures the data we do recover.





            Gender identifiers are a vocabulary in flux, and the political dynamics are such that taxonomic clarity may not always be viewed as advantageous by the affected communities.  Nevertheless, certain patterns have developed which this paper follows.  The most basic of these involves binary gender phrases, in conjunction with expressions of transformation (e.g. “transman, trangendered female”).  These are typically used in reference to people whose identity is moving (or has moved) towards the specified gender.  Thus “transwoman” refers to an individual transitioning towards femaleness, as opposed to an individual transitioning towards maleness.  I have followed that convention here.  It does not capture every use of the terms, and perhaps no rubric will be able to do that for quite some time{4}.

            This paper, as described above, aims to provide a rubric for expanding the range of possible identities that a survey can record in a practical and non-reactive fashion.  It is emphatically not meant as a substitute for discussing gender identity.  Most if not all of the terms below carry specific connotations that transcend the ways I have categorized them.  Even the variant spellings “trans(s)exual” or the contraction “trans” have specific connotations, and are the source of conflict (Cromwell 1999).  Other terms included in this list are widely considered offensive or anachronistic, or are simply unlikely to be used as self-identifiers (some surveys may be asking for nonself-identifiers, as well). 

            On a different note, this classification should not be used to infer chromosomal sex or any other physiological state.  This is not simply a taxonomic choice on my part; almost none of the terms currently in use for gender provide adequate information on this topic.  This is even true for cisgender responses.  Some individuals who have had sex reassignment surgery no longer consider themselves trans(s)exuals, and simply refer to themselves with a cissexual descriptor, such as “male.”  While I believe that a rubric of this sort can be a valuable tool for medical research, it is not a substitute for medical inquiry.

            Finally, and perhaps paradoxically, this rubric should not be used for surveys that specifically target transgendered subjects.  For such purposes, it is not sensitive enough.  The intention is to allow any survey to capture a wide range of gender identities, not to provide a framework for the study of gender itself.



Search-and replace List


Cross-Gendered, Transexual, Transsexual, Transgendered, Transgender, Trans, Tran, Tranny, Trannie à  Tran.

Girl, grrl, gurl, dyke, lesbian, woman, womon, butch, [female symbol]  à  female

Boy, boi, fag, gay, man, daddy, [male symbol]  à male

Transvestite, cross-dresser, CD à TV

Pre-Op, Post-Op, No-Op, pre-operative, post-operative à  [delete unless this is the entire entry, otherwise “Op”]


Alphabetical Index of Possible Responses


This index lists responses as read after using the search-and-replace list above.  The numbers in parentheses are the codes assigned for a given response.


[Hermaphrodite symbol] (7)

? (2)

[answer not found in the index and no assignment made] (2)

[blank] (1)

[illegible response] (2)

[response invalidated] (2)

Agender (5)

AIS (6)

All (6)

Androgen insensitive (6)

Androgyne (5)

Asexual (5)


Bigender (6)

Both (6)

CAIS (6)

Classified (1)

Complete androgen insensitivity syndrome (6)

Confidential (1)

Doesn’t matter (1)

Drag king (11)

Drag queen (10)

Eunuch (12)

Ex-female (13)

Ex-male (12)

F (3)

F2f (8)

F2m (9)

Female (3)

Female in a male’s body (8)

Female male (8)

Female TV (11)

Female-bodied male (9)

Femaling male (8)

Ferm (6)

Ftf (8)

Ftm (9)

F-t-m (9)

F-to-f (8)

F-to-m (9)

Gender bender (7)

Gender bending (7)

Gender blended (7)

Gender blender  (7)

Gender blending (7)

Gender outlaw (7)

Gender queer (7)

Genderfuck (7)

Genderfucker (7)

Genderfucking (7)

Genderless female (13)

Genderless male (12)

Genderqueer (7)

Gynandroid (5)

Herm (6)

Hermaphrodite (6)

Hermaphroditen (6)

I can’t say (1)

I won’t say (1)

I won’t tell you (1)

I’m not saying (1)

I’m not telling (1)

Intergender (6)

Intersex (6)

Intersexual (6)

Irrelevant (1)

It doesn’t matter (1)

It’s irrelevant (1)

It’s not important (1)

Klinefelter (6)

Klinefelter’s syndrome (6)

M (4)

M to f (8)

M2f (8)

M2m (9)

Male (4)

Male female (9)

Male in a female’s body (9)

Male pseudohermaphrodite (6)

Male TV (10)

Male-bodied female (8)

Maling female (9)

Merm (6)

Mind your business (1)

Mind your own business (1)

MPH (6)

Mtf (8)

M-t-f (8)

M-t-f (8)

Mtm (9)

Multiple (6)

MYOB (1)

M-t-m (9)

M-to-f (8)

M-to-m (9)

N/a (1)

Na (1)

Need to know basis (1)

Need to know only (1)

Neither (5)

Neuter (5)

Neutrois (5)

New female  (8)

New male (9)

No (5)

No gender (5)

No response (1)

No sex (5)

None (5)

None of your business (1)

Not important (1)

Not saying (1)

Not telling (1)

Nullo (5)

Nullo female (13)

Nullo male (12)

Op. (7)

Other (7)

Other gender (7)

Other sex (7)

Outlaw (7)

Ovotestes (6)

PAIS  (6)

Pangender (7)

Partial androgen insensitivity syndrome (6)

PH (6)

Pseudohermaphrodite (6)

Queer (7)

Secret (1)

She-male (6)

Tf (8)

Third gender (7)

Tm (9)

Tran. (7)

Tran. f (8)

Tran. Female (8)

Tran. M (9)

Tran. male (9)

Transf (8)

Transm (9)

Transmale (9)

Transman (9)

Transqueer (7)

Transfemale (8)

Transwoman (8)

True hermaphrodite (6)

Tryke (8)

TV female (11)

TV male (10)

Two-spirit (7)

Ungendered female (13)

Ungendered male (12)

Unimportant (1)

Unsexed female (13)

Unsexed male (12)

Won’t tell (1)

XO (6)

XX (3)

XX male (6)

XXY (6)

XY (4)

XY female (6)

You don’t need to know (1)

You don’t need to know that (1)




            There are two dendrograms provided for this rubric.  Surveyors wishing to collapse categories primarily on the basis of a male-female gender continuum should use the “A” chart.  Surveyors looking primarily at the ways in which gender is presented or constructed should use the “B” chart.  These distinctions are presented in the chart and table below.



#1        No Response

#2        Ambiguous

#3        Expressing femaleness

#4        Expressing maleness

#5        Expressing no gender

#6        Expressing combined male and female

#7        Expressing nonspecific or ambiguous transformation or rejection of binary

#8        Expressing maleness transformed to femaleness

#9        Expressing femaleness transformed to maleness

#10      Expressing maleness garbed as female

#11      Expressing femaleness garbed as male

#12      Expressing maleness negated

#13      Expressing femaleness negated



Both Dendrogams


#14      (#1, #2)            No information

#15      (#6, #7)            Expressing maleness and femaleness, or non-polar gender transformation




“A”  Dendrogram


#16      (#8, #10)          All transformations to femaleness

#17      (#9, #11)          All transformations to maleness

#18      (#13, #16)        All transgendered femaleness

#19      (#12,#17)         All transgendered maleness

#20      (#3, #18)          All femaleness

#21      (#4, #19)          All maleness

#22      (#5, #15)          All other


“B” Dendrogram


#23      (#3, #4)            All cisgender  

#24      (#5, #12, #13)  All genderlessness

#25      (#8, #9)            All polar gender transformation

#26      (#10, #11)        All drag

#27      (#15, #24, #25, #26)    All transgender




{1}      Brewer and Trudy (2001)


{2}      Multiple-choice questions can fail in numerous ways: answers are graphically illegible; more than one box is checked on an “OR” question; no boxes are checked on a list that includes “other” or “none of the above”; the question is annotated.  A sufficiently complex rubric can accommodate these results, but in that case there is no need for a multiple-choice question.


{ 3}     This discourse of invisibility makes up one of the core concerns in transgender studies today. See the anthology edited by Namaste (2000) for several perspectives.


{4}      C.f. Cromwell (1999) pp. 21-23.



Bibliography and Text Sources Used


Blackless, Melanie, Anthony Charuvastra, Amanda Derryck, Anne Fausto-Sterling, Karl Lauzanne, and Ellen Lee. 2000. “How sexually dimorphic are we? Review and synthesis.” American Journal of Human Biology 12:151-166.


Brewer, Cynthia A., Trudy A. Suchan (2001) Mapping Census 2000: The Geography of U.S. Diversity.  ESRI Press, Redlands, California.


Cromwell, Jason (1999) Transmen and FTMs; Identities, Bodies, Genders, and Sexualities.  University of Illinois Press, Urbana


Griggs, Claudine (1998) S/He.  Berg.  Oxford. 


Namaste, Viviane K., eds.  (2000)  Invisible Lives: The Erasure of Transsexual and Transgendered People.  University of Chicago Press, Chicago.


Raphael (?)  The Angel’s Dictionary.  http://www.ncf.ca/ip/sigs/life/gay/manner/andre, Accessed Nov. 18, 2006.




Information about this Article
Peer-review ratings (from 1 review, where a score of 100 represents the ‘average’ level):
Originality = 25.00, importance = 6.25, overall quality = 6.25
This Article was published on 24th November, 2006 at 22:36:14 and has been viewed 8373 times.

Creative Commons License
This work is licensed under a Creative Commons Attribution 2.5 License.
The full citation for this Article is:
Mitchell, E. (2006). A Proposed Methodological Tool for Social Surveys. PHILICA.COM Article number 60.

<< Go back Review this ArticlePrinter-friendlyReport this Article

1 Peer review [reviewer #82402confirmed user] added 25th November, 2006 at 04:04:18

This article addresses a important topic in survey research. The author’s view is that post-coding of responses to open-ended questions is fraught with bias. However, no concrete evidence is cited to indicate that this is a common problem in actual research. Some of the hypothetical problems are just that — such errors, if they occurred, would be quickly caught by others in the field (if not the original investigators themselves).

The author’s example of coding gender is puzzling to me. The Blackless et al. article is interesting, but does not show that up to 2% of the general population _identifies_ as intersex. Many of the physiological anomalies noted in that article likely would not prompt a person to think of oneself as trans/other-gendered. If survey researchers were attempting to ascertain biological sex, then survey responses would be sufficient in most cases, even if 2% of respondents perceived themselves as trans/other-gendered in some way or were intersex (disregarding the fact that when such response options are given in forced-choice questions or coded in free response questions, much less than 1% of respondents will be classified as such, even in “high-risk” populations such as sexually transmitted disease patients). Trans/other-genedered persons would in such circumstances most probably be excluded from analysis. And if it were crucial to identify gender precisely in a particular research project, self-reports would have to be supplemented by physical and other direct examination.

The article doesn’t seem to describe how the author’s rubrics were developed other than by his own opinion. Is there an empirical dataset on which they are based? Was there any systematic procedure used in developing them? As presented, the article simply isn’t a scientific contribution.

Originality: 2, Importance: 1, Overall quality: 1

2 Author comment added 30th April, 2009 at 03:30:22

In retrospect, I am to a considerable extent in agreement with the previous reviewer. However. I think the problems created by enforcing multiple-choice categories are self-evident; in all events they have been discussed at great length theoretically, especially in queer theory, whether or not they have been empirically grounded. This proposal originated, in fact, in a queer-theory critique of a survey draft.

Recently, I’ve had the opportunity to utilize this rubric in a survey that was not targeted at a transgendered population. By opening up the gender binary as a write-in, I found that 1.4% of the survey population used that opportunity to identify as something other than male, female, or synonyms thereof.

I agree that the rubric I have suggested is seriously flawed—in particular, it is based largely on etic terminology, and the problem of keeping a rubric abreast of changing emic terms is not addressed. Moreover, the rubric breakdown is merely intuitive.

All that said, it seems clear to me that if a particular measurement tool costs us 1.4% of our data in a given category, it is worth some effort to improve it.

Website copyright © 2006-07 Philica; authors retain the rights to their work under this Creative Commons License and reviews are copyleft under the GNU free documentation license.
Using this site indicates acceptance of our Terms and Conditions.

This page was generated in 0.3856 seconds.