Fresa, Naco, or Neither?: The Interaction between Language and Visual Information: Rebeca Martínez Gómez
Rebeca is a PhD student in the Department of Linguistics at UNM.
This research was made possible in part by funding from the Latin American & Iberian Institute and Tinker Foundation Field Research Grant (FRG). For more information about the FRG, please visit the LAII website.
It is well known that the way you speak –the way you pronounce, the words you use– shows your identity –age, class, gender. We all use language to represent ourselves to others as we want them to perceive us. Studies in sociolinguistics have shown that when we are trying to portray ourselves in certain ways, we are even able to produce the sounds that we associate with the type of person that we, consciously or unconsciously, think has that characteristic. For example, if we think that relaxed people “drop their G’s,” we would probably pronounce running as runnin’ in a situation when we want to be perceived as relaxed. This social evaluation that certain forms of language have is known in linguistics as the social meaning (Coupland 2007).
Social meaning has been studied first in terms of what linguistic forms people produce to come across in a certain way (e.g., Eckert 1989, Mendoza Denton 2008, Labov 1972) and second in how people perceive the use of some linguistic forms (e.g., Campbell-Kibler 2010, Podesva et al. to appear). These studies show that the social meanings that linguistic forms gain are determined by several factors. For example, group membership of the speaker (Johnstone & Kiesling 2008), linguistic style in which a form is embedded (Eckert 2008), the level of awareness of the form (Labov 1972), or our general knowledge about a person (Podesva et al. to appear) might affect the social meaning of a linguistic form. One thing is clear so far: The social meaning(s) that a linguistic form can take is not fixed; there are dimensions of possible meanings in which variants can be construed (Eckert 2008).
One factor overlooked in the studies of linguistic social meaning is the effect of the social characteristics we perceive from speakers through the visual channel. Research in the realm of speech perception has found an interaction between the social characteristics we perceive visually and what linguistic category we perceive (e.g., Hay et. al 2006, Koops 2011, Strand & Johnson 1996). For instance, listeners would report hearing X vowel when they saw a young speaker, whereas they would report hearing Y vowel when the speaker looked older, despite the fact that the same sound was played to them in both instances. This implies a shift in perception of language due to the social characteristics we attribute to a speaker. Could it be possible that visual information has an effect upon the social meaning we perceive?
In order to study this interaction between language and visual information in the construction of social meaning, my research will look into how our previous knowledge about how people sound when they look a certain way is fulfilled versus when it is not. In other words, what are the perceived social meanings when we think that the person looks and sounds like we expect versus when we think there is a discrepancy between the looks and use of language? This matching between traits –in this case, of looks and language use–is called category congruency and has been found to have an effect in other realms, such as racial categorization (Bartholow & Dickter 2008), social relationships (Zebrowitz & Lee 1999), trait inferences (Maass et al. 2005), neural activity (White et al. 2009), interpersonal judgments (Costrich et al. 1975, Krueger et al. 1995), and even memory (Araya et al. 2003, Stangor & McMillan 1992). Thus, the research question of my dissertation is, what is the effect of category congruency on the social meaning of a linguistic form? In my project, I define social meaning specifically as social category. In other words, how will congruency affect how we socially categorize a person? Or, how do we socially categorize someone who “looks” like one category but “sounds” like another category? In order to explore this issue, I am studying a specific case that takes place in Mexico.
II. Case Study: Fresas and Nacos in Mexico
In Mexico, there is a stereotype of a group known as fresas (lit., “strawberries”). They are perceived as the privileged youth of the Mexican upper class, predominantly European-descended elite (Mendoza-Denton 2008), who are assimilated into the American lifestyle and behave pretentiously. Furthermore, they are known for speaking differently from the rest of the Mexican population. For instance, people associate fresas with specific intonation patterns, elongation of vowels at the end of their sentences, or use of words in English that have not been integrated by the rest of the community (Martínez Gómez 2014). Therefore, we could say that someone who uses these linguistic forms will likely be socially categorized as a fresa.
On the other hand, also part of the Mexican imaginary, a social category exists that is opposed to fresas: nacos. As defined by the Dictionary of Mexican Spanish, a naco is a person (or a thing) that is Indian or indigenous from Mexico, that s/he is ignorant and dumb, lacks education, or that it/s/he has bad taste. Someone perceived to belong to this category would not be expected to sound like a fresa. However, as I argued at the beginning, people tend to use the language of the group with which they want to be associated. In this case, despite the somewhat stigmatized fresa style, some speakers might use those linguistic forms to be perceived as someone with high social status. The question is, would these speakers appear in the way they wish in the lack of category consistency? That is, would listeners categorize them as fresas even if they do not look like one, or if they look more like the category of nacos? Would they be categorized as fresas, as nacos, or neither?
My dissertation will focus on answering these questions through testing a specific linguistic form in two conditions: category congruent and category incongruent. In order to do this, I needed to construct auditory and visual stimuli that resemble both fresa and naco categories. I did this on my field trip during the Summer 2014.
III. Field Trip to Guadalajara
Thanks to the support of a LAII Field Research Grant and a Graduate Research Project and Travel Grant, I traveled to Guadalajara, Mexico for four weeks during the summer of 2014. The purpose of this trip was to obtain material to construct audio and visual stimuli for the main experiment of my dissertation. Specifically, I needed to 1) collect pictures of young Mexicans, 2) ask different participants to rate the pictures, and 3) ask them to rate audio excerpts extracted from a conversational corpus from Guadalajara as well (previously collected through another LAII FRG in the summer of 2011).
The first part of the field research was to create a portfolio of pictures of Mexican youth. In previous research, pictures of speakers have been obtained in different ways such as databases (e.g., Koops et al. 2008), pictures from yearbooks (e.g., Pulos & Spilka 1961), and others who have taken their own photographs, controlling for the appearance of people (e.g., Hay et al. 2006, Strand 2000). Taking into account that many visual cues could affect the way a person is perceived (e.g., gestures, background, etc.), I decided to create my own portfolio of pictures for the specific purpose of this research. In order to have a wide variety of Mexican youngsters that physically and socially appeared to live a Mexican lifestyle (versus Mexicans living in the US), and to match the audio previously collected, this part of my research had to take place in Guadalajara, Mexico.
Participants were recruited by word of mouth and emails; I also issued personal invitations through three different social networks: one public and one private university, and the biggest street market in the city. The only requisite for participation was to be a Mexican between 18-30 years old. All pictures were taken using a digital Fuji FinePix S1800 camera in three different locations always using black fabric as a background. I obtained three pictures of each participant from the waist up. The participants were instructed to pose in this order: trying to look serious in one, smiling in another, and in whichever way they felt like posing in the last one. These strategies were used to control for background and gestures. Participants were offered a movie ticket for their collaboration.
A total of 96 young Mexicans participated, yielding a total of 288 pictures (three per participant). Some of these pictures will be used in my dissertation and also in future similar research. A total of 153 pictures (51 participants, three per person) will remain as a corpus of Mexican youth visual stimuli for future similar investigations.
The next step then was to find out which of those pictures and previously selected audio clips fit the fresa category best (to match as congruent) and which fit the opposite category nacos the best. For about a week, I worked on preparing the activities for this second stage and piloting them before conducting the actual sessions. I promoted this final phase at the Department of Applied Psychology in the University of Guadalajara and at CETI (Centro de Enseñanza Técnica Industrial). A total of 36 Mexicans from Guadalajara, 22 females and 14 males, with an average age of 21 participated. All of them were also compensated with a movie ticket for their collaboration. Participants had to socially categorize 40 audio clips and 40 pictures in two tasks: 1) a speeded dichotomous classification task and 2) a rating scale on how fresa or naco/a the person looked or sounded.
In the first task, participants had to decide as soon as they could whether they would categorize the person (in audio or picture) as fresa or naco/a. I recorded responses and latency in this task by using the software SuperLab and a RB-830 response pad. The second task collected participants’ explicit rating of each picture and audio excerpt on a scale from 0 to 10 as well as their reasons for giving that answer. The pictures and audios were numbered and participants looked and listened to them on a computer. Participants completed this task at their own pace. Both tasks were conducted in a quiet room and in a single session in a space provided either by the University of Guadalajara or by CETI.
The analysis of the results of the categorization task consists of looking at the frequency with which participants categorized each picture and audio as fresa and their means of reaction times. For the rating-scale task, I calculated the mean for each audio and picture, and organized the reasons for the answers participants provided into different categories. Finally, in order to combine the three different measurements (response and latency in the first task, and the means in the rating scale), I created an Index, where the higher the overall index, the better fit to the category fresa the picture or the audio was.
Based on the final measurement, I will be able to select pictures and audios that better represent both social categories. As I mentioned before, my dissertation will test how category congruency displayed through visual and auditory channels affects social categorization (i.e., social meaning). Thus, the results of this stage will be used to create congruent and incongruent stimuli (i.e., audio and pictures associated with the same social category and audio and pictures associated with opposite social categories) and observe how that affects the way people categorize the speakers. While the results per se of this stage are not very interesting in themselves (i.e., knowing the rating of each audio and picture does not say anything about my research question), a couple of things are worth noting.
One result is that not all stimuli were categorized similarly in both tasks. That is, some pictures and audio were categorized in drastically different ways in each activity (i.e., mainly as fresa in one, and mainly as naco in the other task.) One possible explanation for this result is that in the categorization task responses are automatic, that is, people are using their stereotypes, while the rating task is a controlled process, where low-prejudice participants might be inhibiting stereotypes (as in Devin, 1989). On the other hand, it is interesting that a couple of pictures were never selected as the fresa category. This tells us that although these social categories might be for the most part ambiguous (as opposed to other categories, e.g., sex) and depend a lot on where the observer is situated in the social space, some consensus still exists within a community.
Also, it is interesting to observe the reasons participants provided for their answers in the rating scale task. For instance, participants reported that they were categorizing those people in that way based on their clothing, accessories, hair style, or even gestures and postures. This is noteworthy because all of the participants in the pictures received precisely the same instruction; however, the bodily hexis, in Bourdieu’s terms, is shown even through little details such as in the way people smile (all pictures used were selected from the ones where participants were instructed to smile). It also reminds us of ethnographic studies in sociolinguistics where authors find that even “minor” aspects such as make up and even eyebrows are important when it comes to the portrayal of our persona. Finally, one of the most interesting elements as a basis for rating mentioned by participants was facial-physical features. This is important if we remember that the category naco was originally associated with being indigenous. As it has been reported in other research, there is within-group racial discrimination among the Latino population (Chavez et al. 2014), specifically in Mexico (Moreno Figueroa 2012). Thus, the fact that participants pointed to physical appearance as a way to determine social categorization tells us that this element is indeed used to differentiate groups within the larger community and that it matters to the categories of fresas and nacos. Although this issue is not the main one in my dissertation, I believe that it can indirectly contribute to the current conversation of discrimination in Latin America.
I would like to thank Professors Daniel Barragán Trejo and Israel Huerta Solano from University of Guadalajara; administrators Teresa Quijas Ibarra from University of Guadalajara and Maricarmen Díaz de Sandi Gómez from University of the Atemajac Valley; instructor Alfredo Orozco from CETI; and Mr. Gilberto Silva Collazo and Mrs. Jessica García Macías, who work at the street market. I could not have conducted this part of my research without their help. I am also extremely grateful to the Latin American and Iberian Institute for their gracious financial support of my doctoral studies at the University of New Mexico.
 Participants had two options in their consent: 1) allowing the researcher to use their pictures in the present and in future similar research, or 2) use the pictures exclusively in this project.
 Although the research will not use audio classified as naco, the objective of knowing what speakers resemble fresa style better is to present the studied linguistic form in a context where listeners would predict its occurrence (e.g., capture voice quality, levels of dynamism, speed, etc., that are congruous with fresa style). In other words, it is a way to avoid incongruence within auditory information.
 It is assumed that the latency in response reflects how close two pieces of information are in the cognitive representation. Thus, the fastest reactions are assumed to be the more prototypical of the chosen category.
Araya, T., Akrami, N., & Ekehammar, B. (2003). Forgetting Congruent and Incongruent Stereotypical Information.
Journal Of Social Psychology, 143(4), 433-449.
Bartholow, B. D., & Dickter, C. L. (2008). A Response Conflict Account of the Effects of Stereotypes on Racial
Categorization. Social Cognition, 26(3), 314-332.
Campell-Kibler, K. (2010). Sociolinguistics and Perception. Language and Linguistics Compass, 4: 377–389
Chávez-Duenãs, N. Y., Adames, H. Y., & Kurt C. (2013). “Skin-Color Prejudice and Within-Group Racial
Discrimination: Historical and Current Impact on Latino/a Populations.” Hispanic Journal of Behavioral
Sciences 36(1), 3-26.
Coupland, N. (2007). Style: Language Variation and Identity. Cambridge: Cambridge University Press.
Costrich, N., Feinstein, J., Kidder, L., Marecek, J., & Pascale, L. (1975). When stereotypes hurt: Three studies of
penalties for sex-role reversals. Journal of Experimental Social Psychology, 11, 520-530.
Devine, P. G. (1989). Stereotypes and prejudice: their automatic and controlled components. Journal of personality
and social psychology, 56(1), 5.
Dictionary of Mexican Spanish. (2012). El Colegio de México. Visited on April 30 from http://dem.colmex.mx
Eckert, P. (1989). Jocks and burnouts: Social categories and identity in the high school. New York: Teachers
------------(2008). Variation and the indexical field. Journal of sociolinguistics. 12.453-76.
Hay, J., Warren, P. & Drager, K. (2006). “Factors influencing speech perception in the context of a merger-in-
progress.” Journal of Phonetics 34, 4:458-84.
Johnstone, B., & Kiesling, S.F. (2008). Indexicality and experience: Exploring the meanings of /aw/-
monophthongization in Pittsburgh. Journal of Sociolinguistics 12/1, 2008: 5–33
Koops, C. (2011). Local Sociophonetic Knowledge in Speech Perception (Doctoral dissertation, RICE
Krueger, J., Heckhausen, J., & Hundertmark, J. (1995). Perceiving middle-aged adults: effects of stereotype-
congruent and incongruent information. The Journals Of Gerontology. Series B, Psychological Sciences And
Social Sciences, 50(2), 82-93.
Lavob, W. (1972). Sociolinguistic Patterns. Philadelphia: U. of Pennsylvania Press.
Maass, A., Cadinu, M., Boni, M., & Borini, C. (2005). Converting Verbs into Adjectives: Asymmetrical Memory
Distortions for Stereotypic and Counterstereotypic Information. Group Processes & Intergroup Relations,
Martínez Gómez, R. (2014). Language Ideology in Mexico: The Case of Fresa Style in Mexican Spanish. Texas
Linguistic Forum (Proceedings of the Symposium About Language and Society – Austin 15) 57.
Mendoza-Denton, N. (2008). Homegirls. Language and Cultural Practice among Latina Youth Gangs. Malden,
Oxford, Carlton: Blackwell.
Moreno Figueroa, M. G. (2012) '‘Linda Morenita’: Skin Colour, Beauty and the Politics of Mestizaje in Mexico' in C.
Horrocks (ed.) Cultures of Colour: Visual, Material, Textual (Oxford and New York: BerghahnBooks); 167-
Podesva, R. J., J. Jamsu, P. Callier & J. Heitman. (mss.) To appear 2015. Constraints on the social meaning of
released /t/: A production and perception study of U.S. politicians. Language Variation and Change.
Pulos, L., & Spilka, B. (1961). Perceptual selectivity, memory, and anti-Semitism. Journal of Abnormal and Social
Psychology, 62, 690-692.
Stangor, C., & McMillan, D. (1992). Memory for expectancy-congruent and expectancy-incongruent information: A
review of the social and social developmental literatures. Psychological Bulletin, 111, 42–61.
Strand, E. (2000). Gender stereotype effects in speech processing. Doctoral dissertation. The Ohio State University.
Strand, E., & Johnson, K. (1996). Gradient and visual speaker normalization in the perception of fricatives. In. D.
Gibbon (ed.) Natural Language processing and speech technology. Results of the 3th KONVENS
Conference, pp. 14-26.
White, K., Crites, S., Taylor, J., & Corral, G. (2009). Wait, what? Assessing stereotype incongruities using the N400
ERP component. Social Cognitive And Affective Neuroscience, 4(2), 191-198.
Zebrowitz, L., & Lee, S. (1999). Appearance, stereotype-incongruent behavior, and social relationships. Personality
And Social Psychology Bulletin, 25(5), 569-584.