AD·VNVM·DATVM Down to a single bit of data : Latest Post I am a PhD Graduand

Posted in || , , 4 min. to read

In short:

I have successfully defended my dissertation in Psychology at the University of Oregon, and am a PhD graduand (i.e., awaiting the conferral of the degree). My dissertation and its code and supporting files are freely available.

On November 22nd, 2016, I successfully defended my dissertation, "The Axiology of Necrologies: Using Natural Language Processing to Examine Values in Obituaries," making me a graduand for myWriting this post marks the first time I've used the phrase "my PhD;" previously, and despite the possible motivating effect of referring to it in that way, I felt uncomfortable with claiming it before having earned it. PhD in Personality and Social Psychology from the University of Oregon (i.e., having completed all requirements for the degree, and awaiting its conferral).

A summary of my dissertation project

A researcher named Shalom Schwartz proposed that when it comes to values, cultures across the world all have the same 10 basic "taste buds." These are (grouped into 4 overarching categories):

  1. "Self-transcendence:" Universalism, and benevolence
  2. "Conservation:" Security, tradition, and conformity
  3. "Openness to change:" Stimulation, self-direction, and hedonism
  4. "Self-enhancement:" Achievement, power, and hedonism

Schwartz's theory is that all cultures and subcultures value these same basic things, but that they value them in different levels (similarly, humans around the world have the same basic taste buds, while different cultures prefer different flavors).

Thanks to the generosity of Legacy.com, which publishes obituaries from 1,500+ newspapers internationally and whose officers allowed me to scrape obituary text from their website, I analyzed 140,599 obituaries from 832 newspapers in the USA to understand how much obituaries in different areas talk about those 10 values. Specifically, I analyzed every word in each of those obituaries to see how distant in a thesaurus it was from prototype words for each of those 10 values. I then created or added to computer programs to read each obituary and attempt to categorize the gender and age at death of the deceased, so that I could understand whether values are talked about differently based on those characteristics. I also used US Census data to look at whether those values were talked about differently based on places' income levels, education levels, and ethnic demographics.

A summary of some of the findings

To my surprise, of those 10 Schwartz values, the authors of the obituaries I analyzed most indicated about power, conformity, and security (I expected power to be least talked about, alongside hedonism). However, in line with what I expected, universalism, hedonism, and stimulation were least indicated. Unexpectedly, it turned out that when hedonism was talked about, conformity was often also talked about; I wonder whether this means that obituary authors "compensate" whenever describing a hedonistic quality of the deceased, since across the USA, hedonism is often seen as a less-desirable value (since it's linked with indulgence). By far (more than any individual-level or community-level demographic predictors), the largest variance component in my statistical models was the newspaper in which an obituary had been published, indicating that newspapers "clump together" in the values-expressions of the obituaries they publish.This "clumping" may be attributable to both shared values among the authors of obituaries in a given community, and because of templating exhibited by obituary authors (either explicitly, as "Fill in these blanks," or implicitly, as "Other obituaries I've read used this phrase, so I'll use it, too").

The dissertation and its supporting files

The dissertation itself can be found here (until it gets a permanent address from the University of Oregon's libraries). It is released under a Creative Commons Attribution 4.0 (i.e., CC-BY 4.0) License.

The dissertation presentation can be found here. The presentation uses Reveal.js, and can be navigated with the arrow keys -- vertical for within-topic slide changes, horizontal for across-topic slide changes. Pressing the m key will display a menu of slides, while pressing o will show a slide overview. Pressing the s key will show speaker notes.

In addition, I've posted much of my code and some derived datasets (for example, the list of words found in the obituaries corpus) especially for other researchers to use at http://doi.org/10.7264/D3WC7S. The code can be cited at that DOI, and is openly licensed (license terms are provided alongside the files).

This project felt like a very special way to finish my graduate program, since it involved not just performing the research, but also working consciously to be ethical with the texts and respectful to the authors and subjects of the obituaries I was stewarding. The dissertation document includes a discussion about the privacy implications of this type of "morality mining"This phrase was coined by Christen, Alfano, Bangerter, and Lapsley (2013), for a type of data mining that specifically seeks to understand moral or value-relevant properties of its subjects.; I welcome further discussion about that aspect especially, and about the use of similar methods for other ethical data analysis applications.

The image at the top of this post is a diagram of a medieval labyrinth, published in Nordisk Familjebok (cf. this original image). To me, it accurately symbolizes the nature of the dissertation-writing process as I experienced it. Unlike a maze, it moves toward a single goal on a single path; but it is convoluted in its progression, requiring patience not only to reach its center, but also to then traverse back to its starting point (with something new having been gained on the journey). I'll be writing more about this metaphor soon.

More Posts:

  1. I am Excited to be Working through CLIR in the University of Pennsylvania's Libraries // I am very excited to announce that I will be beginning work in Philadelphia at the University of Pennsylvania in January 2017 as the University Libraries' Bollinger Fellow in Library Innovation through the Council on Library and Information Resources (CLIR).
  2. An Introduction to the "Non-Destructive Guillotine" ("NDG") Book Scanner // I introduce a new portable, single-camera book scanner design. The scanner is small enough to fit into a backpack when disassembled, and can be re-assembled in under 10 minutes. It does not require any power tools to construct; the only required tools are a tape measure, a PVC cutting tool, and scissors. A full list of parts, sizes, and prices are below, along with a video explaining the scanner's construction and use.
  3. A Workflow for Using Plaintext Notes with Zotero // I've created an "export translator" and a CSL file to facilitate creating and using plaintext notes associated with citations in the cross-platform reference manager Zotero. The CSL file should also be compatible with Mendeley and other programs that use the CSL standard.
  4. A Video of 'Time Travel for Academics,' an Introduction to Git, from the University of Oregon // I've posted a video of a one-hour introduction to version control with Git for academics that I presented at the University of Oregon.
  5. Video Privacy Enhancer Plugin Updated // The Video Privacy Enhancer Plugin that I wrote for Pelican has been updated to allow embedding videos from Vimeo.
  6. A Self-Hosted Sync Solution for Zotero // Because of my preference not to use the Zotero server to sync my academic library, I have written a small bash (command-line) function to sync Zotero database files over any server of one's choosing (including, e.g., OwnCloud or Dropbox).
  7. Markdown Mapper has been Updated to v0.1.2 // Markdown Mapper has been Updated to v0.1.2, and now supports code blocks, blockquotes, and other sections of text that span multiple lines.
  8. A Workflow for Using Plaintext Notes with JabRef // I've created two "export filters" to facilitate creating and using plaintext notes associated with citations in JabRef, a cross-platform reference manager.
  9. Introducing Markdown Mapper // I've written a command-line program in R to facilitate reverse-engineering concept maps from plain-text notes (especially those written in Markdown).