A  A

AD·VNVM·DATVM Down to a single bit of data : Latest Posts Short Notes and Hypomnemata Hypomnemata: Things to reference and remember.

Small updates and thoughts, because I'm not on Twitter.

This is a small tip, but it took me four years of active thinking to come to it. I started using it two years ago, just before I authored Markdown Mapper. Markdown Mapper is an R script for reverse-engineering network maps from notes written in Markdown format, incorporating the hierarchical structure of outlines, metadata about the notes, and tags (like #hashtags) in the notes' text.

When I started my graduate career (which I recently concluded), I struggled to find a way to sustainably incorporate handwritten notes from meetings, classes, and brainstorming sessions with typed notes that I would alternatively create on those same types of occasions. Inevitably, my handwritten notes, lacking a coherent indexing strategy, would become forgotten or remain unconsidered as I reviewed the notes that I had typed.

My answer to this: Rather than focusing on constructing an index per se of my handwritten notes, I now just "tag" the notes. I follow this approach when taking handwritten notes (including, and especially, in mindmap format):

  1. Rather than using #hashtags, I double-underline any word or phrase I want to "tag" in my notes.
  2. Periodically, I digitize (i.e., save a digital copy of) my handwritten notebook. I most recently digitized a notebook by placing it on a tabletop, taking a photograph of each page, and sending it to Voussoir, as I described here. Just photographing each page with a camera or camera-phone can alternatively work well. Digitizing periodically also allows less worry about losing the physical notebook, because the notebook is "backed up" in a workflow like this. I save the photographs (either separately or combined as a PDF) alongside my typed notes.
  3. Once per week (or month, etc.), for each page of un-processed handwritten notes, I type the title of that page, the date, and a list of the double-underlined "tags" from that page. I save this list alongside my typed notes.
  4. At this point, I have a computer-searchable index of my handwritten notes alongside my computer-searchable typed notes. The full directory of notes files can be searched together, combined, etc. The presence of the digitized copy of the handwritten notebook allows looking up any page of notes regardless of whether the original, physical notebook is available.

For me, the utility of this approach is that the index arises naturally out of the text I write, and that the indexing process (typing up the "tags") can happen as frequently or infrequently as my projects require.

Following my post announcing my new position, the University of Pennsylvania, where I began work earlier this month, has very kindly issued a Press Release about my appointment.

The release can be found here, and describes the types of projects on which I'll be working in the coming two years. The original position description can be found here. In addition, I've archived the full text below:

Monday, January 9, 2017

It is with great excitement that The Penn Libraries announces the selection of recent University of Oregon PhD Jacob Levernier as our new Bollinger Fellow in Library Innovation. The Bollinger Fellowship in Library Innovation helps the Penn Libraries think creatively about our future and recruit some of the most talented recent graduates with interests in a wide-array of topics that intersect with libraries. The appointment of Jacob is no exception.

Levernier is a psychologist well versed in the study of cognition, with undergraduate minors in neuroscience, philosophy, and classical studies. He has a vested interest in the future of libraries and data management. Levernier’s current research interests include morality mining, data management education, moral advancement throughout the lifespan, open-source and open-access development and education, and the evolution of imagination.

Levernier will carry these interests into his work with the Libraries’ Technology Services division. Here, he will collaborate with collection curators, metadata specialists, business analysts, and IT developers to study fundamental challenges in discovery, content delivery, assessment, and information presentation.

Joe Zucca, Penn Libraries’ Director of Strategic Initiatives & Library Technology Services describes the scope of the Bollinger Fellow’s contribution in this way, "The Penn Libraries host a vast archive of data that reveal the interactions of scholars with information, that provide a unique lens on the research interests and behaviors of information consumers. The job of the Bollinger Fellow will be to mine these data for signals and patterns that inform acquisitions, service provision, and new strategic directions for the Library."

During his two year fellowship, Levernier will be uniquely positioned to interact with Penn Libraries’ systems, users, and vast archives of data. These interactions will lead to applied research that may influence catalogs and cataloging practice, researcher profiling systems, human interface design, repository tools, and the use of social media to understand information-seeking behavior and the use of data.

I am very excited to announce that I will be beginning work in Philadelphia at the University of Pennsylvania in January 2017 as the University Libraries' Bollinger Fellow in Library Innovation through the Council on Library and Information Resources (CLIR). CLIR's press release is here.

I am joining the 13th cohort of CLIR postdoctoral fellows, who include other social and data scientists (including recent doctoral graduates in psychology, anthropology, sociology, neuroscience, and related disciplines), medievalists, information scientists, and others. CLIR's mission includes the development of a generation of "scholar-librarians," who have earned PhDs and can talk with university faculty from a common background, effecting development and adoption of best practices in data ethics and management, software curation, and digital scholarship in academic institutions, using academic libraries as central hubs of change. CLIR released a summary of the types of work in which I will be collaborating at the University of Pennsylvania here:

Jacob Levernier (University of Pennsylvania) received his Ph.D in Psychology from the University of Oregon. Based in Penn’s Library Technology Services Division, he will collaborate with collection curators, metadata specialists, business analysts, information technology developers, and a computer scientist to study fundamental challenges in discovery, content delivery, assessment, and information presentation. As Bollinger Fellow in Library Innovation, he will work with library systems, users, and troves of data to explore and help design new forms of data-driven support for research and learning that may influence catalogs and cataloging practice, researcher profiling systems, human interface design, repository tools, and the use of social media in understanding information seeking behavior and the use of data. In addition, he will have input into cooperative initiatives of the Penn Libraries and its peers as they work to unlock the benefits of linked data and shared discovery networks.

This opportunity causes not only excitement but also gratitude in me; I feel tremendous admiration for the work of both the University's Libraries and CLIR, and appreciate that this interdisciplinary position is precisely what I trained for during my time as a graduate student. I look forward with optimism to completing my move to Philadelphia, and to reporting here on future projects.

I recently presented a one-hour tutorial to introduce Version Control using Git to an interdisciplinary group of researchers (mostly Psychologists and Biologists) at the University of Oregon's Quantitative Methods Laboratory. The presentation introduced both Git's Command Line Interface and the graphical interface that I currently prefer, GitEye; however, the presentation was created for an audience with no command line or version control experience.

I titled the presentation "Time-Travel for Academics: Get your digital life in order, and protect yourself from yourself." The presentation's abstract helped to motivate interest for users who might not have heard of version control:

If you've ever been working on a manuscript, statistical analysis, or notes on your reading, you might have started saving versions of your work with names like "Manuscript_good_3_a", "Manuscript_after_edits_good", "Manuscript_Use_This", and "Manuscript_Use_This_Final". Not only for your advisor or collaborators, but also for yourself a few months in the future, this approach to managing versions of your work can be confusing at best and misleading at worst, causing you to forget which version is the most up-to-date and, as a result, to re-do or lose work.

"Version control" is a type of free software that you can use to manage your work — not only to remember which versions are from when, but also to see exactly what you changed between versions, and why. Like a time machine, version control software lets you move back and forth between versions without clogging your hard drive with multiple copies of the same files.

We will be discussing the "why" and "how" of using Git, a popular and free version control system that is also the foundation for GitHub, which software developers and academics alike are using to share and collaborate on their work.

This talk will use both the command-line (the Terminal app in Mac OSX and Linux, and Command Prompt or Cygwin (https://www.cygwin.com/) in Windows — no experience assumed) and a point-and-click program called GitEye (http://www.collab.net/downloads/giteye).

Technical information

The presentation's source code files are here. The entire presentation was written in plain text using Markdown and then converted into HTML with Pandoc to be used with Reveal.js, an open-source browser-based presentation framework (i.e., nothing needed to run the presentation except for a web browser). It was (fittingly) version controlled using Git. Early in the development of the presentation, I also experimented with auto-generating a demonstration Git repository from scratch for live demonstrations, using RMarkdown. Instructions for converting from Markdown to final slide format can be found at the top of the Slides.Rmd file (which, following those early experiments, continues to contain instructions on invoking R to run any commands you might add).

A copy of the compiled slides

A copy of the complete presentation is here. It is a single, self-contained, 18.6 MB HTML file. Since the presentation uses Reveal.js, the slides can be advanced using the arrow keys on the keyboard or on the screen (left-right for major section breaks, up-down for minor slide breaks). An overview of all slides can be seen using the o key on the keyboard. The screen can be made black temporarily with the b key. Speaker notes (which are few, as the slides' text is sufficiently useful here for prompting the presentation) are available with the s key.

Although I am a registered instructor for Software Carpentry, which also teaches on this topic, this presentation was written from scratch in order to assume no background knowledge (especially on the command line).

I have released the presentation files under a Creative Commons BY-NC-SA 4.0 license. If you would like to use other license terms, please contact me to ask about an alternative arrangement.

The video recording was commissioned by and is © the University of Oregon Department of Psychology. The Department has kindly agreed to release it under a Creative Commons BY-NC-ND license.

Embedded Video - Click to view

As previously described, I wrote a privacy-enhancing plugin for Pelican, the static site generator that this website uses, which prevents tracking cookies from being placed on users' computers by websites like YouTube until the user explicitly "opts in" to watching the video by clicking on the thumbnail. This plugin is now included with the free Pelican Plugins repository.

I've now updated the plugin with two major changes (with a Pull Request pending approval approved UPDATE: The Pull Request has now been approved! to update the version in the main Pelican Plugins repo.):

  1. In addition to YouTube, the video sharing service Vimeo is now supported.
  2. The plugin code is now much easier to update to support additional video services. The documentation that comes with the plugin has also been updated to make this process straightforward for any users who would like to contribute.

Thus, using Pelican, writing

!youtube(hVqrW-fPOQ0)

the video from http://youtu.be/hVqrW-fPOQ0 is embedded like this (click to view):

Embedded Video - Click to view

while writing

!vimeo(38588291)

embeds the video from https://vimeo.com/38588291 (click to view):

Embedded Video - Click to view

The Markdown Mapper script on which I've been working has been updated to v0.1.2, with two primary changes:

A direct link to the updated version on GitHub is here.

I recently created a poster presentation on my ongoing collaboration with Mark Alfano and Andrew Higgins to extract the virtues of different communities by looking at the obituaries that they write. Cf. here The poster, and links to related resources, can be found here.

As part of my doctoral work at the University of Oregon, I completed an interdisciplinary certificate program last year in "New Media and Culture," the intersection of social- and computer-science-based quantitative methods with humanities-based critical commentary. The program recently asked to interview me about my thoughts on that intersection. They have published that interview here, as part of a series that they call "Shelfies" ("What's on your bookshelf?").

I'm reprinting an excerpt from the interview here (including a few phrases that weren't included in the published version), since it (like the Software Carpentry Team page summary that I posted several months ago) nicely summaryizes where I see myself, and my interests:

O'Neil and Schutt (2013) popularized the term "datafication" to capture the trend of seeing and gathering data where before it would have been infeasible or uninteresting. Fitbits, for example, "datafy" individuals' health decisions, from activity levels to food intake. In 2014, Facebook introduced a feature in their mobile app that would remotely activate users' microphones at various times to "identify TV and music" that the user might be enjoying. Where before there would have been no or only very limited data on these topics (before, one might record one's food choices in a daily journal, or perhaps pay an assistant to watch one throughout the day and make records), there is now persistent data, which often enable incidental usage (although Facebook stated that it "can't identify background noise or conversation," it would not be unreasonable to expect that conversations could be recorded and stored with such an app feature).

This is the context in which I see the study of new media, and where I see their relation to culture: from a research perspective, new media enable new types of inquiry on large scales, both of data quantity and of time; however, they also bring new ethical issues, with which legislation has not yet been able to begin to catch up. In my mind, researchers, as the curators and stewards of these data, carry a moral responsibility to understand these issues and act with them in mind. Understanding and maintaining literacy in the advantages and disadvantages of data storage and usage decisions is complex and requires a creativity that likely is best engendered by cross-disciplinary study, in which historical and philosophical perspectives meet technical ones.

As I've mentioned previously, I've become involved as a volunteer Instructor for Software Carpentry, an organization that facilitates "boot camps" for researchers and others to increase programming and data management literacy. I'm excited to have recently finished the Instructor training seminar, and am now listed on the organization's Team webpage.

I wrote a short summary of what I do for the page, and am reprinting it here, since it nicely archives how I currently see my professional self:

Jacob Levernier is a PhD student in Psychology at the University of Oregon, studying moral development, research ethics with digital data, and applied statistics. Jacob works between the disciplines of Psychology, Philosophy, and Computer Science. His interest in scientific computing centers on data management and workflow automation, both in the social sciences and related disciplines, including library science.

To date, I have written two plugins for Pelican, the static site generator that this website uses:

  1. A video privacy enhancer that limits what data is leaked to organizations like Google when a user visits a site with embedded videos.
  2. A plugin for increasing support for line numbering in code blocks such as this:
This is line 1.
This is a long line of code, which might spill over several lines if it goes on for long enough. Thanks to this plugin, even if the code spills onto another line, the line number next to it (as well as the number for the next line of code, etc.) won't become mismatched.
This is line 3.

As of today, both plugins have been accepted and merged into the main Pelican Plugins repository!

More Posts: