This mirrors an earlier post about using the reference manager As noted in that previous post, a reference manager keeps track of citations, documents, and annotations, facilitating academic writing. JabRef with plaintext notes. Since writing that post, I've increasingly been using Zotero, which I think is more featureful than JabRef, and which similarly is open source (making it, in my opinion, better-suited to long-term academic work than Mendeley, another popular reference manager, which is now owned by the publishing company Elsevier). This is a long post, so I've included a Table of Contents.
Benefits of Zotero
Zotero has founded an impressive and exciting community of developers designing new and re-usable elements, including
- Web scrapers (A web scraper is a program that looks through the code of a webpage, searches for relevant pieces, and then extracts and turns them into ordered data, allowing, for example, Zotero to import a citation directly from a Google Scholar or Amazon.com search listing);
- Export translators (Zotero's term for programs that export citations in other storage formats, making the citations more easily able to be used with other programs);
- Citation Style Language (CSL) files (formatting instructions, which are standardized and thus also work in some other programs, including Mendeley. These instructions allow exporting bibliographies in different formats, such as those defined by the American Psychological Association (APA) and Modern Language Association (MLA)); and
- Plugins (I especially use the Better Bib(La)TeX and AutoZotBib plugins to allow easier use of the BibTex standard about which I wrote in the earlier post on JabRef.)
Zotero can even read embedded metadata on webpages that include it. Since learning this, I've taken this knowledge and added metadata to each page of this website, such that if you import a page from Zotero or a similar tool, the importer should be able to automatically extract the page title, author name, date, and website title.
Features lacking in Zotero for a plaintext- and privacy-centered workflow
Through this process, I've found only two features lacking in Zotero for my preferred workflow, which especially requires easy use of plaintext notes attached to citations rather than use of Zotero itself to take notes (this opens the notes to the power of the Unix command line and my Markdown Mapper script):
- First, the ready ability to deploy and use a self-hosted syncing solution. I've temporarily solved this for myself with a bash script.
- Second, easy access to the filepaths of attachments. As I've incorporated Markdown Mapper and its style of (hash)tagging notes, A future post will include several helper functions that I've written to facilitate this process. having access to the filepaths of my notes has become more and regularly important. This is, I think, a need particularly nicely and understatedly met by JabRef, which has a menu to easily view and edit filepath attachments.
Zotero allows attaching files (PDFs, plaintext files, and whatever else) either by dragging a file from a file manager onto a reference within Zotero, or by using an equivalent menu. Zotero also ably automatically renames files using their metadata, and moves them into a storage directory of the user's choice. Once files are attached to a reference, however, they become difficult to find or change in bulk from within Zotero.
The filepath on my system for the attachment in the screenshot above is
/home/myusername/Documents/Document_Library/Zotero_Database/storage/CGVAE8I4/Alfano-- character as moral fiction.pdf. By default, Zotero copies each attachment into a separate directory with a purposely random name; thus, although most of my Zotero attachments are stored in the same
storage directory, the attachment's location is unclear from this display. To my understanding, Zotero's use of random directory names is to avoid problems that might arise from two attachments using the same filename. Zotero does allow users to right-click on an attachment and choose "Show File," opening the system's file browser to the relevant directory. In addition, the Zutilo plugin adds an option to see a pop-up message with the location of the attachment:
Unfortunately, the Zutilo pop-up will only show one filepath at a time; if a user selects two attachments and clicks to see their filepaths, two popups, only visible in succession, will be shown. Both the "Show File" and Zutilo options are useful for looking up single attachments, but are impractical for looking up the locations of more than a few attachments at a time, as one might wish to do when running command line scripts (e.g., Markdown Mapper, or even basic manipulation tools) on attachments.
An "Export Translator" for exporting attachments' filepaths
To address this issue, I've written an export translator for Zotero. The translator adds an "Attachment Filepaths" entry to the drop-down list of formats in Zotero's Export menu. It is based on the CSV export translator by Philipp Zumstein and Aurimas Vinckevicius, which is included by default in Zotero. Like Drs. Zumstein's and Vinckevicius' CSV translator, this new Attachment Filepaths translator is released under the AGPLv3 license.
How it works
Once the translator is installed (see below), a user can highlight one or more references in Zotero, click "Export Item(s)...", and select "Attachment Filepaths" in the "Format:" drop-down menu.
Clicking "OK" in the screenshot above would produce a text file containing the paths of all files attached to the highlighted references:
"/home/myusername/Documents/Document_Library/Zotero_Database/storage/CGVAE8I4/Alfano-- character as moral fiction.pdf" "/home/myusername/Documents/Document_Library/Zotero_Database/storage/HWGP3QMB/Alfano_plaintext_notes_attachment.mkd" "/home/myusername/Documents/Document_Library/Zotero_Database/storage/RTDIS8T4/Chung and Pennebaker - 2008 - Revealing dimensions of thinking in open-ended sel.pdf" "/home/myusername/Documents/Document_Library/Zotero_Database/storage/4S2TPD85/Saucier et al. - 2015 - Cross-Cultural Differences in a Global “Survey of .pdf" "/home/myusername/Documents/Document_Library/Zotero_Database/storage/6KAXW8K7/Saucier_plaintext_notes_attachment.txt" "/home/myusername/Documents/Document_Library/Zotero_Database/storage/8DZ7XVFW/Saucier_second_plaintext_notes_attachment.markdown"
Each line comprises one attachment filepath, and is contained in quotes, making it ready for use in a command-line script or application, even if any of the filepaths contain spaces.
The translator currently allows two options: the first is to export only the paths of plaintext attachments (e.g., for use with a tool like Markdown Mapper):
Selecting this option would produce the output below. Note that the PDF attachments are now not included:
"/home/myusername/Documents/Document_Library/Zotero_Database/storage/HWGP3QMB/Alfano_plaintext_notes_attachment.mkd" "/home/myusername/Documents/Document_Library/Zotero_Database/storage/6KAXW8K7/Saucier_plaintext_notes_attachment.txt" "/home/myusername/Documents/Document_Library/Zotero_Database/storage/8DZ7XVFW/Saucier_second_plaintext_notes_attachment.markdown"
I have noticed that Zotero export translators (not just the Attachment Filepath translator that I'm introducing here) sometimes produce an error if the attachments themselves are highlighted, like this:
I'm not sure yet why this is.
The second option takes any of the output above and prints it all on a single line, making it easier to use with many command line utilities (e.g., in a
bash command like
grep "some phrase" [list of files] | wc --lines (which would show the number of lines across the files that contain "some phrase")).
Updating the exporter's definition of "plaintext"
The translator considers "Plaintext files" to be any files that end in one of a list of file formats (e.g., "example.txt"). This list is defined near the top of the export translator code, and is straightforward to change or expand. It currently includes popular file extensions for text, markdown, and reStructuredText documents:
// The list of file extensions that are whitelisted if "Only Export Paths of Plaintext Files" is checked: // NOTE: These should all be lowercase, without a leading dot (e.g., "txt" instead of ".TXT") var plainTextFileExtensions = [ "txt", "mkd", "md", "markdown", "rmkd", "rmd", "rmarkdown", "rst" ];
Unfortunately, the translator was not selected for inclusion in the Zotero codebase. However, it's easy to install.
The most up-to-date version of the translator can be found here. For archival purposes, a version of the code as of this writing is also locally hosted here. To install it, download it to your computer and save it in the "translators" directory within your Zotero data directory (it should save with a .js file extension). The Zotero data directory for your system can be found under
Preferences... (in the Cogwheel icon menu) -> Advanced -> Files and Folders -> Data Directory Location -> Show Data Directory.
YAML Metadata for Attachment Annotations
In addition to writing the export translator above, I've adapted the APA 6th Edition CSL file to produce YAML-style metadata about a reference, to be used when taking notes on that reference. This mirrors the "export filter" that I included in my earlier JabRef post. CSL is a standardized language for defining bibliographic styles. In addition to working with Zotero, this file should also work in Mendeley, Papers, and other reference managers that have also adopted the standard.
Like the file on which it is based, this CSL file is released under a Creative Commons Attribution-ShareAlike 3.0 license. Adding my name to the list of contributors, the file is © 2015 Jacob G. Levernier; portions © 2015 Simon Kornblith, Bruce D'Arcus, Curtis M. Humphrey, Richard Karnesky, and Sebastian Karcher.
How it works
Installing a new CSL file creates an additional option when a user right-clicks on a source in Zotero and chooses "Create Bibliography from Item..."
Using this new option, the reference
Alfano, M. (2013). Character as moral fiction. Cambridge: Cambridge University Press. Retrieved from http://public.eblib.com/choice/publicfullrecord.aspx?p=1099907
...would be transformed by the CSL file into this:
INSTRUCTIONS: Use your text editor to find-and-replace all instances of (without the spaces) '| | |' with '\r\n' (this means 'new line' in most text editors). Then delete this line.||||||---|||Author: Alfano, M.|||Issued: 2013|||Title: Character as moral fiction|||Publisher-place: Cambridge|||Publisher: Cambridge University Press|||URL: http://public.eblib.com/choice/publicfullrecord.aspx?p=1099907|||---||||||
...which, when its instructions are followed, becomes this:
--- Author: Alfano, M. Issued: 2013 Title: Character as moral fiction Publisher-place: Cambridge Publisher: Cambridge University Press URL: http://public.eblib.com/choice/publicfullrecord.aspx?p=1099907 ---
||| line delimiters are a workaround to allow use of the CSL Visual Editor, which I've found deletes XML linebreaks.)
The most up-to-date version of the file can be found here. As above, for archival purposes, I've also self-hosted a copy that is current as of this writing, here. This code can be validated using csl-validator.js.
To install it in Firefox, simply drag the file onto the Firefox window. It can also be installed directly into Zotero (including the Zotero Standalone version) under
Preferences... (in the Cogwheel icon menu) -> Cite -> Styles -> + (next to "Get additional styles...)"
With these additions, Zotero works wonderfully for plaintext-based workflows. As always, I welcome constructive ideas, feedback, and contributions, through any of my means of contact.