Document Processing

Document manager (reference manager)

I am an academic. That means, in part, that I read for a living. This is not the type of reading you do to put yourself to sleep, but the type of reading you do so you know how things are done or what is presently the best information in a field that changes quickly. Reading becomes the basis for teaching and for research.

Books are seldom the focus of this work. Books are secondary sources. I may write books, but I read books to keep up with the competition and seldom to learn something new. Academics read periodicals (journals) that describe the most recent research and thinking. This material is written in a specific way with specific characteristics. This way of writing is not intended to be entertaining in that ideas are expressed in creative ways. This writing follows a pattern that is intended to communicate clearly and establish the basis for all claims made.

I am constantly looking for tools and techniques that would improve my work flow as I do this kind of reading. This was true even before the days of personal computers, I have explored many “systems” in the time I have access to technology.

My most recent set of expectations would look something like this:

  • Store and organize pdf files –  Journal articles are not provided as pdf files. A system I use to deal with this content should allow me a way to keep track of the documents and the document-related content (notes, highlights) that I create in reaction. Organization is mostly about retrieval. Will I be able to locate something a year or so later I know I read but now need access to the original to check on specifics and to retrieve details?
  • Highlight and annotate individual files – I like to think of what I do as “processing information inputs”. Reading allows some mental processing, but it helps me if I also do things like highlight and take notes. To some extent, this allows thinking beyond the information given and it also externalizes and stores ideas for future access.
  • Synchronize files across devices or with cloud storage for purposes of backup and flexible access – When I was younger, I nearly always went to the office to work. I spent a lot of time there in the evening and on weekends. One of the reasons for this was that the office was where my stuff was. It was also distraction free. As I became more secure in my career, this lifestyle became less attractive. So, I would haul a large brief case between my home and office each day. Moving to a more digital and online approach changed this. What I need now is internet access and online storage. My huge time investment in this process also encourages concern for backup. I was never that concerned that I would lose my file cabinets, but I am concerned that resources I store on a specific computer may disappear.
  • Export selected files, annotations, and citation lists. – One of my concerns when committing to any digital service involves the ease of getting my content out of the system. What happens if the system is discontinued or the conditions of use change in a way I find unacceptable? I want the opportunity extract my content from the system.

I have two recommendations. There is a free version (level) for both and this may meet the needs of most. I have found that I need to “hack” both in a way to meet my interests, but I have found work arounds that allow me to adjust the systems.

Mendeley

Mendeley allows you to upload pdfs to the cloud and to download pds from this site. You can access the cloud resources from multiple devices (including the iPad). The system includes a pdf reader allowing annotation and highlighting. Annotations and highlights are not automatically moved across devices, but you can export the pdf with your highlights. I must admit to being confused by why the tool would be designed in this way – when at home I want to see the highlights I added at work.

mendeleydesktop

The free version now comes with 2 gigabytes of storage (this was just upgraded from 1 gigabyte of storage when Mendeley was purchased by Elsevier). I must admit I am uneasy when conglomerates buy up tools (Elsevier publishes many academic journals) because I can imagine self-interests eventually imposing limitations.

To give you a sense of what 2 gigabytes provides, I presently have 315 pds stored and I am using approx 250 MB. My one complaint related to storage space concerns the gap between free and the lowest paid version. Two gigabytes is free and 5 gigabytes is $5 a month ($60 a year). Consider what purchasing 25 gigabytes of storage through say Google would cost ($30 for 25 gigabytes).

ReadCube 

ReadCube offers similar features, but saves pdfs to your computer. I found a work around by saving to a folder I synch through Box. I have 50 gigabytes of storage through Box so this eliminates the storage and synchronization issue.

 

readcubedesk

 

ReadCube is different from Mendeley in that it is also a discovery and download tool. The system recognizes that I have access through my institution and allows me to download pdfs because of this affiliation. I can search Google Scholar or PubMed (Google Scholar for me), read the abstract, and download the pdfs for most recent citations I identify. In some cases, there is also a feature called “enhanced pdf” that accesses the reference section of an article and allows me then to use this list to identify cited sources I can also download.

readcubeenhanced

The ReadCube funding model is kind of interesting (if I understand it correctly). The model is described as iTunes like which I think means a fee is charged for the download of individual articles. The price of what this article describes as the iTunes model would be difficult for most of us to cover. The work to review articles for a paper could easily run to several hundred dollars for a given piece I might be working on. It almost assumes grant funding. In addition, while students read far few articles, the cost of what they do read would have a larger impact. At the level of the institution, I would be very surprised if subscriptions through the  library will not continue to be the most efficient model. If ReadCube could negotiate a more moderate rate from publishing companies based on a greater volume of business, perhaps something close to an iTunes model will work. At present, ReadCube recognizes my affiliation with a university and allows my to access resources because the University has committed to the subscription model. So, it is not my situation that concerns me, but rather the survival of the company based on the business model assuming customer payments.

Loading