A fluorescence of note taking tools
Over the past three or so years there has been a fluorescence of digital note taking tools and platforms.
Some of these include:
This brief list doesn’t take into account a sea of other mobile apps and platforms in addition to a broad array of social media platforms that people use for similar note taking or annotations.
My particular interest in some of this note taking field comes in the growing number of people who are working in public and sharing their notes in online settings with others. This has been happening organically since the rise of the internet and has happened on blogs within the blogosphere and on personal and communal wikis.
As was highlighted (pun intended) at the recent I Annotate 2021 conference, the note taking space seems to have been coming to a new boil. With the expansion of the ideas of keeping a zettelkasten or a digital garden, these versions of notebooks seem to be a significant part of this new note taking craze.
One thing I have noticed, however, is a dramatic lack of continuity in the history of note taking within the longue durée of Western civilization. (Other cultures including oral cultures have similar traditions, but for our purposes here, I won’t go into them except to say that they’re highly valuable, spectacularly rich, and something of which we should all be aware.)
Many of these products are selling themselves based on ideas or philosophies which sound and even feel solid, but they’re completely ignoring their predecessors to the tune of feeling like they’re trying to reinvent the wheel. As a result, some of the pitches for these products sound like they’re selling snake oil rather than tried and true methods that go back over 2,000 years of intellectual history. I can only presume that modern education is failing us all dramatically. People are “taught” (maybe told is the better verb) to take notes in school, but they’re never told why, what to do with them, or how to leverage them for maximum efficiency. Perhaps the idea has been so heavily imbued into our culture we’ve honestly forgotten the basic parts and reasoning behind it?
Even Vannevar Bush’s dream of the Memex as stated in his article As We May Think (The Atlantic, 1945), which many of these note taking applications might point to as an inspiration, ignores this same tradition and background, so perhaps these app creators and users aren’t all to blame?
Delineating Online Forms
I’ve been doing some serious reading and research into these traditions to help uncover our missing shared history. I’ll write something longer and more specific about them at a later date.
In the meanwhile, I want to outline just a bit about the various flavors as they relate to some of the more public online versions that I see in the related internet spaces. I hope to help better delineate what they have in common, how they differ, and what they may still add to the mix to get us to a more robust version of Bush’s dream.
Other’s thoughts and comments about these various incarnations and their forms and functions are both encouraged and appreciated.
Historically commonplace books are one of the oldest and most influential structures in the note taking, writing, and thinking space. They have generally been physical books written by hand that contain notes which are categorized by headings (or in a modern context categories or tags. Often they’re created with an index to help their creators find and organize their notes.
They originated in ancient Greece and Rome out of the thought of Aristotle and Cicero as a tool for thinking and writing and have generally enjoyed a solid place in history since. A huge variety of commonplaces have been either copied by hand or published in print book form over the centuries.
Most significant thinkers, writers, and creators throughout history have kept something resembling a commonplace book. While many may want to attribute the output of historical figures like Erasmus, Newton, Darwin, Leibnitz, Locke, or Emerson to sheer genius (and many often do), I might suggest that their works were the result of sustained work of creating personal commonplace books—somewhat like a portable Google search engine for their day, but honed to their particular interests. (One naturally can’t ignore their other many privileges like wealth, education, and time to do this work, which were also certainly a significant factor in their success.)
Many people over the past quarter of a century have used a variety of digital forms to keep digital commonplace books including public versions on blogs, wikis, and other software for either public or private consumption.
Florilegia are a subcategory of commonplace book starting around 900 CE but flourishing in the 12th and 13th centuries and primarily kept by theologians and preachers. The first were a series of short excerpted passages often arranged in order of their appearance in a single text, but eventually were arranged systematically under discrete headings. Medieval florilegia where overwhelmingly, and often exclusively, concerned with religious topics from the works of scriptures, the moral dicta of the Doctors of the Church, and—less frequently—the teachings of approved, classical moral philosophers. The idea and form of florilegium generally merged back into the idea of the commonplace book which had renewed interest and wide popularity during the Renaissance.
These didn’t add any new or innovative features over what had come before. Perhaps, if anything, they were a regression because they so heavily focused only on religion as a topic.
Few (if any) examples of florilegia can be found in modern digital contexts. Though I have seen some people talk about using digital note taking tools for religious study, I have yet to see public versions online.
Born out of the commonplace tradition with modifications by Conrad Gessner (1516-1565) and descriptions by Johann Jacob Moser (1701–1785), the Zettelkasten, a German word translated as “slip box”, is generally a collection of highly curated atomic notes collected on slips of paper or index cards. Zettelkasten were made simpler to create and maintain with the introduction of the mass manufacture of index cards (and card boxes and furniture) in the early 20th century. Slips of paper which were moveable within books or files and later on index cards were a significant innovation in terms of storing and organizing a commonplace book.
Generally zettels (or cards) are organized by topics and often contain dates and other taxonomies or serialized numbers as a means of linking them to other cards within the system. The cross linking of these cards (and thus ideas) were certainly a historical physical precursor of the internet we have today, simply in digital form.
Almost all the current references I’ve seen online to Zettelkasten mention Niklas Luhmann as their inspiration, but none of them reference any other well-known historical examples despite the fact the idea has been around and evolving for several centuries now.
This productivity system and sets of digital tools around it came to greater attention in Germany in 2013 with the exhibition “Zettelkästen: Machines of Fantasy” at the Museum of Modern Literature, Marbach am Neckar and in 2014 with the launch of the zettelkasten.de website. A subsequent boost in the English speaking world occurred following the publication of Sönke Ahrens’s book How to Take Smart Notes – One Simple Technique to Boost Writing, Learning and Thinking – for Students, Academics and Nonfiction Book Writers in February 2017. The recent ability to use platforms like Roam Research, Obsidian, Notion, et al. has helped to fan the flames of their popularization.
More often than not, most of these digital tools (like their card-based predecessors) are geared toward private personal use rather than an open public model. Roam Research and Obsidian Publish have features which allow public publishing. TiddlyWiki is also an excellent tool for this as its so-called Tiddlers have a card-based appearance and can be placed in custom orders as well as transcluded, but again not many are available to the online public.
This sub-genre of notebooks comes out of the tradition of double-entry book keeping where accountants often kept a daily diary of all transactions in chronological order. These temporary notes were then later moved into a more permanent accounting ledger and the remaining book was considered “waste”.
In the commonplace book tradition, these books for temporary notes or (fleeting notes in a Zettelkasten framing), might eventually be copied over, expanded, and indexed into one’s permanent commonplace collection.
In modern digital settings, one might consider some of the ephemeral social media stream platforms like Twitter to be a digital version of a waste book, though to my knowledge I may be the first person to suggest this connection. (To be clear, others have certainly mentioned Twitter as being a waste and even a wasteland.)
Inspired, in part, by Apple’s HyperCard, Ward Cunningham created the first public wiki on his website on March 25, 1995. Apple had designed a system allowing users to create virtual “card stacks” supporting links among the various cards (sound familiar?). HyperCard was designed as a single user system.
Wikis allowed multiple users to author and edit pages on the web with a basic web browser. They were also able to create meaningful links and associations between pages, whether they existed or not using [[WikiLinks]]. They were meant to allow the average visitor to participate in an ongoing process of creation and collaboration.
Here there is some innovative user interface as well as the ability to collaborate with others in keeping a commonplace book. Transclusion of one page into another is a useful feature here.
Personal wikis have been used (as have many blogs) for information aggregation and dissemination over the years in a manner similar to their historical predecessors.
Second brain is a marketing term which stands in for the idea of the original commonplace book. It popped up in the note taking context in early 2017 for promoting the use of commonplace books techniques using Tiago Forte’s expensive online course Building a Second Brain which focused on capturing, organizing, and sharing your knowledge using (digital) notes. It is a platform agnostic method for improving productivity wholly using the commonplace underpinning.
Google searches for this term will be heavily mixed in with results about the gastrointestinal system being the body’s “second brain”, the enteric nervous system, second brain tumors, a debunked theory that dinosaurs had two brains, and other general health-related topics.
Some websites, personal wikis and other online versions will use the phrase second brain, but they generally have no innovative features that are missing from prior efforts. Again, I view the phrase simply as marketing with no additional substance.
Informed heavily by their cultural predecessors in commonplace books, zettelkasten, and wikis, digital gardens are digital first note collections which are primarily public by default and encourage the idea of working in public.
Digital Gardens arose more formally in 2019 and 2020 out of the work and influence of Mark Bernstein’s 1998 essay Hypertext Gardens: Delightful Vistas, Ward Cunningham’s Smallest Federated Wiki (which just celebrated it’s 10th anniversary), Mike Caulfield’s essays including The Garden and the Stream: A Technopastoral as well as some influence from the broader IndieWeb Community and their focus on design and user interface.
Digital garden design can often use the gardening metaphor to focus attention on an active tending and care of one’s personal knowledge base and building toward new knowledge or creations. The idea of planting a knowledge “seed” (a note), tending it gradually over time with regular watering and feeding in a progression of 🌱 Seedlings → 🌿 Budding → 🌳 Evergreen is a common feature.
There are a growing number of people with personal digital gardens in public. Many are built on pre-existing wiki software like WikiMedia, the Smallest Federated Wiki, or TiddlyWiki, static site generators like Jekyll, note taking platforms like Obsidian Publish and Roam Research, or even out of common blogging software like WordPress. A growing common feature of these platforms is that they not only link out to resources on the open web, but contain bidirectional links within themselves using either custom code (in a wiki-like manner) or using the W3C Webmention specification.
With luck, application and platform designers and users will come to know more about the traditions, uses, and workflows of our rich cultural note taking history. Beyond this there are a few innovations, particularly in the public-facing arena which could be useful, but which aren’t broadly seen or available yet.
Still missing from the overall personal knowledge and note taking space is a more tightly integrated version of both a garden and a stream (in Mike Caulfield’s excellent framing) that easily allows interaction between the two arenas. Some of the more blog-based sites with notes, bookmarks, articles and IndieWeb friendly building blocks like Webmention, feeds (RSS, JSON Feed, h-feed), Micropub, and Microsub integrations may come the closest to this ideal.
One of the most fascinating recent entrants on the scene is Flancian’s Anagora which he uses as a personal commonplace book in a wiki-esque style. Over other incarnations it also has the ability to pull in and aggregate the notes of other digital commonplace books to create a larger marketplace of ideas. It also includes collaborative note taking space using Etherpad, which I’ve seen as a standalone tool, but never integrated into a digital commonplace book.
Ultimately, my dream—similar to that of Bush’s—is for individual commonplace books to be able to communicate not only with their users in the Luhmann-esqe sense, but also communicate with each other.
Niklas Luhmann apparently said:
Ohne zu schreiben, kann man nicht denken; jedenfalls nicht in anspruchsvoller, anschlussfähiger Weise.
(Translation) You cannot think without writing; at least not in a sophisticated, connectable way.
I think his conceptualization of “connectable” was much more limited and limiting than he might have guessed. Vannevar Bush, as the academic advisor of Claude Shannon, the godfather of the modern digital age, was more prepared to envision it.
(Luhmann’s “you” in his quote is obviously only a Western cultural referent which erases the existence of oral based cultures which have other ways to do their sophisticated thinking. His ignorant framing on the topic shouldn’t be a shared one.)
This post has grown out of my own personal commonplace book, portions of which are on housed on my blog, in a wiki, and in a private repository of which I hope to make more public soon. Further thoughts, ideas and expansions of it are more than welcome.
I’ve slowly been updating pieces of the history along with examples on shared commonplaces in both the IndieWeb Wiki and Wikipedia under the appropriate headings. Feel free to browse those or contribute to them as you would, at least until our digital commonplace books can communicate with each other.
I’d also invite those who are interested in this topic and who have or want online spaces to do this sort of thing to join us at the proposed upcoming Gardens and Streams II IndieWebCamp Pop up session which is being planned for later this Summer or early Fall. Comment below, stop by the page or chat to indicate your interest in attending.
It’s too painful to quickly get frequent notes into note taking and related platforms. Hypothes.is has an open API and a great UI that can be leveraged to simplify note taking processes.
Note taking tools
I’ve been keeping notes in systems like OneNote and Evernote for ages, but for my memory-related research and work in combination with my commonplace book for the last year, I’ve been alternately using TiddlyWiki (with TiddlyBlink) and WordPress (it’s way more than a blog.)
I’ve also dabbled significantly enough with related systems like Roam Research, Obsidian, Org mode/Org Roam, MediaWiki, DocuWiki, and many others to know what I’m looking for.
Many of these, particularly those that can be used alternately as commonplace books and zettelkasten appeal to me greatly when they include the idea of backlinks. (I’ve been using Webmention to leverage that functionality in WordPress settings, and MediaWiki gives it grudgingly with the “what links to this page” basic functionality that can be leveraged into better transclusion if necessary.)
The major problem with most note taking tools
The final remaining problem I’ve found with almost all of these platforms is being able to quickly and easily get data into them so that I can work with or manipulate it. For me the worst part of note taking is the actual taking of notes. Once I’ve got them, I can do some generally useful things with them—it’s literally the physical method of getting data from a web page, book, or other platform into the actual digital notebook that is the most painful, mindless, and useless thing for me.
Evernote and OneNote
Older note taking services like Evernote and OneNote come with browser bookmarklets or mobile share functionality that make taking notes and extracting data from web sources simple and straightforward. Then once the data is in your notebook you can actually do some work with it. Sadly neither of these services has the backlinking functionality that I find has become de rigueur for my note taking or knowledge wrangling needs.
My WordPress solutions are pretty well set since that workflow is entirely web-based and because WordPress has both bookmarklet and Micropub support. There I’m primarily using a variety of feeds and services to format data into a usable form that I can use to ping my Micropub endpoint. The Micropub plugin handles the post and most of the meta data I care about.
It would be great if other web services had support for Micropub this way too, as I could see some massive benefits to MediaWiki, Roam Research, and TiddlyWiki if they had this sort of support. The idea of Micropub has such great potential for great user interfaces. I could also see many of these services modifying projects like Omnibear to extend themselves to create highlighting (quoting) and annotating functionality with a browser extension.
With this said, I’m finding that the user interface piece that I’m missing for almost all of these note taking tools is raw data collection.
I’m not the sort of person whose learning style (or memory) is benefited by writing or typing out notes into my notebooks. I’d far rather just have it magically happen. Even copying and pasting data from a web browser into my digital notebook is a painful and annoying process, especially when you’re reading and collecting/curating as many notes as I tend to. I’d rather be able to highlight, type some thoughts and have it appear in my notebook. This would prevent the flow of my reading, thinking, and short annotations from being subverted by the note collection process.
Different modalities for content consumption and note taking
Based on my general experience there are only a handful of different spaces where I’m typically making notes.
A large portion of my reading these days is done in online settings. From newspapers, magazines, journal articles and more, I’m usually reading them online and taking notes from them there.
Some texts I want to read (often books and journal articles) only live in .pdf form. While reading them in an app-specific setting has previously been my preference, I’ve taken to reading them from within browsers. I’ll explain why in just a moment, but it has to do with a tool that treats this method the same as the general online modality. I’ll note that most of the .pdf specific apps have dreadful data export—if any.
Reading e-books (Kindle, e-readers, etc.)
If it’s not online or in .pdf format, I’m usually reading books within a Kindle or other e-reading device. These are usually fairly easy to add highlights, annotations, and notes to. While there are some paid apps that can extract these notes, I don’t find it too difficult to find the raw file and cut and paste the data into my notebook of choice. Once there, going through my notes, reformatting them (if necessary), tagging them and expanding on them is not only relatively straightforward, but it also serves as a simple method for doing a first pass of spaced repetition and review for better long term recall.
Naturally taking notes from live lectures, audiobooks, and other spoken events occurs, but more often in these cases, I’m typically able to type them directly into my notebook of preference or I’m using something like my digital Livescribe pen for notes which get converted by OCR and are easy enough to convert in bulk into a digital notebook. I won’t belabor this part further, though if others have quick methods, I’d love to hear them.
While I love a physical book 10x more than the next 100 people, I’ve been trying to stay away from them because I find that though they’re easy to highlight, underline, and annotate the margins, it takes too much time and effort (generally useless for memory purposes for me) to transfer these notes into a digital notebook setting. And after all, it’s the time saving piece I’m after here, so my preference is to read in some digital format if at all possible.
A potential solution for most of these modalities
For several years now, I’ve been enamored of the online Hypothes.is annotation tool. It’s open source, allows me reasonable access to my data from the (free) hosted version, and has a simple, beautiful, and fast process for bookmarking, highlighting, and annotating online texts on desktop and mobile. It works exceptionally well for both web pages and when reading .pdf texts within a browser window.
I’ve used it daily to make several thousand annotations on 800+ online web pages and documents. I’m not sure how I managed without it before. It’s the note taking tool I wished I’d always had. It’s a fun and welcome part of my daily life. It does exactly what I want it to and generally stays out of the way otherwise. I love it and recommend it unreservedly. It’s helped me to think more deeply and interact more directly with countless texts.
When reading on desktop or mobile platforms, it’s very simple to tap a browser extension and have all their functionality immediately available. I can quickly highlight a section of a text and their UI pops open to allow me to annotate, tag it, and publish. I feel like it’s even faster than posting something to Twitter. It is fantastically elegant.
The one problem I have with it is that while it’s great for collecting and aggregating my note data into my Hypothes.is account, there’s not much I can do with it once it’s there. It’s missing the notebook functionality some of these other services provide. I wish I could plug all my annotation and highlight content into spaced repetition systems or move it around and modify it within a notebook where it might be more interactive and cross linked for the long term. Sadly I don’t think that any of this sort of functionality is on Hypothes.is’ roadmap any time soon.
There is some great news however! Hypothes.is is open source and has a reasonable API. This portends some exciting things! This means that any of these wiki, zettelkasten, note taking, or spaced repetition services could leverage the UI for collecting data and pipe it into their interfaces for direct use.
As an example, what if I could quickly tell Obsidian to import all my pre-existing and future Hypothes.is data directly into my Obsidian vault for manipulating as notes? (And wouldn’t you know, the small atomic notes I get by highlighting and annotating are just the sort that one would like in a zettelkasten!) What if I could pick and choose specific course-related data from my reading and note taking in Hypothes.is (perhaps by tag or group) for import into Anki to quickly create some flash cards for spaced repetition review? For me, this combination would be my dream application!
These small pieces, loosely joined can provide some awesome opportunities for knowledge workers, students, researchers, and others. The education focused direction that Hypothes.is, many of these note taking platforms, and spaced repetition systems are all facing positions them to make a super-product that we all want and need.
So today, as a somewhat limited experiment, I played around with my Hypothes.is atom feed (https://hypothes.is/stream.atom?user=chrisaldrich, because you know you want to subscribe to this) and piped it into IFTTT. Each post creates a new document in a OneDrive file which I can convert to a markdown .md file that can be picked up by my Obsidian client. While I can’t easily get the tags the way I’d like (because they’re not included in the feed) and the formatting is incredibly close, but not quite there, the result is actually quite nice.
Since I can “drop” all my new notes into a particular folder, I can easily process them all at a later date/time if necessary. In fact, I find that the fact that I might want to revisit all my notes to do quick tweaks or adding links or additional thoughts provides the added benefit of a first round of spaced repetition for the notes I took.
Some notes may end up being deleted or reshuffled, but one thing is clear: I’ve never been able to so simply highlight, annotate, and take notes on documents online and get them into my notebook so quickly. And when I want to do something with them, there they are, already sitting in my notebook for manipulation, cross-linking, spaced repetition, and review.
So if the developers of any of these platforms are paying attention, I (and I’m sure others) really can’t wait for plugin integrations using the full power of the Hypothes.is API that allow us to all leverage Hypothes.is’ user interface to make our workflows seamlessly simple.
Over the past several weeks I’ve been thinking more and more about productivity solutions, bullet journals, and to do lists. This morning I serendipitously came back across a reply Paul Jacobson made about lab books on a post relating to bullet journals and thought I’d sketch out a few ideas.
I like the lab book metaphor! That’s probably why a notebook-note analogy appeals to me for my productivity tools. Paul Jacobson on .
I’m honestly a bit surprised that no one has created a bullet journal plugin for WordPress yet. Or maybe someone comes up with a bullet journal stand alone product a bit like Autommatic’s Simple Note? Last week after a talk I attended, someone came up to me who had self-published 400+ copies of a custom made bullet journal that they wanted to sell/market. I’ve also been looking at some bullet journal apps, but my very first thoughts were “Who owns this data? What will they do with it? What happens if the company goes out of business? Is there a useful data export functionality?” For one of the ones I looked at my immediate impression was “This is a really painful and unintuitive UI.”
Naturally my next thought was “how would the IndieWeb build such a thing?”
Perhaps there’s a lot of code to write, though I can imagine that simply creating Archive views of pre-existing data may be a good first start. In fact some good archive views would be particularly helpful if one is using a plugin like David Shanske’s Post Kinds which dramatically extends the idea behind Post Formats. This would make tracking things like eating, drinking, reading, etc. a lot easier to present visually as well as to track/journal. One could easily extend the functionality of Post Kinds to create “to do” items and then have archive views that could be sorted by date, date due, tags/categories for easier daily use. Since it’s all web-based, it’s backed up and available almost everywhere including desktop and mobile.
I know a few people like Jonathan LaCour and Eddie Hinkle have been tinkering around with monthly, weekly, or annual recaps on their websites (see also: https://indieweb.org/monthly_recap). Isn’t this what a lot of bullet journals are doing, but in reverse order? You put in data quickly so you can have an overview to better plan and live in the future? If you’re already using Micropub tools like teacup (for food/drink), OwnYourSwarm (for location), or a variety of others for bookmarking things (which could be added to one’s to-do list), then creating a handful of bullet journal-type views on that data should be fairly easy. I also remember that Beau Lebens had his Keyring project for WordPress that was pulling in a lot of data from various places that could be leveraged in much the same way.
In some sense I’m already using my own WP-based website as a commonplace book (or as Jamie Todd Rubin mentions on Paul’s post a (lab) notebook), so how much nicer/easier would it be if I could (privately) track to do lists as well?
Of course the hard part now is building it all…
Additional notes and ideas
I started thinking about some of this ages ago when I prototyped making “itches” for my own website. And isn’t this just a public-facing to-do list? I don’t immediately see a to-do list entry on the IndieWeb wiki though I know that people have talked about it in the past. There’s also definitely no bullet journal or productivity entries, but that doesn’t mean we couldn’t build them.
There are a lot of preexisting silos on the web that do to-do lists or which have productivity related personal data (Google notes, Evernote, Microsoft OneNote, etc.), so there are definitely many UI examples of good and bad display. For distributed group task management I could easily see things being marked done or undone and webmentions handling notifications for these. I suspect for this to take off on a wide, distributed scale for company-wide project management however, more work would need to exist on the ideas of audience and private or semi-private posts. The smaller personal side is certainly much more easily handled.
As another useful sub-case for study, I’ll note that several within the IndieWeb are able to post issues on their own websites, syndicate to GitHub’s issue queue, and get replies back, and isn’t this just a simple example workflow of a to-do list as well?
Greg McVerry has also mentioned he’s tinkered around in this area before primarily using pre-existing functionality in WithKnown. In his case, he’s been utilizing the related idea of the Pomodoro Technique which is widely known in productivity circles.
I’d be thrilled to hear ideas, thoughts, additional brainstorming, or even prior art examples of this sort of stuff. Feel free to add your thoughts below.
There’s so much great material out there to read and not nearly enough time. The question becomes: “How to best organize it all, so you can read even more?”
I just came across a tweet from Michael Nielsen about the topic, which is far deeper than even a few tweets could do justice to, so I thought I’d sketch out a few basic ideas about how I’ve been approaching it over the last decade or so. Ideally I’d like to circle back around to this and better document more of the individual aspects or maybe even make a short video, but for now this will hopefully suffice to add to the conversation Michael has started.
Lots of good insights in the responses. One thing stands out: this is a real pain point for many, & I don’t think anyone feels like they’ve nailed it (or how they organize information in general). It’d be great to have more ideas added to the thread! https://t.co/6KfhO5aVU3
— michael_nielsen (@michael_nielsen) March 8, 2018
How do people organize their reading? Perennially frustrated by this. I want one system that lets me trivially add books, papers, webpages, etc, re-organize very easily, search & filter. What works for you?
— michael_nielsen (@michael_nielsen) March 8, 2018
Keep in mind that this is an evolving system which I still haven’t completely perfected (and may never), but to a great extent it works relatively well and I still easily have the ability to modify and improve it.
The first piece of the overarching puzzle is to have a general structure for finding, collecting, triaging, and then processing all of the data. I’ve essentially built a simple funnel system for collecting all the basic data in the quickest manner possible. With the basics down, I can later skim through various portions to pick out the things I think are the most valuable and move them along to the next step. Ultimately I end up reading the best pieces on which I make copious notes and highlights. I’m still slowly trying to perfect the system for best keeping all this additional data as well.
Since I’ve seen so many apps and websites come and go over the years and lost lots of data to them, I far prefer to use my own personal website for doing a lot of the basic collection, particularly for online material. Toward this end, I use a variety of web services, RSS feeds, and bookmarklets to quickly accumulate the important pieces into my personal website which I use like a modern day commonplace book.
In general, I’ve been using the Inoreader feed reader to track a large variety of RSS feeds from various clearinghouse sources (including things like ProQuest custom searches) down to individual researcher’s blogs as a means of quickly pulling in large amounts of research material. It’s one of the more flexible readers out there with a huge number of useful features including the ability to subscribe to OPML files, which many readers don’t support.
As a simple example arXiv.org has an RSS feed for the topic of “information theory” at http://arxiv.org/rss/math.IT which I subscribe to. I can quickly browse through the feed and based on titles and/or abstracts, I can quickly “star” the items I find most interesting within the reader. I have a custom recipe set up for the IFTTT.com service that pulls in all these starred articles and creates new posts for them on my WordPress blog. To these posts I can add a variety of metadata including top level categories and lower level tags in addition to other additional metadata I’m interested in.
I also have similar incoming funnel entry points via many other web services as well. So on platforms like Twitter, I also have similar workflows that allow me to use services like IFTTT.com or Zapier to push the URLs easily to my website. I can quickly “like” a tweet and a background process will suck that tweet and any URLs within it into my system for future processing. This type of workflow extends to a variety of sites where I might consume potential material I want to read and process. (Think academic social services like Mendeley, Academia.com, Diigo, or even less academic ones like Twitter, LinkedIn, etc.) Many of these services often have storage ability and also have simple browser bookmarklets that allow me to add material to them. So with a quick click, it’s saved to the service and then automatically ported into my website almost without friction.
My WordPress-based site uses the Post Kinds Plugin which takes incoming website URLs and does a very solid job of parsing those pages to extract much of the primary metadata I’d like to have without requiring a lot of work. For well structured web pages, it’ll pull in the page title, authors, date published, date updated, synopsis of the page, categories and tags, and other bits of data automatically. All these fields are also editable and searchable. Further, the plugin allows me to configure simple browser bookmarklets so that with a simple click on a web page, I can pull its URL and associated metadata into my website almost instantaneously. I can then add a note or two about what made me interested in the piece and save it for later.
Note here, that I’m usually more interested in saving material for later as quickly as I possibly can. In this part of the process, I’m rarely ever interested in reading anything immediately. I’m most interested in finding it, collecting it for later, and moving on to the next thing. This is also highly useful for things I find during my busy day that I can’t immediately find time for at the moment.
As an example, here’s a book I’ve bookmarked to read simply by clicking “like” on a tweet I cam across late last year. You’ll notice at the bottom of the post, I’ve optionally syndicated copies of the post to other platforms to “spread the wealth” as it were. Perhaps others following me via other means may see it and find it useful as well?
At regular intervals during the week I’ll sit down for an hour or two to triage all the papers and material I’ve been sucking into my website. This typically involves reading through lots of abstracts in a bit more detail to better figure out what I want to read now and what I’d like to read at a later date. I can delete out the irrelevant material if I choose, or I can add follow up dates to custom fields for later reminders.
Slowly but surely I’m funneling down a tremendous amount of potential material into a smaller, more manageable amount that I’m truly interested in reading on a more in-depth basis.
Calibre with GoodReads sync
Even for things I’ve winnowed down, there is still a relatively large amount of material, much of it I’ll want to save and personally archive. For a lot of this function I rely on the free multi-platform desktop application Calibre. It’s essentially an iTunes-like interface, but it’s built specifically for e-books and other documents.
Within it I maintain a small handful of libraries. One for personal e-books, one for research related textbooks/e-books, and another for journal articles. It has a very solid interface and is extremely flexible in terms of configuration and customization. You can create a large number of custom libraries and create your own searchable and sort-able fields with a huge variety of metadata. It often does a reasonable job of importing e-books, .pdf files, and other digital media and parsing out their meta data which prevents one from needing to do some of that work manually. With some well maintained metadata, one can very quickly search and sort a huge amount of documents as well as quickly prioritize them for action. Additionally, the system does a pretty solid job of converting files from one format to another, so that things like converting an .epub file into a .mobi format for Kindle are automatic.
Calibre stores the physical documents either in local computer storage, or even better, in the cloud using any of a variety of services including Dropbox, OneDrive, etc. so that one can keep one’s documents in the cloud and view them from a variety of locations (home, work, travel, tablet, etc.)
I’ve been a very heavy user of GoodReads.com for years to bookmark and organize my physical and e-book library and anti-libraries. Calibre has an exceptional plugin for GoodReads that syncs data across the two. This (and a few other plugins) are exceptionally good at pulling in missing metadata to minimize the amount that must be done via hand, which can be tedious.
Within Calibre I can manage my physical books, e-books, journal articles, and a huge variety of other document related forms and formats. I can also use it to further triage and order the things I intend to read and order them to the nth degree. My current Calibre libraries have over 10,000 documents in them including over 2,500 textbooks as well as records of most of my 1,000+ physical books. Calibre can also be used to add document data that one would like to ultimately acquire the actual documents, but currently don’t have access to.
BibTeX and reference management
In addition to everything else Calibre also has some well customized pieces for dovetailing all its metadata as a reference management system. It’ll allow one to export data in a variety of formats for document publishing and reference management including BibTex formats amongst many others.
Reading, Annotations, Highlights
Once I’ve winnowed down the material I’m interested in it’s time to start actually reading. I’ll often use Calibre to directly send my documents to my Kindle or other e-reading device, but one can also read them on one’s desktop with a variety of readers, or even from within Calibre itself. With a click or two, I can automatically email documents to my Kindle and Calibre will also auto-format them appropriately before doing so.
Typically I’ll send them to my Kindle which allows me a variety of easy methods for adding highlights and marginalia. Sometimes I’ll read .pdf files via desktop and use Adobe to add highlights and marginalia as well. When I’m done with a .pdf file, I’ll just resave it (with all the additions) back into my Calibre library.
Exporting highlights/marginalia to my website
For Kindle related documents, once I’m finished, I’ll use direct text file export or tools like clippings.io to export my highlights and marginalia for a particular text into simple HTML and import it into my website system along with all my other data. I’ve briefly written about some of this before, though I ought to better document it. All of this then becomes very easily searchable and sort-able for future potential use as well.
Here’s an example of some public notes, highlights, and other marginalia I’ve posted in the past.
Eventually, over time, I’ve built up a huge amount of research related data in my personal online commonplace book that is highly searchable and sortable! I also have the option to make these posts and pages public, private, or even password protected. I can create accounts on my site for collaborators to use and view private material that isn’t publicly available. I can also share posts via social media and use standards like webmention and tools like brid.gy so that comments and interactions with these pieces on platforms like Facebook, Twitter, Google+, and others is imported back to the relevant portions of my site as comments. (I’m doing it with this post, so feel free to try it out yourself by commenting on one of the syndicated copies.)
Now when I’m ready to begin writing something about what I’ve read, I’ve got all the relevant pieces, notes, and metadata in one centralized location on my website. Synthesis becomes much easier. I can even have open drafts of things as I’m reading and begin laying things out there directly if I choose. Because it’s all stored online, it’s imminently available from almost anywhere I can connect to the web. As an example, I used a few portions of this workflow to actually write this post.
Naturally, not all of this is static and it continues to improve and evolve over time. In particular, I’m doing continued work on my personal website so that I’m able to own as much of the workflow and data there. Ideally I’d love to have all of the Calibre related piece on my website as well.
Earlier this week I even had conversations about creating new post types on my website related to things that I want to read to potentially better display and document them explicitly. When I can I try to document some of these pieces either here on my own website or on various places on the IndieWeb wiki. In fact, the IndieWeb for Education page might be a good place to start browsing for those interested.
One of the added benefits of having a lot of this data on my own website is that it not only serves as my research/data platform, but it also has the traditional ability to serve as a publishing and distribution platform!
Currently, I’m doing most of my research related work in private or draft form on the back end of my website, so it’s not always publicly available, though I often think I should make more of it public for the value of the aggregation nature it has as well as the benefit it might provide to improving scientific communication. Just think, if you were interested in some of the obscure topics I am and you could have a pre-curated RSS feed of all the things I’ve filtered through piped into your own system… now multiply this across hundreds of thousands of other scientists? Michael Nielsen posts some useful things to his Twitter feed and his website, but what I wouldn’t give to see far more of who and what he’s following, bookmarking, and actually reading? While many might find these minutiae tedious, I guarantee that people in his associated fields would find some serious value in it.
I’ve tried hundreds of other apps and tools over the years, but more often than not, they only cover a small fraction of the necessary moving pieces within a much larger moving apparatus that a working researcher and writer requires. This often means that one is often using dozens of specialized tools upon which there’s a huge duplication of data efforts. It also presumes these tools will be around for more than a few years and allow easy import/export of one’s hard fought for data and time invested in using them.
If you’re aware of something interesting in this space that might be useful, I’m happy to take a look at it. Even if I might not use the service itself, perhaps it’s got a piece of functionality that I can recreate into my own site and workflow somehow?
If you’d like help in building and fleshing out a system similar to the one I’ve outlined above, I’m happy to help do that too.