Rolling out to Medium users over the coming week will be a new, more satisfying way for readers to give feedback to writers. We call it “Claps.” It’s no longer simply whether you like, or don’t like, something. Now you can give variable levels of applause to a story. Maybe clap once, or maybe 10 or 20 times. You’re in control and can clap to your heart’s desire.
Yet another way to “like” a post….
This reminds me a lot of Path’s pivot to stickers. We all know how relevant it has made them since.
And all this just after Netflix, the company that has probably done more research on ranking than any other, has gone from a multi-star intent to a thumbs up/thumbs down in the past month.
Most of the measurements social media and other companies are really trying to make are signal to noise ratios as well as creating some semblance of dynamic range. A simple thumbs up creates almost no dynamic range compared to thumbs up/nothing/thumbs-down. Major platforms drive enough traffic that the SNR all comes out in the wash. Without the negative intent (dis-like, thumbs down, etc.) we’re missing out on some important data. It’s almost reminiscent to the science community only publishing their positive results and not the negative results. As a result scientific research is losing a tremendous amount of value.
We need to be more careful what we’re doing and why…
Many academics are using academic related social platforms (silos) like Mendeley, Academia.edu, Research Gate and many others to collaborate, share data, and publish their work. (And should they really be trusting that data to those outside corporations?)
A few particular examples: I follow physicist John Carlos Baez and mathematician Terry Tao who both have one or more academic blogs for various topics which they POSSE work to several social silos including Google+ and Twitter. While they get some high quality response to posts natively, some of their conversations are forked/fragmented to those other silos. It would be far more useful if they were using webementions (and Brid.gy) so that all of that conversation was being aggregated to their original posts. If they supported webmentions directly, I suspect that some of their collaborators would post their responses on their own sites and send them after publication as comments. (This also helps to protect primacy and the integrity of the original responses as the receiving site could moderate them out of existence, delete them outright, or even modify them!)
While it’s pretty common for researchers to self-publish (sometimes known as academic samizdat) their work on their own site and then cross-publish to a pre-print server (like arXiv.org), prior to publishing in a (preferrably) major journal. There’s really no reason they shouldn’t just use their own personal websites, or online research journals like yours, to publish their work and then use that to collect direct comments, responses, and replies to it. Except possibly where research requires hosting uber-massive data sets which may be bandwidth limiting (or highly expensive) at the moment, there’s no reason why researchers shouldn’t self-host (and thereby own) all of their work.
Instead of publishing to major journals, which are all generally moving to an online subscription/readership model anyway, they might publish to topic specific hubs (akin to pre-print servers or major publishers’ websites). This could be done in much the same way many Indieweb users publish articles/links to IndieWeb News: they publish the piece on their own site and then syndicate it to the hub by webmention using the hub’s endpoint. The hub becomes a central repository of the link to the original as well as making it easier to subscribe to updates via email, RSS, or other means for hundreds or even thousands of researchers in the given area. Additional functionality could be built into these to support popularity measures as well to help filter some of the content on a weekly or monthly basis, which is essentially what many publishers are doing now.
In the end, citation metrics could be measured directly on the author’s original page by the number of incoming webmetions they’ve received on it as others referencing them would be linking to them and therefore sending webmentions. (PLOS|One does something kind of like this by showing related tweets which mention particular papers now: here’s an example.)
Naturally there is some fragility in some of this and protective archive measures should be taken to preserve sites beyond the authors lives, but much of this could be done by institutional repositories like University libraries which do much of this type of work already.
I’ve been meaning to write up a much longer post about how to use some of these types of technologies to completely revamp academic publishing, perhaps I should finish doing that soon? Hopefully the above will give you a little bit of an idea of what could be done.
Sci Hub has been in the news quite a bit over the past half a year and the bookmarked article here gives some interesting statistics. I’ll preface some of the following editorial critique with the fact that I love John Bohannon’s work; I’m glad he’s spent the time to do the research he has. Most of the rest of the critique is aimed at the publishing industry itself.
From a journalistic standpoint, I find it disingenuous that the article didn’t actually hyperlink to Sci Hub. Neither did it link out (or provide a full quote) to Alicia Wise’s Twitter post(s) nor link to her rebuttal list of 20 ways to access their content freely or inexpensively. Of course both of these are editorial related, and perhaps the rebuttal was so flimsy as to be unworthy of a link from such an esteemed publication anyway.
Sadly, Elsevier’s list of 20 ways of free/inexpensive access doesn’t really provide any simple coverage for graduate students or researchers in poorer countries which are the likeliest group of people using Sci Hub, unless they’re going to fraudulently claim they’re part of a class which they’re not, and is this morally any better than the original theft method? It’s almost assuredly never used by patients, which seem to be covered under one of the options, as the option to do so is painfully undiscoverable past their typical $30/paper firewalls. Their patchwork hodgepodge of free access is so difficult to not only discern, but one must keep in mind that this is just one of dozens of publishers a researcher must navigate to find the one thing they’re looking for right now (not to mention the thousands of times they need to do this throughout a year, much less a career).
Consider this experiment, which could be a good follow up to the article: is it easier to find and download a paper by title/author/DOI via Sci Hub (a minute) versus through any of the other publishers’ platforms with a university subscription (several minutes) or without a subscription (an hour or more to days)? Just consider the time it would take to dig up every one of 30 references in an average journal article: maybe just a half an hour via Sci Hub versus the days and/or weeks it would take to jump through the multiple hoops to first discover, read about, and then gain access and then download them from the over 14 providers (and this presumes the others provide some type of “access” like Elsevier).
Those who lived through the Napster revolution in music will realize that the dead simplicity of their system is primarily what helped kill the music business compared to the ecosystem that exists now with easy access through the multiple streaming sites (Spotify, Pandora, etc.) or inexpensive paid options like (iTunes). If the publishing business doesn’t want to get completely killed, they’re going to need to create the iTunes of academia. I suspect they’ll have internal bean-counters watching the percentage of the total (now apparently 5%) and will probably only do something before it passes a much larger threshold, though I imagine that they’re really hoping that the number stays stable which signals that they’re not really concerned. They’re far more likely to continue to maintain their status quo practices.
Some of this ease-of-access argument is truly borne out by the statistics of open access papers which are downloaded by Sci Hub–it’s simply easier to both find and download them that way compared to traditional methods; there’s one simple pathway for both discovery and download. Surely the publishers, without colluding, could come up with a standardized method or protocol for finding and accessing their material cheaply and easily?
“Hart-Davidson obtained more than 100 years of biology papers the hard way—legally with the help of the publishers. ‘It took an entire year just to get permission,’ says Thomas Padilla, the MSU librarian who did the negotiating.” John Bohannon in Who’s downloading pirated papers? Everyone
Personally, I use use relatively advanced tools like LibX, which happens to be offered by my institution and which I feel isn’t very well known, and it still takes me longer to find and download a paper than it would via Sci Hub. God forbid if some enterprising hacker were to create a LibX community version for Sci Hub. Come to think of it, why haven’t any of the dozens of publishers built and supported simple tools like LibX which make their content easy to access? If we consider the analogy of academic papers to the introduction of machine guns in World War I, why should modern researchers still be using single-load rifles against an enemy that has access to nuclear weaponry?
My last thought here comes on the heels of the two tweets from Alicia Wise mentioned, but not shown in the article:
She mentions that the New York Times charges more than Elsevier does for a full subscription. This is tremendously disingenuous as Elsevier is but one of dozens of publishers for which one would have to subscribe to have access to the full panoply of material researchers are typically looking for. Further, Elsevier nor their competitors are making their material as easy to find and access as the New York Times does. Neither do they discount access to the point that they attempt to find the subscription point that their users find financially acceptable. Case in point: while I often read the New York Times, I rarely go over their monthly limit of articles to need any type of paid subscription. Solely because they made me an interesting offer to subscribe for 8 weeks for 99 cents, I took them up on it and renewed that deal for another subsequent 8 weeks. Not finding it worth the full $35/month price point I attempted to cancel. I had to cancel the subscription via phone, but why? The NYT customer rep made me no less than 5 different offers at ever decreasing price points–including the 99 cents for 8 weeks which I had been getting!!–to try to keep my subscription. Elsevier, nor any of their competitors has ever tried (much less so hard) to earn my business. (I’ll further posit that it’s because it’s easier to fleece at the institutional level with bulk negotiation, a model not too dissimilar to the textbook business pressuring professors on textbook adoption rather than trying to sell directly the end consumer–the student, which I’ve written about before.)
(Trigger alert: Apophasis to come) And none of this is to mention the quality control that is (or isn’t) put into the journals or papers themselves. Fortunately one need’t even go further than Bohannon’s other writings like Who’s Afraid of Peer Review? Then there are the hordes of articles on poor research design and misuse of statistical analysis and inability to repeat experiments. Not to give them any ideas, but lately it seems like Elsevier buying the Enquirer and charging $30 per article might not be a bad business decision. Maybe they just don’t want to play second-banana to TMZ?
Interestingly there’s a survey at the end of the article which indicates some additional sources of academic copyright infringement. I do have to wonder how the data for the survey will be used? There’s always the possibility that logged in users will be indicating they’re circumventing copyright and opening themselves up to litigation.
I also found the concept of using the massive data store as a means of applied corpus linguistics for science an entertaining proposition. This type of research could mean great things for science communication in general. I have heard of people attempting to do such meta-analysis to guide the purchase of potential intellectual property for patent trolling as well.
Finally, for those who haven’t done it (ever or recently), I’ll recommend that it’s certainly well worth their time and energy to attend one or more of the many 30-60 minute sessions most academic libraries offer at the beginning of their academic terms to train library users on research tools and methods. You’ll save yourself a huge amount of time.
At the end of April, I read an article entitled “In the Margins” in the Johns Hopkins University Arts & Sciences magazine. I was particularly struck by the comments of eminent scholar Jacques Neefs on page thirteen (or paragraph 20) about computers making marginalia a thing of the past:
I actually think that he may be completely wrong and that current technology actually allows us to keep far more marginalia! (Has anyone heard of digital exhaust?) The bigger issue may be that many writers just don’t know how to keep a better running log of their work to maintain all the relevant marginalia they’re actually producing. (Of course there’s also the subsequent broader librarian’s “digital dilemma” of maintaining formats for the future. As an example, thing about how easy or hard it might be for you to read that ubiquitous 3.5 inch floppy disk you used in 1995.)
A a technologist who has spent many years in the entertainment industry, I feel compelled to point everyone towards the concept of revision control (or version control) within the realm of computer science. Though it’s primarily used in tracking changes in computer programs and is often a tool used by large teams of programmers, it can very easily be used for tracking changes in almost any type of writing from novels, short stories, screenplays, legal contracts, or any type of textual documentation of nearly any sort.
Example Use Cases for Revision Control
As a direct example, I’m using what is known as a Git repository to track every change I make in a textbook I’m currently writing. I can literally go back and view every change I’ve made since beginning the project, so though I’m directly revising one (or more) text files, all of my “marginalia” and revisions are saved and available. Currently I’m only doing it for my own reference and for additional backup not supposing that anyone other than myself or an editor possibly may want to ever peruse it. If I was working in conjunction with otheres, there are ways for me to track the changes, edits, or notes that others (perhaps an editor or collaborator) might make.
In addition to the general back-up of the project (in case of catastrophic computer failure), I also have the ability to go back and find that paragraph (or multiple pages) I deleted last week in haste, but realize that I desperately want them back now instead of having to recreate them de n0vo.
Because it’s all digital, future scholars also won’t have problems parsing my handwriting issues as has occasionally come up in differentiating Mary Shelley’s writing from that of her husband in digital projects like the Shelley Godwin Archive. The fact that all changes are tracked and placed in a tree-like structure will indicate who wrote what and when and will indicate which changes were ultimately accepted and merged into the final version.
Screenplays in Hollywood
One particular use case I can easily see for such technology is tracking changes in screenplays over time. I’m honestly shocked that every production company or even more likely studios don’t use such technology to follow changes in drafts over time. In the end, doing such tracking will certainly make Writers Guild of America (WGA) arbitrations much easier as literally every contribution to a script can be tracked to give screenwriters appropriate credit. The end results with the easy ability to time-machine one’s way back into older drafts is truly lovely, and the outputs give so much more information about changes in the script compared to the traditional and all-too-simple (*) which screenwriters use to indicate that something/anything changed on a specific line or the different colored pages which are used on scripts during production.
I can also picture future screenwriters using services like GitHub as platforms for storing and distributing their screenplays to potential agents, managers, and producers.
Redlining Legal Documents
Having seen thousands of legal agreements go back and forth over the years, revision control is a natural tool for tracking the redlining and changes of legal documents as they change over time before they are finally (or even never) executed. I have to imagine that being able to abstract out the appropriate metadata in the long run may actually help attorneys, agents, etc. to become better negotiators, but something like this is a project for another day.
In addition to direct research for projects being undertaken by academics like Neefs, academics should look into using revision control in their own daily work and writings. While writing a book, paper, journal article, essay, monograph, etc. (or graduate students writing theses) one could use their own Git repository to not only save but to back up all of their own work not only for themselves primarily, but also future scholars who come later who would not otherwise have access to the “marginalia” one creates while manufacturing their written thoughts in digital form.
I can easily picture Git as a very simple “next step” in furthering the concept of the digital humanities as well as in helping to bridge the gap between C.P. Snow’s “two cultures.” (I’d also suggest that revision control is a relatively simple step one could take before learning a particular programming language, which I think should be a mandatory tool in everyone’s daily toolbox regardless of their field(s) of interest.)
Start Using Revision Control
“But how do I get started?” you ask.
Know going in that it may take parts of a day to get things set up and running, but once you’ve started with the basics, things are actually pretty easy and you can continue to learn the more advanced subtleties as you progress. Once things are working smoothly, the additional overhead you’ll be expending won’t be too much more than the old method of hitting Alt-S to save one of your old Word documents in the time before auto-save became ubiquitous.
First one should start by choosing one of the myriad revision control systems that exist. For the sake of brevity in this short introductory post, I’ll simply suggest that users take a very close look at Git because of its ubiquity and popularity in the computer science world and the fact that it includes a tremendously large amount of free information and support from a variety of sites on the internet. Git also has the benefit of having versions for all major operating systems (Windows, MacOS, and Linux). Git also has the benefit of a relatively long and robust life within the computer science community meaning that it’s very stable and has many more resources for the uninitiated to draw upon.
Once one has Git installed on their computer and has begun using it, I’d then recommending linking one’s local copy of the repository to a cloud storage solution like either GitHub or BitBucket. While GitHub is certainly one of the most popular Git-related services out there (because it acts, in part, as the hub for a large portion of the open internet and thus promotes sharing), I often recommend using BitBucket as it allows free unlimited private but still share-able repositories while GitHub requires a small subscription fee for keeping one’s work private. Having a repository in the cloud will help tremendously in that your work will be available and downloadable from almost anywhere and because it also serves as a de-facto back-up solution for your work.
I’ve recently been playing around with version control to help streamline the writing/editing process for a book I’ve been writing. Though Git and it’s variants probably seem more daunting than they should to the everyday user, they really represent a very powerful tool. I’ve spent less than two days learning the basics of both Git and hosted repositories (GitHub and Bitbucket), and it has been more than well worth the minor effort.
There is a huge wealth of information on revision control in general and on installing and using Git available on the internet, including full textbooks. For the complete beginners, I’d recommend starting with The Chronicle’s “A Gentle Introduction to Version Control.” Keep in mind that though some of these resources look highly technical, it’s because many are trying to enumerate every function one could potentially desire, when even just the basic core functionality is more than enough to begin with. (I could analogize it to learning to drive a car versus actually reading the full manual so that you know how to take the engine apart and put it back together from scratch. To start with revision control, you only need to learn to “drive.”) Professors might also avail themselves of the use of their local institutional libraries which may host small sessions on learning such tools, or they might avail themselves of the help of their colleagues or students in the computer science department. For others, I’d recommend taking a look at Git’s primary website. BitBucket has an excellent step-by-step tutorial (and troubleshooting) for setting up the requisite software and using it.
What do you use for revision control?
I’ll welcome any thoughts, experiences, or additional resources one might want to share with others in the comments.