Lies, damned lies and download counts


Shirley Wu posted on Friendfeed earlier about some of the things she’d overheard people saying about PLoS ONE papers. PLoS ONE Manging Ed Peter Binfield weighed in early to point out that the best way of combating misconceptions about the journal is to push out positive info and mentioned the journal’s article-level metrics program.

Near the end of the (long) thread was this exchange:

“You could try asking them exactly how many downloads their last paper in a ‘high impact’ journal got… – Peter Binfield

Fair enough, but you know, I really don’t think they think about that. They think “what will be in my CV?” and they think any journal that is somewhat competitive [includes other PLoS journals, BMC journals, etc] looks better than one that accepts anything that’s methodologically sound. Again, not my view, but perhaps one that is held by many. Do people list # of downloads on their CV for publications? – Shirley Wu

They dont, because they dont have the data. However, people do list if their paper was rated by F1000; or if BMC designated it a ‘highly accessed’ article. So I think they will start to say “this paper was downloaded 5000 times in the first 3 months which put it in the top x% of all PLoS ONE articles, the top y% of all PLoS articles, and the top z% of ALL articles” (when the rest of the world starts quoting this data) – Peter Binfield"

Do people here think that article downloads stats should be put on academic CVs? (serious question)

It feels wrong to me. IMHO encouraging anybody to take download statistics seriously as a measure of success / quality would be a mistake. Taken on their own they’re meaningless, surely – nice to know for the author, but meaningless. For them to be at all useful you’d have to supply a lot of context – as Peter suggests – though I don’t think the journal level “top 10% of papers in first three months” context he outlined would be enough either.

(just to be clear I don’t think Peter was necessarily saying that people should put only the download count on their CV – am using his comment above simply as a jumping off point for discussion)

A download counter can’t tell if the person visiting your paper is a grad student looking for a journal club paper, a researcher interested in your field or… somebody who typed in an obscure porn related search that turned up unconnected words in the abstract. A search bot. Somebody on Google Images looking for free clipart. Got a blog? Check your traffic stats. Journals get those crazy queries too, lots of them. Mainstream search engines are a major source of traffic for journals but not always for the reasons publishers might want.

As a publisher do you account for this and only record ‘good’ traffic? What if your competition don’t?

Institutions and ISPs transparently cache pages. If my lab mate and I both download your paper depending on the publisher’s stats package it might register as only one hit (from the university proxy server). Do you compensate for that somehow?

Am I going to be penalized if I host my papers on my homepage? In my institutional repository? Should I add all those counts up for my CV? Do I need to cite my sources?

Should I tell my mum to set my paper as her homepage (and to be sure to delete her cookies each morning)?

If Science spends $50m on SEO next year and hits on their article pages double will the articles in 2010 be twice as good as those in 2009?

As an author should I be repeating keywords in my title to get more Google traffic? Should I try to include a figure of Britney Spears?

If we stick to giving ‘top x percentage’ context then do we make concessions for smaller disciplines publishing in multidisciplinary journals? More people work and publish in genetics than in quantum physics. Even if every important person in your field downloads your paper they might be outnumbered by grad students from the three dozen groups working on Rab4A effectors that download the genetics paper next to yours in the TOC.

I’m not saying that download stats aren’t useful in aggregate or that authors don’t have a right to know how many hits their papers received but they’re so potentially misleading (& open to misinterpretation) that it doesn’t seem to me the type of metric we want to be bandying about as an impact factor replacement.


Comments are closed.