A detailed update on readership for the first 200 posts
March 23, 2015 4 Comments
It is time — this is the 201st article on TheEGG — to get an update on readership since our 151st post and lament on why academics should blog. I apologize for this navel-gazing post, and it is probably of no interest to you unless you are really excited about blog statistics. I am writing this post largely for future reference and to celebrate this arbitrary milestone.
The of statistics in this article are largely superficial proxies — what does a view even mean? — and only notable because of how easy they are to track. These proxies should never be used to seriously judge academics but I do think they can serve as a useful self-tracking tool. Making your blog’s statistics available publicly can be a useful comparison for other bloggers to get an idea of what sort of readership and posting habits are typical. In keeping with this rough and lighthearted comparison, according to Jeromy Anglim’s order-of-magnitude rules of thumb, in the year since the last update the blog has been popular in terms of RSS subscribers and relatively popular in terms of annual page views.
As before, I’ll start with the public self-metrics of the viewership graph for the last 6 and a half months:
If you’d like to know more, dear reader, then keep reading. Otherwise, I will see you on the next post!
Unfortunately, the above graph does not go all the way to the previous update of March 18th, 2014. This is partially because I forgot to take screenshots of the stats page and because there are only 7 posts in the period of March 25th – May 4th followed by a nearly 4 month silence until my apology on August 29th.
Given that TheEGG’s archive is growing sizable, I created an about & highlights pages that breaks down the articles into 7 rough themes: (i) Algorithmic theory of biology; (ii) Bounded rationality in economics and finance; (iii) Cognitive science and philosophy of mind; (iv) Evolutionary game theory; (v) Mathematical oncology and theoretical biology; (vi) Metamodeling and philosophy of science, (vii) Theoretical computer science and machine learning. For each section I selected 3 articles, based mostly on viewership, to give a new visit about a 10% sample of posts to start with. However, in keeping with the tradition of my previous updates, below is the list the top 10% of the posts on the blog by total viewership; 7 are from the last 50, and 13 return from the top 15 of the previous update.
- Defining empathy, sympathy, and compassion (19,471)
- Critical thinking and philosophy (17,821)
- Machine learning and prediction without understanding (10,676)
- Three types of mathematical models (8,409)
- Hunger Games themed semi-iterated prisoner’s dilemma tournament (8,328)
- Models and metaphors we live by (6,606)
- Toward an algorithmic theory of biology (6,573)
- Should we be astonished by the Principle of “Least” Action? by Abel Molina (5,823)
- Software through the lens of evolutionary biology (5,658)
- Evolution is a special kind of (machine) learning (5,484)
- Transcendental idealism and Post’s variant of the Church-Turing thesis (5,327)
- Micro-vs-macro evolution is a purely methodological distinction (5,127)
- Bounded rationality: systematic mistakes and conflicting agents of mind (4,812)
- Monoids, weighted automata and algorithmic philosophy of science (4,613)
- Is Chaitin proving Darwin with metabiology? (4,430)
- Weapons of math destruction and the ethics of Big Data (3,972)
- Are all models wrong? (3,859)
- Four color problem, odd Goldbach conjecture, and the curse of computing (3,448)
- Programming language for biochemistry (3,445)
- Personification and pseudoscience (3,397)
Since this is the fourth update on readership, we can try to track some trends on the total views and top 10% from the previous updates on February 5, 2013; August 4, 2013; and March 18, 2014. The total views went from 16,751 to 79,722 to 159,455 to 316,565; the views threshold for being in the top 10% went from 559 (1.67x the approximate mean) to 1,633 (2.05x) to 2,152 (2.02x) to 3,397 (2.15x); the views for the top post went from 2,863 (8.55x) to 7,200 (9.03x) to 8,086 (7.61x) to 19,471 (12.3x). The fraction of all post views (excluding the front page) to the top 10% of the posts went from less than ~47% (I did not exclude the front page in the first stats update, so can only estimate) to 51% to 46% to 48%. I don’t know what any of these numbers mean, but hey: numbers!
Guests and contributors
What I am most excited about is seeing Abel’s post in the top 10% of viewership. TheEGG was originally intended as an group blog (hence the last G standing for group), and I have worked hard to get authors other than myself to contribute writing. As you can see on the sidebar, TheEGG has content from 15 authors and they’ve contributed from all over the globe:
In the last year we’ve had posts from the following fine folks:
- Jill Gallaher wrote about her experience at the annual IMO workshop as leader of team microbiome.
- Philip Gerlee and Phillip Altrock argued that evolutionary game theory, or frequency-dependent selection more broadly, is not the right framework for thinking about mathematical oncology. The post generated a very lively discussion in the comments, and I have since expanded to Philip & Phillip’s discussion of space and stochasticity in EGT and started fumbling toward a response from both the experimental and philosophical perspectives; I will write a coherent rebuttal eventually.
- Abel Molina suggested that we should not be amazed by the least action principle and discussed some potential motivations for pursing theoretical computer science.
- Marcel Montrey continued his consistent contributions to TheEGG with an article on Rogers’ paradox.
- Dan Nichol shared his recent results (and the personal story behind them) on the non-commutativity of fitness landscapes, which resulted in an exciting discussion with Milo Johnson — a future TheEGG contributor — in the comments.
- Alexander Yartsev kicked off a series on ethics by examining primate sociality, and neuroscience and development of humans.
Since the start of 2012, however, there has been a relatively steady increase in the fraction of the blog written by me. It seems to have now stabilized at around 82%. Hopefully in the next 50 posts, we can push this down to 75%. If you think that your writing would be a good fit for TheEGG then contact me and let me know, and we can see if it fits the blog’s vision and standards.
Post propertiesAbove we have a lot of statistics about the blog overall, but not much on individual posts. From having written most of the posts, I have a vague feeling of what the ‘typical’ post looks like, but I decided to also calculate some statistics. At the right is the graph of two cumulative distributions, in red is views and in blue is words. The graph is truncated at 2500 views/words, because plotting the outliers would obscure the focus on the typical post.
The two vertical blue lines correspond to the median and mean of the distribution of words. The mean is slightly higher than the median; from reading the horizontal blue intercept, we can see that the mean is at the 52nd percentile and around 1393 words; the median length post is “False memories and journalism” at 1358 words. This close correspondence between mean and median is not surprising because I explicitly aim to have posts in the range of 1k to 1.5k words. Most excessively long posts, I try to split them into self-contained parts.
Viewership, however, is much more skewed. The two vertical red lines correspond to the median and mean of the distribution of views. The mean is far higher than the median. From reading the horizontal red intercept, we can see that the mean is just past the 75th percentile — 75% percent of the posts have fewer views than average. In terms of views, the median post is the transcript of my old TEDxMcGill Talk on evolving cooperation with 681 views. A mean post however, has around 1409 views similar to the post “Game theoretic analysis of motility in cancer metastasis“.
This skew of a small fraction of posts bringing in most views is not too surprising given how most traffic arrives on TheEGG. Of all the referrals to the blog, Reddit is by far the most prominent; it brings in around 97k views, with the most common source subreddits (other than the front page) being /r/programming (5.5k), /r/math (5.2k), /r/compsci (3.8k), and /r/philosophy (3.7k). I am a little surprised not to see my favorite subreddit /r/PhilosophyofScience in that list. The second most common source of traffic is searches, with around 37k views. If you are curious, the most common search terms are variants on ‘types of mathematical models‘, ‘spatial structure‘, and ‘metabiology‘. Social media follows in third with G+ at 4.7k, Twitter at 4.3k, and Facebook at 4.1k.
Unsurprisingly, more recent posts on the blog have higher total readership than older ones. I like to imagine that this is due to a gradual increase in the quality of content, but it might be just time and better promotion. In the left panel below, there is the total views to each post to date versus the time it was published. Note that the views scale is logarithmic with base 10. The fit is an exponential, which has a doubling time of around 1.5 years.
The panel at right, has views versus words. This is mostly to entertain David Basanta’s question on if longer posts garner a smaller readership. It seems that upto around 1k words, there is an increase in readership; this is probably because it is hard to express an interesting idea if less than 1k words. After about 1k, lengths seem to not affect the viewrship much; although maybe I should test this more carefully. It is, however, possible to do some fitting to find the ‘sweet-spot’ for wordlength and depending on which method I use, it is somewhere between 1860 and 2070 words. For comparison, this post (which, for obvious reason is not included in these stats) is 3021 words long and will probably be viewed less than 302 times in the next year.Finally, on the regularity of the blog. Although there are some notable long gaps between posts, it seems that the typical (i.e. median gap is about half a week). In particular, over 53% of the posts come 3 days or less after their predecessor. My personal goal is to post at least once every 7 days; this resolution is violated less than 14% of the time. For example, so far this year there has only been one gap of over a week — a two week gap while I prepared the last post on pairing tools and problems. At the right, you can see a cumulative distribution of the post delay times. The distribution is truncated and does not display the 5 times TheEGG had a silence of greater than a month.
Of course, the most rewarding and important part of the blog is not the easy to track properties like view and word counts, but the community of readers. It is tempting to take the 2,417 followers that TheEGG has on WordPress and via email as a measure of the size of community, but I think that would be an unreasonable overestimate. A lot of the WordPress accounts that follow this blog, for instance, seem to be small businesses of various products/services that have nothing to do with the content of TheEGG. My pessimism leads me to suspect there is some SEO optimization that is being exploited here, or maybe a convention that I am unaware of (and thus don’t adhere to) of following back the people that follow you? I’ve tried to keep track manually of the people that engage with TheEGG on social media and through commenting, by creating a G+ circle of engagers. A lot of engagement that I see comes from twitter, and a lot of tweeps don’t have G+ accounts so this circle is an underestimate of the community. If you are a regular reader or occasional sharer or commenter and I forgot to include you in the circle then let me know and I will add you to it; that way other readers can find you!A better metric for community might be the number of comments on the blog, since that reveals the readers that are passionate enough to write responses and suggestions. Here, the statistics don’t look too impressive. There are 701 comments, but 225 of these are by me, and 989 pingbacks (with most being internal to the blog). On the right is the cumulative distribution for the comments per posts in blue and pingbacks per post in red. On 33.5% of the posts, there is no comments, and on 57.5% of the posts there is 2 or fewer comments — often this is a single reader’s comment with my response.
The most commented on post is “Kooky history of the quantum mind” with 27 comments. The discussion on that post was incredibly useful to me, in that it lead me to learning about Hoffman’s interface theory of perception — a find that has been directly useful to my research. That thread has also gotten me in a bit of trouble in a later email discussion with Hoffman. I was very critical when I first saw Hoffman’s talk, mistaking parts of it for new-age pseudoscience — coincidentally, the most commented on post in the year since the last stats update was my post on pseudoscience with 20 comments — and expressed that sentiment in the comments. My comment was then picked up by a (popular?) skeptic forum and eventually reached Hoffman. So yes, dear reader, there are some dangers to blogging, but they are greatly outweighted by the benefits. Although I fear that my harsh commenting might be driving off some potential interlocutors, so I am trying to work on my tone in discussion.
The post with the most pingbacks at 29 (and also with the third most number of comments at 18) is on the three types of mathematical models. This is expected, since I often link to that post for definitions of heuristics and insilications. Maybe I should create a glossary of terms.
The most memorable experiences with the community or readers, however, are qualitative ones. For example, the twitter and email reaction to my post on bernstein polynomials and the public good that lead to the guest post by Philip Gerlee and Phillip Altrock. Or when a fellow research that I just met mentions that they read the blog or pull up a copy of a post from their EndNote.
I was also extremely honored last year when TheEGG was listed as one of the top 30 computer science and programming blogs alongside wonderful blogs that I often frequent. I was particularly excited about their summary of the content here:
This blog weaves together computer science, the theory of evolution, and game theory into a masterpiece of interdisciplinary research.
Hopefully I can maintain this site as a space worth visiting.
As I’ve mentioned earlier in this article, one of my goals is to introduce more contributors to the blog and feature more writing from other researchers. I also feel like I have fallen behind in my reading and commenting on others’ blogs, and so will aim to engage more with the blog-o-sphere to widen the community beyond this site. Obviously, this has to be balanced against my other commitments. Although this blog is a part of my research workflow, it is only a small part and one that is not particularly rewarded by the traditional academy.
How would you like to see TheEGG develop, dear reader?