Double Blind Peer Review: Some Thoughts for the First Timers

Double blind peer review is supposed to be the gold standard of academic research.

According to this model, authors submit manuscripts to journal or book editors who in turn send the papers to experts in the field.  These experts are supposed to evaluate the manuscripts anonymously; neither the reviewer nor the author is supposed to know the identity of each other until after publication when at the least the author is revealed.

Given how important peer review is to academic success, it’s astounding that we are rarely trained in how to actually do it!

The results of this lack of training, quite frankly, can be very frustrating and sometimes insulting for authors, who frequently, though not always, have to find a way to satisfy a reviewer who:

• didn’t carefully read the paper but instead briefly skimmed it and the list of references (for their name!);
• provided little to zero comments on how to improve the paper;
• is fundamentally opposed to your theoretical or methodological choices; and/or
• is just plain rude and insulting of your intellectual abilities and writing capabilities (apparently, I write like a first year undergraduate, which may be true if you talk to some of my co-authors!).

Of course, there are also many reviewers out there who provide very helpful comments and thoughtful reviews/rejections. But occasionally, one of the reviewers will commit one or more of the above sins, plus be four months overdue in submitting their review!

Having received and done many peer reviews over the last decade, I’ve started to develop a number of guidelines in writing my referee reports.  I try to review these guidelines before and after I complete a review.  For those who are just starting the peer review game as a referee, here are some tips or helpful advice to consider.

1. Accountability and transparency is important!  If you know who the author is, or have a pretty good idea, let the editor know immediately before doing the review.  Discuss your ability to write a fair and relatively unbiased referee report and then leave it to the editor to decide whether you should complete the review.
2. Read the manuscript at least twice! The first time through should be to simply understand and make sense of the argument, rather than to evaluate it.  Try to figure out exactly what the author is saying and how s/he says it.  Reserve judgment on the author’s theoretical, methodological, and analytical choices until the second read through.  During your second read through, carefully analyze the appropriateness of these choices, including the logic behind them and the integration of these choices given the research question.  Don’t “dump and run” or “snipe from the bushes” as one of my old UofT profs use to say.
3. During this second reading, check your theoretical and methodological biases at the door!  If you hate political economy, don’t immediately reject a paper for using this framework (indeed, my paper on territorial devolution and my book on treaties both had to deal with reviewers who were extremely hostile mainly on the basis of my chosen theoretical framework, rather than how it was applied or whether alternatives were more appropriate).  Instead, consider how far the author’s framework or methodology takes them in terms of answering the question.  Consider whether there are plausible theoretical alternatives, given the evidence presented. Consider the nature of the evidence presented, given what currently exists out there in literature or elsewhere.  But don’t reject out of hand because you hate constructivism or whatever. Evaluate from within or take a pass on reviewing the paper.  Or, state your biases upfront to the editor and to the author (see Tip #1 above!)
4. Provide a thorough list of suggestions, both major and minor.  Rejections should be accompanied by thoughtful and helpful comments about how to improve the paper for resubmission elsewhere. Accepts should say why the paper should be published.  Frequently editors have to deal with split decisions (e.g. one review says accept; the other says reject) and so giving a strong set of reasons for why the paper should be published could push the editor towards acceptance.  Sometimes, during revise and resubmits, I will actually comment on some of the other reviews if I think the author should not take some suggestions very seriously, which again can help editors make more informed decisions.
5. Provide caveats to your review! I try to preface different sets of comments by saying which ones are really crucial and which ones the authors should consider but do not have to address.  I also try to tell authors that I don’t think they have to address all of my comments, but I think they should address some and tell me why the others do not need to be addressed. As reviewers, we sometimes forget that these aren’t our papers and so we end up trying to co-author them. Instead, I think our role is to provide advice, recognize that authors will disagree with us, and provide space for that give and take, as long as a certain scholarly bar is met.
6. Provide even more caveats to your review! Sometimes I’ll be asked to review something that isn’t quite in my wheelhouse.  Given the frequency in which journal editors complain about reviewer fatigue, I almost always accept reviewer invitations even on papers that I really don’t have any really expertise in. In those situations, I always inform the editors and authors about the nature and extent of my expertise (sometimes none!) and that my comments should be read in that light.  Again, accountability and transparency are important!
7. Be Nice! I remember once writing a really nasty review of a paper that was terrible on all fronts, and really shouldn’t have been sent out for review.  I’m talking grammatical errors, typos, spelling mistakes, referencing errors, and bad scholarship.  The paper got me in a really bad mood and the tone of the review reflected that fact.  The minute after I hit “submit”, I immediately regretted the tone of the review.  Having been on the receiving end of those reviews from time to time, I’ve come to appreciate how important it is to be, well, nice!  There’s nothing wrong with being critical; it’s part of the job.  However, the delivery is just as important as the content.  Indeed, authors are more likely to incorporate constructive criticism and reject nasty slagging.
8. Finally, my biggest pet peeve is how long the review process takes.  There’s no one else to blame but ourselves! I’ve waited anywhere between 6 to 18 months at times for referee reports, which is outrageous.  I think four weeks is a reasonable expectation to find 5 or 6 hours to properly review a journal article.  Six weeks is also reasonable for book manuscripts.  Try to prioritize writing reviews please!  Authors appreciate quick turnaround times, especially because actual publication can take a long time, but so can finding a home for the paper.  So let’s help each other out and let’s all get to that “review pile” today!

Any other tips/observations? Provide them in the comment section!

A Political Theorist Teaching Statistics: Estimation and Inference

Still more about my experiences this term as a political theorist teaching methods.

How to introduce the core ideas of regression analysis: via concrete visual examples of bivariate relationships, culminating in the Gauss Markov theorem and the classical regression model? via a more abstract but philosophically satisfying story about inference and uncertainty, models and distributions? Some combination of each?

I took my lead here from my first teacher of statistics, and I want to describe and praise that approach, which still impresses me as quite beautiful in its way.

I remember with some fondness stumbling through Gary King‘s course on the likelihood theory of inference just over twenty years ago. That course, in turn, drew heavily on King’s Unifying Political Methodology, first published in 1989.

I’m too far removed from the methods community to have a sense of how this book is now received. I remember at the time, when I took King’s course, thinking that the discussion of Bayesian inference was philosophically … well, a bit dismissive, whereas nowadays Bayes seems just fine. Revisiting the relevant sections of UPM (especially pp. 28-30) I now think my earlier assessment was unfair.

Still, UPM is easily recognizable as the approach that led Chris Achen to say the following in surveying the state of political methods little more than a decade after King’s book first appeared …

… Even at the most quantitative end of the profession, much contemporary empirical work has little long-term scientific value. “Theoretical models” are too often long lists of independent variables from social psychology, sociology, or just casual empiricism, tossed helter-skelter into canned linear regression packages. Among better empiricists, these “garbage-can regressions” have become a little less common, but they have too frequently been replaced by garbage-can maximum-likelihood estimates (MLEs). …

Given this, it wouldn’t have surprised me if, upon querying methods colleagues, I’d found that UPM remains widely liked, its historical importance for political science acknowledged, but its position in cutting-edge methods syllabi quietly shuffled to the “suggested readings” list.

Is this the case? I doubt it, but even if all that were true, UPM is the book I learned from, and it’s the book I keep taking off the shelf, year after year, to see how certain basic ideas in distribution and estimation theory play out specifically for political questions.

Of course I say that as a theorist: whenever I’ve pondered high (statistical) theory, nothing much has ever been at stake for me personally, as a scholar and teacher. Now, with some pressure to actually do something constructive with my dilettante’s interest in statistics, I wanted to teach with this familiar book ready at hand.

I haven’t been disappointed, and I want to share an illustration of why I think this book should stand the test of time: King’s treatment of the classical regression framework and the Gauss-Markov theorem.

Try googling “the Classical Regression Model” and you’ll get a seemingly endless stream of (typically excellent) lecture notes from all over the world, no small number of which probably owe significant credit to the discussion in William Greene’s ubiquitous econometrics text. High up on the list will almost certainly be Wikipedia’s (actually rather decent) explanation of linear regression. The intuition behind the model is most powerfully conveyed in the bivariate case: here is the relationship, in a single year, between a measure of human capital performance for a sample of countries against their per capita GDP …

Now, let’s look at that again but with logged GDP per capita for each country in the sample (this is taken, by the way, from the most recent Penn World Table) …

The straight line is, of course, universally understood as “the line of best fit,” but that interpretation requires some restrictions, which define the conditions under which calculating that line using a particular algorithm, ordinary least squares (OLS, or simply LS), results in the best linear unbiased predictor, or estimator, of y (thus the acronym BLUE, so common in introductory treatments of the CLRM). OLS minimizes the sum of squared errors, measured vertically, along values of x (rather than, say, perpendicular to the line). Together, those conditions are the Gauss-Markov assumptions, named thus thanks to the Gauss-Markov theorem, which, given those conditions (very roughly: normally distributed and uncorrelated errors with mean zero and constant variance, and those errors uncorrelated with x, or with the columns in the multivariate matrix X), establishes OLS as the best linear unbiased estimator of coefficients in the equation that describes that ubiquitous illustrative line,

or, in matrix notion for multiple x variables,

… and that’s how generations of statistics and econometrics students first encountered regression analysis: via this powerful visual intuition.

But as King notes in UPM, the intuition was never entirely satisfying upon more careful reflection. Why the sum of square errors, rather than, say, the sum of the absolute value of errors? And why calculate the respective errors along the X axis, rather than, again, perpendicular to the line we want to fit?

UPM is, so far as I know, unique (or at the very least, extraordinarily rare) in beginning not with these visual intuitions, but instead with a story about inference: how do we infer things about the world given uncertainty? How can we be clear about uncertainty itself? This is, after all, the point of an account of probability: to be precise about uncertainty, and the whole point of UPM was (is) to introduce statistical methods most useful for political science via a particular approach to inference.

So, instead of beginning with the usual story about convenient bivariate relationships and lines of best fit, UPM starts with the fundamental problem of statistical inference: we have evidence generated by mechanisms and processes in the world. We want to know how confident we should be in our model of those mechanisms and processes, given the evidence we have.

More precisely, we want to estimate some parameter $\theta$, taking much of the world as given. That is, we’d like to know how confident we can be in our model of that parameter $\theta$, given the evidence we have. So what we want to know is $p( \theta | y)$, but what we actually have is knowledge of the world given some parameter $\theta$, that is, $p( y | \theta )$.

Bayes’s Theorem famously gives us the relationship between a conditional probability and its inverse:

We could contrive to render $p(y)$ as a function of $p(\theta)$ and $p(y | \theta)$ by differentiating $p(\theta,y)$ over the whole parameter space $\Theta$, $\int_\Theta p(\theta) p(y| \theta)$, but this still leaves us with the question of how to interpret $p(\theta)$.

These days that interpretive task hardly seems much of a philosophical or practical hurdle, but Fisher’s famous approach to likelihood is still appealing. Instead of arguing about (variously informative) priors, we could proceed instead from an intuitive implication of Bayes’s result: that $p(\theta |y)$ might be represented as some function of our evidence and our background understanding (such as a theoretically plausible model) of the parameter of interest. What if we took much of that background understanding as an unknown function of the evidence that is constant across rival models of the parameter $\theta$?

Following King’s convention in UPM, let’s call these varied hypothetical models $\tilde{\theta}$, and then define a likelihood function as follows:

$L(\tilde{\theta}|y) = g(y) p(y|\tilde{\theta})$

This gives us an appealing way to think about relative likelihoods associated with rival models of the parameter we’re interested in, given the same data …

$\dfrac{L(\tilde{\theta_{i}}|y)}{L(\tilde{\theta_{j}}|y)} = \dfrac{g(y) p(y|\tilde{\theta_{i}})}{g(y) p(y|\tilde{\theta_{j}})}$

$g(y)$ cancels out here, but that is more than a mere computational convenience: our estimate of the parameter $\theta$ is relative to the data in question, where many features of the world are taken as ceteris paribus for our purposes. These features are represented by that constant function (g) of the data (y). We can drop $g(y)$ when considering the ratio

$\dfrac{p(y|\tilde{\theta_{i}})}{p(y|\tilde{\theta_{j}})}$

because our use of that ratio, to evaluate our parameter estimates, is always relative to the data at hand.

With this in mind, think about a variable like height or temperature. Or, say, the diameter of a steel ring. More relevant to the kinds of questions many social researchers grapple with: imagine a survey question on reported happiness using a thermometer scale (“If 0 is very unhappy and 10 is very happy indeed, how happy are you right now?”). We can appeal to the Central Limit Theorem to justify a working assumption that

$y_{i} \sim f_{stn} (y_{i} | \mu_{i}) = \dfrac{e^{-\frac{1}{2}(y_{i}-\mu_{i})^{2}}}{\sqrt{2\pi}}$

which is just to say that our variable is distributed as a special case of the Gaussian normal distribution, but with $\sigma^{2}=1$.

By now you may already be seeing where King is going with this illustration. The use of a normally distributed random variable to illustrate the concept of likelihood is just that: a illustrative simplification. We could have developed the concept with any of a number of possible distributions.

Now for a further illustrative simplification: suppose (implausibly) that the central tendency associated with our random variable is constant. Suppose, for instance, that everyone in our data actually felt the same level of subjective happiness on the thermometer scale we gave them, but there was some variation in the specific number they assigned to the same subjective mental state. So, the reported numbers cluster within a range.

I say this is an implausible assumption for the example at hand, and it is, but think about this in light of the exercise I mentioned above (and posted about earlier): there really is a (relatively) fixed diameter for a steel ring we’re tasked to measure, but we should expect measurement error, and that error will likely differ depending on the method we use to do the measuring.

We can formalize this idea as follows: we are assuming $E(Y_{i})=\mu_{i}$ for each observation i. Further suppose that $Y_{i}, Y_{j}$ are independent for all $i \not= j$. So, let’s take the constant mean to be the parameter we want to estimate, and we’ll use some familiar notation for this, replacing $\theta$ with $\beta$, so that $\mu_{i} = \beta_{i}$.

Given what we’ve assumed so far (constant mean $\mu = \beta$, independent observations), what would the probability distribution look like? Since $p(e_{i}e_{j}) = P(e_{i})p(e_{j})$ for independent events $e_{i}, e_{j}$, the full distribution over all of those events is given by

$\prod_{i}^{n} \dfrac{e^{-\frac{1}{2}(y_{i}-\beta)^{2}}}{\sqrt{2\pi}}$

Let’s use this expression to define a likelihood function for $\beta$:

$L(\tilde{\beta}|y) = g(y) \prod_{i}^{n} f_{stn}(y|\tilde{\beta})$

Now, the idea here is to estimate $\beta$ and we’re doing that by supposing that a lot of background information cannot be known, but can be taken as roughly constant with respect to the part of the world we are examining to estimate that parameter. Thus we’ll ignore $g(y)$, which represents that unknown background that is constant across rival hypothetical values of $\beta$. Then we’ll define the likelihood of $\beta$ given our data, y, with the expression $\prod_{i}^{n} f_{stn}(y|\tilde{\beta})$ and substitute in the full specification of the standardized normal distribution for $\mu_{i} = \beta_{i}$,

$L(\tilde{\beta}|y) = \prod_{i}^{n} \dfrac{e^{-\frac{1}{2}(y_{i}-\beta)^{2}}}{\sqrt{2\pi}}$

Remember that we’re less interested here in the specific functional form of L(.) than in relative likelihoods, so any transformation of the probability function that preserves the properties of interest to us, the relative likelihoods of parameter estimates $\tilde{\beta}$, isn’t really relevant to our use of L(.). Suppose, then, that we took the natural logarithm of $L(\tilde{\beta}|y)$? Because we’re taking $g(y)$ as constant, we know that $ln(ab) = ln(a) + ln(b)$ and for some constant $\alpha$, $ln(\alpha ab) = \alpha +ln(a) + ln(b)$. So, the natural logarithm of our likelihood function is

$L(\tilde{\beta}|y) = g(y) + \sum_{i}^{n} ln(\dfrac{e^{-\frac{1}{2}(y_{i}-\tilde{\beta})^{2}}}{\sqrt{2\pi}})$

$= g(y) + \sum_{i}^{n} ln(\dfrac{1}{\sqrt{2\pi}}) - \dfrac{1}{2}\sum_{i}^{n}(y_{i}-\tilde{\beta})^{2}$

$= g(y) - \dfrac{n}{2}ln(2\pi) - \dfrac{1}{2}\sum_{i}^{n}(y_{i}-\tilde{\beta})^{2}$

Notice that $g(y) - \frac{n}{2}ln(2\pi)$ doesn’t include $\tilde{\beta}$. Think of this whole expression, then, as a constant term that may shift the relative position of the likelihood function, but that doesn’t affect it’s shape, which is what we really care about. That shape of the log-likelihood function is given by

$ln L(\tilde{\beta}|y) = -\dfrac{1}{2} \sum_{i}^{n} (y_{i} - \tilde{\beta})^{2}$

Now, there are still several steps left to get to the the classical regression model (most obviously, weakening the assumption of constant mean and instead setting $\mu_{i}=x_{i}\beta$) but this probably suffices to make the general point: using analytic or numeric techniques (or both), we can estimate parameters of interest in our statistical model by maximizing the likelihood function (thus MLE: maximum likelihood estimation), and that function itself can be defined in ways that reflect the distributional properties of our variables.

This is the sense in which likelihood is a theory of inference: it lets us infer not only the most plausible values of parameters in our model given evidence about the world, but also measures of uncertainty associated with those estimates.

While vitally important, however, this is not really the point of my post.

Look at the tail end of the right-hand side of this last equation above. The expression there ought to be familiar: it looks suspiciously like the sum of squared residuals from the classical regression model!

So, rather than simply appealing to the pleasing visual intuitions of line-fitting; or alternatively, appealing to the Gauss-Markov theorem as the justification for least squares (LS), by virtue of yielding the best linear unbiased predictor of parameters $\beta$ (but why insist on linearity? or unbiasedness for that matter?), the likelihood approach provides a deeper justification, showing the conditions under which LS is the maximum likelihood estimator of our model parameters.

This strikes me as a quite beautiful point, and it frames King’s entire pedagogical enterprise in UPM.

Again, there’s more to the demonstration in UPM, but in our seminar at Laurier this sufficed (I hope), not to convince my (math-cautious-to-outright-phobic) students that they need to derive their own estimators if they want to do this stuff. What I hope they took away is a sense of how the tools we use in the social sciences have deep, even elegant, justifications beyond pretty pictures and venerable theorems.

Furthermore, and perhaps most importantly, understanding at least the broad brush-strokes of those justifications helps us understand the assumptions we have to satisfy if we want those tools to do what we ask of them.

A political Theorist Teaching Statistics: Measurement

Another post about my experiences this term as a political theorist teaching methods.

That gloss invites a question, I suppose. I guess I’m a political theorist, whatever that means. A lot of my work has been on problems of justice and legitimacy, often with an eye to how those concerns play out in and around cities, but also at grander spatial orders.

Still, I’ve always been fascinated with mathematics (even if I’m not especially good at it) and so I’ve kept my nose pressed against the glass whenever I can, watching developments in mathematical approaches to the social and behavioural sciences, especially the relationships between formal models and empirical tests. Continue reading

I was lucky enough in graduate school to spend a month hanging out with some very cool people working on agent-based modeling (although I’ve never really done much of that myself). This year, I was given a chance to put these interests into practice and teach our MA seminar in applied statistical methods.

I began the seminar with a simple exercise from my distant past. My first undergraduate physics lab at the University of Toronto had asked us to measure the diameter of a steel ring. That was it: measure a ring. There wasn’t much by way of explanation in the lab manual, and I was far from a model student. I think I went to the pub instead.

I didn’t stay in physics, and eventually I wound up studying philosophy and politics. It was only a few years ago that I finally saw the simple beauty of that lab assignment as a lesson in measurement. In that spirit, I gave my students a length of string, a measuring tape, and three steel hoops. Their task: detail three methods for finding the diameter of each hoop, and demonstrate that the methods converge on the same answer for each hoop.

measurement

I had visions of elegant tables of measurements, and averages taken over them. Strictly speaking, that vision didn’t materialize, but I was impressed that everyone quickly understood the intuitions at play here, and they did arrive at the three approaches I had in mind:

1. First, use the string and take the rough circumference several times, find the average, then divide that figure by $\pi$.
2. Second, use a pivot point to suspend both the hoop and a weighted length of string, then mark the opposing points and measure.
3. Third, simply take a bunch of measurements around what is roughly the diameter.

The lesson that took a while to impart here was that I didn’t really care about the exact diameters, and was far more concerned that they attend to the details of the methods used for measurement, and that they explicitly report these details.

In the laboratory sciences measurement protocol is so vitally important. We perhaps don’t emphasize the simple point enough in the social sciences, but we should: it matters how you measure things, and what you use to make the measurements!

A Political Theorist Teaching Statistics: Stata? R?

What is a political theorist doing teaching a seminar in social science statistics? A reasonable question to ask my colleagues, but they gave me the wheel, so I drove off!

Later I’ll post some reflections on my experiences this term. For now, I want to weigh in briefly with some very preliminary thoughts on software and programming for statistics instruction at the graduate level, but in a MA programme that doesn’t expect a lot by way of mathematical background from our students.

In stats-heavy graduate departments R seems to be all the rage. In undergraduate methods sequences elsewhere (including here at Laurier) SPSS is still hanging on. I opted for Stata this term, mostly out of familiarity and lingering brand loyalty. If they ever let me at this seminar again, I may well go the R route.

This semester has reassured me that Stata remains a very solid statistical analysis package: it’s isn’t outrageously expensive, it has good quality control, and they encourage a stable and diverse community of users, all of which are vital to keeping a piece of software alive. Furthermore, the programmers have managed to balance ease of use (for casual and beginning users) with flexibility and power (for more experienced users with more complicated tasks).

All that said, I was deeply disappointed with the “student” version of Stata, which really is far more limited than I’d hoped. Not that they trick you: you can read right up front what those limits are, but reading them online is a whole lot different than running up against them full steam in the middle of a class demonstration, when you’re chugging along fine until you realize your students cannot even load the data set (that you thought you’d pared down sufficiently to fit in that modest version of stata!).

R, in contrast, is not a software package, but a programming environment. At the heart of that environment is an interpreted language (which means you can enter instructions off a command line and get a result, rather than compiling a program and then running the resulting binary file).

R was meant to be a dialect of the programming language S and an open source alternative to S+, a commercial implementation of S. R is not built in quite the same way as S+, however. R’s designers started with a language called Scheme, which is a dialect of the venerable (and beautiful) language LISP.

My sense is that more than a few people truly despise programming in R. They insist that the language is hopelessly clumsy and desperately flawed, but they often keep working in the R environment because enough of their colleagues (or clients, or coworkers) use it. Often these critics will grudgingly concede that, in addition to the demands of their profession or client base, R is still worth the trouble, in spite of the language.

These critics certainly make a good case. That said, I suspect these people cut their programming teeth on languages like C+ and that, ultimately, while their complaints are presented as practical failings of R, they are in fact deeper philosophical and aesthetic differences. (… but LISP is elegant!)

I remain largely agnostic on these aesthetic questions. A language simply is what it is, and if it — and as importantly, the community of users — doesn’t let you do what you want, the way you want, then you find another language.

If you’ve ever programmed before, then R doesn’t seem so daunting, and increasingly there are good graphical user interfaces to make the process of working with R more intuitive for non-programmers. Still, fundamentally the philosophy of R is “build it yourself” … or, more often, “hack together a script to do something based on code someone else has built themselves.”

This latter tendency is true of Stata also, of course, but when you use someone else’s package in Stata, you can be reasonably confident that it’s been checked and re-checked before being released as part of the official Stata environment. That is less-often the case with R (although things are steadily improving).

Indeed, there have been, not too long ago, some significant quality-control issues with R packages, and it always leaves the lingering worry in the back of your mind as to whether the code you’ve invoked with a command (“lm” say, for “linear model) is actually doing what it claims to do.

Advocates of R rejoin that this not a bug, but a feature: that lingering worry ought to inspire you to learn enough to check the code yourself!

They have a point.

Peer Review and Social Pyschology: Or Why Introductions are so Important!

Inspired by my colleagues Loren King and Anna Esselment, both of whom regularly make time in their busy schedules to read (I know! A crazy concept!), I’ve started to read a new book that Chris Cochrane recommended: Jonathan Haidt’s The Righteous Mind: Why Good People Are Divided By Politics and Religion.

I’m only in the first third of the book, but one of the main arguments so far is that when human make moral (and presumably other) judgements, we tend to use our intuitions first, and our reasoning second. That is to say, frequently we have gut feelings about all sorts of things and rather than reasoning out whether our feelings are correct, we instead search for logic, examples, or arguments to support those gut feelings. Haidt effectively illustrates this argument by drawing upon a broad set of published research and experiments he has done over the years.

At the end of chapter 2, he writes:

“I have tried to use intuitionism while writing this book. My goal is to change the way a diverse group of readers … think about morality, politics, religion, and each other …. I couldn’t just lay out the theory in chapter 1 and then ask readers to reserve judgement until I had presented all of the supporting evidence. Rather, I decided to weave together the history of moral psychology and my own personal story to create a sense of movement from rationalism to intuitionism. I threw in historical anecdotes, quotations from the ancients, and praise of a few visionaries. I set up metaphors (such as the rider and the elephant) that will recur throughout the book. I did these things in order to “tune up” your intuitions about moral psychology. If I have failed and you have a visceral dislike of intuitionism or of me, then no amount of evidence I could present will convince you that intuitionism is correct. But if you now feel an intuitive sense that intuitionism might be true, then let’s keep going.”

I found these first few chapters, and this paragraph in particular, to be extremely powerful and relevant to academic publishing (and other things!). If humans tend to behave in this manner, (e.g. we frequently rely on gut feelings to make moral judgements and we frequently try to find reasons to support those feelings), then the introduction of a journal article is CRUCIAL, both for peer review and afterwards. On the issue of peer review, I can’t tell you how many times I’ve received a referee report that was extremely negative, yet failed to: a) clearly show that they understood my argument; and b) demonstrate logically why my argument is wrong. I always blamed myself for not being clear enough, which is probably half true! But the real story is that sometimes my introductions were probably ineffective at connecting with people’s intuitions, and so these reviewers found reasons to reject it.

The lesson here, I think, is that introductions matter! You can’t ask or expect readers to withold judgement while you present the theory and evidence first. Instead, you have to find a way to tap immediately into their intuitions to make them open to considering the merits of your argument.

Other Experiences Using the Flipped Classroom

I continue to be a fan of the flipped classroom. Others have started to blog about their own experiences. Check them out:

1) Phil Arena, an IR scholar, is using it in his classes. Check out some of his reflections here.

2) Here’s economist John Cochrane’s take:

“A lot of mooc is, in fact, a modern textbook — because the twitter generation does not read. Forcing my campus students to watch the lecture videos and answer some simple quiz questions, covering the basic expository material, before coming to class — all checked and graded electronically — worked wonders to produce well prepared students and a brilliant level of discussion. Several students commented that the video lectures were better than the real thing, because they could stop and rewind as necessary. The “flipped classroom” model works.

The “flipped classroom” component will, I think, will be a very important use for online tools in business education especially. Our classes are infrequent — once a week for three hours at best. Our international program meets twice for 5 days in a row, three hours per day. We also offer intensive 1-3 day custom programs. An experience consisting of a strong online component, in which students work a little bit each day over a long period, mixed with online connection through forums, videos, chats, etc., all as background for classroom interaction — which also cements relationships formed online — can be a big improvement on what we do now….

But a warning to faculty: Teaching the flipped classroom is a lot harder! The old model, we pretend to teach, you pretend to learn, filling the board with equations or droning on for an hour and a half, is really easy compared to guiding a good discussion or working on some problems together.”

I think Cochrane is wrong that the old model is as bad as he paints here. I’ve blogged previously about how some instructors are simply fantastic lectures and can regularly find that magic sweet spot for learning. But for people like me, I need my flipped classroom crutch to help me be an effective teacher!

More on Knowledge Mobilization: What About Individual Citizens?

There’s been some recent discussion on this blog about the importance of knowledge mobilization. Dr. Erin Tolley, for instance, provided some excellent advice several days ago based on her own experiences in government and academia. But recently I’ve been wondering: what can us academics do to better share our research findings with regular citizens?

My usual strategy has been to write op eds in regional or national newspapers. I have no idea whether this is an effective strategy. I have a hunch that op eds rarely persuade but instead simply reinforce people’s existing opinions on the issue (one day I’d like to run an experimental study to test this proposition. I just need to convince my colleague, Jason Roy, to do it!) Sometimes, I receive emails from interested citizens or former politicians. In one op ed, published in the Toronto Star, I briefly mentioned the Kelowna Accord, dismissing it as a failure. The day after the op ed was published, former Prime Minister Paul Martin called me up in my office to tell me why my analysis of the Accord was wrong. That was quite the experience!

But on the issue of communicating research results to interested citizens, I wonder if there is more that I can do? At least once every six months, I receive an email from a random First Nation citizen asking for advice. Usually, the questions they send me focus on the rights of individual band members against the actions of the band council. One email I received, for instance, asked about potential legal avenues that were available to members for holding band council members accountable, because somehow they saw my paper in Canadian Public Administration on accountability and transparency regimes. Just last week, I received an email from a band member who was fielding questions from fellow band members about the rights of CP holders (e.g. certificates of possession) against a band council that wanted to expropriate their lands for economic development. Apparently, this individual tried to look up my work online but all of the articles I’ve written on this topic are gated (with the exception of one).

So, what to do?

Well, one easy and obvious solution is to purchase open access rights for these articles, which is something SSHRC is moving towards anyway. That way, anyone can download and read the articles.

But what else can we do? Taking a page from Dr. Tolley’s post, maybe I need to start writing one page summaries of my findings in plain language and post them on my website?

Another thing I want to try is to put together some short animated videos that explain my findings. This is what I hope to do with my SSHRC project on First Nation-municipal relations, if Jen and I can ever get this project finished!

Any other ideas? Suggestions welcome!

Should We Change the Grant Adjudication Process? Part 2!

Previously, I blogged about the need to reconsider how we adjudicate research grant competitions.

Others agree:

Researchers propose alternative way to allocate science funding

HEIDELBERG, 8 January 2014 – Researchers in the United States have suggested an alternative way to allocate science funding. The method, which is described in EMBO reports, depends on a collective distribution of funding by the scientific community, requires only a fraction of the costs associated with the traditional peer review of grant proposals and, according to the authors, may yield comparable or even better results.

“Peer review of scientific proposals and grants has served science very well for decades. However, there is a strong sense in the scientific community that things could be improved,” said Johan Bollen, professor and lead author of the study from the School of Informatics and Computing at Indiana University. “Our most productive researchers invest an increasing amount of time, energy, and effort into writing and reviewing research proposals, most of which do not get funded. That time could be spent performing the proposed research in the first place.” He added: “Our proposal does not just save time and money but also encourages innovation.”

The new approach is possible due to recent advances in mathematics and computer technologies. The system involves giving all scientists an annual, unconditional fixed amount of funding to conduct their research. All funded scientists are, however, obliged to donate a fixed percentage of all of the funding that they previously received to other researchers. As a result, the funding circulates through the community, converging on researchers that are expected to make the best use of it. “Our alternative funding system is inspired by the mathematical models used to search the internet for relevant information,” said Bollen. “The decentralized funding model uses the wisdom of the entire scientific community to determine a fair distribution of funding.”

The authors believe that this system can lead to sophisticated behavior at a global level. It would certainly liberate researchers from the time-consuming process of submitting and reviewing project proposals, but could also reduce the uncertainty associated with funding cycles, give researchers much greater flexibility, and allow the community to fund risky but high-reward projects that existing funding systems may overlook.

“You could think of it as a Google-inspired crowd-funding system that encourages all researchers to make autonomous, individual funding decisions towards people, not projects or proposals,” said Bollen. “All you need is a centralized web site where researchers could log-in, enter the names of the scientists they chose to donate to, and specify how much they each should receive.”

The authors emphasize that the system would require oversight to prevent misuse, such as conflicts of interests and collusion. Funding agencies may need to confidentially monitor the flow of funding and may even play a role in directing it. For example they can provide incentives to donate to specific large-scale research challenges that are deemed priorities but which the scientific community can overlook.

“The savings of financial and human resources could be used to identify new targets of funding, to support the translation of scientific results into products and jobs, and to help communicate advances in science and technology,” added Bollen. “This funding system may even have the side-effect of changing publication practices for the better: researchers will want to clearly communicate their vision and research goals to as wide an audience as possible.”

Awards from the National Science Foundation, the Andrew W. Mellon Foundation and the National Institutes of Health supported the work.

From funding agencies to scientific agency: Collective allocation of science funding as an alternative to peer review

Johan Bollen, David Crandall, Damion Junk, Ying Ding, and Katy Börner

doi: 10.1002/embr.201338068
On the one hand, some might argue that the system would be hijacked by logrolling and network effects. But if the system had strong accountability and transparency measures, which involved researchers disclosing expenditures, outcomes, and which researchers they financially supported, I think some of the possible negative effects would disappear.

It’s a neat idea and someone needs to try it. It could be SSHRC creating a special research fund that worked in this way: everyone who applied would receive money (equally divided among the applicants) and would have to donate a fixed portion of their money to others. Or maybe a university could try it with some internal funding.

Thoughts?

Hat tip to marginal revolution for this story.

Should we Apply Gender and Race Analysis to Everything?

According to some, the answer is yes.

At first, I thought it was written “tongue-in-cheek”, but after reading it a second time, I’m not so sure!

There’s no question that gender and race analyses have an important role to play in political science and we should be taking these perspectives more seriously than is usually the case.  But aren’t some domains more important, pervasive, and influential than others?  Angry Birds? Meh. Movember? Maybe.

Do University Ethics Review Boards Work?

Objective

To test for any long-term effects on the death rates of domestic assault suspects due to arresting them versus warning them at the scene.

Methods

The Milwaukee Domestic Violence Experiment (MilDVE) employed a randomized experimental design with over 98 % treatment as assigned. In 1987–88, 1,200 cases with 1,128 suspects were randomly assigned to arrest or a warning in a 2:1 ratio. Arrested suspects were generally handcuffed and taken to a police station for about 3 to 12 h. Warned suspects were left at liberty at the scene after police read aloud a scripted statement. Death records were obtained in 2012–13 from the Wisconsin Office of Vital Statistics and the Social Security Death Index, with the support of the Milwaukee Police Department.” [emphasis added]

Can you say lawsuit?!

Hat tip to Kids Prefer Cheese.

Can Electronic Voting Solve Declining Voter Turnout in Canada?

The answer is no, according to a new report from B.C. Elections and its Chief Electoral Officer, Keith Archer.

I have to admit that I haven’t really kept up with the literature on this topic but I’m not surprised by this finding.  That last thing I read on this topic was Henry Milner’s IRPP paper, which placed the blame for declining turnout on “political dropouts”, which he defined as young Canadians (30 years of age and younger) who lacked political knowledge about politics and therefore chose not to participate in it.  Knowledge matters for these individuals because without it, they cannot link their policy preferences to a particular political party, even though appropriate political parties and platforms did exist.

If knowledge is crucial to turnout, then it’s no surprise that e-voting would have no effect.

Yet whenever I poll my students in my second year intro to Canadian politics class, they always choose electronic voting as the solution to declining turnout among youth.  This is before I present Milner’s study.

There is a whole whack of new literature on the topic (see here and here) but I haven’t gotten around to reading them yet.  Hopefully soon!

Flipped Classroom: A Lesson Plan for 2nd Year Intro to Canadian Politics

So this year, I had planned to flip my course, PO 263.  Unfortunately, by the time September rolled around, I couldn’t make it happen so instead I went with incorporating the “learning catalytics” software into each class. So far, it’s been really great!

Still, I wanted to try out the flipped model in the large class format and so I’ve decided to flip my Nov. 25 class on the Canadian judiciary.  Here’s what I plan to do:

1) Pre-Class Activities
Students will:

a) Read a chapter from Malcolmson and Myers that describes the nuts and bolts of the judiciary;

b) Watch three short videos I’ve recorded using the whiteboard app, Explain Everything, on theories of judicial decision making, judicial activism, and dialogue theory and coordinate interpretation.

c) Complete an online quiz that asks them to apply their learning from the reading and videos to a variety of situations (to test for comprehension).

d) Sign up for a time slot. My classes run on Mondays from 7-10pm so half the class will attend the first time slot (7pm to 8:20pm) and the other will attend the second time slot (8:30pm to 9:50pm).

2) In-Class Activities

Each time slot will begin, if necessary, with a very short lecture (5-10 minutes maximum) clarifying anything that the students seemed to have trouble with on the quiz.

Then, the students will divide into small groups (of approximately 6-8) to discuss three hypothetical court cases. They will assess which theory of judicial decision making applies, how this case relates to judicial activism, and how dialogue and coordinate interpretation (e.g. textual retort and minority retort) might occur. We will take up these answer as a group and discuss each group’s reasoning.

In the second half of each time slot, the groups will discuss two debate questions about the nature of judicial activism and whether Parliament should respond (and if so, in what way). After the small and large group discussions, each group will post their answer on the MYLS discussion board and students will vote on which answer is the best (e.g. students from time slot 1 will vote on the answers given in time slot 2).

That’s the plan! We shall see how it goes in six weeks but I’m optimistic that it should work out well.

Should We Change the Grant Adjudication Process?

John Sides, commenting on recent criticisms that the U.S. governments (through the NSF) funds far too many “silly projects,” at the Monkey Cage writes:

“it’s very hard to determine the value of any research ahead of time.  It’s hard because any one research project is narrow.  It’s hard because you can’t anticipate how one project might inform later ones.  It’s hard because some funding goes to create public goods—like large datasets—that many others will use, and those myriad projects also cannot be anticipated.  It’s hard because some research won’t work, and we can’t know that ahead of time.”

I think John is completely right. Committees are faced with extreme uncertainty about the future value of the various proposed research projects and so the grant adjudication process is bit of a crapshoot, all else being equal. (It also explains why one of my SSHRC grants got funded after the third try, even though I made extremely minor revisions!)

Because it is difficult to figure out which proposed research will actually have value in the future, most committees and/or competition criteria put great emphasis on research record. And so, if you have a really strong record, you are likely to get funded.

But I’m not sure that’s the best model for adjudicating grant applications. If we take seriously the notion that predicting the future value of research is very hard, then there are at least two reasonable options for changing how we adjudicate grant applications:

a) fund all applications that come in (which is problematic since resources are limited; one solution would be to simply divide up the money among all of the applications); or better yet,

b) create a small pool of applications that meet a certain scholarly criteria and then randomly choose which applications to fund from that pool.  This process would be different from the current one where applications are ranked and funded beginning with #1 downwards.

My own view is perhaps we should move to model (b).  At least that way, some of the idiosyncrasies involved with personal preferences and networks are somewhat mitigated.

Or maybe not!

The Flipped Classroom and Intro to Canadian Government and Politics: The First Day

Today was the first day of my 2nd year intro to Canadian government and politics class.  You can find the syllabus here.

As I blogged about previously, this year I will be using software to obtain instant feedback during class on whether students are learning the concepts and the materials that I am presenting to them.  I’m also using the software to facilitate discussion and teaching among the students themselves during lectures.

Today, I finally got a chance to roll out the software and try it live with my students.  After working closely with IT over the summer to make sure the classroom could handle 125 computers connecting to the wifi network, I realized that there were very few power outlets in the room!

Anyway, we made do and the results were very encouraging! I used the software to see immediately whether students understood and could apply the materials I just taught them. In one instance, 53% of the students got the question wrong so I went over the material again and scores improved.  I also noticed improvement whenever students were given a chance to discuss their answers with their neighbours. Correct responses to the questions usually increased (although sometimes they didn’t)

In terms of participation, the majority of students answered the many questions I sent out. Approximately 100 students out of 110, I think, participated, of which 7 or 8 gave hand written answers as opposed to using an electronic device (I’ll provide some more accurate numbers once I review the data).  Not everyone engaged in discussion but most did.  I had the TAs float around the room to encourage discussion, especially among those students who seemed to not be participating.

Overall, it was a really good and positive experience.  The full flipped classroom begins this week with the first round of online quizzes due next Sunday.  I am looking forward to using the data from the quizzes to modify my lectures as needed.  I’m also looking forward to using the learning catalytics software again to assess student learning instantly, but also this time to engage in some substantive debate and discussion about a number of issues relating to voting and electoral reform in Canada.

Stay tuned!

Asking for a Raise? Avoid Round Numbers

That’s the headline of a recent Wall Street Journal Article.

“In one experiment, Ms. Mason and her team had 130 sets of people negotiate the price of a used car. When buyers suggested a round anchor, they ended up paying an average of $2,963 more than their initial offer. But buyers who suggested a precise number for a first offer paid only$2,256 more, on average, than that number in the end.

When it comes to negotiating salary, Ms. Mason’s research indicates that a job candidate asking for $63,500 might receive a counteroffer of$62,000, while the request for $65,000 is more likely to yield a counteroffer of, say,$60,000, as the hiring manager assumes the candidate has thrown out a broad ballpark estimate.

“We often think a higher anchor is the way to go,” said Ms. Mason. “But you risk upsetting people if you’re too extreme. We found that you could be less extreme if you were precise and still do better in the end.” The best strategy, she added, is to start with a high (but not extreme) number that is also precise.”

It sounds like the precise bidders do better than the rounders and so, as usual, asymmetrical information benefits the knowledge holder.  But I wonder what would happen if both parties offer and counter off with precise numbers? Maybe the advantage only accrues to the first bidder on the first bid?In any event, now I know why I did so poorly in my salary negotiations when I got hired at Laurier!