book-cover

The author tries to answer two questions via this book :

  1. What are the core elements of a fulfilling work ?
  2. How should we make a change to our current path so that our work is in line with our being.

Give meaning to work

The author talks about five aspects relevant to work :

  1. earning money
  2. achieving status
  3. making a difference
  4. following our passions
  5. using our talents

The author categorically states the earning money and achieving status might not give meaning to our work. Most often than not, one gets on to a "hedonistic" treadmill that is difficult to get off. Instead pursuing "making a difference" path is a better option. It might lead an individual to try out a variety of jobs, learn a variety of skills, and become a portfolio worker. It is one of the ways to explore "many selves" that lie within us. The chapter also explores the various challenges that one might have to face in following any of the above paths. Personally I find "using your talents" path to be quite appealing. I have taken a path of "developing talents and then doing your bit in making your talents useful to others". If you forget the monetary aspect and status aspect of any work and immerse yourself in developing talents, I guess one might find that learning a specific skill and putting it to use, could be a wonderful experience. There are obvious challenges that one must encounter on day to day basis. However I guess the "love" element towards your work will give all the strength to face the challenges. The author’s message from this chapter is that there are three core elements of a fulfilling work, i.e. meaning, flow and freedom. Each of these elements are concisely discussed via examples and anecdotes. Light reading but there is no fluff here.

If you are going to get good at something you need a tunnel vision.

– Wayne Davies

Act First, Reflect Later

One often hears stories of somebody abandoning a certain career and heading out to do something radical. These stories are inspiring but they need a lot of courage. Not many would be in a state of mind to take such extreme step. Instead the author suggests three alternate ways

  • radical sabbatical
  • branching projects : start doing things in the context of your project on weekends or mini-breaks
  • conversational research : talk to people who might be working in the field that you want to work, get to know some details of the work etc.

The above three suggestions involve doing first and then reflecting later. While working on alternatives, one must seek out "flow" experiences. These tend to be activities in which we lose the sense of time. However one must always remember that there might be some activities where the struggle itself could be something that we find rewarding. Imagine understanding a certain branch of math. To get to "flow" state will take a LOT of time. However the struggles that you take up on a day to day basis and the incremental understanding of the subject could provide the kick needed to be continue working. To get more clarity about "flow" experiences, it is better to keep a monthly log of such experiences and reflect on them from time to time.

The Longing for Freedom

Based on many interviews with independent consultants, the author says that "bespoke consultant job" is both wonderful and awful. However once experiencing the freedom of a freelancer, most of them say that they would never trade-in with a 9-to-5 job. On the contrary, one can explore even in a routine job. There is a story about Wallace Stevens which goes like this:

By day he worked in an insurance company, eventually becoming vice-president of an established firm in Connecticut. But he was no workaholic: he returned home each evening to write verse, and was considered one of the great modernist poets of the early twentieth century. Stevens kept these two lives separate: he always felt something of an imposter in his day job; it was "like playing a part", he wrote. He regarded poetry as his "real work"- even if he wasn’t paid for it- and never wanted to commercialize his art by becoming a "professional" poet. After winning the Pulitzer Prize in 1955 he was offered a faculty position at Harvard that would have allowed him to write poetry for a living, but he turned it down to stay in his insurance job.


How to grow a vocation

The book ends with a saying that there is no right vocation that one can "find". One has to "grow" in to it. Once you start cultivating your talents, you see that those talents will give you the freedom to do certain things, and by taking those actions, you grow in to a vocation. This opinion is similar to the one offered by Cal Newport in his book,"So good that they can’t ignore you". The story of Marie curie illustrate the point that there is no magical calling that will become apparent to you one fine day. Real vocation is one where you grow in to it, rather than finding one.

The book is very compact and can be read in a few hours. What I like about the book is the various stories that the author manages to put in, so that the main takeaways of the book are implicit and do not sound preachy.

Advertisements

book_cover

In this post, I will attempt to briefly summarize the main points of the book

An optimistic skeptic

The chapter starts off by saying that there are indeed people in the world who become extremely popular, make tons of money, get all the press coverage, by providing a perfect explanation, after the fact. The author gives one such example of a public figure who rose to fame explaining events post-fact, Tom Friedman. At the same time, there are many people who are lesser known in the public space but have an extraordinary skill at forecasting. The purpose of the book is to explain how these ordinary people come up with reliable forecasts and beat the experts hands down.

Philip Tetlock, one of the authors of the book is responsible for a landmark study spanning 20 years(1984-2004) that compares experts predictions and random predictions. The conclusion was, the average expert had done little better than guessing on many of the political and economic questions. Even though the right comparison should have been a coin toss, the popular financial media used "dart-throwing chimpanzee pitted against experts" analogy. In a sense, the analogy was more "sticky" than the mundane word, "random". The point was well taken by all; the experts are not all that good in predicting outcomes. However the author feels disappointed that his study has been used to dish out extreme opinions about experts and forecasting abilities such as, "all expert are useless". Tetlock believes that it is possible to see into the future, at least in some situations and to some extent, and that any intelligent, open-minded, and hardworking person can cultivate the requisite skills. Hence one needs to have "optimistic" mindset about predictions. It is foolhardy to have a notion that all predictions are useless.

The word "Skeptic" in the chapter’s title reflects the mindset on must possess in this increasingly nonlinear world. The chapter mentions an example of Tunisian man committing suicide that leads to a massive revolution in the Arab world. Could anyone have predicted such catastrophic ripple effects of a seemingly common event ? It is easy to look backward and sketch a narrative arc, but difficult to actually peer in to the future and forecast? To make effective predictions, the mindset should be that of an "optimistic skeptic".

So is reality clock-like or cloud-like? Is the future predictable or not? These are false dichotomies, the first of many we will encounter. We live in a world of clocks and clouds and a vast jumble of other metaphors. Unpredictability and predictability coexist uneasily in the intricately interlocking systems that make up our bodies, our societies, and the cosmos. How predictable something is depends on what we are trying to predict, how far into the future, and under what circumstances.

In fields where forecasts have been reliable and good, one sees that the people who make these forecasts follow, Forecast, measure, revise. Repeat. procedure. It’s a never-ending process of incremental improvement that explains why weather forecasts are good and slowly getting better. Why is this process non-existent in many stock market predictions, macro economic predictions? The author says that it is a demand-side problem. The consumers of forecasting don’t demand evidence of accuracy and hence there is no measurement.

Most of the readers and general public might be aware of the research done by Tetlock that produced the dart-throwing chimpanzee article. However this chapter talks about another research study that Tetlock and his research partner&wife started in 2011, Good Judgment Project. The couple invited volunteers to sign up and answer well designed questions about the the future. In total, there were 20,000 people who volunteered to predict event outcomes. The author collated the predictions from this entire crew of 20,000 people and played a tournament conducted by IARA, an intelligence research agency. The game comprised predicting events spanning a month to a year in to the future. It was held between 5 teams, one of which was GJP. Each team would effectively be its own research project, free to improvise whatever methods it thought would work, but required to submit forecasts at 9 a.m. eastern standard time every day from September 2011 to June 2015. By requiring teams to forecast the same questions at the same time, the tournament created a level playing field-and a rich trove of data about what works, how well, and when. Over four years, IARPA posed nearly five hundred questions about world affairs. In all, one million individual judgments about the future. In all the years, the motley crowd of forecasters of The Good Judgment project beat the experts hand down. The author says that there are two major takeaways from the performance of GJP team

  1. Foresight is real : They aren’t gurus or oracles with the power to peer decades into the future, but they do have a real, measurable skill at judging how high-stakes events are likely to unfold three months, six months, a year, or a year and a half in advance.
  2. Forecasting is not some mysterious gift : It is the product of particular ways of thinking, of gathering information, of updating beliefs. These habits of thought can be learned and cultivated by any intelligent, thoughtful, determined person.

The final section of the first chapter contains author’s forecast on the entire field of forecasting. With machines doing most of the cognitive work, there is a threat the forecasting done by humans will be no match to that supercomputers. However the author feels that humans are underrated(a book length treatment has been given by Geoff Colvin). In the times to come, the best forecasts would result from a combination of human-machine teams rather than humans only or machines only forecasts.

The book is mainly about specific type of people, approximately 2% of the volunteered forecasters who did phenomenally well. The author calls them "superforecasters". Have they done well because of luck or skill? If it is skill, what can one learn from them ? are some of the teasers posed in the first chapter of the book.

Illusions of Knowledge

Given the number of books and articles that have been churned out, talking about our cognitive biases, there is nothing really new in the chapter. The author reiterates the System-1 and System-2 thinking from Daniel Kahneman’s book. He also talks about the perils of being over-confident of our own abilities. He talks about various medical practices that were prevalent before the advent of clinical trials. Many scientists advocated medicine based on their "tip-of-your-nose" perspective without vetting their intuitions.

The tip-of-your-nose perspective can work wonders but it can also go terribly awry, so if you have the time to think before making a big decision, do so-and be prepared to accept that what seems  obviously true now may turn out to be false later.

The takeaway from this chapter is obvious from the title of the chapter. One needs to weigh in both System-1 and System-2 thinking in most of our decisions. An example of Magnus Carlsen is given that illustrates this kind of mixed thinking. In an interview, the grandmaster disclosed that his intuition tells him what are the possible steps immediately(10 seconds), and he spends most of the time double checking his intuition. Only then does he make the next move in a chess tournament. Its an excellent practice to mix System-1 thinking and System-2 thinking, but one requires conscious effort to do that.

Keeping Score

The chapter starts with the infamous statement of Steve Ballmer who predicted that iPhone was not going to have a significant market share. To evaluate Ballmer’s forecast in a scientific manner, the author looks at the entire content of Ballmer’s speech and says that there are many vague terms in the statement that it is difficult to give a verdict on the forecast. Another example is the "open letter" to Bernanke that was sent by many economists to stop QE to restore stability. QE did not stop and US has not seen any of the dire consequences that economists had predicted. So, is the forecast wrong ? Again the forecast made by economists is not worded precisely in numerical terms so that one can evaluate it. The basic message that the author tries to put across is, "judging forecasts is difficult".

The author’s basic motivation to conduct a study on forecasting came while sitting on a panel of experts who were asked to predict the future of Russia. Many of the forecasts were a complete disaster. However that did not make them humble. No matter what had happened the experts would have been just as adept at downplaying their predictive failures and sketching an arc of history that made it appear they saw it coming all along. In such scenario, how does one go about testing forecasts ? Some of the forecasts have no time lines. Some of the forecasts are worded in vague terms. Some of them are not worded in numbers. Even if there are numbers, the event happened cannot be repeated and hence how does one decide whether it is luck or skill ? We cannot rerun history so we cannot judge one probabilistic forecast- but everything changes when we have many probabilistic forecasts. Having many forecasts helps one pin down two essential features of any forecast analysis, i.e. calibration and resolution. Calibration involves testing whether the forecast and the actual are in sync. Resolution involves whether the forecast involved are decisive probabilistic estimate and not somewhere around 40%-60%. The author takes all the above thoughts in to consideration and starts a 20 year project from 1984-2004 that goes like this :

  • assemble experts in various fields
  • ask a large number of questions with precise time frames and unambiguous language
  • require that forecast be expressed using numerical probability scales
  • measure the calibration of the forecasters
  • measure the resolution of the forecasters
  • use brier score to evaluate the distance between the forecast and the actual

The author patiently conducts the study for 20 years to see the results of all the forecasts. The following are the findings/insights from the project :

  • To make a good analogy, the author says big idea thinkers are akin to "Hedgehogs" and many idea thinkers are akin to "foxes"
  • Foxes were better forecasters than Hedgehogs
  • Foxes don’t fare well with the media. Media likes authoritative statements to probabilistic statements.
  • Aggregating among a diverse set of opinions beats hedgehogs. That’s why averaging from several polls gives a better result than single poll. This doesn’t mean "wisdom of any sort of crowd" works. It means "wisdom of certain type of crowd" works.
  • The best metaphor for developing various perspective is to have a dragonfly eye. Dragonflies have two eyes, but theirs are constructed very differently. Each eye is an enormous, bulging sphere, the surface of which is covered with tiny lenses. Depending on the species, there may be as many as thirty thousand of these lenses on a single eye, each one occupying a physical space slightly different from those of the adjacent lenses, giving it a unique perspective. Information from these thousands of unique perspectives flows into the dragonfly’s brain where it is synthesized into vision so superb that the dragonfly can see in almost every direction simultaneously, with the clarity and precision it needs to pick off flying insects at high speed. A fox with the bulging eyes of a dragonfly is an ugly mixed metaphor but it captures a key reason why the foresight of foxes was superior to that of hedgehogs with their green-tinted glasses. Foxes aggregate perspectives.
  • Simple AR(1), EWMA kind of models performed better than hedgehogs and foxes

Superforecasters

The chapter starts off recounting the massive forecasting failure from the National Security Agency, the Defense Intelligence Agency, and thirteen other agencies that constitute the intelligence community of US government. These agencies had a consensus view that IRAQ had weapons of mass destruction. This view made everyone support Bush’s policy of waging the Iraq war. After the invasion in 2003, no WMDs were found. How come the agencies that employ close to twenty thousand intelligence analysts were so wrong? Robert Jervis who has critically analyzed the performance of these agencies over several decades says that the judgment was a reasonable one but wrong. This statement does require some explanation and the author provides the necessary details. The takeaway from the story is that the agencies did some errors that would have scaled back the probability levels that were associated with the consensus view. Who knows it would have changed the course of Iraq’s history?

After this failure, IARPA(Intelligence Advanced Research Projects Activity) was created in 2006. Its mission was to fund cutting-edge research with the potential to make the intelligence community smarter and more effective. They approach the author with a specific type of game in mind. IARPA’s plan was to create tournament-style incentives for top researchers to generate accurate probability estimates for Goldilocks-zone questions. The research teams would compete against one another and an independent control group. Teams had to beat the combined forecast-the "wisdom of the crowd"-of the control group. In the first year, IARPA wanted teams to beat that standard by 20%-and it wanted that margin of victory to grow to 50% by the fourth year. But that was only part of IARPA’s plan. Within each team, researchers could run experiments to assess what really works against internal control groups. Tetlock’s team beat the control group hands down. Was it luck ? Was it the team had a slower reversion to mean ? Read the chapter to judge it for yourself. Out of several volunteers that were involved GJP, the author finds that there were certain forecasters who were very extremely good. The next five chapters are all about the way superforecasters seem go about forecasting. The author argues that there are two things to note from GJP’s superior performance :

  1. We should not treat the superstars of any given year as infallible. Luck plays a role and it is only to be expected that the superstars will occasionally have a bad year and produce ordinary results
  2. Superforecasters were not just lucky. Mostly, their results reflected skill.

Supersmart?

The set of people whom the author calls superforecasters do not represent a random sample of people. So, the team’s outcome is not the same thing as collating predictions from a large set of random people. These people are different, is what the author says. But IQ or education are not the boxes based on which they can be readily classified. The author reports that in general the volunteers had higher IQ than others but there was no marked distinction between forecasters and superforecasters. So it seems intelligence and knowledge help but they add little beyond a certain threshold-so superforecasting does not require a Harvard PhD and the ability to speak five languages.

The author finds that superforecasters follow a certain way of thinking that seems to be marking better forecasters

  • Good back of the envelope calculations
  • Starting with outside view that reduces anchoring bias
  • Subsequent to outside view, get a grip on the inside view
  • Look out for various perspectives about the problem
  • Think thrice/four times, think deeply to root out confirmation bias
  • It’s not the raw crunching power you have that matters most. It’s what you do with it.

Most of the above findings are not groundbreaking. But what it emphasizes is that good forecasting skills do not belong to some specific kind of people. It can be learnt and consciously cultivated.

For superforecasters, beliefs are hypotheses to be tested, not treasures to be guarded. It would be facile to reduce superforecasting to a bumper-sticker slogan, but if I had to, that would be it.

Superquants?

Almost all the superforecasters were numerate but that is not what makes their forecasts better. The author gives a few examples which illustrate the mindset that most of us carry. It is the mindset of Yes, No and Maybe, where Yes mean very almost certainty, No means almost impossible and Maybe means 50% chance. This kind of probabilistic thinking with only three dials does not help us become a good forecasters. Based on the GJP analysis, the author says that the superforecasters have a more fine grained sense of probability estimates than the rest of forecasters. This fine grained probability estimates are not a result of some complex math model, but are a result of careful thought and nuanced judgment.

Supernewsjunkies?

The chapter starts with the author giving a broad description of the way a superforecaster works:

Unpack the question into components. Distinguish as sharply as you can between the known and unknown and leave no assumptions unscrutinized. Adopt the outside view and put the problem into a comparative perspective that downplays its uniqueness and treats it as a special case of a wider class of phenomena. Then adopt the inside view that plays up the uniqueness of the problem. Also explore the similarities and differences between your views and those of others-and pay special attention to prediction markets and other methods of extracting wisdom from crowds. Synthesize all these different views into a single vision as acute as that of a dragonfly. Finally, express your judgment as precisely as you can, using a finely grained scale of probability.

One of the things that the author notices about superforecasters is their tendency to make changes to the forecasts frequently. As things/facts change around them, they revise their forecasts. This begs the question, "Does the initial forecast matter ?". What if one starts with a vague prior and keep updating it based on the changing world. The GJP analysis shows that superforecasters initial estimates were 50% more accurate than the regular forecasters. The real takeaway is that "updating matters";frequent updating is as demanding as challenging and it is a huge mistake to belittle belief updating. Both under and overreaction to events happening can diminish accuracy. Both can also, in extreme cases, destroy a perfectly good forecast. Superforecasters have little ego invested in their initial judgments and the subsequent judgments. This makes them update their forecasts far quicker than other forecasters. Superforecasters update frequently and update in smaller increments. Thus they tread the middle path between over forecasting and underforecasting. The author mentions one superforecaster who uses Bayes theorem to revise his estimates. Does that mean Bayes is the answer to getting forecasts accurate? No, says the author. He found that even though all the superforecasters were numerate enough to apply Bayes, but nobody actually crunched numbers that explicitly. The message is that all the superforecasters appreciate the Bayesian spirit, though none had explicitly used a formula to update their forecasts. But not always "small updations" work. The key idea that the author wants to put across is that there is no  "magic" way to go about forecasting. Instead there are many broad principles with lots of caveats.

Perpetual Beta

The author starts off by talking about Carol Dwecks’ "growth mindset" principle and says that this is one of the important traits of a superforecaster.

We learn new skills by doing. We improve those skills by doing more. These fundamental facts are true of even the most demanding skills. Modern fighter jets are enormously complex flying computers but classroom instruction isn’t enough to produce a qualified pilot. Not even time in advanced flight simulators will do. Pilots need hours in the air, the more the better. The same is true of  surgeons, bankers, and business executives.

It goes without saying that practice is key to becoming good. However it is actually "informed practice" that is the key to becoming good. Unless there is a clear and timely feedback about how you are doing, the quantity of practice might be an erroneous indicator of your progress. This idea has been repeated in many books in the past few years. An officer’s ability to spot a liar is generally poor because the feedback of his judgment takes long time to reach him. On the other hand, people like meteorologists, seasoned bridge players learn from failure very quickly and improve their estimates. I think this is the mindset of a day trader in financial markets. He makes a trade, he gets a quick feedback about it and learns from the mistakes. If you take a typical mutual fund manager and compare him/her with a day trader, the cumulative feedback that the day trader receives is far more than what an MF manager receives. Read any indexing book and you will always read arguments debating whether Mr.XYZ was a good fund manager or not. You can fill in any name for XYZ. Some say luck, Some say skill. It is hard to tease out which is which when the data points are coarse grained. However if you come across a day trader who consistently makes money for a decent number of years, it is hard to attribute luck to his performance, for the simple reason that he has made far more trades cumulatively than an MF manager. The basic takeaway at least for a forecaster is that he/she must know when the forecast fails. This is easier said/written than done. Forecasts could be worded in ambiguous language, the feedback might have a large time lag like years by which time our flawed memories can no longer remember the old forecast estimate. The author gives a nice analogy for forecasters who do not have timely feedback. He compares them with basketball players doing free throws in the dark.

They are like basketball players doing free throws in the dark. The only feedback they get are sounds-the clang of the ball hitting metal, the thunk of the ball hitting the backboard, the swish of the ball brushing against the net. A veteran who has taken thousands of free throws with the lights on can learn to connect sounds to baskets or misses. But not the novice. A "swish!" may mean a nothing-but-net basket or a badly underthrown ball. A loud "clunk!" means the throw hit the rim but did the ball roll in or out? They can’t be sure. Of course they may convince themselves they know how they are doing, but they don’t really, and if they throw balls for weeks they may become more confident-I’ve practiced so much I must be excellent!-but they won’t get better at taking free throws. Only if the lights are turned on can they get clear feedback. Only then can they learn and get better.

Towards the end of this chapter, the author manages to give a rough composite portrait of a superforecaster :

Philosophical Outlook Cautious Nothing is certain
  Humble Reality is infinitely complex
  Non Deterministic What happens is not meant to be and does not have to happen
Abilities and thinking styles Actively open-minded Beliefs are hypotheses to be tested, not treasures to be protected
  Intelligent, Knowledgeable with a need for cognition Intellectually curious, enjoy puzzles and mental challenges
  Reflective Introspective and self-critical
  Numerate Comfortable with numbers
Methods of forecasting Pragmatic Not wedded to any idea or agenda
  Analytical Capable of stepping back from the tip-of-your-nose perspective and considering other views
  Dragon-fly eyed Value diverse views and synthesize them into their own
  Probabilistic Judge using many grades of maybe
  Thoughtful updaters When facts change, they change their minds
  Good-intuitive psychologists Aware of the value of checking thinking for cognitive and emotional biases
Work ethic Growth mindset Believe it’s possible to get better
  Grit Determined to keep at it however long it takes

The author says that the single most predictor of rising to the ranks of superforecasters is to be in a state of "perpetual beta".

Superteams

The author uses GJP as a fertile ground to ask many interesting questions such as :

  • When does "wisdom of crowd" thinking help ?
  • Given a set of individuals, does weighing team forecasts work better than weighting individual forecasts?
  • Given that there are two groups, forecasters and superforecasters, does acknowledging superforecasters after the year 1 performance works in improving or degrading the subsequent year performance for superforecasters?
  • How do forecasters perform against prediction markets ?
  • How do superforecasters perform against prediction markets ?
  • How do you counter "groupthink" amongst a team of superforecasters?
  • Does face-to-face interaction amongst superforecasters help/worsen the forecast performance ?
  • If aggregation of different perspectives gives better performance, should the aggregation be based on ability or diversity ?

These and many other related questions are taken up in this chapter. I found this chapter very interesting as the arguments made by the author are based on data rather than some vague statements and opinions.

Leader’s dilemma

I found it difficult to keep my attention while reading this chapter. It was trying to address some of the issues that typical management books talk about. I have read enough management BS books that my mind has become extremely repulsive to any sort of general management gyan. May be there is some valuable content in this chapter. May be there are certain type of readers who will find the content in the chapter appealing.

Are they really Super?

The chapter critically looks at the team of superforecasters and tries to analyze viewpoints of various people who don’t believe that superforecasters have done something significant. The first skeptic is Daniel Kahneman who seems to be of the opinion that there is a scope bias in forecasting. Like a true scientist, the author puts his superforecasting team in a controlled experiment that gives some empirical evidence that supeforecasters are less prone to scope bias. The second skeptic that the author tries to answer is Nassim Taleb. It is not so much as an answer to Taleb, but an acknowledgement that superforecasters are different. Taleb is dismissive of many forecasters as he believes that history jumps and these jumps are blackswans (highly improbable events with a lot of impact). The author defends his position by saying

If forecasters make hundreds of forecasts that look out only a few months, we will soon have enough data to judge how well calibrated they are. But by definition, "highly improbable" events almost  ever happen. If we take "highly improbable" to mean a 1% or 0.1% or 0.0001% chance of an event, it may take decades or centuries or millennia to pile up enough data. And if these events have to be not only highly improbable but also impactful, the difficulty multiplies. So the first-generation IARPA tournament tells us nothing about how good superforecasters are at spotting gray or black swans. They may be as clueless as anyone else-or astonishingly adept. We don’t know, and shouldn’t fool ourselves that we do.

Now if you believe that only black swans matter in the long run, the Good Judgment Project should only interest short-term thinkers. But history is not just about black swans. Look at the inch-worm advance in life expectancy. Or consider that an average of 1% annual global economic growth in the nineteenth century and 2% in the twentieth turned the squalor of the eighteenth century and all the centuries that preceded it into the unprecedented wealth of the twenty-first. History does sometimes jump. But it also crawls, and slow, incremental change can be profoundly important.

So, there are people who trade or invest based on blackswan thinking. Vinod Khosla invests in many startups so that one of them can be the next google. Taleb himself played with OTM options till one day he cracked it big time. However this is the not the only kind of philosophy that one can adopt. A very different way is to beat competitors by forecasting more accurately-for example, correctly deciding that there is a 68% chance of something happening when others foresee only a 60% chance. This is the approach of the best poker players. It pays off more often, but the returns are more modest, and fortunes are amassed slowly. It is neither superior nor inferior to black swan investing. It is different.

What Next ?

The chapter starts off by giving a few results of the opinion polls that were conducted before the Scotland’s referendum of joining UK. The numbers show that there was no clear sign of which way the referendum would go. In any case, the final referendum was voted NO. It was hard to predict the outcome. There was one expert/pundit, Daniel Drezner, who came out in the open and admitted that it is extremely easy to give an explanation after the fact but doing so, before the fact forecast, is a different ball game. Drezner also noted that he himself had stuck to NO for sometime before switching to YES. He made an error while correcting his prior opinion. As a learning, he says, in the future he would give a confidence interval for the forecast, rather than a binary forecast. The author wishes that in the future many more experts/forecasters adopt the confidence interval mindset. This shift from point estimate to interval estimate might do a world of good, says the author. What will this 500 page book do to the general reader/society ? The author says that there could be two scenarios.

  • Scenario 1: forecasting is mainly used to advance a tribes interests. In all such situations, the accuracy of the forecast would be brushed aside and whoever makes the forecast that suits the popular tribe will be advertised and sadly actions will be taken based on these possibly inaccurate forecasts. This book will be just another book on forecasting that is good to read, but nothing actionable will come out of it.
  • Scenario 2 : Evidence based forecasting takes off. Many people will demand accuracy, calibration results of experts

Being an optimistic skeptic, the author feels that evidence based forecasting will be adopted in the times to come. Some quantification is always better than no quantification(which is want we see currently). The method or system used in the forecasting tournament to come out ahead is a work-in-progress, admits the author. However that doesn’t mean it is not going to improve our forecasting performance.

Towards the end of the book, the author does seem to acknowledge the importance of Tom Friedmans of the world, not because of their forecasting ability. It is their vague forecasts that are actually superquestions for the forecasters. Whenever pundits give their forecasts in a imprecise manner, that serves as the fodder for all the forecasters to actually get to work. The assumption the author makes is that superforecasters are not superquestioners. Superquestioners are typically hedgehogs who have one big idea, think deeply and see the world based on that one big idea. Superforecasters, i.e. foxes are not that good at churning out big questions, is what the author opines. In conclusion, he says an ideal person would be a combination of superforecaster and superquestioner.

 

takeawayTakeaway:

This book is not "ONE BIG IDEA" book. Clearly the author is on the side of foxes and not hedgehogs. The book is mainly about analyzing the performance a specific set of people from a forecasting team that participated in IARPA sponsored tournament. The book looks at these superforecasters and spells out a number of small but powerful ideas/principles that can be cultivated by anyone, who aspires to become a better forecaster.

reproducible-research

The book starts by explaining an example project that one can download from the author’s github account. The project files serve as an introduction to reproducible research. I guess it might make sense to download this project, try to follow the instructions and create the relevant files. By compiling the example project, one gets a sense of what one can accomplished by reading through the book.

Introducing Reproducible Research
The highlight of an RR document is that data, analysis and results are all in one document. There is no separation between announcing the results and doing number crunching. The author gives a list of benefits that accrue to any researcher generating RR documents. They are

  • better work habits
  • better team work
  • changes are easier
  • high research impact

The author uses knitr / rmarkdown in the book to discuss Reproducibility. The primary difference between the two is that the former demands that document be written using the markup language associated with the desired output. The latter is more straightforward in the sense that one markup can be used to produce a variety of outputs.

Getting Started with Reproducible Research
The thing to keep in mind is that reproducibility is not an after thought – it is something you build into the project from the beginning. Some general aspects of RR are discussed by the author. If you do not believe in the benefits of RR, then you might have to carefully read this chapter to understand the benefits as it gives some RR tips to a newbie. This chapter also gives a road map to the reader as to what he/she can expect from the book. In any research project, there is data gathering stage, data analysis stage and presentation stage. The book contains a set of chapters addressing each stage of the project. More importantly, the book contains ways to tie each of the stages so as to produce a single compendium for your entire project.

Getting started with R, RStudio and knitr/rmarkdown
This chapter gives a basic introduction to R and subsequently dives in to knitr and rmarkdown commands. It shows how one can create a .Rnw or .Rtex document and convert in to a pdf either through RStudio or the command line. rmarkdown documents on the other hand are more convenient for reproducing simple projects where there are not many interdependencies between various tasks. Obviously the content in this chapter gives only a general idea. One has to dig through the documentation to make things work. One learning for me from this chapter is the option of creating .Rtex documents in which the syntax can be less baroque.

Getting started with File Management
This chapter gives the basic directory structure that one can follow for organizing the project files. One can use the structure as a guideline for one’s own projects. The example project uses gnu make file for data munging. It also gives a crash course of bash. 

Storing, Collaborating, Accessing Files, and Versioning
The four activities mentioned in the chapter title can be done in many ways. The chapter focuses on Dropbox and Github. It is fairly easy to learn to use the limited functionality one gets from Dropbox. On the other hand, Github demands some learning from a newbie. One needs to get to know the basic terminology of git. The author does a commendable job of highlighting the main aspects of git version control and its tight integration with RStudio.

Gathering Data with R

This chapter talks about the way in which one can use GNU make utility to create a systematic way of gathering data. The use of make file makes it easy for other to reproduce the data preparation stage of a project. If you have written a make file in C++ or in some other context, it is pretty easy to follow the basic steps mentioned in the chapter. Else it might involve some learning curve. My guess is once you start writing make files for specific tasks, you will realize their tremendous value in any data analysis project. A nice starting point for learning make file is robjhyndman’s site.

Preparing Data for Analysis
This gives a whirlwind tour of data munging operations and data analysis in R.

Statistical Modeling and knitr
The chapter gives a brief description of chunk options that are frequently used in an RR document. Out of all the options, cache.extra and dependson are the options that I have never used in the past and is a learning for me. One of the reasons I like knitr is its ability to cache objects. In the Sweave era, I had to load separate packages, do all sorts of things to run a time intensive RR document. It was very painful to say the least. Thanks to knitr it is extremely easy now. Even though cache option is described at the end, I think it is one of the most useful features of the package. Another good thing is that you can combine various languages in RR document. Currently knitr supports the following language engines :

  • Awk
  • Bash shell
  • CoffeeScript
  • Gawk
  • Haskell
  • Highlight
  • Python
  • R (default)
  • Ruby
  • SAS
  • Bourne shell


Showing results with tables
In whatever analysis you do using R , there are always situations where your output is in the form of a data.frame or matrix or some sort of list structure that is formatted to display on the console as a table. One can use kable to show data.frame and matrix structures. It is simple, effective but limited in scope. xtable package on the other hand is extremely powerful. One can use various statistical model fitted objects and pass it on to xtable function to obtain a table and tabular environment encoded for the results. The chapter also mentions texreg that is far more powerful than the previous mentioned packages. With texreg , you can show the output of more than one statistical model as a table in your RR document.There are times when the output classes are not supported by xtable. In such cases, one has to manually hunt down the relevant table, create a data frame or matrix of the relevant results and then use xtable function.

Showing results with figures
It is often better to know basic LaTeX syntax for embedding graphics before using knitr. One problem I have always faced with knitr embedded graphics is that all the chunk options should be mentioned in one single line. You cannot have two lines for chunk options. Learnt a nice hack from this chapter where some of the environment level code can be used as markup rather than as chunk options .This chapter touches upon the main chunk options relating to graphics and does it well, without overwhelming the reader.

Presentation with knitr/LaTeX
The author says that much of the LaTeX in the book has been written using Sublime Text editor. I think this is the case with most of the people who intend to create an RR. Even though RStudio has a good environment to create a LaTeX file, I usually go back to my old editor to write LaTeX markup. How to cite bibliography in your document? and How to cite R packages in your document? are questions that every researcher has to think about in producing RR documents. The author does a good job of highlighting the main aspects of this thought process. The chapter ends with a brief discussion on Beamer. It gives a 10,000 ft. of beamer. I stumbled on to a nice link in this chapter that gives the reason for using fragile in beamer.

Large knitr/LaTeX Documents: Theses, Books, and Batch Reports
This chapter is extremely useful for creating long RR documents. In fact if your RR document is not large, it makes sense to logically subdivide in to separate child documents. For knitr, there are chunk options to specify parent and child relationships. These options are useful in knitting child documents independently of the other documents embedded in the parent document. You do not have to specify the preamble code again in each of the child documents as it inherits the code from the parent document. The author’s also shows a way to use Pandoc to change rmarkdown document to tex, which can then be included in the RR document.

The penultimate chapter is on rmarkdown. The concluding chapter of the book discusses some general issues of reproducible research.

takeawayTakeaway:

This book gives a nice overview of the main tools that one can use in creating a RR document. Even though the title of the book has the term “RStudio” in it, the tools and the hacks mentioned are IDE agnostic. One can read a book length treatment for each of the tools mentioned in the book and might easily get lost in the details. Books such as these give a nice overview of all the tools and hence motivate the reader to dive into specifics as and when there is requirement.

book_cover

This book can be used as a companion to a more pedagogical text on survival analysis. For someone looking for an appropriate R command to use, for fitting certain kind of survival model, this book is apt. This book neither gives the intuition nor the math behind the various models. It appears like an elaborate help manual for all the packages in R, related to event history analysis.

I guess one of the reasons for the author writing this book is to highlight his package eha on CRAN. The author’s package is basically a layer on survival package that has some advanced techniques which I guess only a serious researcher in this field can appreciate. The book takes the reader through the entire gamut of models using a pretty dry format, i.e. it gives the basic form of a model, the R commands to fit the model,and some commentary on how to interpret the output. The difficulty level is not a linear function from start to end. I found some very intricate level stuff interspersed among some very elementary estimators. An abrupt discussion of Poisson regression breaks the flow in understanding Cox model and its extensions. The chapter on cox regression contains detailed and unnecessary discussion about some elementary aspects of any regression framework. Keeping these cribs aside, the book is useful as a quick reference to functions from survival, coxme, cmprsk and eha packages.

book_cover_sa_self

 

As the title suggests, this book is truly a self-learning text. There is minimal math in the book, even though the subject essentially is about estimating functions(survival, hazard, cumulative hazard). I think the highlight of the book is its unique layout. Each page is divided in to two parts, the left hand side of the page runs like a pitch, whereas the right hand side of the page runs like a commentary to the pitch. Every aspect of estimation and inference is explained in plain simple English. Obviously one cannot expect to learn the math behind the subject. In any case, I guess the target audience for this book comprises those who would like to understand survival analysis, run the model using some software packages and interpret the output. So, in that sense, the book is spot on. The book is 700 pages long and so all said and done, this is not a book that can be read in one or two sittings. Even thought the content is easily understood, I think it takes a while to get used the various terms, assumptions for the whole gamut of models one comes across in survival analysis. Needless to say this is a beginner’s book. If one has to understand the actual math behind the estimation and inference of various functions, then this book will equip a curious reader with a 10,000 ft. view of the subject, which in turn can be very helpful in motivating oneself to slog through the math.

Here is a document that gives a brief summary of the main chapters of the book.

cover

This book is vastly different from the books that try to warn us against incorrect statistical arguments present in media and other mundane places. Instead of targeting newspaper articles, politicians, journalists who make errors in their reasoning, the author investigates research papers, where one assumes that scientists and researchers make flawless arguments, at least from stats point of view. The author points a few statistical errors, even in the pop science book, “How to lie with statistics?”. This book takes the reader through the kind of statistics that one comes across in research papers and shows various types of flawed arguments. The flaws could arise because of several reasons such as eagerness to publish a new finding without thoroughly vetting the findings, not enough sample size, not enough statistical power in the test, inference from multiple comparisons etc. The tone of the author isn’t deprecatory. Instead he explains the errors in simple words. There is minimal math in the book and the writing makes the concepts abundantly clear even to a statistics novice. That in itself should serve as a good motivation for a wider audience to go over this 130 page book.

In the first chapter, the author introduces the basic concept of statistical significance. The basic idea of frequentist hypothesis testing is that it is dependent on p value that measure Probability(data|Hypothesis). In a way, p value measures the amount of surprise that you find in the data given that you have a specific null hypothesis in mind. If the p value turns out to be too less, then you start doubting your null and reject the null. The procedure at the outset looks perfectly logical. However one needs to keep in mind, the things that do not form a part of p value such as,

  • It does not per se measure the size of the effect.
  • Two experiments with identical data can give different p values. This is disturbing as one assumes that p value somehow knows the intention of the person doing the experiment.
  • It does not say anything about the false positive rate.

By the end of the first chapter, the author convincingly rips apart p value and makes a case for using confidence intervals. He also says that many people do not report confidence intervals because they are often embarrassingly wide and might make their effort a fruitless exercise.

The second chapter talks about statistical power, a concept that many introductory stats courses do not delve in to, appropriately. The statistical power of a study is the probability that it will distinguish an effect of a certain size from pure luck. The power depends on three factors

  • size of the bias you are looking for
  • sample size
  • measurement error

If an experiment is trying to test a subtle bias, then there needs to be far more data to even detect it. Usually the accepted power for an experiment is 80%. This means that the probability of bias detection is close to 80%. In many of the tests that have negative results, i.e the alternate is rejected, it is likely that the power of test is compromised. Why do researchers fail to take care of power in their calculations? The author guesses that it could be because the researcher’s intuitive feeling about samples is quite different from the results of power calculations. The author also ascribes to the not so straightforward math required to compute the power of study.

The problems with power also plague the other side of experimental results. Instead of detecting the true bias, the results often show inflation of true result, called M errors, where M stands for magnitude. One of the suggestions given by the author is : Instead of computing the power of a study for a certain bias detection and certain statistical significance, the researchers should instead look for power that gives narrower confidence intervals. Since there is no readily available term to describe this statistic, the author calls it assurance, which determines how often the confidence intervals must beat a specific target width. The takeaway from this chapter is that whenever you see a report of significant effect, your reaction should not be “Wow, they found something remarkable", but it needs to be, "Is the test underpowered ?". Also just because alternate was rejected doesn’t mean that alternate is absolute crap.

The third chapter talks about pseudo replication, a practice where the researcher uses the same set of patients/animals/ whatever to create repeated measurements. Instead of bigger sample sizes, the researcher creates a bigger sample size by repeated measurements. Naturally the data is not going to be independent as the original experiment might warrant. Knowing that there is a pseudo replication of the data, one must be careful while drawing inferences. The author gives some broad suggestions to address this issue

The fourth chapter is about the famous base rate fallacy where one ascribes the p value to the probability of alternate being true. Frequentist procedures that give p values merely talk about the surprise element. In no way do they actually talk about the probability of alternate treatment in a treatment control experiment. The best way to get a good estimate of probability that a result is false positive, is by considering prior estimates. The author also talks about Benjamini-Hochberg procedure, a simple yet effective procedure to control for false positive rate. I remember reading this procedure in an article by Brad Efron titled, “The future of indirect evidence”, in which Efron highlights some of the issues related to hypothesis testing in high dimensional data.

The fifth chapter talks about the often found procedure of testing two drugs with a placebo and using the results to compare the efficiency of two drugs. Various statistical errors can creep in. These are thoroughly discussed. The sixth chapter talks about double dipping, i.e. using the same data to do exploratory analysis and hypothesis testing. It is the classic case of using in-sample statistics to extrapolate out-of-sample statistics. The author talks about arbitrary stopping rules that many researchers employ for cutting short an elaborate experiment when they find statistically significant findings at the initial stage. Instead of having a mindset which says, "I might have been lucky in the initial stage", the researchers over enthusiastically stops the experiment and reports truth inflated result. The seventh chapter talks about the dangers of dichotomizing continuous data. In many research papers, there is a tendency to divide the data in to two groups and run significance tests or ANOVA based tests, thus reducing the information available from the dataset. The author gives a few examples where dichotomization can lead to grave statistical errors.

The eighth chapter talks about basic errors that one does in doing regression analysis. The errors highlighted are

  • over reliance on stepwise regression methods like forward selection or backward elimination methods
  • confusing correlation and causation
  • confounding variables and Simpson’s paradox

The last few chapters gives general guidelines to improve research efforts, one of them being “reproducible research”. 

takeawayTakeaway

Even though this book is a compilation of various statistical errors committed by researchers in various scientific fields, it can be read by anyone whose day job is data analysis and model building. In our age of data explosion, where there are far more people employed in analyzing data and who need not necessarily publish papers, this book would be useful to a wider audience. If one wants to go beyond the simple conceptual errors present in the book, one might have to seriously think about all the errors mentioned in the book and understand the math behind them.

book_cover

The book serves a nice intro to Bayes theory for an absolute newbie. There is minimal math in the book. Whatever little math that’s mentioned, is accompanied by figures and text so that a newbie to this subject “gets” the basic philosophy of Bayesian inference. The book is a short one spanning 150 odd pages that can be read in a couple of hours.  The introductory chapter of the book comprises few examples that repeat the key idea of Bayes. The author says that he has deliberately chosen this approach so that a reader does not miss the core idea of the Bayesian inference which is,

Bayesian inference is not guaranteed to provide the correct answer. Instead, it provides the probability that each of a number of alternative answers is true, and these can then be used to find the answer that is most probably true. In other words, it provides an informed guess.

In all the examples cited in the first chapter, there are two competing models. The likelihood of observing the data given each model is almost identical. So, how does one chose one of the two models ? Well, even without applying Bayes, it is abundantly obvious which of the two competing models one should go with. Bayes helps in formalizing the intuition and thus creates a framework that can be applied to situations where human intuition is misleading or vague. If you are coming from a frequentist world where “likelihood based inference” is the mantra, then Bayes appears to be merely a tweak where weighted likelihoods instead of plain vanilla likelihoods are used for inference.

The second chapter of the book gives a geometric intuition to a discrete joint distribution table. Ideally a discrete joint distribution table between observed data and different models is the perfect place to begin understanding the importance of Bayes. So, in that sense, the author provides the reader with some pictorial introduction before going ahead with numbers. 

The third chapter starts off with a joint distribution table of 200 patients tabulated according to # of symptoms and type of disease. This table is then used to introduce likelihood function, marginal probability distribution, prior probability distribution, posterior probability distribution, maximum apriori estimate . All these terms are explained using plain English and thus serves as a perfect intro to a beginner. The other aspect that this chapter makes it clear is that it is easy to obtain probability of data given a model. The inverse problem, i.e probability of model given data, is a difficult one and it is doing inference in that aspect that makes Bayesian inference powerful. 

The fourth chapter moves on to  continuous distributions. The didactic method is similar to the previous chapter. A simple coin toss example is used to introduce concepts such as continuous likelihood function,  Maximum likelihood estimate, sequential inference, uniform priors, reference priors,  bootstrapping and various loss functions.

The fifth chapter illustrates inference in a Gaussian setting and establishes connection with the well known regression framework. The sixth chapter talks about joint distributions  in a continuous setting.  Somehow I felt this chapter could have been removed from the book but I guess keeping with the author’s belief that “spaced repetition is good”, the content can be justified. The last chapter talks about Frequentist vs. Bayesian wars, i.e. there are statisticians who believe in only one of them being THE right approach. Which side one takes depends on how one views “probability” as – Is probability a property of the physical world or is it a measure of how much information an observer has about that world ? Bayesians and increasingly many practitioners in a wide variety of fields have found the latter belief to be a useful guide in doing statistical inference. More so, with the availability of software and computing power to do Bayesian inference, statisticians are latching on to Bayes like never before.

The author deserves a praise for bringing out some of the main principles of Bayesian inference using just visuals and plain English. Certainly a nice intro book that can be read by any newbie to Bayes.

image

image Takeaway

“Write your code as though you are releasing it as a package” – This kind of thinking forces one to standardize directory structure, abandon adhoc scripts and instead code well thought out functions, and finally leverage the devtools functionality to write efficient, extensible and shareable code.

book-cover

 

 

imageTakeaway :

This book is a beautiful book that describes the math behind queueing systems. One learns a ton of math tools from this book, that can be used to analyze any system that has a queueing structure within it. The author presents the material in a highly enthusiastic tone with superb clarity. Thoroughly enjoyed going through the book.

bookcover

image Takeaway :

The book is definitely overpriced for what it delivers. Given that 50% of this book explains R basics, the title is not at all appropriate. Even the quant stuff that is covered in the remaining 50% of the book is laughably inadequate.