### February 2013

The author urges the reader to develop a “probabilistic perspective” towards many aspects that one comes across in life. Rational thought about randomness is far better than irrational emotional responses. Whenever we see things being reported, coincidences being sighted, it is better to start off by understanding the question “how unlikely is the event?”. One needs to figure ,” Out of How many ? question to answer the likelihood of the event . For example, the famous 6 degrees of separation that is often quoted, sounds pretty reasonable once you do a back of the envelope calculation.

Another example is the Birthday problem that is cited as a good party trick. The reason it is termed as a trick is : Most of us do not get the “Out of how many “ question right. If 23 people are selected at random in a party, there is just over 50% probability that they have the same birthday. Well if one thinks it should 23/365(6.3%), one must understand that 23/365 is the right answer to a different question,”If you choose 23 people and see how many of them share a specific day as their birthday?”. In the case of birthday party problem, one needs to look at “Out of how many”. Since there are 253 pairs that can be formed in a set of 23 people, the upper limit should be 253/365(69%) and taking in to account the double counting that happens, the probability works out to be 50.7%. Simple notions such as “Out of how many?”, “understanding clumping of outcomes” are enough to develop a probabilistic perspective to many things that we come across.

Out of 16 essays on probability, I will try to summarize a few of them that I found interesting:

Randomness to the Rescue: When Uncertainty Is Your Friend

This section talks about situations where randomness is used deliberately to create fair play. While discussing various examples the author mentions about the famous example of monkey sitting on a type writer and eventually producing Shakespeare’s work after infinite time. It is said that the monkey will take trillion, trillion, trillion, trillion years to type the statement , “it was the best of times, it was the worst of times”. One can take the statement at face value as somehow the probability that such a sentence comes out of random typing seems very low. But what if one was asked to calculate the average time, assuming we all live for infinite time. Even though the book does not go in to math behind it, Just pause for a minute and think about it. It is not a trivial question. How long does a monkey take to type the above phrase ? It can be solved by formulating a discrete martingale process. One solution is as follows :

Consider a situation where there are infinite gamblers who are willing to bet on monkey’s output. Let the time be measured in seconds. At each second, the monkey types in a letter at random (assume for simplicity that monkey can type in small cap letters only (26) and use only 2 punctuation marks out of the 14 in English language, space and comma). The sequence of letters and punctuation marks form a series of IID observations. So, the monkey at each second will type one of the 28 characters (26 letter + 2 punctuation symbols).

Now here is the set up : Every second a gambler walks in and bets \$1 on the event that the monkey will type “i” the first character. If he wins, he bets that “t” would  be the second character that the monkey types. If he wins, he bets that space as the third character, etc..The gambler is betting the monkey will type in “it was best of the times, it was worst of the times” in 51 seconds, one second for each character. The first second he gambles and gets it right he wins \$28 dollars As soon as he wins \$28 dollars, he uses all his money to bet on the subsequent character. If he gets it right on the second letter, he wins \$28^2. He bets the entire amount on the third character, i.e. the space character, so on and so forth.The gambler drops out of the game and wins nothing when the monkey does not type the phrase in exactly 51 seconds. The game stops when any gambler wins \$28^51 = 63 trillion, trillion, trillion, trillion, trillion, trillion dollars. Given this setup, one can construct a discrete martingale process for each gambler and sum up the payoffs for all gamblers till the game ends. One can show that the expected value of time when the game ends has the most significant term as 28^51 seconds, followed by 28^11 seconds. So, if you take the most significant term, the average time taken for the game to stop, i.e. the monkey to type the phrase, “it was the best of times, it was the worst of times” will be 2 million, trillion, trillion, trillion, trillion , trillion years. Human authors should not drop their pens just yet

The section mentions examples like the game Rock Paper Scissors, transmission of packets on the internet, cryptography and other areas where randomness is used widely. There are two examples that I have particularly liked. One is in the field of sports. There was a big debate after 1996 Olympics where a runner was disqualified as he had anticipated the starter gun. The issue was finally resolved by a random number generator. When the runners line up and are ready to go, the time at which the starter gun is fired is at a random time based on exponential distribution. By making the start time a random time, runners were prevented from anticipating the starter gun, while not penalizing them for extraordinary fast reaction times.

The other example I liked is the way author describes Markov chain Monte Carlo (MCMC).

Suppose you wanted to measure the average pollution level in a large wilderness lake system. You might proceed by setting out in a boat, and paddling this way and that way, from inlet to inlet and lake to lake, throughout the park, without any particular destination. Every five minutes you take a water sample and measure the pollutants. Each new sample would be taken just a short distance from the previous sample, thus continuing from where the previous sample left off. Still if you average the pollution levels in many different samples over many days of paddling, eventually you get an accurate picture of the lake system.

Basically the above set up is not be confused with sampling the level at random places. This set up means that your random move from one spot to another spot should be independent on the history and your next inlet should be a random choice. Nothing in the path should influence that decision. In more technical terms, you are basically specifying a random walk where each step in the random walk is completely independent of the steps before the current position. The proposed next step has no dependence on where the walk has been before, and the decision to reject or accept the proposed step has no dependence on where the walk has been before.

Evolution, Genes, and Viruses: Randomness in Biology

This section talks about various examples relating to “Branching processes”. Any undergrad course on probability introduces a student to branching processes where the student is asked to compute the probability of an eventual extinction of species, given a certain regenerating behavior. Various examples are given that help the reader built an intuition for the branching processes. One of the nice examples that I came across is the chain mail spam that we often get. The message always tends to be “ Send this message to 5/10/X friends”. This is a branching process where the mail replicates itself based on how many people you forward the message. The message is typically NEVER “Send this message to 1 or 2 friends”. The content is always asking you to send to multiple people, typically more than 5. You can actually prove mathematically that the chain will die very soon if the message told you to forward it to just 1 or 2 friends.

That Wily Monty Hall: Finding Probabilities from Clues

This section talks about conditional probability and examples where we adjust probabilities given the evidence either too much or too little. When we are talking about probability of an event given that we see some data, the event space of the experiment (if we have that mental model) shrinks. This means that we need to reassess the probabilities given the data. Instead of updating the priors too much or too little, we need to adjust just about the right amount. The section ends with discussing about Frequentists and Bayesians, the philosophical difference being the latter set view all probabilities as conditional probabilities while the former set is comfortable about talking in absolute probability terms. I think it is a nice way of verbalizing the need for understanding conditional probability, i.e. “for not updating the priors too much or too little”.

Spam, Spam, Probability, and Spam: Blocking Unwanted E-mail

I learnt something unrelated to probability in this section. The origin of the word “Spam”.

Spam was originally a canned-meat product developed by Hormel corporation in 1937. During the fresh-meat shortage of World War II, Spam was distributed widely and consumed by soldiers, civilians worldwide. The 1970s comedy group Monty Python mocked the widespread availability of Spam in their famous skit about restaurant that offers breakfast delicacies such as Spam Sausage, Spam, Spam bacon, Spam tomato etc. This skit made the word “Spam” synonymous with any item that is overly abundant.

This section as it is obvious from the title, talks about using probability theory to solve classification problem, i.e. assigning a score for an incoming message as being spam or ham. Based on this, the system learns to keep your inbox spam-free. BTW, the Gmail version where you are also given the power to mark messages as spam fall under the category of filters called “bogofilters”.

Ignorance, Chaos, and Quantum Mechanics: Causes of Randomness

The last essay makes a point using Quantum mechanics that nature is random at its core. While scientific community had to go through years of debate and research to prove it, we should be glad to be living in today’s world where the notions of probability and randomness are developed to a large extent. More importantly having a probabilistic mindset is becoming a necessary skill to separate the signal from an overwhelming noise that we come across daily.

The book is about 800 pages long.It is a classic reference to solve ODEs and thus contains almost all the tricks of the trade to solve an ODE, be it analytically or numerically.  The good thing about the book is that various methods are presented in the form of a “lesson-exercise-solution” format. Each lesson has good enough examples to give a clear idea of the technique used to solve the ODE. Another interesting feature of the book is that it is interlaced with chapters that describe various setups that generate ODEs. So, in one sense the reader can see everything in one place, i.e., the various kinds of settings that generate ODEs and the methods to solve the ODEs. I happened to go over this book mainly as I was stuck with an ODE that I did not want to solve it numerically. I went through a section of this book to figure out an analytical solution. I liked the style of the specific section so much that I ended up reading almost the entire book. Well, it is a reference and it needs to be used whenever you are in need. But I went through the book just to refresh all the ODE methods that were lying dormant in my memory.After going through this magnum opus on ODE, I am kind of exhausted and plan to stay away from differential equations at least for sometime.

This book is one of the few books on stochastic processes that teach concepts via problems. Teaching math concepts via problems with out delving too much in to the theory has its own advantages. The fact that the book is asking you to prove/compute/calculate/verify something at regular intervals means that you are not a passive reader from the word “go”. In fact one cannot be a passive reader at least while going over a math book. But this approach of “using problems” to teach various concepts is extremely appealing for people who are looking for self-study texts. There is a whole section on Markov chains where a few definitions theorems are interlaced between what is largely a set of interesting exercises that guide the reader to understand discrete and continuous Markov processes. The interesting thing about this book is that, even though the title seems to communicate that it is a “learn by doing” book, the math for all the proofs, lemma and propositions are quite rigorous. There are a few places where the author refers the reader to other books, but by and large it is a self-contained text.

The book starts with a basic recap of probability theory and covers sigma algebra, measures, probability space, borel measurable functions, sigma generated by a random variable etc. Well, the chapter is more a formality. If you are not comfortable with measure theory, this shouldn’t be your first book anyway. The second chapter is on conditional expectation, something that is one of the most important concepts needed in understanding stochastic processes. The key thing to keep in mind is that conditional expectation has no explicit formula. Years ago, when I came across this thing I was pretty surprised and was at the same time clueless. If there is no explicit way to compute something, how does one compute then? Slowly after getting my fundas right, I realized that conditional expectation variable needs to be guessed based on two constraints , one that restricts the sigma algebra generated by the variable and the second one involving the expectation of the conditional expectation. The chapter presents a nice set of examples and exercises which make it abundantly clear about the ways to guess the random variable.

The third chapter and fourth chapters talk about Martingales. These are mathematical objects that are essential to understand a whole lot of stuff in math finance and stochastic integration. For example, if you want to integrate a function with respect to Brownian motion, one needs to use Ito’s integral which is nothing but a martingale process. Somehow, I think it is better to read about Martingales from some other book. Its hard to understand the importance of Martingale inequalities just by reading a few pages. The book covers the fundas behind martingales at a blazingly fast speed.

The real fun starts from chapter 5 onwards where Markov chains are introduced. As mentioned earlier, the highlight of this book is “learn by doing”. In that spirit, a basic definition of Markov chain and properties are given and the reader is expected to work out the important properties, lemmas relating to Markov chains. One thing I liked is the application of Fatou’s lemma in proving some lemmas for countable Markov chains. Finally I found some application where I could easily understand the utility of Fatou’s lemma. Chapter 6 is about continuous stochastic processes. Poisson process and Weiner process are covered in this chapter. This is also the chapter where one sees the application of Martingale inequalities.

The last chapter covers Ito’s stochastic calculus. Obviously 30 pages cannot do full justice to Ito’s calculus. But the chapter imparts enough rigor and intuition so that reader gets a good idea of all the important concepts of Ito’s framework. At the very beginning of the chapter, it shows why Ito’s integral is so different from Riemann integral. Subsequently Ito’s integral is constructed for a random step process. These random step processes form the basis for a generalized random process. Also the Ito’s integral for the sequence of random step processes is used to compute the Ito’s integral for a generalized random process with respect to Brownian function. The basic properties of Ito’s integral are stated and derived. Also the sufficient condition for the existence of Ito’s integral is stated and proved. The thing I liked about this chapter is that one gets enough practice in proving two things, one is to formally check whether the hypothesized random step function is a good approximation to a particular integrand function, and second is to verify whether Ito’s integral of the random step process does indeed converge to an Ito process. The chapter ends with a brief discussion of Ito’s lemma and its application in solving stochastic differential equations. Frankly if you are reading this chapter with absolutely no knowledge about Riemann-Stieltjes integrals, it might be a little difficult to put the pieces together.

The book is a good introduction to discrete as well as continuous time stochastic processes. I think the book has two appealing aspects. One is that it has a thorough explanation of conditional expectation and the second is the use of problems as guideposts for the reader to figure out various properties of stochastic processes.

The title of the book gives away the main idea behind the book, i.e., ways to separate skill and luck in any outcome. Take any field, be it sports, investing, business , etc. all we get to see is the outcome. In games like chess, the outcome is clearly attributed to player’s “Skill” and in games like slot-machines / casino games, the outcome is clearly attributed to “Luck”. However there are a lot of activities that fall somewhere in between this Luck-Skill continuum. The book tries to answer the following questions :

• How does one place an activity in the Luck-Skill continuum?
• How does on quantify the components of Luck and Skill in any activity?
• How do you account for changing / varying skills?
• How does one improve the observed outcome for various activities on the Luck-Skill continuum?
• What metrics does on use to capture the components of the outcome?
• How does one verify whether the metrics are reliable or not?
• What are the basic fallacies behind “mean reversion” ?
• What is a better metric in the case where there is very little data to estimate?
• What are the limitations of using quantitative methods for analysis of outcomes?
• Why does great success always entail great luck?
• How does the effect of sample size play in to the whole business of disentangling skill and luck ?
• What are the kinds of models that one can think of, while analyzing “Luck”?
• How does one get better in the art of good guesswork?
• How to interpret feedback in luck dominated activity / skill dominated activity?

I will try to give a gist of the argument that the author makes throughout the book using a ton of examples , academic studies, and other books.

Think of the observed outcome as a random variable and assume the random variable is a linear combination of luck and skill. Simple model but it will do the job of getting the intuition right.

Think of 2 jars, the first one is a Luck jar and the second one is a Skill jar. Both jars contain marbles with different numbers on it. The proportion of a marbles with a specific number differs in both the jars. The simplest model one can think of is that the marbles in Luck jar and Skill jar are normally distributed with varying mean and standard deviation

We get to see only the observables. So, in one sense inferring the distribution of skill and luck appears to be an intractable problem. The two distributions of luck and skill can take infinite forms and could result in the distribution that we get to see. So, how does one go about separating the skill and luck? To simplify thing further, one can assume the skill has less variance. Now things become a little easy to analyze. If there was no luck factor in the outcome, the following things would be evident in the observed data

• Easily assign a cause to the effect / outcome.
• Correlation between the outcomes is high.
• There is slow rate of mean reversion.

The activities where such effects are visible are the ones where deliberate practice works, years of toiling at something makes you good at something. However if one sees that there is no correlation between outcomes, i.e. there is a faster mean reversion, then one can infer that Luck plays a great role in the outcome.

The sample size needed for the analysis of the outcomes depends on the type of activity. If it is a skill based activity, small sample size is ok. But in activities that are more toward the luck side, a big enough sample is a must. The basic problem with small sample sizes is that humans are extremely good at attaching a narrative to the data. A big sample size means that there is a possibility to see the luck component better.

Using the above two jar example the book suggests James Stein estimator for estimating true skill. The basic equation behind the estimator is

In the activities where there is huge amount of luck, the shrinkage factor c is close to 0 and in the activities that are more skill based one can use the shrinkage factor as 1. For the rest of the situations, one can estimate this shrinkage coefficient.

So, that’s the crux of the book. The author uses the above framework to various contexts such as baseball, investing, business strategies, etc. to show how one could analyze and apply the argument to various datasets. The book as such is well written and easy on eyes. Through various interesting examples, the author shows that one needs to always guesstimate the relative importance of skill and luck in various activities to avoid falling prey to various cognitive biases.

Here a list of random points in the book that I found interesting

• The title of the author’s previous book “Think Twice” was crowd sourced. Mechanical Turk from Amazon was used to rate the various titles and “Think twice” came out as favorite
• Feedback is largely misleading in activities where luck plays a significant part
• “Can you lose on purpose” – is a simple test to check whether an activity is skill based or not ?
• Analyzing “hot hands” involves checking whether a 2 state Markov model is statistically significant or not. I think the same analysis can be done on mutual funds to see whether there is a 2 or 3 state Markov chain model for the fund performance?
• Polya’s urn framework to explain preferential treatment
• Music Lab – The experiment run by Stanford prof to show that hits are actually difficult to predict. I found this experiment a powerful validation of how difficult it is to predict hits in the entertainment industry. Having said that, there is a mention of a bot in the the book “Automate this” that has predicted fair number of music hits. Which one to believe?
• Correlation and rate of mean reversion capture the same effect
• “Active share “ might be a good metric to analyze the performance of various funds.
• When success is probabilistic, focus on the process
• Fallacies of interpreting mean reversion – Illusion of cause and effect, Illusion of reduction in variance, Illusion of feedback.

Haim Bodek, the quant trader who was featured in the book “The Dark Pools” has written a book on what he thinks is the main problem with HFT. This content is more like a collection of articles and blog posts packaged as a book. Despite such a structure, the author focuses on ONE issue through out,i.e., “Special order types”,  that he claims as the single most important demon behind HFT dominance.

Let me summarize the main points of various essays.

I – The Problem of HFT

The first essay talks about a fundamental problem with HFT. It was a game created by the exchanges and HFT players at the expense of institutional and retail investors. It was like everyone was using checker pieces to play checkers, while the game has been changed to chess and the HFTs were using the queens, rooks, bishops. Traditional investors in order to make alpha were actually playing checkers at a faster speed. But when the game itself had changed, it was a pointless effort.

The HFT business strategy was to work with the exchanges to align the features of the exchange with the features of the algorithmic strategies themselves. The exchanges facing margin pressure and intense competition from other ECNs found that pleasing HFT players and providing “guaranteed economics” was one way to bring trading volumes. Why were exchanges facing crisis? Firstly, the growing practice of internalizing retail order flow off exchange lead to shrinking volumes on the lit markets and secondly the rise of dark pools. Subsequent to these “exchange mis-innovations”, HFT strategies became much more prevalent in the market in 2005-2006 and the top HFTs were able to achieve results comparable to mid-tier market makers. However there was something else that made them rake massive money in the last 5-7 years. What were they? Special order types and corresponding order matching engine features (through artificial and anti-competitive means). Post REG NMS, there were a lot of abusive features introduced in the exchange structure, under the pretext of complying with the regulation. So, instead of creating a fair place, exchanges and HFT players manipulated rules in REG NMS environment. Slowly HFTs became a dominant form of trading and soon enough it became the only game worth playing.

The problem with HFT is not that its basic strategies are illegal or even unethical. The problem with HFT is that these strategies shouldn’t work at their current scale and volume, and have only come to dominate the market through the carefully crafted advantages provided by the electronic exchanges for these specific HFT strategies – advantages such as special order types and preferred order matching engine practices. The problem with HFT is that it amounts to little more than opportunistic skimming that is only possible because HFTs have been accommodated with unfair and discriminatory advantages that assist them in getting an artificial and anti-competitive edge over public customers.

The author suggests an easy solution to this prevailing discrimination amongst various players. If the features that unjustly enrich HFT profitability are eliminated from electronic exchanges, either by regulators or by industry pressure, the adverse impact of HFT activity in the market will rapidly dissipate. HFT strategies will still exist, but their role will once again be limited by their natural scale and volume.

The essay also makes an interesting point that the media, politicians, HFT critics etc. are focusing on wrong issue, “speed”. The actual problem lies in “special order types” and “order handling treatment” that are keeping HFT players in the game. Unless something is done about this, the HFT problem will persist. Currently the bulk of modern HFT volumes are executed with HFT-oriented special order types, accounting for a significant proportion of the total US market volume. Actually the rise of dark pools is a natural consequence of a prolonged neglect of this issue. When the institutional investors lost confidence in the lit markets, they gradually moved to dark pools which are currently about 40 in number.

The crux of the essay is this:

The alpha of HFTs is in the “order type”, and overwhelms the alpha in many quantitative signals and strategies. The sad part is that alpha is not even talked about much in the media in the last 5-7 years. Everyone is focusing on the wrong issue, “speed”. Speed doesn’t help if your order will kicked to the back of the queue because you are not using the abusive special order types.

II – HFT Scalping Strategies

In this essay, the author talks about HFT scalping strategy that is the basis for many strategies in the market place. To understand this strategy, the essay talks about three aspects, i.e. the intentions, the key properties and observable effects. I am adding a fourth aspect in the context of this discussion, i.e. what are the factors that are driving the adoption of this strategy?

Intentions

• Its core intent is, on every round trip trade, to step ahead of supply-and-demand imbalances evident in market depth, and to capture a micro-spread by closing on the other side for a tick or to scratch out by closing on the same side, both of which are favorably subsidized by rebate in the maker-taker market model that is currently prevalent in US equities.
• Manage sweep risk / a large informed swing trade.
• Capture rebates and minimize losses from scratch trades

Key Properties

• High Frequency Turnover – passive scalping of a micro-spread
• Queue Position – a dependence on order rank and order book depth
• Low Latency – precise and timely reaction to market microstructure events. It is default for any electronic strategy in today’s world.
• Exchange Microstructure – usage of special order types and order matching engine features
• Rebate Capture – subsidized costs through “post only” orders and tiered rebates
• Low Risk Tolerance – avoidance of risk and usage of market book depth to reduce risk
• Superior cancellation latency

What’s in Favor of this strategy?

• unfair order handling practices that permit HFTs to step ahead of investor orders in violation of price-time priority
• unfair rebooking and repositioning of investor orders that permit HFTs to flip out of toxic trades
• unfair conversion of investor orders eligible for maker rebates into unfavorable executions incurring taker fees
• unfair insertion of HFT intermediaries in between legitimate customer-to-customer matching
• unfair and discriminatory order handling of investor orders during sudden price movements

Observable effects

• Lot of IOC orders, High order cancellation rates
• Mere algo trading strategies don’t work as HFT players employ hybrid strategies, i.e. use algo strategies and special order types
• Sudden and dramatic withdrawal of liquidity
• Price fluctuations – Quote pulling by major HFT players can create a cascade as other players also pull out of the market
• Market disruptions way beyond what the initial sweep trade might have created
• Popular techniques to limit market impact, such as order slicing and various weighted averaging strategies, can backfire when they interact with HFT scalping strategies employing special order types and market microstructure features.

III – Why HFTs have an advantage?

The author gives the following reasons

• “Spam and Cancel” orders: Rule 610 of REG NMS banned locked markets. This means a bid of \$X in one exchange cannot be displayed if there is an ask quote at \$X at another exchange. The implementation of REG NMS changed the mechanism for achieving queue position in a price-time priority market. The ban on locked markets created a massive number of strategies that were “spam and cancel” type. What exactly are they? If an order entered an exchange that could potentially lock the market, the exchange had a mechanism to slide the order. Immediately HFT guys sniff this situation, cancel their order and retry with a different order so as to get to the top of the order book. The result obviously is that large institutional orders get pushed back and the “spam and cancel” jockey for the top-of-the-book status.
• “Hide and Light” orders: Since exchanges were inundated with “spam and cancel” kind of orders, they created special order types like “Hide and Light” orders which clearly was “guaranteed economics” to the HFT player. He could be hidden at the NBBO and as soon market is unlocked, he gets to be at the top of the queue rather than the guy whose order was price-slided.
• ISO (Intermark Sweep Orders ) like IOC orders
• The DAY ISO

The chapter goes in to pretty elaborate description of these order types and the reasons they provide HFT traders with a massive advantage against the institutional and other traders.

HFT – A systemic issue

This essay is more positive about the recent developments that have happened in the second half of 2012 and hopes that the HFT volumes on the exchanges would go back to something like 20% of the total volume. The developments listed are

• Some exchanges, not all, have been quietly cleaning up since October 2011.
• Some of the more egregious HFT-oriented features appear to have been neutralized through order matching engine modifications.
• Certain HFTs and exchanges are admitting that they no longer have the cozy relationships they once had – this was certainly the case by second quarter 2012 when regulatory scrutiny was heating up.
• Certain electronic exchanges are on the path to becoming more open, transparent, better documented, and less exclusive. NASDAQ’s recent move to provide order type documentation certainly attests to this fact.
• The concept of SRO status of for-profit exchanges appears to be under scrutiny. At the bare minimum, exchanges are certain to be subjected to a higher level of regulatory oversight going forward.
• There appears to be some degree of industry consensus and perhaps regulatory consensus that HFT-oriented electronic exchanges went too far and full market reform is likely untenable

The author makes a few predictions for various market entities in the times to come

• Regulators – We will see regulators strengthen the regulatory oversight of self-regulating
• Exchanges – We will see exchanges eliminate discriminatory practices and enhance their level of disclosure.
• HFTs – We will see HFTs lose advantages and a new emphasis by such firms on market making status and order flow relationships.
• Sell Side – We will see electronic trading desks utilize sophisticated features of exchanges appropriately.
• Buy Side – We will see institutional investors take more responsibility for their execution performance.

The book also contains a paper that talks about Electronic Liquidity Strategy that aims to level the playing field between institutional investors that need to source liquidity and short-term liquidity providers that operate in an opportunistic and discriminatory manner.

REG NMS gave the HFT player a complex jungle of rules that they could exploit. In one of the essays, the author suggests ways to improve this jungle, i.e. National Market system to make the field a level playing one. The book ends with excerpts of Haim Bodek’s interview with various media outfits.

Takeaway :

The book is about 100 pages long and describes the core problem of HFT, i.e. “special order types-order handling treatment”. This has created a complex and fragmented market structure where the only game worth playing is the HFT game. If nothing is done to address this aspect, the author predicts that dark pools will become even more important and investors will lose all the confidence in the lit markets.

This book introduces stochastic calculus in a very intuitive sense not burdening the reader with heavy math from measure theory and functional analysis. The author in the preface says that the reader is deemed to have passed an examination based on the book if he/she can understand the derivation of Black-Scholes that is present in the last chapter of the book. If you pick up any standard text book that proves Black-Scholes , one needs to go through a ton of math before understanding Ito’s lemma that is one of the major tools in deriving the price of a call options. So, the author’s ambitious objective (as stated in the preface) is that the book should teach Ito’s lemma to anyone with a basic understanding of probability and calculus. Does he meet the objective ? Well, you got to poll a large group of readers to get a fair estimate to that question. My opinion is that the author has done a fantastic job of showing the appropriate concepts and tools that one needs to get a hang of, before understanding Black-Scholes PDE. How does the author mange it?

Chapter 1 : Preliminaries

Well, the book starts off with the basic introduction to probability and then quickly moves on to the definition of Stochastic processes and explains some of the ways to categorize them. Subsequently the book goes straight in to the definition of Brownian motion and its properties. Through various visuals, the reader is made aware of the fact that Brownian sample path is a nowhere differentiable function and has unbounded variation on any interval. This makes the standard Riemann integral useless for working with Brownian motion. Along the way the chapter explains the principle behind simulating Brownian motion and some derived processes from BM.

One can’t understand anything in stochastic processes without proper knowledge of Conditional expectation. Presenting the concept of conditional expectation in an intuitive way and at the same time touching upon all the properties and rules is a very tough act. The author starts off with a discrete conditional expectation variable and then takes a big jump by extending similar properties to a continuous conditional expectation variable. Something that is peculiar about conditional expectation random variable is that it has to be guessed!. You need to guess the right form based on some constraints. Since knowing its particular form is not always feasible, one needs rules to work with the variable. The chapter introduces about 7 most commonly used rules of conditional expectation.

The final section of the chapter is on Martingales. Well, Martingales are at the heart of math finance. They constitute an important class of stochastic processes. The author introduces the concept of filtration and gives definitions for continuous time martingales and discrete time martingales. The key aspect of Martingale transformation of a process by a previsible process is defined and explained via an analogy to a gambling fair game.

An interesting historical snippet is mentioned in this chapter. It’s about Karl Weierstrass , who was the first mathematician to come up with nowhere differentiable function. At that time it was considered a mathematical curiosity. But think of Brownian motion, a function that is a nowhere differentiable function. It has moved from the real of pure math to applied math as is now being used in a whole lot of scientific disciplines.

Chapter 2 : The Ito Integral

The author starts from scratch by defining Riemann integral , Riemann –Stieltjes integral, the latter being used to integrate a function with respect to another function. Subsequently he poses a question about the possibility of using Riemann-Stieltjes integral to integrate a function with respect to Brownian motion. The chapter then shows through various established theorems that Riemann-Stieltjes cannot be used. All the while, there is no formal proof for most of the statements here, as that is not the purpose of the book. It is supposed to be non-rigorous and it is so. The chapter then introduces Ito’s integral. For those who always associate value of an integral with a value, it might come as a surprise that Ito’s integral is a probabilistic average. Since one cannot integrate with respect to every Brownian path, all one gets out of Ito’s integral is a probabilistic average.

Now how do you define an Ito’s integral? Mathematicians love building stuff ground up. So, in order to define Ito’s integral, it is first defined for a simple step function process using Riemann-Stieltjes integration framework so that you get a probabilistic average of the integral. Then you define Ito’s integral of a general integrand as a limit of Ito’s integral of the simple processes. This is the thing that you see everywhere in math. If it is difficult to define a complicated object, you define it for a simple object, then use those simple objects to create a general object. For some one who is familiar with Measure theory, this is standard stuff. You define a Lebesgue integral of a non-negative function using Lebesgue integral for simple measurable functions.

Since the Ito’s integral itself is a random process, one can compute the expectation function and covariance function of the process . The chapter shows the beautiful the connection between Ito’s integral and Martingale theory. By using Riemann-Stieltjes integral in a specific way, i.e., the integrand evaluated at the left end point of the time interval, the resulting Ito’s integral becomes a Martingale, thus enabling one to use the rich theory of martingales. One downside is that the classic chain rule and produce rule of calculus no longer applies to Ito’s integral.

One needs tools to solve ODE, PDEs. Same is the case with Ito’s SDE. Ito’s lemma, the most useful tool in solving SDE’s is introduced in the chapter. Various forms of Ito’s lemma are introduced based depending on the case whether the function is driven by Brownian motion, Ito process or a combination of Ito processes. The chapter ends with the introduction of Stratonovich integral that evaluates the integrand in a different way. Despite the integral not being a Martingale, Stratonovich integral retains the classic chain rule and product rule properties and hence its utility.

Thus this chapter connects the concepts of Riemann-Stietljes integral, Ito’s integral, Stratonovich integral Martingales, Martingale transform of a process in such a way that the reader gets a fair idea of the Ito process and its properties. In all the cases, one needs to get used to the fact that the integral values are mentioned in the context of mean-square convergence. Integrals are probabilistic averages is one of the resounding take always from this chapter.

Chapter 3 : Ito’s Stochastic Differential Equations

The chapter starts off by defining a random differential equation and stochastic differential equation and explains the difference between the two using visuals. The first thing to notice is that the Stochastic differential equation is actually in the integral form. Since differentiation for a Brownian path is meaningless, the SDE is represented in the integral form, though one can write it in the differential form as a symbolic version. The author systematically builds up examples, visuals to explain ways to crack an Ito’s SDE.

Obviously there is no single way to crack it. It depends on the type of SDE. Straight forward application of Ito’s lemma to a hypothesized solution and then solving PDEs might work for some cases. Transformations might work for some cases. In this context, the chapter also has an interesting section of using Stratonovich integral. By working through this chapter, a reader is fairly equipped with techniques to solve some elementary to medium level difficult SDEs. The chapter ends with a discussion of numerical analysis of the solution to SDEs. Most of the times one want to get a handle on the distribution function for the solution and in this context the author shows two popular numerical analysis techniques, i.e. Euler’s approximation and Milstein approximation.

Chapter 4: Application of Stochastic Calculus in Finance

If a reader has worked through the first three chapters , i.e. around 170 pages of the book, the author assumes that the section on Black-Scholes derivation should be easy on eyes. My guess is, if a reader puts in some decent effort in going through 170 odd pages, he/she is more or less certain to understand the derivation of Black-Scholes PDE.

Well, even though the author seems to focusing on “Understanding Black-Scholes SDE” as a single test of measuring reader’s understanding, I think this book will open up many paths to a curious reader. May be you will like the way Martingales are introduced and you might decide to spend some time on understanding Martingales. May be you will appreciate that solving a SDE is more like trying out various techniques and you will venture out to understand and master those techniques. May be you will want to think about how to solve a SDE numerically, etc..Well, the list can go on. So, I guess this book will be a pleasure to read for anyone who is interested in knowing about the world of stochastic calculus and wants a pretty non-rigorous introduction. Again non-rigorous is a relative term. Something that is non-rigorous according to the author might be fairly challenging for someone. It goes without saying that this is a must-read for any math-fin student as it provides a superb intuition to formulating and solving SDEs.