November 2011


This book helps one “get up to speed“ with git. With github being the goto repository-master for most of the R projects out there, it has become imperative for any R programmer to have a decent knowledge of git and the know about ways to interact with github. As of today github has 1.1 million programmers hosting close to 3.2 Million repositories. This has surpassed sourceforge long back. So, an R programmer cannot be ignorant of this distributed version control system. Personally I found it a stretch to have a version control system. But once I started using it for a few projects along with ProjectTemplate directory structure , I found that it helped me in carrying out analysis in a better way. So, how do you go about using it? Well, there are many plug-ins to get going on git. Before you start using the plugin, you need to get the vocabulary of git.

There are a lot of terms that might be overwhelming to begin with for a newbie. Here is a sample
tag, stage, tracked, untracked, branch, working tree, master, head, treeish, SHA-1,carrot parent, tilde parent,patch,stash,diff,clone,push, rebase

Even if you use some plug-in like Egit( that I use), you got to understand the basics of git and terminology that goes with git. Git is supremely elegant for use but at the same time, understanding the internals is takes some time. My objective was mainly to use effectively git and hence this book was kind of ideal to begin with. The book starts off by giving some basic principles of distributed version control system( git is an example of DVCS). One of the main use of Git for me is that I am tired of maintaining v_1, v_2 files and my working directory has become very huge. Git forces me to think in versioning and thus relieves me of maintaining the versions manually. Git works based on snapshots , meaning you have the entire repository on your local machine. You no longer check in and check out files. This is a crucial difference between git and other systems. Here’s some history on Git. Git was developed by Linus Torvalds after relationship broke down with Linux community and BitKeeper, a proprietary DCVS. The purpose of developing was to create an open source DCVS with the following goals.

  • Simple design
  • Strong support for non-linear development (thousands of parallel branches)
  • Fully distributed
  • Able to handle large projects like the Linux kernel efficiently (speed and data size)

The first version was released in 2005 and since then, in the last 7 odd years, its popularity is growing wide and far.

So, what are the advantages of using Git

  • Git is fast ( you can do everything from a simple command line)
  • Easy to learn( I don’t know about this as it depends on what you want to learn about git!)
  • Git offers a staging area ( this is immensely useful feature if you have gone through other versioning systems where you can a file is checked in or checked out by a user)
  • GitHub is available for sharing – I guess this is one of main reasons for its growing popularity.

The book then gives a quick tour of commands for DOS and LINUX. It then dives in to the bare minimum stuff that you need to know about git. They are the following

  • git init
  • git add
  • git commit
  • git status
  • git diff : shows the difference between staged files and rest of files
  • git log
  • git diff SHA-1-a SHA-1-b
  • git describe
  • git tag
  • gitk
  • gitbranch

The book then takes to some advanced level git commands that will be helpful in big multi-user projects. The following illustration (Via Scott Chacon’s Pro Git ) gives an idea of the life cycle of a file in git.


Each file in your working directory can be in one of two states: tracked or untracked. Tracked files are files that were in the last snapshot; they can be unmodified, modified, or staged. Untracked files are everything else – any files in your working directory that were not in your last snapshot and are not in your staging area. When you first clone a repository, all of your files will be tracked and unmodified because you just checked them out and haven’t edited anything.

The book shows the various stages a file can be , and the appropriate git commands that need to be used. One way to explore the above life cycle is to add files in interactive mode. In the interactive mode, git shows options like status, update, revert, add untracked, patch, diff, quit and help. Each of these options are useful in a specific scenario depending on the file status.

Some of the commands explained in this section are

  • git diff : shows the difference between staged files and rest of files
  • git diff –cached : shows the difference between files in original repository and working directory
  • git commit –a –m , -a : lets you avoid the staging area and commits every file that was tracked earlier
  • git reset
  • git stash
  • git push
  • git rebase

The book ends with a nice tutorial for a) setting up a repository on github, b) cloning the repository on your local machine and c) working with using various git commands to synch your working directory with the remote repository.

imageTakeaway :

With screencasts and a large font size, this book is a quick read. Perfect for a newbie who wants to understand just enough, to start experimenting with git.


Dan Roam, the author of popular books “The Back of the Napkin” and “Unfolding the Napkin”, has written a new book with a message that one must fuse our linear wordy based thought process with pictures/images/visuals to comprehend things in a better way. Well, this message is not something new. Presentations with visuals catch our attention rather than the boring bullet point slides. To some extent my book summaries might also be boring as they are merely words, although I try to put in some visuals to make it interesting.

This book has three parts. The first part talks about the fact that we are drowning in blah-blah-blahs and in the process losing our clarity of thought. The second part of the book talks about vivid thinking, a jazzy word for using visual and verbal thought process to get to the message/ideas that we ought to understand and the stuff we ought to filter out as noise.


The author uses an analogy of Fox and Hummingbird to distinguish between visual and verbal skills necessary to think. Fox is linear, analytic, patient, clever and smug( representing our verbal mind). Hummingbird is spatial, spontaneous, synthesizer and easily gets distracted (representing our visual mind). This book argues for a balance between Fox and Hummingbird traits.


Why should we care about developing these skills? Well, there is a lot of blah-blah that we get to hear in our lives, be it in corporate presentations, conference calls, customer meetings etc. and most of the times the idea is either complicated, missing or outright misleading. The intent of blah-blah-blah could also be obfuscate or deviate from the original intent. Hence it is imperative to understand the right message from all the blah-blah-blah, and that is where these Fox and Hummingbird skills are useful.

We have all been drilled with Fox Skills/ Verbal skills right from our childhood. However Hummingbird skills have not been taught consciously to most of us and hence we develop a phobia to use visual skills/ images/drawings to understand things. It is merely a phobia, says the author, and tries to offer a framework, i.e visual grammar,to get going on developing these skills. The framework has 6 elemental pictures


  • Portraits are visual representation of nouns and pronouns ( WHO and WHAT )
  • Charts are visual representation of quantity ( HOW MANY)
  • Maps are visual representation of preposition and conjunctions ( WHERE )
  • Timelines are visual representation of tense ( WHEN )
  • Flowcharts are visual representation of complex verbs( HOW )
  • Multivariable plots are visual representation of complex objects – Synthesis of all the above elements answering WHY

The last part of the book shows the reader a way to see the forest from the trees by distilling 7 attributes of a vivid idea. It uses an acronym FOREST to show the essential attributes of an idea. F == form, O == Only the Essentials, R == Recognizable, E == Evolving, S == Span Differences, T == Targeted.

Whether you believe in this structured framework or not, this book will make you realize that visual imagery is imperative for better understanding of the ideas you get to read/write/talk in the verbal/wordy world.


Yesterday was a rather unpleasant day for me. Whenever I am in a bad mood / I become restless I pick up this book by Anne D LeClaire, titled, “ Listening below the noise”. For some reason this book has been my best companion over the past few years. I reread parts of this book from time to time. This time around, I thought I should blog about this and write a few words about the book.

The author starts off by saying, “Like too many of us, I mistook a busy life to a rich one”, and cites her mid-life emptiness as one of the reasons for exploring “silence”. While strolling on the Cape Cod Beach on a warm afternoon in January 1992, she suddenly realizes that “silence” is the missing element in her life and decides to consciously inculcate it in her daily life. She then begins, as a whim, a unique ritual, i.e, maintaining a day of complete silence every other Monday.

clip_image002 Entry :

On the first day of silence, she realizes that her energy levels and productivity levels are far better than a normal day. She begins to wonder whether turning away from outside world is the first step to nourishing the inner world that is so very vibrant in every one of us. The first thing she realizes that silence and mindfulness reduces the habit of multi-tasking things in life. When we are doing something, i.e mindfully doing something, we cannot multi-task. Silence makes one realize that single-tasking is the way to do any work and is a far better and a peaceful alternative to multi-tasking. Is silence necessary for creativity? The author’s first brush with silence makes her feel that silence/solitude is probably a necessary condition to do creative work. In Silence, Our minds are more tuned to the creative energy that lies within us.

Staying in complete silence is not an easy practice, as everyone would agree. The author reflects on one such day when she has an immense urge towards breaking her “silence” tradition. She reveals one of her tricks, i.e, strike a bargain with oneself that it would be only a couple of hours of silence instead of maintaining one full day of silence. This little trick  sometimes becomes useful in centering her. However these bargains are only helpful for a temporary period of time. The quieter we become, the more we hear. What we hear might be unpleasant at times and hence there is a tendency to fill up the silence with noise, i.e songs/TV/Soap-opera/movies/empty chatter. On some level, we are afraid of what we will hear.

When we are in silence, there is one thing we definitely become aware, at least most of us I guess. There is a muddy messy space our inner garbage heap where we toss scraps too painful to consider or confront, the loss, pain, grief and disappointment that are all too real. When we weed out extraneous stimulation and let go of the reins of control, these things claim our attention. And it is silence that allows us the space and stillness in which to think about our motives, to examine our behavior, to see where we’ve fallen short. It is only when we drag our smallest, shabbiest parts into the light that we can move toward becoming whole. We are massively afraid of dealing with this compost. So, in that sense this is the reason we fill our lives with distractions so that we don’t get to meet this compost. However the wise thing to do is to actually confront this compost, to turn it over. For within the same mounds lies the fertile matter out of which new life arises and is nourished.

clip_image002[4] Cultivation :

It is also sometimes necessary to create boundaries in life for silence to walk in to our lives. May be cutting down on TV/ avoiding the idle chatter with someone, consciously weeding out distractions, etc are bare minimum steps that one must take for the silence to make presence in our lives. There is no guarantee that these restrictions on external stimulation would help one to remain silent as there is sometimes that internal chatter that drives away the silence ruthlessly.

Silence is Janus-faced. Like the Roman god Janus, silence holds two faces. To be silenced is not at all the same as choosing not to speak.A chasm lies between the two, as wide as that between fasting for a purpose and starvation. To be silenced is crippling, belittling, constricting, disempowering. Chosen stillness can be healing, expansive, instructive.

Let’s say you decide to have one day of complete silence. What if others in your family pick up a fight for some trivial reason ? The fact that you have chosen not to respond might fill you up with more resentment as you cannot vent out feelings. So , is the silence painful in such situations ? Well, not necessarily says the author narrating one such incident when silence helped in not messing up a situation and thus was invaluable in providing time and space to defuse a situation.

Silence enhances listening ability, i.e, listening to things that are left unsaid. Bird-watchers, Naturalists typically develop a keen sense of listening. One colleague of mine is so excited whenever he sees tigers/butterflies/birds etc. whenever he goes for a Jungle safari. To observe/spot/infer the kind of bird/butterfly, it demands a sense of stillness from a person. Unless you are fully present in the NOW and develop a keen ear, going on a Jungle safari is practically useless. Ability to remain Silent is a great skillset to have in such activities. Similarly even in our daily lives unless we are silent we cannot truly listen to others. We are merely talking and waiting to talk. We engage in chatter, not conversation, and our chatter reveals our egos’ needs: Love me, admire me, envy me, fear me, help me, see me. There is little space for truly hearing others. Silence in that sense enables true conversations to take place.

clip_image002[6] Fertilization :

Silence is the first essential for most of the creative endeavors. Even a creative act like choreography might need the choreographer to visualize the steps in silence and then make his/her troupe follow those steps. Personally I find silence very stimulating , be it for programming something, be it for writing a blog post , be it for practicing music. The attention we can give to work, complete and undistracted concentration does enable us to bring the best with in all of us. Like the solitary spider who busily weaves her web in perfect silence, we need to be alone and quiet for our subconscious to spin its creations. Stillness focuses the brain. And like the tensile strength of the spider’s strands, it buttresses and strengthens creativity. Another beneficial aspect of silence is it produces marked improvement in physical and psychological health. At what age, do we develop a low tolerance for quiet time? When do we begin to call it boredom? When do we begin the excessive yearning for entertainment and diversions? The prospect of loneliness can make us fear solitude and silence. The paradox is that those are the very measures that can heal us. The author concludes this section by pondering over her experiences of remaining silent for an entire week.

clip_image004 Harvest :

After 9 years of practicing this ritual, the author develops sudden resistance to this ritual. She has this intense urge to put an end to this ritual but finally bargains another 2 months with herself. During these two months, she reflects on the benefits of silence through the years . She also remembers her yoga teacher’s words:

Sometimes we go through a time  that looks like a setback but in reality that time is a place of preparation. A resting space. A gathering of energy. Like an archer pulling back the bowstring so the arrow can shoot forward.

She also realizes another benefit that silence bestows on people who practice it, i.e, the inner strength that it provides.Trees growing in a forest are fundamentally weaker and less able to weather wind and storms than ones that stand alone, because the solitary trees, without the shelter provided by the others, develop stronger, deeper roots. After two months of the bargain period, the author realizes that silent-day experiences have helped her develop a  spiritual way of living and is no longer a mere ritual. So ,she decides to continue the practice(till date).

clip_image006 Sowing :

In the final section of the book, the author provides a few ways to incorporate silence in to our daily lives.

I have read this book at least half a dozen times till now and every time it has made me appreciative of the importance of silence and solitude in one’s life. Sometimes when I am restless, I just pick up this book and read a random paragraph. The writing is so beautiful that it brings my restless mind to calmness and allows me to focus on my work.

image        image


The author of “R Inferno”, Patrick Burns, starts off by saying, “If you are using R and you think you’re in hell, this is a map for you”. Well, this fantastic book needs to be read by any R programmer, irrespective of whether he thinks he is in hell or not. The metaphor used in this book is that of journey through concentric circles, each circle representing people (programmers) who are suffering in pain because of “violating the proper programming conduct”. Using this metaphor, the author makes an amazing list of items that one need to keep in mind while programming in R. There is a good discussion on each of the items too. My intent of this post is to merely list down the main points of this book. 

imageCircle – 1: Falling into the Floating Point Trap

  • Be careful with floating point representation of numbers. There will always be numerical errors which are very different between logical errors

 imageCircle – 2: Growing Objects

  • Preallocate objects as much as possible
  • Try to get an upper bound of the vector you will need and allocate the vector before you run any loop. Limit the number of times rbind is used in a loop
  • If you do not know how many elements will get added in each loop, populate the data for each iteration in to a list and then collapse the list in to a data frame
  • R does all the computation in RAM. It means quicker computation but it means that if you are not careful it will eat up all your RAM
  • Error: cannot allocate vector of size 79.8 Mb. This should not be interpreted as “Well, I have X GB of memory and why can’t R allocated 80 MB”. The fact is that R has already allocated the memory efficiently and it has reached a point where it cannot allocate more memory
  • To check the memory that is being used up, generously scatter the code
  • cat(‘point 1 mem’, memory.size(), memory.size(max=TRUE), ‘nn’)
  • memory.size() and memory.limit() gives an account of memory used up and memory that can is still left that can be used

 imageCircle – 3 : Failing to Vectorize

  • Write functions / code which inherently handles vectorized input
  • Vectorization does not mean treating collection of arguments as a vector.
  • min, max, range, sum, and prod take the collection of arguments as the vector. Mean does not adhere to this form mean(1,2,3,4) gives 1 as output whereas min(c(1,2,3,4)) gives the right answer as 2.5
  • Vectorize to have clarity in the code construction.
  • Subscripting can be used as a vectorization tool
  • Use ifelse instead of if to help vectorize your code; vector is not a welcome input in if condition.
  • Use apply/tapply/lapply/sapply/mapply/rapply etc have inbuilt vectorized functions instead of writing loops

image Circle – 4 : Over Vectorizing

  • apply function has a for loop inside. lapply function has a for loop inside. Hence mindless application of these functions is skirting with danger
  • If you really want to change NAs to 0, you should rethink what you are doing – you are introducing fictional data

imageCircle – 5: Not Writing Functions

  • The body of a function needs to be a single expression. Curly brackets convert a bunch of expressions in to single expression
  • Functions can be passed as argument to other functions.
  • allows you to provide the arguments as an actual list
  • Don’t use a list when atomic vector will do
  • Don’t use a data frame when matrix will do
  • Don’t try to use an atomic vector when list is needed
  • Don’t use a matrix when data frame is needed
  • Put spaces between operators and indent the code
  • Avoid superfluous semicolons that you would have been carrying from the old programming languages
  • Rprof can be used to explore which functions are taking most of the time
  • Write a help file for each of your persistent functions.
  • Writing a help file is an excellent way of debugging the function.
  • Add examples while writing a help file and try to use data from the inbuilt datasets package

imageCircle – 6 : Doing Global Assignments

  • Avoid Global assignments ( <<- ). The code is extremely inflexible when global assignments are used.
  • R always passes through value. It never passes by reference.

image Circle – 7 : Tripping over Object Orientation

  • S3 methods make the “class” attribute. Only if an object has “class” attribute, do S3 methods really come to an effect.
  • If Generic functions take S3 class as an argument, it searches the S3 class with the function which matches the name of the generic function and executes it
  • getS3method(“median”,”default”)
  • Inheritance should be based on similarity of the structure of the objects , not based on similarity of concepts. Dataframe and matrix might look similar conceptually, but they are completely different as far as code reuse is concerned. Hence inheritance is useless between matrix and dataframe
  • There is multiple dispatch in S4 objects
  • UseMethods creates an S3 generic function
  • standardGeneric creates S4 function. More strict guidelines for S4 class object
  • In S3 the decision of what method to use is made in real-time when the function is called. In S4 the decision is made when the code is loaded into the R session. There is a table that charts the relationships of all the classes.
  • Namespaces : If you have two functions with the same name in two different packages, namespace allows you to pick the right function.
  • A namespace exports one or more objects so that they are visible, but may have some objects that are private.

imageCircle – 8 : Believing it does as intended


In this circle there are ghosts, chimeras and devils that inflict the maximum pain

clip_image002 Ghosts

  • browser(), recover(), trace(), debug() are THE most important functions in R debuggin
  • always use prebuilt nullcheck functions such as is.null ,
  • objects have one of the following as atomic stogarge modes:logical, integer, numeric, complex, character
  • == operator and %in% operator – Their importance and relevance
  • Sum(numeric(0)) is 1 and prod(numeric(0)) is 1
  • There is no median method that can be applied to data frame.
  • match only matches first occurrence
  • cat prints the contents of the vector . while using cat you must always add a newline as by default it doesn’t have one.
  • cat interprets the string whereas print doesn’t
  • All coercion functions strip the attributes from the object
  • Subscripting almost always strips almost all attributes
  • Extremely good practice to use TRUE and FALSE rather than T and
  • sort.list does not work for lists
  • attach and load put R objects on to the search list. Attach creates a new item in the search list while load puts its content in the global environment, the first place in the search list. source is meant to create objects rather than loading actual objects
  • If you have a character string that contains the name of an object and you want the object, then one uses get function
  • If you want the name of the argument of the function, you can use deparse(substitute(arg_name))
  • If a subscript operation is used on an array , it becomes a vector not a matrix. If you use drop=FALSE , the attribute is kind of preserved
  • Failing to use drop=FALSE inside functions is a major source of bugs.
  • The drop function has no effect on a data frame. Always use drop=FALSE in the subscripting function
  • rownames of a data frame are lost through dropping. Coercing to a matrix first will retain the row names.
  • If you use apply with a function that returns a vector, that becomes the first dimension of the result. I came across this umpteen number of times in my code and I just used to have the result transposed.
  • sweep function is a very useful function that is not emphasized much in the general r literature floating around
  • guidelines for list subscripting
    • single brackets always give you back the same type of object
    • double brackets need not give you the same type of object
    • double brackets always give you one item
    • sungle brackets can give you any number of items
  • c function works with lists also
  • for(i in 1:10) i does not print anything . The problem is that no real action is involved in the loop. You must use instead print(i)
  • use of seq_along or seq(along=x) is always better
  • iterate is sacrosanct. Never knew about this earlier. This statement means that if you have a for loop with index on i and then you change the value of i in the loop, it does not effect the global counter of the loop
  • R uses dynamic scoping rather than lexical scoping

clip_image004 Chimeras

  • factor : Factors are an implementation of the idea of categorical data, Class attribute is “factor” , “levels” attribute has a character vector that provides the identity of each category
  • factors do not refer to numbers. as.numeric() typically gives numbers that has nothing to do the factors
  • subscripting does not change the levels of the factor . Use drop=TRUE to drop the levels that are not present in the data.
  • Do not subscript with factors
  • There is no c for factors
  • Missing values makes sense in factors and hence there can be level NA for a factor
  • If you want to convert data frame to character, it is better to convert to a matrix and then convert to a character
  • X[condition,] <- 999 Vs X[which(condition),] <- 999. What’s the difference ? The latter treats NA as false while the former doesn’t
  • There is a difference between && , &. Similarly || , |. The latter is used in vector comparisons and former is used for a single element. Use & | in ifelse condition and && || in if condition.
  • An integer vector tests TRUE with is.numeric. However as.numeric() changes the storage mode to double
  • Be careful to know the difference between max and pmax
  • all.equal and is.identical are two different functions altogether.
  • = is not a synonym of <-
  • Sample has helpful feature that is not always helpful. Its first argument can be either the population of items to sample from, or the number of items in the population.
  • apply function coerces a dataframe in to matrix before the application. Its better to use lappy instead of apply to keep the attributes of dataframe intact.
  • If you think something is a data.frame or a matrix, it is better to use x[,”columnname”]
  • names of a dataframe are the column names while names of a matrix are the names of the individual elements
  • cbind with two vectors gives a matrix , meaning, cbind favors matrices over data.frames
  • data.frame is implemented as a list. But not just any list will do – each component must represent the same number of rows.


  • read.table creates a data.frame
  • colClasses to control the type of input columns that are imported
  • use strip.white to remove extraneous spaces while importing files
  • scan and readLines function to read files with irregular data format
  • Instead of storing data in a file, retrieving the file back to R , it is better to save the object and attach/load the object as and when required
  • Function given to outer must be vectorized
  • can be used to access … in the argument of the function
  • R uses lazy evaluation. Arguments to functions are not evaluated until they are required
  • The default value of an argument to a function is evaluated inside the function, not in the environment calling the function
  • tapply returns one dimensional array which is neither a vector nor a matrix
  • by is a pretty version of tapply
  • When R coerces from a floating point number to an integer, it truncates rather than rounds
  • Reserved words in R are if , else , repeat , while , function , for , in , next , break , TRUE , FALSE ,NULL , Inf, NaN, NA, NA_integer, NA_real, NA_complex, NA_character
  • return is a function and not a reserved word
  • Before running a batch job, it is better to run parse on the code and check for any errors.

imageCircle – 9 : Unhelpfully seeking help

This circle gives some guidelines in the context of posting queries in various R help forums.



This is my favorite book on R. Any R programmer at whatever level of expertise he/she is at, journey through these circles, would certainly make them a better programmer, and their present / future pain of debugging their R code less traumatic.


I tend to read books from the Fabozzi factory Smile, not to get mathematical rigor in a subject but to get an intuitive understanding of the stuff. Overtime, this approach has helped me managed my expectations from Fabozzi books. Having never worked on Black-Litterman model till date, I wanted to get some intuition behind the model. With that mindset, I went through the book. The book is divided into four parts. Part I covers classical portfolio theory and its modern extensions. Part II introduces traditional and modern frameworks for robust estimation of returns. Part III provides readers with the necessary background for handling the optimization part of portfolio management. Part IV focuses on applications of the robust estimation and optimization methods described in the previous parts, and outlines recent trends and new directions in robust portfolio management and in the investment management industry in general. The structure of the book is pretty logical. Starting off with the basic principles of modern portfolio management, one moves on to understanding estimation process, and then understanding the optimization process and finally exploring new directions.


PART I – Portfolio Allocation : Classic Theory and Extensions

Mean Variance Analysis and Modern Portfolio Theory

Investment process as seen from Modern Portfolio management


Mean variance framework refers to a specific problem formulation wherein one tries to minimize risk for a specific level of return OR maximize return for a specific level of risk. The objective function usually is set as a quadratic programming problem where the constraints are formulated for asset weights. These constraints typically specify the target risk and the bounds on asset weights. If there is no target return constraint specified, the resulting portfolio is called the global minimum variance portfolio. Obviously there is a limit to the diversification benefits that an investor can seek. Just by adding limitless number of assets, the portfolio risk does not vanish. Instead it approaches a constant

Minimizing variance for a target mean is just one type of formulating fund manager’s problem. There can be other variants, depending on the fund manager’s context and fund objective. Some of the alternative formulations are explored in this chapter such as expected return maximization formulation, Risk aversion formulation, etc.

The efficient set of portfolios available to investors who employ mean-variance analysis in the absence of risk-free asset is inferior to that available when there is a risk-free asset. When one includes a risk-free asset, the asset allocation is reduced to a line, called “Capital Market Line”. The portfolios on this line are nothing but some combination of risk-free asset and minimum variance portfolios. The point at which the line hits the frontier is called “tangency portfolio”, also called the market portfolio. Hence all the portfolios on the CML line represent combination of borrowing or lending at the risk-free rate and investing in the market portfolio. This property is called the “separation”.
Utility functions are then introduced to measure the investor’s preferences for a portfolio with a certain risk and return characteristics.

If we choose the quadratic utility function, then the utility function maximization coincides with mean variance framework.

Advances in the theory of Portfolio Risk measures

If investor’s decision making process is dependent beyond the first moments, one needs to modify or work with an alternative framework than the mean variance framework. There are two types of risk measures explored in this chapter, Dispersion measures & Downside measures. Dispersion measures are measures of uncertainty. In contrast to downside measures, dispersion measures consider both positive and negative deviations from the mean, and treat those deviations as equally risky. Some common portfolio dispersion approaches are mean standard deviation, mean-variance, mean-absolute deviation, and mean-absolute moment. Some common portfolio downside measures are Roy’s safety-first, semi variance, lower partial moment, Value-at-Risk, and Conditional Value-at-Risk. In all these methods, some variant of utility maximization procedure is employed.

Portfolio Selection in Practice

There are usually constraints imposed on the mean variance framework. This chapter gives a basic classification of the constraints, which are

  • Linear and Quadratic constraints
  • Long-Only constraints
  • Turnover constraints
  • Holding constraints
  • Risk factor constraints
  • Benchmark Exposure and Tracking error constraints
  • General Linear and Quadratic constraints
  • Combinatorial and Integer constraints
  • Minimum-Holding and Transaction size constraints
  • Cardinality constraints
  • Round lot constraints

Portfolio optimization problems with minimum holding constraints, cardinality constraints or round lot constraints are so-called NP-complete. For the practical solution of these problems heuristics and approximation techniques are used.

In practice, few portfolio managers go to the extent of including constraints of this type in their optimization framework. Instead, a standard mean-variance optimization problem is solved and then, in a “post-optimization” step, generated portfolio weights or trades are pruned to satisfy the constraints. This simplification leads to small, but often negligible, differences compared to a full optimization using the threshold constraints. The chapter then gives a flavor of incorporating transaction costs in asset allocation models. A simple and straightforward approach is to assume a separable transaction cost function that is dependent only on the weights of the assets in the portfolio. An alternate approach is to assume a piece-wise linear dependence between the assets and trade size. As such the models might appear complex, but this area is well explored in the finance literature. Piece-wise linear approximations for transactions cost are solver friendly. There is an enormous breadth of material that is covered in this book and it sometimes is overwhelming. For example the case of multi-account optimization does sound very complex to understand in the first reading.

PART II – Robust Parameter Estimation.

Classical Asset Pricing

This chapter gives a crash course on simple random walks, arithmetic random walks, geometric random walks, trend stationary processes, covariance stationary processes, etc. that one sees in the finance literature.

Forecasting Expected Risk and Return

The basic problem with mean variance framework is that one uses historical data estimates of mean return and covariance to use it in the asset allocation decisions for the future. There is a forecast that is built in to plain vanilla mean variance framework. Ideally an estimate should have the following characteristics

· It provides a forward-looking forecast with some predictive power, not just backward looking historical summary of past performance

  • The estimate can be produced at a reasonable cost
  • The technique does not amplify errors already present in the inputs used in the process of intuition
  • The forecast should be intuitive, i.e, portfolio manager or the analyst should be able to explain and justify them in a comprehensible manner.

It is typically observed that sample mean and sample covariance have a low forecasting power. It is obvious as the asset returns are typically the realization of a stochastic process and in fact could a complete random walk as some believe it to be and vehemently argue for the same.

The chapter goes on to introduce basic dividend discount models, which I think are basically pointless and dumbest way to do formulate estimates. Using sample mean and covariance is also fraught with danger. Sample mean is a good predictor only for distributions that are not heavy-tailed. However most of the financial time series are heavy tailed and non-stationary and hence the resulting estimator has a large estimation error. In my own little exercise that I have done using some assets in the Indian markets, I found that

  • Equally weight portfolios offer a comparable advantage to using asset allocation using sample mean and sample covariance
  • Uncertainty of mean completely plays a spoil sport in efficient asset allocation. It is sometimes ok to have an estimation error in covariance, but estimation error in mean kills the performance of asset allocation
  • Mean variance portfolios are not diversified properly. In fact one can calculate zephyr drift score for the portfolios and one can easily see that there is a considerable variance in the portfolio composition.

What are the possible remedies that one can use?

For expected returns, one can use a factor model / Black-Litterman model / Shrinkage estimators. Remedies can easily be talked about, but the implementation is difficult. Black-Litterman needs views about assets, views about the error estimates of the views. So, you take these subjective views and then try to build a model based on Bayesian statistics. In the real world for a small boutique shop offering asset allocation solutions, who provides these views? How to put weights on these opinions? I really wonder the reason behind this immense popularity of Black-Litterman model. Can it really outperform a naïve portfolio? My current frame of mind suggests it might not beat a naïve 1/N portfolio.

For Covariance matrix, there are remedies too, but the effectiveness is context specific and sometimes completely questionable. Sample covariance matrix is essentially a non-parametric estimator. One can put a structure for the covariance matrix and then estimate it. Jagannathan and Ma suggest using portfolios of covariance matrix estimators. Shrinkage estimator is another way to work on it. The crucial thing to ponder upon is ,” If the estimator is a better estimate , does it actually give a higher out-of sample Sharpe ratio ? “.The chapter then talks about Heteroskedasticity and Autocorrelation consistent covariance matrix estimation. It talks about the treatment for covariance matrix under varying lengths, increasing the sampling frequency etc. Overall this gives the reader a broad sense of estimators that can be used for covariance matrix. Chow’s method is mentioned where the covariance matrix is built using the two covariance matrices ,one in quieter period and another in noisy period. Well, how to combine these matrices could be as simple as relying on intuition , or complex way by considering a markov chain. Another remedy suggested is in using a different volatility metric , i.e one of the downside measures such as semi-variance etc. .

Random matrices are introduced to show the importance of dimensionality reduction. It can be shown in cases where the asset universe is large, there are a few eigen values which dominate the rest. Hence random matrix theory makes a case for using factor based model like APT. Again factor models can themselves be of multiple varieties. They can be statistical factor models, macroeconomic factor models, fundamental factor models etc. Selecting the “optimal number of factors” is a dicey thing. The selection criteria should be in such a way that it reduces estimation error and bias. Too less factors decreases estimation error but increases bias. Too many factors increase estimator error but decrease bias. So, I guess the ideal number of factors and type of factors to choose is more an art than science. One of the strong reasons for using a factor model is the obvious dimensional reduction for a large asset universe allocation problem.

The chapter also mentions various methods for estimating volatility such as modeling it based on the implied vols of the relevant options,use clustering techniques, use GARCH methods or use stochastic vol methods. One must remember though that most of these methods are relevant in the Q world (sell side) and not the P world( buy side).

Robust Estimation

Robust estimation is not just a fancy way of removing outliers and estimating parameters. It is a fundamentally different way of estimation. If one is serious about understanding it thoroughly, it is better to skip this chapter as the content merely does a lip service to the various concepts.

Robust Framework for Estimation: Shrinkage, Bayesian Approaches and Black-Litterman Model

Estimation of expected returns and covariance matrix is subjected to estimation error in practice. The portfolio weights change constantly and resulting in the considerable portfolio turnover and sub-optimal realization of portfolio returns. What can be done to remedy such a situation?

There can be two areas where one can improve things. On the estimation side, one can employ estimates that are less sensitive to outliers, estimators such as shrinkage estimators, Bayesian estimators. On the modeling side, one can constrain portfolio weights, use portfolio resampling, or apply robust or stochastic optimization techniques to specify scenarios or ranges of values for parameters estimated from data, thus incorporating uncertainty in to the optimization process itself.

The chapter starts off by talking about the practical problems encountered in Mean-Variance optimization. Some of them mentioned are

  • Sensitivity to estimation error
  • Effects of uncertainty in the inputs in the optimization process
  • Large data requirements necessary for accurately estimating the inputs for the portfolio optimization framework

There is a reference to a wonderful paper, “Computing Efficient Frontiers using Estimated Parameters”, by Mark Broadie. The paper had so many points clearly mentioned, that I feel like listing down them here. Some of them are intuitively obvious, some of them I understood over a period of time working with the data and some of them are nice learning’s for any MPT model builder.

  • There is a trade-off between Stationarity of parameter values Vs. length of data that needs to be taken, to get good estimates. If you take too long a dataset, you run the risk of non-stationarity of the returns. If you take too short a dataset, then you run the risk of estimation error
  • One nice way of thinking about these things is to use 3 mental models
    • True efficient frontier – based on true unobserved return and covariance matrix
    • Estimated Efficient frontier – based on estimated return and covariance matrix
    • Actual frontier – based on the portfolios on estimated efficient frontier and using the actual returns of the assets
  • One can draw these frontiers to get an idea of one basic problem with Markowitz theory – error maximization property. The assets which have large positive error for returns, large negative errors for standard deviation and large negative errors for correlation tend to get higher asset weights than they truly deserve
  • If you assume a true return and covariance matrix for a set of assets, draw the three frontiers, it is easy to observe that , minimum variance portfolios can be estimated more accurately than maximum return portfolios.
  • To distinguish between two assets having a distribution of (m1,sd1) and (m2,sd2) requires a LOT of data and hence identifying the maximum return portfolio is difficult. However distinguishing securities with different standard deviations is easier than distinguishing with different mean returns
  • Estimate of mean returns have a far more effect on frontier performance as compared to estimate of cov. If possible, effort should first be focused on improving the historical estimates of mean returns.
  • To get better estimates of actual portfolio performance, the portfolio means and standard deviations reported by the estimated frontier should be adjusted to account for the bias caused by error maximization effect
  • The greatest reduction of errors in mean-variance analysis can be obtained by improving historical estimates of mean returns of securities. The errors in estimates of mean returns using historical data are so large that these parameter estimates should always be used with caution.
  • One recommendation for practitioners is to use historical data to estimate standard deviations and correlations but use a model to estimate mean returns
  • A point on the estimated frontier will, on average have a larger mean and a smaller standard deviation than the corresponding point on the actual frontier.
  • Due to error maximization property of mean-variance analysis, points on the estimated frontier are optimistically biased predictors of actual portfolio performance.

This paper made me realize that I should be strongly emphasizing / analyzing the actual frontier Vs. estimated frontier, instead of merely relying on quasi stability of frontiers across time. I was always under the impression that all that matter is frontier stability. Also from an investor’s standpoint, I think this is what he needs to ask any asset allocation fund manager – “Show me the estimated frontier vs. actual frontier for the scheme across the years”. This is a far more illuminating than any numbers that are usually tossed around.

clip_image004 clip_image006

Using a few assets I looked at Estimated and True Frontier in two separate years, one in bull market and one in bear market. In the first illustration, the realized frontier is at least ok, but in the second case, the realized frontier shows pathetic results. It is not at all surprising that the frontier works well for minimum variance portfolios but falls flat for maximum return portfolio. This means “return forecast” is key to getting good performance for high risk profile investors.

Also, one can empirically check that errors in expected return are about 10 times more important than errors in covariance matrix, and errors in variance are twice more important than errors in covariance.

Shrinkage Estimators

One way to reduce the estimation error in the mean and covariance inputs is to follow either shrink the mean ( Jorion’s method) / shrink the cov( Ledoit and Wolf) method. I have tried the latter but till date have not experimented with shrinking both the mean and covariance at once. Can one do it ? I don’t know. Shrinking covariance towards a constant correlation matrix is what I have tried before. Estimator with no structure + Estimator with lot of structure + shrinkage intensity – These three components are essential for shrinkage of estimated mean. Ledoit and Wolf method shrinks the covariance matrix and compare the performance with sample covariance matrix, statistical factor model based on principal components, and a factor model. They find that shrinkage estimator outperforms other estimators for a global minimum variance portfolio.

Bayesian Approach via Black Litterman Model

Since expected returns are so much more important in the realized frontier performance, it might be better to use some model instead of sample mean as an input to the mean-variance framework. Black-Litterman model is one such popular model that combines views with prior distribution and gives the portfolio allocation for various risk profiles. If one has to explain Black-Litterman in words, it can be done this way – You want to strike a middle ground between market portfolio and “views”. You have an equation for market portfolio and you have an equation for views. You combine these equations and form a Generalized Linear Model, estimate the asset returns. These returns turn out to be a linear combination of market portfolio returns and expected return implied by investor’s views. One key thing to be understood is that you don’t have to views on all the assets in the considered asset universe. Even if you have views on few assets, that will change the expected return for all the assets.

However I am kind of skeptical about this method as I write. “Who in the world can give me views about the assets in the coming year?” If I poll a dozen people, there will more than dozen views with confidence intervals for their views. I have never developed a model with views till date. But as things change around me, it looks like I might have to incorporate views in to asset allocation strategies.

Part III of the book is a list about optimization techniques and appears more like a math book than a finance book .For an optimization-newbie, this chapter is not the right place to start. For an optimization-expert, the chapter will be too easy to go over. This chapter then is targeted towards a person who already knows quite a bit about optimization procedures and wants to know different viewpoints. Like most of the Fabozzi books that try to attempt at mathematical rigor and fail majestically, this part of the book throws not surprise. Part IV of the book talks about application of robust estimation and optimization methods.

imageTakeaway :

Like many Fabozzi series books, the good thing about this book is that it has an extensive list of references to papers/ thoughts/opinions expressed by academia & practitioners about portfolio management. If the intention is to know what others have said/done in the area of portfolio management, then the book is worth going over.