August 2013


image

If you interview some of the brightest minds in the world and get to know about their productivity hacks (sometimes their role model’s hacks), and compile all of their ideas in to a format that is easily digestible, you get this book.

Nothing in this book is really something that one would not have come across. But  it is easy to forget hacks that make our lives productive. Sometimes reading other’s work habits can create awareness of the way we go about doing our work.

I think such awareness from time to time is essential and books such as these provide that opportunity, to pause and reflect. The book is definitely worth a one time read or may be a one time listen if you are audio book types person.

image

I have not been tracking many developments in the Python world for various reasons. Recently I stumbled on to this book and learnt that a ton of things have happened since the last version of my IPython installation. In the last one year or so, it has found a very strong community of pythonistas and is being used by professors in their classrooms. ipynb is turning out to be a format for submitting  programming assignments. With the nbviewer, all one has to do is to create a gist on github and anybody can view the notebook over the web. Here’s a Michigan university Professor narrating his experiences of using IPython in his class. Here’s someone writing about his experience of using it in a class at Columbia, All these are great developments show the general acceptance of Python as a language.  Recently I came to know that CBSE (India) has replaced C++ with Python for class XI students for 2013-2014 curriculum.

I guess it’s a matter of time before IPython will be the default environment for many python programmers, be it a beginner or an expert. Pandas creator Wes Mckinney goes gaga over IPython and says it is his primary mode of code development.  IPython is also being increasingly used by speakers at various conferences to share the code relevant to their talks.

Given the above context, this book is a great welcome. Instead of scouring tidbits of information/hacks from support forums, blog posts, user manuals, you get to know all the stuff about IPython at one place. The book is about 130 pages long and highlights everything that is possible using IPython.  The book starts off with a list of packages that are useful for data exploration. I learnt about Anaconda,  a product from Continuum Analytics, a firm founded by NumPy creator. It is such a big relief I must say. All the important packages are bundled up in to one executable and you are done with one installation. The packages used for various examples in the book are NumPy, SciPy, Matplotlib, NetworkX, Pandas, PIL, PySide, PyQt, Cython, Basemap, Numba. After the chapter on installation, the book goes on to explain ten essential aspects of IPython

  1. Running the IPython console
  2. Using it as a System shell
  3. Using the history _ , __, ___contain the last three output objects,_i, _ii,_iii contain the last three input objects respectively,_n and _in return the nth output and input history
  4. Tab completion
  5. Executing a script with the %run command
  6. %timeit  for benchmarking running times
  7. Quick debugging with `%debug` command
  8. %pylab fires up scientific computing capabilities.
  9. Using IPython notebook
  10. Customizing IPython

A sample dataset is taken up for analysis and the following features of IPython are explained :

  • Navigating through the file system
  • Exploring the history of all the commands entered
  • Import code from python script to IPython
  • Export code from IPython to notebook
  • Tab completion
  • Profiler
  • Literate Programming using Markdown
  • Graph Plotting
  • LaTeX support – I love this one

There is a brief chapter on NumPy, another one on Matplotlib. On a side note, I was so much away from tracking stuff in the Python community that I did not even know that Matplotlib creator, John Hunter, died last year battling with cancer. He will always be remembered for making available an awesome graphing library.

There is also a chapter on Parallel processing and another one on customizing IPython environment. So, despite being just 130 pages long, there is a lot of stuff that is packed in to the book.

image

I firmly believe that when you are trying to learn something, it is always “easy come, easy go”.  It is also applicable to other aspects like love, friendship etc. A friend who seems to come in to your life effortlessly also fades out of your life quickly. So, is the case with love, I guess. 

This book is total crap. The author gives a sermon on how to learn things in the first twenty hours. Most of what he writes is taken from books that I have already come across. Nothing is original. Amazing really..people can’t even bring originality in crap!. As though he has to justify his crap, he mentions about using his methods to learn Yoga, Touch typing, Go, Ruby programming, instrument called Ukulele and windsurfing.

Well, I can’t comment on things like windsurfing, touch typing, Go as I have no clue how easy or how difficult they are.  However I have had experience with Ruby and I play an instrument. So, I can at least relate to those activities. Reading author’s experiences about those two activities, I felt that the author has done a great disservice by presenting them as something that can learnt quickly. Trivializing such things is ridiculous. Why do I call it trivializing? Because in either of the cases, be it learning an instrument or programming, you basically skim the surface and that’s about it. Spending 20 hours on an instrument, you will probably learn the most popular song and hit a few notes here and there. That’s about it. It will help you pose as though you have learnt an instrument . Far from it, you will know from inside that you are a fraud and you fear that people will soon figure it out that all you know are a few songs and nothing else. In the process your experience with the instrument is shallow.

In the case of programming it is even worse. Knowing syntax of a language, typing a few commands and accomplishing a few tasks is something that you can’t even pose for a long time. Programming languages like Ruby have a false appeal that they are easy to learn. Yes, but sometimes such languages are like models. They have a limited shelf life or should I say ramp life. Try to put in 20 hours in C++ and you will be swimming in an ocean of concepts. Your head will start spinning and you either give up or become tenacious and spend days and days on it to learn it deeply. I notice that the author had the audacity to put “Anything Fast ” in the subtitle. One thing is for sure, his clientele, i.e. readers are going to forget him FAST.

imageimage

This book takes a rather difficult topic, “algorithmic complexity”, and explains it in a way that any reader with a bit of curiosity towards algorithmic world can understand most of its contents. This is actually not a book in the traditional sense of it. ETH Zurich offered a public lecture series called, “ The Open Class – Seven Wonders of Informatics” in the fall of 2005 and this book has been written based on those lecture series. 

To begin with, the title needs a bit of explanation. The phrase “From Knowledge to Magic” is meant to represent the transition from a deterministic algorithmic world to a randomized algorithmic world, a transition from a world of bits and bytes to a world of molecules and DNA, a transition from something that we know that is currently possible in to something that appears magical. What are randomized algorithms and deterministic algorithms ? In fact what are algorithms ? How does one compute using molecules ? What is an NP-hard problem ? These and many more such questions answered in the book. Do you know that a particular version of Traveling Salesman problem that is considered as NP-hard has been solved by a DNA computer ? I didn’t know this fact until I read this book. The book starts off with the basic definition of “algorithm” and takes a reader all the way to visualizing a few scenarios that could represent the world of computing 20-30 years from now, may be even earlier.

A Short Story about the development of Computer Science

OR

Why Computer Science Is Not a Computer Driving Licence

The chapter starts off with a question, “Why does public opinion equate the facility to use specific software packages to computer science , while most of us clearly distinguish basic research in physics vs. the technical application in electrical engineering?”.  As an answer to “What is Compute Science?”, the author writes

The main problem with answering this question is that computer science itself does not provide a clear picture to people outside the discipline. One cannot classify computer science unambiguously as a metascience such as mathematics, or as a natural science or engineering discipline. The situation would be similar to merging physics and electrical and mechanical engineering into one scientific discipline. From the point of view of software development, computer science is a form of engineering, with all features of the technical sciences, covering the design and development of computer systems and products. In contrast, the fundamentals of computer science are closely related to mathematics and the natural sciences. Computer science fundamentals play a similar role in software engineering as theoretical physics plays in electrical and mechanical engineering. It is exactly this misunderstanding of computer science fundamentals among the general public that is responsible for the bad image of computer science.

Any scientific discipline is built on certain notions. These notions are like self-evident truths and are called axioms. If you take probability theory, it took more than 2 centuries of dabbling until it was described in a consistent mathematical language. Is Probability theory water tight ? Yes, in a way because everything flows from the basic axioms, No because we take axioms for granted. For example, the quantum physics world is vastly different and maybe there is a need to relook at the entire theory. It might seem that giant scientific disciplines stand on these axioms and can fall off any time. The scientific researchers in fact are always looking to take a crack at axioms and other notions of a scientific discipline because it might result in a better understanding of the entire subject.

It was David Hilbert who pushed the notion of cause and effect in Mathematics and strove for the ideal world of water tight and perfect mathematical framework. In 1931, Kurt Godel definitively destroyed all dreams of building such a perfect mathematics. Basically Godel’s work says that building mathematics as a formal language of science is an infinite process. The author writes,

The result of Godel were responsible for the founding of computer science. Before Godel nobody saw any reason to try and give an exact definition of the notion of a method. Such a definition was not needed, because people only presented methods for solving particular problems. The intuitive understanding of a method as an easily comprehensible description of a way of solving a problem was sufficient for this purpose. But when one wanted to prove the nonexistence of an algorithm (of a method) for solving a given problem, then one needed to know exactly (in the sense of a rigorous mathematical definition) what an algorithm is and what it is not. Proving the nonexistence of an object is impossible if the object has not been exactly specified.

Hence there was a need for formal definitions in Computer Science. The first formal definition of an algorithm was given by Alan Turing in 1936 and later further definitions followed. This definition made a demarcation between the problems that can and cannot be solved using an algorithms/ computers.  Thanks to the exact definition of what an algorithm is, one was able to investigate the borderline between the automatically solvable and unsolvable.

Algorithmics

OR

What Have Programming and Baking in Common?

This chapter explains the meaning of algorithm via a simple example – “baking a cake”. Since the book is based on public lectures , it is no surprise that the content goes in great detail to explain in layman terms as to what happens exactly with in the internals of the computer when a program is executed.

Infinity is not equal to Infinity Infinity Is Not Equal to Infinity,

OR

Why Infinity Is Infinitely Important in Computer Science

The chapter gives a background to the wonderful concept of infinity and gives a good explanation for the following questions :

  • What does it mean for a set to be infinite ?
  • How does one compare two infinite sets ?
  • Are there different sizes of infinities ?
  • How does on prove that set of real numbers is a greater set (higher cardinality) than the set of rational numbers ?

The point of this chapter is to equip the reader with the “not so intuitive” nature of infinity so that he/she can grasp the concepts around algorithmically solvable problems and unsolvable problems.

Limits of Computability,

OR

Why Do There Exist Tasks That Cannot Be Solved Automatically by Computers

How many programs can ever be written? The author shows that this question can be answered by matching every program that can ever be written to a natural number. Hence the first basic principle to be grasped is that the cardinality of the set containing all the programs is same as the cardinality of N. Algorithms are nothing but programs that run in finite time. So, they are a proper subset of programs set.

The chapter introduces the first problem that no algorithm can solve. The task is called Problem(c) that takes a natural number n and outputs a number c up to n decimal digits after the decimal points. Since the cardinality of R is greater than N and there are |N| programs, and there exists a c for which there is no algorithm that can be written. So, the same Cantor’s diagonalization is used to show that there is a problem for which no algo exists.

Obviously who is interested in a task such as Problem (c). So, the author takes on a practical problem, a decision problem, that cannot be solved by any algo. Simply stated, a decision problem is one that takes a natural number n as input and it outputs YES if n belong to a certain set M and NO if n does not belong to a set M. On the face of it, it seems a straightforward task and that a simple algo can do the job. For example if the set M is the set of even numbers, given a number n, it is easy to decide whether it falls in to the set of even numbers. Similarly if M is a set of primes, a few lines of code is enough to check whether n falls in the set of primes. However the fact is that this simple problem is devious and no algo can solve this problem. The proof of this statement lies in doing a nifty trick on the Diagonalization argument. You have got to read through the proof to understand its magic. Thus the author shows the first practical problem, a decision problem that cannot be algorithmically solved. This is denoted by (N, M(DIAG)). Reduction method is used to propagate this hard problem in to a number of problems that cannot be solved algorithmically. By the end of the chapter, the reader gets a fair idea that all non trivial semantic questions like the following are not algorithmically solvable

  • What does a given program compute? Which problem does the program solve
  • Does the program developed solve the given problem?
  • Does a given problem halt on a given input or does a program always halt ?
  • Does a given program accept a given input ?

Complexity Theory,

OR

What to Do When the Energy of the Universe Doesn’t Suffice for Performing a Computation?

As the research on the problems that were algorithmically solvable or not, reached its maturity, the next issue that was dealt by research scientists was, “How does one classify the problem as practically solvable or unsolvable?” First one must get some clarity on what does “practically solvable” mean. The chapter defines time complexity and space complexity of an algorithm and uses these concepts to explain the type of algorithms that are not practical. An extreme case would be to think in terms of age of universe that is less than 1018 seconds. Let’s say you have been running an algo on your laptop, 1 Gig machine since the start of universe. Basically you have let the computer do 1018 times 109 instructions , i.e. 1027 instructions . If the algo is of time complexity n! or 2n, then the algo can at most work on a few hundred numbers only. So, any algo whose time complexity is exponential or factorial can be safely assumed to be practically unsolvable. So, what are the kinds of problems that can be deemed practically solvable ? There is no formal proof for this, but scientist have agreed that if the time complexity of an algorithm to solve a problem is of polynomial order, then the problem can be practically solved.

There are two reasons for acceptance of this definition. First is a practical one. It has always been the case that if a polynomial running time algo was devised, people have always figured out a better algo with lesser degree polynomial. The second reason is a theoretical reason : you cannot fix a specific degree for the polynomial and declare that as the boundary line. What if, the same algo runs a bit faster on a different environment, different programming language etc?

The main task of complexity theory is to classify concrete computing problems with respect to their degree of hardness measured by computational complexity. So, how does one go about classifying a problem as a hard problem or not ? Since one is unable to design lower bounds on complexity of concrete problems, a indirect  method is used to classify whether a problem is NP-hard or not. There are about 4000 problems that are considered as hard problems and cannot be solved by an algo efficiently.

Let’s say you have thought about an algo A and you want to know whether it is NP-hard. The way to do it is , check whether the assumption of polynomial running time for your algo would mean a polynomial running time for any of the 4000 NP-hard problems. If that is the case your assumption is wrong and hence your algo is NP-hard. It’s a convoluted argument but a plausible one and that has stood the test of time.

Now it is perfectly possible that the now classified 4000 problems themselves are not NP-hard. For example Traveling Salesman problem is NP-hard but with DNA computing it is possible in the future that all variants of TSP can be solved in polynomial time. So, may be the subset of 4000 problems might be reduced and the ones you thought were NP-hard are no longer NP-hard. But till then the method to check whether a problem is NP-hard or not is via the indirect method. This concept is not easier to grasp for a first timer and hence the author provides enough examples and visuals to drive home this point.

At the end of this chapter, a reader will get a fair idea of the kind of tasks that are deemed to be practically unsolvable. For example, I can now understand and make sense of the statement that I came across a few days ago in graph theory – “ Finding a Hamiltonian Cycle in Graph is NP-hard problem”. This means that there is no algo till date that has been developed that has polynomial time complexity.

 

Randomness in Nature and as a Source of Efficiency in Algorithmics

The reader will feel that the word “magic” in the title is apt, after reading this chapter. Randomized algorithms are introduced in this chapter. What are they ? There are many tasks for which deterministic algos take more than the age of universe and are deemed practically impossible. In all such cases, uncertainty comes to the rescue. By giving up on absolute reliability of the output and allowing for a small error probability, algorithms can be devised to have have randomization as one of the steps. This randomization could range from merely selecting a number at random from a set to simulating something at random and using it for algo processing. This randomization allows for the task to be done in a far less time than any deterministic algo. The chapter uses an example of communication protocol for explaining the power of randomized algorithm. This example makes the reader abundantly clear that any kind of insane reliability can be obtained by repeating several randomized computations on the same problem instance. For many applications on the internet, randomized algorithms are the solutions to many problems that take insane time or space by a deterministic algo.

Cryptography,

OR

How to transform drawbacks in to advantages?

Cryptography is also a topic that lives up to the subtitle of the book, “from knowledge to magic”. Thinking of secure communication between multiple parties is not possible without randomized algorithms. The chapter gives the logic behind symmetric cryptosystems and then explains its limitations. It then goes on to describing the math behind public key-private key cryptography system that is the dominant mode of secure communication on the internet. At the heart of its success lies the fact there are one way functions that cannot be inverted easily. No wonder the best number theorists in the world contributed a ton of stuff to developments in Cryptography.

The last few chapters are like reading stuff from a science fiction book, but with a difference. The stories mentioned here are happening in various labs around the world. DNA computing for instance is the use of molecules for computation. The following visual is a projection of how computing will be done years from now.

imageclip_image001image

 

Massive parallel computing can be done via DNA computing. This might make all the 4000 or so NP-hard problems known till date crumble down in to doable tasks. In fact Adelman, a scientist has solved an instance of Traveling Salesman problem in his lab. Though there are many aspects of DNA computing that are yet to worked on, it will not be surprising ( given the way PC and internet revolution happened so quickly relatively speaking) to see DNA computing device in every scientist’s lab in the near future.

 

imageTakeaway :

The book shows the beauty, depth and usefulness of the key ideas in computer science and makes a reader understand and appreciate the “science” part of “computer science”.

imageimage

The cover page of the book gives the solution to the popular puzzle,

Is it possible for a knight to tour the chessboard, visiting every square once and only once, and return to its initial square?

The solution to the puzzle lies in thinking about a graph containing vertices as squares of the chess board and the adjacency of two vertices based on the validity of a knight move. If you want to solve such a problem for a general n*n square, you need a graph processing algorithm.

Graphical models are considered as of the most important classes of models in applied mathematics. The first time I came across graphs in the context of statistics was in Larry Wasserman book, ”All of Statistics”. According to Wasserman,

The typical mathematical statistics course spends too much time on tedious and uninspiring topics (counting methods, two dimensional integrals, etc.) at the expense of covering modern concepts (bootstrapping, curve estimation, graphical models, etc.).

Wasserman devotes one full chapter on Directed Acyclic Graphs and its connections to statistical modeling. At the time I was reading Wasserman’s book , I was clueless and had deferred that chapter until I understood some basic principles of graph theory. It has been almost three years since I read Wasserman’s book and yet I could not find time to go over Graph theory. Sometimes I feel that there are so many things that I am ignorant of, that it will take my entire life to get a good understanding of modern statistics and build some meaningful models. In any case, I can only do one thing at a time and so I finally decided to understand this topic by going over a book that is targeted towards undergrads.

The first chapter of the book is on Mathematical models and gives a basic flavor of the modeling process. The chapter then goes on to define various terms relevant to graph theory such as

  • graph
  • edge set
  • edges
  • vertex set
  • vertices
  • order of graph
  • size of graph
  • adjacent vertices
  • incident to an edge or incident with an edge
  • directed graph / digraph
  • directed edge
  • symmetric digraphs
  • network
  • undirected network
  • directed network
  • signed graph
  • positive edge
  • negative edge
  • multigraph
  • multi edges
  • loop graph
  • loop digraphs
  • loop-networks
  • loop multigraphs

By going over the definitions and explanations carefully, a reader will have an understanding that Graph is a mathematical object that is characterized by edge set, vertex set and relation set on vertex set. Order and size of the graph capture the number of nodes and edges in a graph. Depending on the type of edges, the graphs can be directed or undirected.

The second chapter introduces the term “isomorphism” and gives examples of identifying isomorphic graphs. In fact “ is isomorphic” is an equivalence relation and can be used to partition the universe of graphs. The chapter then introduces the following terms with rich visuals :

  • degree of the vertex
  • graph of degree r / r regular graph
  • complete graph
  • isomorphic
  • isomorphism
  • subgraph
  • u-v walk
  • u-v trail
  • u-v path
  • connected graph
  • disconnected graph
  • circuit
  • cycle
  • cut vertex
  • bridge

The book then takes a novel approach for explaining various principles of graph theory. Instead of defining the terms, giving theorems and proofs, it gives a set of applications where graphs can be used as effective tools for problem solving. In the process, it explains all the relevant theorems and principles.

The book uses Konisberg 7 bridges problem and Traveling Salesman problem to explain the following terms

  • eulerian trail
  • eulerian circuit
  • eulerian graph
  • eulerian multigraph
  • traversable graph
  • hamiltonian graph
  • hamiltonian cycle

A reader will never confuse the meaning of the above terms once they are explained in terms of a problem. Eulerian / Traversable graph comes up in 7 bridges problem where it deals with traversing each edge once . A Hamiltonian graph is relevant to solving Traveling salesman problem where each vertex needs to be touched once. The chapter also states the conditions (sometimes necessary, sometimes sufficient) for a graph to be Eulerian / Hamiltonian. There is also some historical trivia mentioned  , for example

Hamiltonian graph got its name from the famous Irish mathematician Sir William Rowan Hamilton(1805-1865) who invented a game that involved a regular solid dodecahedron. Hamilton labeled each vertex with a name of a well known city and object of the game was to travel “All around the world” with traversing each city exactly once.

Using an example of building a rail network between a set of cities, the book introduces the terms like trees, forest, spanning trees, spanning forests and minimum spanning trees. Digraphs are introduced via PERT. There are a lot of algorithms in the CS literature that can be used to find the minimum spanning trees in a graph. I think this book gives enough motivation for an interested reader to look up various graph processing algorithms.

Using party puzzles the book introduces bipartite graphs and Ramsey numbers. In the process it also shows that graph theory development is far from over as there a quite a number of problems yet to be cracked.

The book also contains a chapter that is full of puzzles and the way they can be solved by looking for a graph structure. Towards the end of the book, the author talks about representing the graph as a incidence matrix. By looking at the graph from a matrix stand point, relations between edges, nodes etc. can be culled out from the four fundamental subspaces of the matrix. After reading through this book, I think I am little bit equipped to slog through application of DAGs in statistical modeling.