March 2012


0C__Cauldron_Books_Reviews_NumPy_for_blog_book_cover

Programming in Matlab / R exposes one to vectorized way of thinking. One doesn’t usually write loops often and one tends to think in terms of vectors,array, matrices etc. R for instance is designed to facilitate vectorized input and output. Almost all the functions in base R support vectorization. Most of the functions in the packages on CRAN are equipped to take vectorized input. In fact the program design itself makes vectorizing easy. Recycling rule in R for instance , sometimes makes your function automatically handle vectorized input, even though you never meant to handle such an input. For a R newbie , the fact that his/her function does much more than expected is a happy side effect. However once you have logged in decent number of hours in R, you realize that it is your responsibility that whatever code you write, it handles vectorized input . Hence these days, one of the first unit tests that I write is to check whether my code breaks down for vectorized input.

If you stick to only Python Standard Library, you will not get to see the power of thinking in vectors. At least that’s my impression. If you have programmed in R like me and then exploring Python, you will in fact be eagerly , thirstily, looking for stuff that facilitate vectorization. Thanks to Travis Oliphant, we now have Numerical Python(NumPy) library that gives a lot of R equivalent functions in Python.

Historically , Numarray and Numeric were Python libraries for matrix computations. Numeric was first released in 1995 . In 2001, a number of people inspired by Numeric created SciPy – an open source Python scientific computing library that provides functionality similar to Matlab, Maple and Mathematica. Around this time, people were growing increasingly unhappy with Numeric. Numarray was created as an alternative for Numeric that was better in some areas. Soon, there were a lot of developments around Numarray and SciPy that depended on Numeric could not take advantage of these developments.

1C__Cauldron_Books_Reviews_NumPy_for_blog_Travis_Oliphant_creator

In 2005, Travis Oliphant, a professor and an early contributor to SciPy decided to something about the situation. He tried to integrate some of the Numarray features in to Numeric. A complete rewrite took place that culminated in the release of NumPy 1.0 in 2006. Originally NumPy was a part of SciPy but today it exists as an independent library. This book is meant to teach some basic skills to work with NumPy and it does so with a good balance of theory and practical examples.

I feel NumPy does not have a steep learning curve if you are already exposed to Matlab or R. The reason is obvious. Once you are familiar with vectorization, you can easily spot the functions and understand them. I have thoroughly enjoyed reading this book as it has equipped me with skills to translate R code in to Python and at the same time get all the powerful features of NumPy. As such NumPy is a package that is usually a part of the basic tool kit for any researcher using Python. The fact that it has been around for quite sometime means that it has matured as a library. Its first release was NumPy 1.0 in 2006. As of today, you can get NumPy 1.6, i.e you have the advantage of using a very stable library that has gone through a life of almost 6 long years. NumPy is used by scientists, statisticians, researchers , academicians, quants who typically need to deal with huge data sets and are looking for quicker computations. NumPy’s USP is that it is the closest that you can get to run your code using C with out actually writing C code.

The author lists some key points of NumPy that make it appealing in the context of Big Data  :

  • Much cleaner than straight Python code
  • Fewer loops required because operations work on arrays
  • Underlying algos have stood the test of time
  • NumPy’s arrays are stored more efficiently than an equivalent data structure in base Python
  • Array I/O is significantly faster
  • Performance improvement scales with the number of elements. Really pays off to use NumPy for large datasets.
  • Large portions of NumPy are written in C. That makes NumPy faster than pure Python.

The book is organized in to 10 chapters in such a manner that it takes a first time NumPy reader systematically over all the major features of NumPy.

Chapter 1 : NumPy Quick Start

The first chapter starts off with a set of screen shots relevant to installing NumPy on Window , Mac and other platforms. The highlight of the chapter is a simple example of summing two vectors and comparing the speed of NumPy code with that of plain vanilla Python code. I ran a single run of “generate two separate million dimensional vector  , do some operations, add them” and found that NumPy was 11 times faster than Python code and was 1.5 times faster than using list comprehensions. Instead of relying on this single run, I ran a similar code 1000 times to get some summary statistics of the speed. The code compared the speed of NumPy and List comprehensions with that of plain vanilla Python loop. I found that the speed of NumPy with Python was on an average 14 times faster than plain vanilla Python code and speed of List comprehensions was on an average 2 times faster than plain vanilla Python code. So all in all, NumPy is a clear winner over both simple python loops and list comprehensions.

This chapter also introduced me to iPython. There was some learning curve for me. But once I figured out working with the magic commands, I realized that iPython was fantastic for interactive analysis. There was some inertia in going over iPython but soon I realized that magic commands were indeed magical. I found a few videos on iPython  that were helpful  in understanding iPython. After a brief encounter with iPython, I stumbled on to a chapter in a book titled,`Python for Unix and Linux System Administration` that goes at length in describing various iPython commands. Now when I look at R programming, I realize that R has a gui but probably needs something similar to iPython. May be it will be just a matter of time that some open source enthusiast will develop iPython equivalent in R. Some of the features of iPython that I find very useful are

  • Searching history and executing them
  • Repeating specific lines from history and executing them again
  • Tabbing feature
  • psearch
  • bookmarking paths
  • Ability to call external scripts.

Obviously there are many more magic commands that are available in iPython that I haven’t explored. I think it is hard to leave iPython once you start using it.

Chapter 2 : Beginning with NumPy fundamentals

At the heart of NumPy is the the object ndarray, that comprises two parts, actual data and meta data. ndarray contains homogenous data described by the metadata object, dtype. The first things that one needs to learn is to create an ndarray for various numerical types. Firstly the data in the ndarray can be obtained from an existing Python object like list or using the arange function or from an existing array using I/O or using linspace. The dtype object is used to manipulate the default numerical type of the ndarray data. This chapter starts off with a list of constructors for populating data and meta data. It then talks about ways to slice and dice the array. All R equivalent functions of cbind and rbind are mentioned. There is an important point that is not highlighted well enough in this chapter, i.e the slicing and dicing the ndarray does not give you a new ndarray but gives a pointer to the specific memory location. This means that if you extract a subset of data and update it, the original ndarray gets updated too. To avoid this inconvenience , a slicing function can be put through a copy function that creates a new ndarray from the existing ndarray. Very much like the unlist function in R, there are two functions , i.e flatten and ravel that can applied to an ndarray to retrieve the data elements in the form of a 1 D array. The difference between the two is that flatten allocates new memory whereas ravel doesn’t. So, based on whatever your requirement is, you might have to use flatten or ravel. By the end of this chapter, a reader gets a good idea of creating an ndarray, stacking/resizing/ reshaping/splitting ndarrays.

Chapter 3 : Get in to Terms with Commonly Used Functions

The first thing that any data analyst needs to do is to get the data in to the working environment. Much like the read.csv and read.delim functions in R ,you have loadtxt and genfromtxt function in Python. The syntax is more or less similar. genfromtext has arguments where you can define your own converters while extracting data. My guess is genfromtxt is far more useful than loadtxt as real world data is always messy and needs to be treated in someway or the other before getting it to a analytic environment. A sample list of ndarray functions such as sum , cumsum, cumprod, mean, var, std, argmax, argmin are mentioned .The author illustrates these functions by using some finance related examples like calculating stock returns, plotting SMA, Bollinger bands, Trend lines etc. By the end of this chapter, a reader gets a good idea of the doing I/O operations in NumPy and using various inbuilt functions for a ndarray object.

Chapter 4 : Convenience Functions for your convenience

This chapter gives a taste of basic statistical functions in NumPy such as covariance, correlation, polynomial fitting and smoothing functions. The best thing that I came across in this chapter in the vectorize function. Write whatever function you want to write, pass it to vectorize function in NumPy and your function is all set to take vectorized input. Coming from an R environment, where it is the coder’s responsibility to ensure vectorized input handling, this function is like a boon. You write a normal function and use numpy.vectorize and your function is ready. Wow! It’s so neat.

Chapter 5 : Working with Matrices and ufuncs

This chapter covers matrices, a subclass of ndarray object. All basic operations of matrices are covered like transpose, inverse, solving linear equations etc. The highlight of this chapter is ufuncs, an abbreviation for universal functions. Universal functions are not functions but objects representing functions such as add. Now here is where things get interesting. You have methods for these functions namely reduce, accumulate, reduceat and outer. accumulate is the NumPy equivalent of rowSums, ColSums in R. All these four methods of functions are very useful in data munging.

Chapter 6 : Move Further with NumPy modules

This is a chapter in book that made me feel that I was reading some code in R. Except for a few cosmetic changes in function signature , all the object and classes signatures of NumPy’s linalg module and random module look similar to R code. This chapter starts off with describing linalg package functionality.All decompositions are supported in linalg package, i.e cholesky, svd, pseudo inverse, eigen value decomposition etc. Subsequently the chapter talks about various statistical distributions and random number generators relating to those distributions. Overall a very easy read for a reader exposed to R.

Chapter 7 talks about special functions like Sorting, searching, financial utilities etc. For testing ndarrays, NumPy has a testing package. Chapter 8 describes the main functions in the testing package that can be used to test equivalence of two ndarrays/ matrices. I have skipped chapter 9 that gives a 10000 ft view of matplotlib, the graphics package in Python. I thought I would go over carefully and learn it properly rather than getting some cookbook kind of learning. The last chapter of this book talks about SciPy where there are many more modules for a researcher looking beyond NumPy. The last chapter has made me curious about Scikits.statsmodels. I am certain that the models covered in Scikits.statsmodels would be far less than the ocean of models that are available in R. But I am curious to know what sort of models are available in Scikits.statsmodels. Will go over it someday.

 

takeawa Takeaway :

This book gives a good working knowledge of Numerical Python library. By explaining the nuts and bolts of the major functions of the 400 odd functions in the NumPy package, it gives the reader good enough ammunition to crunch data.

image

This book is cited as the classic reference for  Python programmers. Instead of diving in to Python as the title suggests, I did some ground work before going through this book. I went over Learning Python the Hard Way, Think Python, Python Visual Quick Start guide and understood Python 101 . Those three books gave me some confidence to go over this book that is supposedly for experienced programmers.

The book is organized in a very interesting way. The author starts off every chapter with a rather challenging Python program source code. A reader is expected to go over the code before reading through the chapter. Most of the times I was clueless about what was coded, but I kept moving and found that the author explains almost every line of code written, the rational behind choosing a specific Python object, the specific programming style, and many other things in the code. The author manages to bring in some humor too, an element that is often not seen in Programming books.

I have realized the power of reproducible research, thanks to R. So, the first thing to work on was Literate programming in Python. I stumbled on to Pweave a great module for Literate Programming. As the name suggests, it is Python version of Sweave documents . There are ways to convert a Pnw document to whatever format you want your output as. The intermediate step typically is to convert Pnw to a reST document . Subsequently reST document can be converted to html/pdf/.doc etc. This summary in html format is also the output of using Pnw document ( Pnw=>rST=>html).

I have learnt a ton of stuff from this book such as generators, development frameworks like TDD, power of  Regex, many Python hacks to make program compact and elegant like optimizing look ups, etc.

In this post, I will try to summarize the main points of the book

Chapter 2 – Your First Python Program

  • Python has several implementations such as IronPython, Jython, PyPy, Stackless Python The default interpreter from python.org is the CPython implementation
  • Every Python functions returns something, either a value or None
  • Variable are never explicitly typed in Python. This sort of thing is called dynamic typing
  • Came across a very interesting comparison between Python datatypes and other language data types
    • Statically typed language – A language in which types are fixed at compile time. Most statically typed languages enforce this by requiring you to declare all variables with their datatypes before using them. Java and C are statically typed languages.
    • Dynamically typed language – A language in which types are discovered at execution time; the opposite of statically typed. VBScript and Python are dynamically typed, because they figure out what type a variable is when you first assign it a value.
    • Strongly typed language – A language in which types are always enforced. Java and Python are strongly typed. If you have an integer, you can’t treat it like a string without explicitly converting it.
    • Weakly typed language – A language in which types may be ignored; the opposite of strongly typed. VBScript is weakly typed. In VBScript, you can concatenate the string ’12’ and the integer 3 to get the string ‘123’, then treat that as the integer 123, all without any explicit conversion.
  • Python is both dynamically typed language and strongly typed language. once a variable has a datatype, it actually matters
  • sys module is written in C . Also all the built-in modules are written in C
  • Everything in Python is an object, and almost everything has attributes and methods
  • sys module is an object that has path as the attribute
  • Definition of a class in Python is rather loose. Everything is an object in the sense that it can be assigned to a variable or passed as an argument to a function. Some objects have neither attributes nor methods. Not all objects are subclassable
  • I thought that 4 spaces as code indent is a MUST. This chapter says that it is not necessary. It only needs to be consistent spacing
  • Indentation is a requirement and not a matter of style. Hence all the programs look similar and hence it is easier to read and understand other people’s code
  • if __name__ trick : Modules are objects, and all modules have a built-in attribute __name__. A module’s __name__ depends on how you’re using the module. If you import the module, then __name__ is the module’s file name , with out a directory path or file extension. If you run the module as a standalone program, __name__ will be a special default value __main__

Chapter 3 – Native Datatypes

  • Dictionary keys are case sensitive
  • Dictionary supports mixed keys. Dictionary values can be string, integers, lists, dictionaries etc. However keys have some restrictions. They can be string, integers and some other data types
  • Dictionaries are an efficient means of storing sparse data
  • Sorting a dictionary using three different ways
  • Lists have two methods, extend and append, that look like they do the same thing, but are in fact completely different. extend takes a single argument, which is always a list, and adds each of the element of that list to the original list. On the other hand, append take one argument, which can be any data type.
  • Python accepts anything in Boolean context according to the following rules
    • 0 is false
    • An empty string is false
    • An empty list is false
    • An empty tuple is false
    • An empty dictionary is false
  • remove only removes the first occurrence in the list
  • pop is an interesting beast as it removes the last element in the list as well as returns the deleted element
  • extend is faster than concatenating the list as the latter creates a new list whereas the former merely extends the list
  • tuples have no methods. They are immutable objects
  • tuples are faster than lists
  • It makes your code safer it you write-protect data and use of tuples can come in handy
  • Dictionary keys should be immutable and hence tuples can be dictionary keys
  • Tuples can be converted to lists and vice-versa
  • use tuples to assign multiple values at once
  • An easy way to assign values to day of the week
  • Tuples are used in formatting. I did not observe this thing even though I worked through a ton of examples in LPTHW. I need to be alert about the kind of code that I work on
  • Tuples are used in string concatenation as using a plus operator between string and integer raises an exception
  • One of the most powerful features of Python is the list comprehension, which provides a compact way of mapping a list in to another list by applying a function to each of the elements of the list
  • Every thing is an object. "," is also an object as one can invoke join method

Chapter 4 – The Power of Introspection

This chapter starts off with a rather complex looking function and explains various components of the function

  • str can be used to convert any thing in to a string
  • dir lists the attributes and methods of any object
  • callable objects include functions, class methods , even classes themselves
  • One can use getattr to invoke a function that is not known until the run time
  • getattr can be used as a dispatcher. Let’s say based on the type of input, you want to do something, you can use the various input types as function names and code the various functions , use getattr to dispatch to various functions
  • You can add a default function , in the getattr method
  • Python has powerful capabilities for mapping lists in to other lists, via list comprehensions.
  • The list filtering syntax [mapping-expression for element in source-list if filter-expression]
  • Boolean is handled in a peculiar way in Python. 0,”,[],(),{} and None are false in Boolean context, everything else is true.
  • In the case of OR statements, the statements are evaluated from left to right.If all the statements are false, then OR returns the last value
  • You can define one-line mini functions on the fly. These are called lambda functions .There is no return statement. The function has no name. They cannot contain commands
  • If you want to encapsulate specific non-reusable code without littering code, use lambda functions
  • Assigning functions to variables and calling the function by referencing the variable is important to understand properly . This mode of thought is vital to advancing understanding of Python

Chapter 5 – Objects and Object Orientation

Like other chapters in the book, this chapter starts off with a page long code that captures all the important aspects that come in OOPS.

  • Learnt about os.path.splitext(f) a function that split the file name in to 2 parts, one before the dot and one after the dot
  • Another function useful in normalizing the path , os.path.normcase(f)
  • To decide between using from x import y OR import x, it depends on how frequently one is using y function in the code. If there is a possibility of namespace clashes, its better to import specific functions instead of import x
  • Avoid doing a wild import
  • __init__ is like a constructor method but it is not. The object has already been constructed by the time init function is called
  • Subclassing is done easily by merely listing the parent classes in the parenthesis
  • Python support multiple inheritance
  • using self in the class methods is only a convention, but a very strong convention
  • class acts like a dictionary
  • __init__ methods are optional, but when you define one, you must remember to explicitly call the ancestors __init__ method
  • Every class instance has a built-in attribute __class__, __name__, __bases__
  • In Python, simply call a class as it were a function to create a new instance. There is no explicit new operator like in other languages
  • Memory leaks are rare in Python as it implements reference counting. As soon as something goes out of reference, it is removed immediately
  • In Python, you can forget about memory management and concentrate on other things.
  • There is no functional overloading in Python
  • UserString, UserList and UserDict are wrapper classes that mimic built-in string, list and dict classes
  • You can write special methods like __getitem__ and retrieve from the class instance using a dict syntax.
  • There are ton of special class methods that you can write like comparison, length, etc .
  • The convention for defining special class methods is to prepend and append two underscores to the function name
  • Class attributes are different from data attributes. One can think of class attributes as static attributes that are associated with the class. They are present even before instantiating the class. Class attributes are defined soon after the class definition statement
  • Data variables are defined in __init__ method
  • In Python, there is private or public scope for class method or attribute. There is no protected method like C++
  • __class__ is a built-in attribute of every class instance. It is a reference to the class that self is an instance of

Chapter 6 – Exceptions and File Handling

  • try …except has the same function as the try catch block in other languages
  • try .. except.. else :  If no exception is raised in the try block, the else clause is executed afterwards
  • try… finally : the code in the finally block will always be executed, even if something in the try block raises an exception
  • Most other languages don’t have a powerful list datatype like Python. So, you don’t need to use for loop that often
  • os.environ is a dictionary of the environment variables defined on your system.
  • sys module contains system-level information such as the version of Python you are using etc.
  • sys module is also a dictionary
  • Given the name of any previously imported module, you can get a reference to the module itself through sys.modules dictionary
  • The split function splits a full pathname and returns a tuple containing the path and filename
  • splitext functions splits a filename and returns a tuple containing the filename and the file extension
  • isfile and isdir are useful to check whether the object is a file or a directory
  • glob module helps in reading filtering files from a folder
  • fileinfo.py has taught me a lot about Python syntax and OOPS concepts. I think it will take a looooong time before I manage to write a program that is as elegant and succinct as fileinfo.py

Chapter 7 – Regular Expressions

This chapter introduces Regular expressions in a superb manner by using three case studies. First one involves parsing street addresses, second one involves parsing roman numerals and third one involves parsing telephone numbers. All the Regex 101 aspects are discusses such as

  • ^ matches the beginning of the string
  • $ matches the end of the string
  • \b matches a word boundary
  • \d matches any numeric didit
  • \D matches any non-numeric character
  • x? matches any optional x character
  • x* matches x zero or more times
  • x+ matches x one of more times
  • x{n,m} matches an x character atleast n timesm, but not more than m times
  • (a|b|c) matches either a or b or c
  • (x) in general is a remembered group. You can get the value of what is matched by using groups function

Chapter 8 – HTML Processing

The chapter starts off by showing a program that looked overwhelming to me. Its a program that parses an external html and converts the text in to various languages and renders it in to another translated html. So, at the outset reading through the program I did not understand most of things. Basically that’s the style maintained through out the book. Introduce a pretty involved program and explain each of the steps involved in the program. So, the book starts off by talking about SGMLParser that takes in a html document and consumes it. Well, that’s all it does. So, what’s the use of such a class? One has to subclass it and provide the methods so that one can do interesting things. For example one can specify start_a function and list all the urls in a page. This means instead of manually going through the data to find all a hrefs , you can extend this function and get all the links in the page. If a method has not been defined for a specific tag, unknown_starttag” method is invoked. The chapter then talks about locals and globals , functions that are useful in string formatting. So, the basic structure of this chapter is ,start with SGMLParser ,subclass it and create a BaseHTMLProcessor, subclass it to create Dialectizer, and then subclass it to create various Language specific Dialectizers. One gets to understand the way to make a program extensible by reading carefully the “dialect.py. This chapter makes one realize the power of sgmllib.py to manipulate HTML by turning its structure in to an object model. This module can be used in many different ways, some of them being

  • parsing the HTML looking for something specific
  • aggregating the results, like the URL lister
  • altering the structure along the way
  • transforming the HTML in to something else by manipulating the text while leaving the tags alone

After going through this chapter, I learnt to write a basic link crawler.

Chapter 9 – XML Processing

The chapter starts with the 250 line program that was overwhelming for me to go through. However the author promises that he would take the reader carefully over all aspects of the 250 line code. After this mega code, the chapter starts talking about packages and the need for organizing Python programs in packages. XML package uses unicode to store all parsed XML data and hence the chapter then dwells on the history of unicode. Python uses ascii encoding scheme whenever it needs to auto-coerce a unicode string in to a regular string. The last two sections of the chapter talk about searching for elements and accessing element attributes in an XML document. Overall, this chapter shows that accessing and reading XML document in Python is made easy by the xml module.

Chapter 10 – Scripts and Streams

  • One of the powerful use of dynamic binding is the file-like object
  • A file-like object is any object with a read method with an optional size parameter
  • file-like objects are useful in the sense that the source could be anything , a local file, a remote xml document, a string
  • Standard output and error are pipes that are built in to every UNIX system. when you print something, it goes to the stdout pipe, when your program crashes and print out debugging information, it goes to the stderr pipe
  • stdout and stderr are both file-like objects. They are both write-only
  • In windows based IDE, stdout and stderr default to interactive window
  • To read command line arguments, either you can import sys and use the iterator sys.argv or use getopt module

I have skipped Chapters 11 and 12 that are based on web services and SOAP. Will refer it to get some general idea at a later date.

Chapter 13 – Unit Testing

This chapter’s basic message is – "Write Unit Tests", " Code later". This is one of the programming approaches that is popularly known as Test Driven Development. Some of the points I learnt from this chapter are

  • You have subclass the unittest module so that you can use all the useful features of the module in your own function
  • Each individual test that you write takes in no arguments. It returns no value whatsoever. If the method exists normally with out raising any exception, the test is considered passed
  • TestCase class provides a method called assertEqual to check whether two values are equal.
  • There is also a method called assertRaises to check whether the code fails for bad input. Instead of calling manually the function, passing in the argument and then checking whether it raises a specific exception, assertRaises is a goodway to check this all in one single function call
  • Each test case should handle only one question
  • Each test case must be able to work independently of the other test cases

Chapter 14 – Test First Programming

The purpose of this chapter is to show code development via testing. In the previous chapter, set of unit tests were written for testing the conversion from number to roman numerals and roman numerals to numbers. Through a series of Python programs, the chapter manages to come up with an all tests-ok code. I particularly like this idea of writing tests before even you start coding. It will force one to think of all the possible ways to develop a nice set of code.I tried doing this with out looking at the author’s working and I found the solution in the book a million times more elegant than my code. It never occurred to me that you can use regular expressions for checking the correct input for the conversion code. Overall one of the best chapters in the book.

Chapter 15 – Refactoring

By adding some more functions to the roman numerals problem, the author shows way to refactor the code.

Chapter 16 – Functional Programming

  • map and filter functions have been in Python forever. List comprehensions have been introduced since Python 2.0.All the three functions are very useful when you want to vectorize stuff. If you have already coded in MATLAB or R, vectorizing is the way you think and code Thanks to List comprehensions. they enable to you to think in vector centric way. Out of the the three, map, filter and list comprehensions, I think I like list comprehensions the most.
  • The author strongly recommends using map, filter and list comprehensions for creating a better code
  • The book shows a nice way to import modules dynamically

Chapter 17 – Dynamic Functions

This is my favorite chapter of the book. I was amazed at how the author takes a simple example of pluralizing a noun, introduces the concept of lambda functions, generators to make the code look extremely beautiful. From a raw if then statement code, the author takes the reader in a systematic manner in 6 iterations to a code that simply is beautiful. I am really thrilled to see so much of infra built in to this language. I don’t think I will ever move out of Python and R to do anything, at least for now. The one thing that I have to practice and implement in my own code soon is generators. For me, this chapter had so many ‘aha’ moments that I will revisit this chapter at a later date. The last chapter offers a list of performance hacks that probably an experienced programmer might appreciate. Overall a fantastic book. Loved every moment of it

image Takeaway:

Reading other’s code is an activity that a programmer must make it a part of his daily life, to become a better programmer. This book is organized in such a way that you are forced to read code to start with and then you are given the rationale behind the code.I think this particular structure is the highlight of the book and it is no wonder that this is one of the most popular books on Python.

book_cover

This book gives a quick intro to Python and one does not need any prior programming knowledge to go over the contents. I quick read the book in an hour’s time and it served as a syntax F5er.

  • Some of the GUI libraries mentioned that I need to explore are PyQT, PyGTK, wxPython, TkInter
  • Need to check out Charming Python link mentioned in the book
  • There is no char datatype in Python. Never realized this actually.
  • Statements which go together must have the same indentation.
  • Singleton tuple can be created by specifying it this way  (a,). Why does one require them though ?
  • os.system(command) will execute the command. Good for creating your own backup routines
  • All class members are public and all the methods are virtual in Python.
  • Python does not automatically call the constructor of the  base class, you have to explicitly call it.
  • A variable starting with double underscore is effectively a private variable
  • Python provides a standard module called pickle using which you can store any Python object in a file and then get it back later intact. One can dump and retrieve objects in to a file that has an extension .data using pickle and cpickle module. This is similar to load and save command in R
  • You can pass multiple arguments to a function use tuples or lists.