Originally Published: Wednesday, 30 August 2000 Author: Jason Tackaberry
Published to: develop_articles_tutorials/Development Tutorials Page: 1/1 - [Std View]

Programming with Python - Part 1: Baby Steps

This tutorial is the first in a series that will introduce you to Python, a dynamic, object-oriented language that is rapidly gaining popularity. Since I myself have garnered a modest Perl background before learning Python, I will target this tutorial at the Perl programmer. However, anyone with a little programming experience should find this text useful.

Programming with Python series
1. Baby Steps
2. The Real World
3. Extending Python

This tutorial is the first in a series that will introduce you to Python, a dynamic, object-oriented language that is rapidly gaining popularity. Since I myself have garnered a modest Perl background before learning Python, I will target this tutorial at the Perl programmer. However, anyone with a little programming experience should find this text useful.

In the interest of full disclosure, Python is my language of choice. I have a reasonably solid background in Perl, C, and C++, but when I feel Python will do the job, I tend to favor it. So, despite my care to the contrary, some of the comments in this series may be subjective. If you have any strong objections, please mail me. We might be able to spark some interesting discussions.

In the Beginning...

Some time in the 80s, Guido van Rossum, Python's lead developer, co-authored a programming language called ABC, geared toward teaching programming concepts. ABC, while never really becoming very popular, was overhauled in its entirety, and became the proud father of Python. One of Python's core principles was and still is simplicity and elegance. More than 10 years later, Python has retained its elegance, and has only grown in features and popularity.

...and, yes, Python was indeed named after Monty Python.

So what is Python, anyway? Python is every one of these things: interpreted, dynamic (loosely typed), object-oriented, portable, clean, elegant, easy to learn, powerful, embeddable, extendable, freely available, actively developed, and widely used. Because of its simplicity and clean syntax, Python makes an excellent first language. And because of its incredibly diverse library of modules, it is an excellent language for experienced developers. Python is suitable for small projects, and scales beautifully for very large projects.

Python is commonly used to create software prototypes: versions that are used to model and prove a design, and then ported to lower-level languages or discarded. You may be surprised that one would simply discard a prototype; throw-away prototyping is a common practice in software engineering, and Python's simplicity makes this practice feasible. This case study documents Python's role in a commercial environment, used to prototype and ultimately build a modest sized system.

Getting Down to Business

Okay, enough chitchat; let's get our hands dirty. Like Perl, Python is interpreted, and so runs using an interpreter. Also, like Perl, Python code is dynamically converted to byte-code before execution. This byte-code is saved to disk so that subsequent executions need not recompile the code, unless modifications to the source code have been made, of course.

So, our first Python program can be directly typed into the interpreter, run through the interpreter by passing the source file as an argument, or by prefixing the source file with a #! directive pointing to the Python interpreter:

  #!/usr/bin/env python

Once this file is made executable, it can be run directly. The Python version of Hello World is rather uninteresting, but in the spirit of tradition, here it is, in all its glory:

  print "Hello World!"

A programming language is pretty useless without any form of flow control or iteration. However, before I introduce these constructs, I should gently ease you into one of Python's most controversial features. Whereas in C and Perl, groups of statements (blocks) are sandwiched in braces, in Python blocks are denoted solely by their indentation. Consider the following code in Perl:

  if ($foo == 1) {
    $foo = 4;
    print "Changed foo to 4";

The same code would look like this in Python:

  if foo == 1:
    foo = 4
    print "Changed foo to 4"

In the Perl snippet, the whitespace preceding the lines in the code block is arbitrary. I could use any number of spaces or tabs; the choice is purely stylistic. In Python, the prefixed whitespace is mandatory. A certain degree of style is permitted: you can use any number of whitespaces or tabs. The only requirement is that you be consistent in your use. So, if the first line in the block is prefixed with a tab, you had better make sure the next line is as well, or else the interpreter will generate an error.

If you're accustomed to Perl or C, at this point you must be crying, "that's just absurd!" I know I did. But once you start using Python you quickly become at ease with this. And once you start weeding through thousands of lines of code, you begin to appreciate it. It may seem absurd now, but unless you have years of experience sorting through nests of braces, it makes the code much more readable. 3 or 4 nested blocks in Perl will look to the beginner like Lisp looks to us mere mortals.

The Basics

Now we know how to start a Python program, and hopefully by now have gotten over the shock from the indentation requirement. There's a few things we need to have under our belts before we can dive in.

First, in Python, mostly everything is an object. Strings are objects, integers are objects, functions are objects, lists are objects, and so on. Some objects may be acted on directly; that is, these objects have methods that may be invoked. For example, list objects have an append method that let you append an object to the list. Other objects, such as strings and integers, must be operated on indirectly. So, you won't call a string's split method, but instead will use the split function offered by the string library (module). This idiosyncrasy has been addressed somewhat in Python 1.6 (at least, string objects now have methods), but in general knowing what's what is just a matter of memory work.

Python objects come in two flavors: mutable, and immutable. Immutable objects are those objects which cannot be directly changed, such as strings or integers. If you want to concatenate some text onto a string, you don't modify a string object. Instead, you concatenate two string objects together and produce a new string object. Mutable objects can be modified directly. Lists, for example, are mutable. Adding an item to a list does not produce a new list object, it just modifies the list you're working with.

Like Perl, Python has all the high level data types we've come to know and love. Tuples are much like lists in an array context in Perl, except that they are immutable. For example, (1, 2, 3, "foo", "bar", "baz") is a tuple construct. Python lists are more like Perl lists because they are mutable. What's the point of the tuple data type, you ask? Immutable types are easier to build internally, they have a simpler interface, and they are much more efficient. Why incur the overhead of a list when only a tuple is needed, such as with return values for example? Finally, Python offers dictionaries, associative arrays which Perl coders will know as hash tables. Let's have a look at some code that uses all three of these data structures:

  stuff = (1, 2, 3, "foo", "bar", "baz") # this is a tuple with 6 elements
  list = [] # initialize an empty list
  ages = { "Dick": 36, "Jane": 25, "John": 20 } # a dictionary

  # Go through the tuple and append only the strings onto the list
  for element in stuff:
    if type(element) == types.StringType:

  # Now list all the people and their ages from the ages dictionary
  for person in ages.keys():
    print person, "is", ages[person], "years old"

This example shows us not only tuples, lists, and dictionaries, but introduces some iteration and selection constructs. The for statement iterates through any sequence object (either a list, tuple, or string) and executes the code block that follows it. In the first instance, the for loop will iterate across each of the items in the stuff tuple. The first time through, element becomes 1, and then 2, and so on, until it finally finishes baz. The second for loop iterates over all the keys in the ages dictionary. The keys are those things that you want to look up, or index, in the dictionary. So, the person variable takes on the values of Dick, Jane, and John (in no particular order).

The block under the first for loop shows an if construct. The syntax of this line should be fairly intuitive, especially if you have a Perl or C background. Equality comparisons are done using the == operator; inequality is tested using !=; other unary, binary, bit-wise, and shifting operators work as you would expect. The only operators Python lacks that you may miss are the increment and decrement operators (++ and --, respectively), and the shorthand assignment operators, like +=, -=, and so on. I cringe every time I am forced to use a = a + 1 instead of a++; supposedly these operators were not included to improve readability, but the jury's still out on that one. However, these augmented assignment operators will at long last be included in Python 2.0. Comparison operators may act on any object, and will behave differently depending on the context. Comparisons between integers will do arithmetic comparisons; comparisons between strings will do lexical comparisons; comparisons between tuples will perform element-by-element comparisons; and so on. There's nothing particularly magical about an if statement. The code block that follows it is executed if the expression is evaluated to true.

Another iteration method you're probably familiar with is the while statement:

  i = 10
  while i > 0:
    i = i - 1

And this does precisely what you'd expect. The while statement evaluates the expression (in this case i > 0), and executes the block of code that follows if it evaluates to true.

The syntax for defining functions is equally as simple. For example:

  def say_hello(who, what):
    print "Hello,", who, "! ", what

And calling say_hello("Fred", "How are you?") will output Hello Fred! How are you?

Just in case you haven't noticed yet (I'm sure you have), variables in Python aren't explicitly assigned types, as is the case with any loosely typed language. The type, be it string, list, integer, or whatever, is bound to the variable on the fly. If you're a Perl coder, this may seem a little backwards to you. In Perl, the type of a variable is determined by the lvalue in an expression (the part on the left side of the assignment, in this case). For example:

  my %result = get_some_value();

Here the type of variable result is determined by lvalue of this expression, in this case a hash table. If the return value of get_some_value() is not a hash, then Perl will try to coerce it to one. Also, in Perl, $result is distinct and different from %result and @result. If you're not a Perl coder and don't know what any of this means, don't worry. Just understand that in Python, the type of a variable is determined by the rvalue of an expression ((the part on the right side of the assignment). So if get_some_value() returns a tuple, the type of result is bound to a tuple. If it returns a string, result becomes a string type, and so on.

If you're coming from strictly a C++ background, you're about to discover the way polymorphism was meant to be. I'm going to conveniently sidestep a heated debate about whether static typing makes for better software engineering. Personally, I find loosely-typed languages a pleasure to work with.

Doing Something Useful

Up until now we've mostly been looking at Python's syntax, and learning about the basic building blocks. There's still plenty more to learn about, but at this point we're ready to look at some code that actually does something. Let's look at some code that reads /etc/passwd and prints out a list of users who are in groups whose group id is greater than 100.

  import string

  pwdfile = open("/etc/passwd")
  lines = pwdfile.readlines()
  for entry in lines:
    fields = string.split(entry, ":")
    if int(fields[3]) > 100:
      print fields[0]

Only a few select functions are built into Python's core. In order to do something useful, you'll need to use one or more modules. Python is distributed with a standard library containing a vast number of modules. In the first line, we import the string module, which allows us to perform common string operations. In our example, we're interested in the string module's split function.

After importing the string module, we then open the /etc/passwd file. The open function is a built-in function that returns a file object. File objects are one of the few built-in types. A second optional argument passed to open specifies if the file should be open in read-only, or read-write mode. In the absence of this argument, read-only is assumed. Next, we call the readlines() function of the file object, which returns a tuple whose elements correspond to the lines in the file. Then, in the for loop, we iterate over each of the lines in the lines tuple, which we know corresponds to an entry in the passwd file.

The first line in the for code block calls the string module's split function. Perl coders will know right away what's happening here; this function separates the given string by a separation string, in this case ":", and returns a tuple of all the strings between (but not including) the separation string. Now we have a tuple called fields that holds the individual fields in the passwd entry. The group id is held in the fourth field, which is at index 3 (indices start at 0, like in Perl or C). First we must coerce that field to an integer (because it's a string right now), and do the comparison. If it's greater than 100, we print the user name, which is the first field. "But wait!" the Perl coder exclaims. Why do we have to explicitly convert the string to an integer? In Perl, this is done for you behind the scenes. In Python, you need to do this yourself.

A little bit of shorthand can be used in the above example. In particular, I would write the three lines after the import statement as:

  for entry in open("/etc/passwd").readlines():

One feature of Python is that users needn't worry about freeing memory. Internally, Python objects use reference counting as means of garbage collection. When an object is created, its reference count is initialized to 1. When an object is deleted, its reference count is decremented. When the reference count reaches zero, the object is destroyed. The third tutorial in this series will go into more detail on reference counting. It's more of an implementation detail that you don't need to worry too much about, except that you should be aware that it exists. In the compressed code snippet above, the open call returns a file object that isn't being assigned anywhere. So, internally the reference count of this object is decremented. Since no other objects hold a reference to it, the file object is destroyed and the file is closed.

A Classy Example

An object-oriented language like Python must provide a way to create classes of objects. In essence, classes are a description about what methods (or member functions) an instance of that class provides, and the semantics of those descriptions.

First things first: we need to clear up some definitions. Python objects are not just instances of classes. Instances are objects, but not all objects are instances. Many of Python's built-in types are objects, but not classes. Earlier I talked about string objects. There is no string class from which these string objects are created. And to make matters even more confusing, classes are objects too. So, for you C++ programmers, if I talk about an instance in Python, I'm talking about what you'd call an object. But when I talk about an object, this is not necessarily an instance, unless of course it's an instance object. Clear as mud, right? This paragraph reads like Abbott & Costello's Who's on First, I realize. Reading it once more couldn't hurt.

The simplest class definition looks like this:

  class coordinate:

The pass keyword is one we haven't seen before. This keyword is a no-op. Where the Python syntax requires something and you don't actually want to do anything, pass is what you want to use. You may be wondering what good this coordinate class is? There are no methods for this class, but Python does not require variable declarations. So, we can use this class to hold any variable, or attribute, we want. Consider:

  plot = coordinate() # Create a new coordinate instance
  plot.x = 10
  plot.y = 5

If you're thinking this seems to be functionally similar to structs in C, you'd be right. Empty classes provide a place to assign attributes. If we want a 3D coordinate, we can just assign a value to plot.z without any additional work to the class definition. Let's see what a more fleshed out version of the coordinate class might look like:

  class coordinate:
    def __init__(self, x, y, z = 0):
      self.x, self.y, self.z = x, y, z

    def translate(self, xoff, yoff, zoff = 0):
      self.x = self.x + xoff
      self.y = self.y + yoff
      self.z = self.z + zoff

    # rotation about the origin
    def rotate(self, angle):
      from math import * # import the math module into the current scope
      rad = angle * pi / 180 # convert to radians
      self.x = self.x * cos(rad) + self.y * -sin(rad)
      self.y = self.x * sin(rad) + self.y * cos(rad)

This example introduces some new syntax to us. First, we see right away that the coordinate class defines three methods: __init__, translate, and rotate. The __init__ method, as you may have guessed, is special. This is the constructor for the class -- the method that is called when a new instance of a class is created. The z = 0 that appears in __init__'s parameter list will be familiar to C++ programmers. It denotes that this argument is optional, and if it is not specified, it will default to 0. Also, the first line in the body of the constructor is a tuple assignment. It would more explicitly be written as (self.x, self.y, self.z) = (x, y, z), which will friendlier to Perl programmers. This does an element-by-element assignment; so self.x becomes x, self.y becomes y, and self.z becomes z.

Draw your attention to the first parameter of each of the 3 methods. The first parameter of any method is an instance object -- the instance from which this method was invoked. The parameter name self is arbitrary. You could call it this if you prefer. However, the name self is a Python convention, and I strongly recommend against using anything else.

Scoping rules in Python will be discussed in detail in part 2 of this tutorial. For now, it will have to suffice to say that in order to reference an attribute of an instance, you must reference it through the instance object, or self. So while in a C++ member function this->attr and attr are the same (assuming attr has not been redeclared in the member function's scope), in Python self.attr and attr are not. Now your first instinct might be to say this is a silly way of doing it. Why not just make a this keyword, like in C++? By passing the instance as a parameter, we more clearly and explicitly couple a method to an instance. Methods are also objects, and can be either bound to instances, or not bound to anything (that is, unbound). Let's see what I mean:

  plot = coordinate(2, 2)
  print coordinate.rotate
  print plot.rotate

Whose output is:

  <unbound method coordinate.rotate>

  <method coordinate.rotate of coordinate instance at 80cabb0>

Here we demonstrate a neat feature of Python objects: they can represent themselves as strings. This is quite useful for debugging purposes. The built-in function repr returns a string containing such information; the print statement will implicitly call repr on most objects for you.

So the first line of output tells us we have an unbound method. That is, it isn't associated with any instance of the coordinate class. The second line tells us it's a method of an instance, and so it is bound. The numbers at the end of the string represent the address at which this instance is located. Obviously it's going to be different for you.

In order to make my previous statement that "classes are objects" more obvious, let's look at this snippet:

  print repr(coordinate)
  fooclass = coordinate
  print repr(fooclass)
  plot = fooclass(2, 2)
  print plot

Which will output:

  <class __main__.coordinate at 80cbda8>
  <class __main__.coordinate at 80cbda8>
  <__main__.coordinate instance at 80bf228>

The first two print statements output the same thing. And this really does make sense, because the variable fooclass is bound to the class object coordinate. When we call fooclass(), we're really invoking coordinate's constructor, and the result is an instance of coordinate.

The last piece of the puzzle is the occurrence __main__ in the output. We'll be getting into scopes and namespaces in part 2 and we'll address this in more detail then. For now, it's easiest to think of __main__ as the top-most, or global scope. It's where all objects that aren't part of another module or class end up.

An Exception to the Rule

Error handling in Python is done using exceptions. When a statement or expression is executed, it may need to signal some sort of error message. It does this by raising an exception. For example, suppose we try to open a file that doesn't exist:

  file = open("/etc/password")

Executing this code will output:

  Traceback (innermost last):
    File "example.py", line 1, in ?
      file = open("/etc/password")
  IOError: [Errno 2] No such file or directory: '/etc/password'

Obviously it's not acceptable to have your program bail out every time it encounters an error as simple as a non-existent file. In Python, you first try the code, and can specify a block of code to be executed upon one, many, or all exceptions:

    file = open("/etc/password")
    print "Open failed!"

Now, if the open fails, it will print "Open failed!" and move on. It's entirely possible (and likely) that an expression can generate more than one exception. So in our example, we may just want to handle IOError exceptions. We can do this by specifying the exception type after the except keyword:

    file = open("/etc/password")
  except IOError:
    print "Open failed!"

But there are different kinds of IOErrors; the file may not exist, or perhaps it is not readable to the user trying to open it. Note in the exception error generated above, the No such file or directory error number was 2. Exceptions may have arguments associated with them, and we can fetch these arguments like so:

    file = open("/etc/password")
  except IOError, (errno, message):
    if errno == 2:
      print "File does not exist, cannot open"
      print "Unhandled error", errno, message

The IOError is one of many built-in exceptions. You can create your own custom exceptions, as well. In fact, the exception may be identified by either strings or instance objects. (You can identify an exception by a class object, but Python will just use that class object to create an instance before raising the exception.) A simple example of raising an exception identified by a string is:

  def bailout():
    raise "MyError", (1, 2, "x y z")

  except "MyError", info:
    print "Handled MyError exception with", info

More to Come...

Now that wasn't so bad, was it? By now you hopefully have a reasonably good feel for Python's syntax and some of the basics. If you're craving for more, Python's home page is the best starting point. If you're an experienced Perl programmer, you're probably by now thinking, "Okay, so what?" Part 1 is intended just as a gentle primer, so you may be left wondering what Python can really do.

In the next part of this series, we'll explore some of the finer nuances of Python. We'll take a look at packages, scope rules, string handling, and class inheritance. We'll also crack open a few of the modules provided in the Python library and work through some examples, including sockets, the XML parser, Perl-compatible regular expressions (rejoice, Perl programmers!), and more. After Part 2 you should have enough under your belt to tackle almost any project in Python.

In Part 3, we're going to see how we can extend Python using the Python/C API. A lot of Python's internals will be covered, including reference counting, creating new Python types, and creating new class objects from C. We'll discover that the XML parser we toyed with in Part 2 doesn't meet our performance requirements, and create a module that uses gnome-xml to parse XML and make some functions available to Python space for accessing data.

Who knows what's in store for Parts 4 and beyond. I may cover creating GNOME applications in Python using an incredibly cool library called libglade. I may also talk about CORBA, and examine a way to create CORBA objects in Python using the lightning fast CORBA ORB ORBit. Please email me your comments and suggestions; these tutorials are meant for you, a member of the Linux development community!

Jason Tackaberry (tack@linux.com) works in Ontario, Canada as a Unix/Network administrator. He is the author of ORBit-Python, Python bindings for ORBit, and several soon to be released projects. Having over 12 years of development experience in C and C++, and hacking with Perl for 4 years, he has turned to Python as his new favorite language.