Originally Published: Monday, 30 July 2001 Author: Matt Michie
Published to: develop_articles/Development Articles Page: 1/1 - [Std View]

Introduction to Programming on Linux

Linux.com editor Matt Michie takes a look at the first choices you need to make when contemplating learning to program on your Linux system.

Thus spake the master programmer:
"After three days without programming, life becomes meaningless."
-- Geoffrey James, "The Tao of Programming"

Introduction

After using Linux for some time, it becomes irresistible for many to become a more active participant in the open source world. The daily chores of a Linux user are solved by the plethora of free software. It is also not uncommon for a "regular" user to download a tarball of C code, modify the makefile and compile the code into a usable binary.

This familiarity with using the tools of development fosters even more development thoughts. The intention of this article is to give a brief overview on how to get started down the path of a Linux hacker.

Once one has made the decision to become a developer, it is easy to be put off by the number of choices to make. Which editor to use? Which language to learn? Object oriented, procedural, or functional?

Paradigms

This brings us to the first fork in the road. Which style of programming should a beginner take up first? Some of the choices are functional, procedural, and object oriented. To become a good programmer, the beginner should eventually at least expose themselves to one language from each paradigm.

The comp.lang.functional FAQ defines functional languages as, "a style of programming that emphasizes the evaluation of expressions, rather than execution of commands. The expressions in these language are formed by using functions to combine basic values. A functional language is a language that supports and encourages programming in a functional style."

Some common functional languages available on Linux are: ML, Haskell, and Scheme.

Procedural languages are a style of programming that breaks up tasks into "procedures". Think of a flowchart and you'll get a good feeling of how a typical procedural program is put together.

Common procedural languages are: Pascal and C

Object oriented languages are built around the idea that code should be represented as objects. For instance, a programmer wishing to write a highway simulation would start out by defining a vehicle class, which would contain code essential to all vehicles. Later, the programmer can "inherit" code from this vehicle class to a truck object for instance.

Common object oriented languages on Linux are: C++, Java, Python, Smalltalk and Eiffel.

There are also hybrid languages in which you can program using multiple paradigms that include PHP and Perl.

Even though you want to expose yourself to each paradigm eventually, this can seem like an overwhelming chore. First define a project that you want to complete. You probably have encountered something that you wish you could automate, or make simpler under Linux. Once you have a clear goal, the choice of languages becomes simple. The problem domain that you choose should show you which language makes solving it easier. Learn this language first.

For instance, when I wanted to learn more about databases and web programming, I decided to make a books database containing the books I owned and those I wished to purchase. The languages that were well suited for solving this problem happened to be PHP and MySQL. After a bit of experimentation I was able to complete the database and put it onto the web: http://daimyo.org/books.

My own personal experience has led me to focus on at least four languages: C, PHP, SQL, and Perl. C is almost a pre-requisite for any other programming language. It has become so pervasive that any other language designed after C is likely to incorporate some of C's syntax. Therefore if you learn C early on you will have an easier time learning other languages.

The down side to learning C is that it is barely considered a "high-level" language. This makes it good for writing kernels and drivers, but not always the easiest for writing more advanced programs. However, there are so many good libraries and tools for developing C in Linux, so you can't go wrong learning it.

PHP, or PHP: Hyper-text Processor is an interpreted language that typically runs on web servers, giving HTML a scripting component. Once you know C, learning PHP can be done in a day. PHP gives you the flexibility to do web programming well.

SQL or Standard Query Language is the defacto database language. This goes hand in hand with learning PHP, since many of the applications one uses PHP for are connecting database components to the web. SQL was designed to be used by business people and therefore has a lot of syntactical sugar and English-style grammar. If you can pick up PHP and C, SQL is a snap.

Perl rounds out my language toolbox. Originally designed by Larry Wall, Perl has become a "glue" language, borrowing much of its syntax from C, and Unix shell tools. Perl makes it easy to manipulate text, and glue together disparate programs into a whole. While Perl's regular expressions will seem complicated at first, Perl regular expression syntax is fast becoming standard in other programs that have RE libraries. You'll be able to carry over your knowledge gained here to other languages down the road. Perl also has plenty of add-on modules available in its CPAN archive, which make coding easy.

The first Perl I wrote was to check my University's job web page for any jobs that had a higher pay than the one I was currently working. Since I was lazy, I coded up a quick and dirty Perl script to do this for me. All I had to do after that was to put it into cron and await its responses.

#!/usr/bin/perl 
# NMSU Job grabber 
# Matt Michie (mmichie@linux.com)
#----------------------------------------------------------------------
#Copyright (c) 2000, Matt Michie (mmichie@linux.com) (All rights reserved.)
#
#Redistribution and use in source and binary forms, with or without 
#modification, are permitted provided that the following conditions are met:
#
#Redistributions of source code must retain the above copyright notice, 
#this list of conditions and the following disclaimer.
#
#Redistributions in binary form must reproduce the above copyright notice,
#this list of conditions and the following disclaimer in the documentation
#and/or other materials provided with the distribution.
#
#The names of this programs contributors may not be used to endorse or 
#promote products derived from this software without specific prior 
#written permission. 
#
#THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
#``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 
#LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR 
#A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR 
#CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, 
#EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, 
#PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; 
#OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
#WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR 
#OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF 
#ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 
#----------------------------------------------------------------------
#Instructions for use:
#
#You must have LWP installed to fetch the jobs.  If you wish to use the
#e-mail notification you also must have sendmail installed.
#
#The script has the following command line flags:
#
#-v :  Print the version
#
#-t :  Max threshold to include the salary in statistics calculations
#
#-b (number) :  Boundary of salary after which program will shoot off an 
#               e-mail
#
#-m (email address) : Tell the program to notify you with email.  The arguement
#                     is a valid e-mail address. 
#
#-q : Quiet mode, don't print statistics, automatically used in email mode.
#Example use:
#
#fetch.pl -b 7.50 -m mmichie@linux.com
#
#This tells the program to fetch the jobs list and only send e-mail 
#notification if there are any jobs with higher pay than $7.50.
#----------------------------------------------------------------------

use LWP::Simple;
use Getopt::Std;

$version = 'Job Grabber 0.01';

getopts('vqt:m:b:') || die "Check your command line args!\n";

if ($opt_t != 0) {
    $max = $opt_t;
}
else {
    $max = 20; # Max threshold to include salary in count
}

$min = 0;      # Min threshold to include salary in count

$highest = 0;  # Highest salary
$total = 0;    # Total jobs counted
$count = 0;    # Total jobs which fall inside min/max thresholds

$oncampus = "http://www.nmsu.edu/~pment/oncampu.htm";
#$offcampus = "http://www.nmsu.edu/~pment/offcampu.htm";

$URL = $oncampus;

if ($opt_v) {
    print "$version\n";
    exit(0);
}
if ($opt_m && !$opt_q) {
    $opt_q = true;
}

&fetch_page;
&stats;

if ($opt_m && ($highest > $opt_b)) {
    &email;
}
elsif (!$opt_b && $opt_m) {
    $opt_q = 1;
}

sub fetch_page {
    unless (defined ($page = get($URL))) {
	die "There was an error getting URL: $URL\n";
    }

    @page = split(/\n/, $page);

    foreach $line (@page) {



	$line =~ s/<[^>]*>//g;  # strip HTML codes
	if (!$opt_q && $line =~ /On campus job postings as of:/) {
	    print "$line\n";
	}
	elsif ($line =~ /SALARY:/) {
	    push @pay, (split (/:/, $line))[1];
	}
    }
}    

sub stats {
    foreach $elm (@pay) {
	$total++;
	next if ($elm <= $min || $elm >= $max); 
	
	if ($elm > $highest) {
	    $highest = $elm;
	}
	
	$count++; 
	$accum += $elm;
    } 

    if ($count == 0) { 
	die "Eiiiiiiiiiieeeeeeeeeeeeeeeeeeee divide by zero :(\n";
    }
    else {
	$avg = $accum / $count;
    }  

    if (!$opt_q) {
	print "Total jobs listed: $total\n";
	print "Number of jobs counted for pay: $count\n";
	print "Highest hourly pay: \$$highest\n";
	printf "Average hourly pay: \$%.2f\n", $avg;
    }
}

sub email {
    open(SENDMAIL, "|/usr/lib/sendmail -oi -t -odq")
	or die "Can't fork for sendmail: $!\n";

print SENDMAIL <<"EOF";
From: Job Grabber <$opt_m>
To: $opt_m <$opt_m>
Subject: Jobby
	  
Total jobs listed: $total
Number of jobs counted for pay: $count
Highest hourly pay: \$$highest\nAverage hourly pay: \$$avg\n
EOF

    close(SENDMAIL) or warn "sendmail didn't close nicely";
}

As you can see from the code, I wrote it procedurally, using as much of the C style syntax as I could. This increased my development speed, as all I had to do was look in the documentation to see where Perl's basic syntax differed from C's. Notice in one place that I actually used Perl's printf, since I was familiar with C's printf syntax. Also, I was able to use a couple of the excellent Perl modules to grab the web page off the Internet as well as process command line arguments.

You'll also notice that instead of writing e-mail code, I simply used the Perl script to throw data into Sendmail. This is a good example of how Perl glues together programs.

Eventually, to round out my toolbox, I will probably pick up Python or perhaps Ruby. Both of these languages have received wide acclaim for their clean syntax and object-oriented styles. Python is considered by many to be an ideal choice for a first language, because of its interpreted nature (making trying out new things quick), and the way it "forces" a clean style of writing code. If I was starting from scratch I would probably attempt either Python or C as my choice of first language.

Tools

With Linux development there are usually two main tools that are first argued over. Vi or Emacs? Both of these text editors will allow you to input code, and much flamage has been written for both sides on which one is Superior. My advice is to try each one and see which one suits your style of development. I actually use both, for quick scripts I usually use vim, which has syntax highlighting and quick load times. For longer C projects, I tend to favor the Emacs integrated kitchen-sink environment.

GCC, the GNU C Compiler, is one of the best features of GNU/Linux. The compiler is stable, well documented, and is used to compile the majority of all free software. When you code C, you'll be using gcc. The GNU debugger gdb, has some nice features, but can be complex for a beginner to learn. Try ddd for your C debugging needs.

Other languages often have their own IDEs, but I recommend you stick to vi or Emacs when beginning so you can gain proficiency in an editor that you can use for multiple tasks and languages.

Conclusion

Linux is a developer's haven. The developers control the environment, and create tools to make their lives easier. Almost every programming language ends up with a compiler or interpreter that runs on Linux. There is extensive documentation and code available for a budding programmer to study. Sometimes the best way to learn is to read other people's code and see how they do business. Linux and open source leads the way in providing this. You can't go wrong with learning to code on Linux.

Resources

http://perl.com
http://python.org
http://www.ruby-lang.org/en/
http://www.cpan.org/
http://www.php.org

Matt Michie exists in the New Mexican desert. Please visit his web site at http://daimyo.org.