Originally Published: Wednesday, 12 September 2001 Author: Konstantin Klyagin
Published to: learn_articles_firststep/General Page: 1/1 - [Std View]

Getting Started with Programming for Linux using GNU Tools

Linux is a rich programming environment, all the more so due to its adoption of the entire GNU tool set. In this article Konstantin Klyagin takes a detailed but quick look at the complete GNU environment for beginning or intermediate programmers on Linux. It's a great way to get up to speed on what everything does in your system.

Introduction

Some people say there is nothing but the GNU way to develop software. While there is a wide set of tools a typical Linux/UNIX software developer uses in his everyday work, the GNU development tools are a complete framework. It's absolutely free and powerful. There is no problem to see their source code. And it's not worse than any other commercial "killer-apps" widely used on other platforms. Newbie Linux programmers may be scared of new kinds of development process, but the GNU tools are your friends.

Modelling

I use the modelling tool every time I cannot imagine the complete architecture of a system or a separate module after a quick look at the task description. I run it, sit back and start finding the best way to implement the task ahead. I drag and drop use-case units, draw diagrams, database tables, relations, sequences of actions, packages structure and other stuff. Though I prefer Rational Rose myself there is a good and free program named Dia distributed under the terms of the GNU Public License that does much the same thing. It enables you to unleash all of your brain power and explain everything you think about future system's internals using the Unified Modelling Language. It's just a simple and incredibly powerful tool. Having all the diagrams you can easily remember what, when and why the system does what it does. You can even easily pass your task to a person who doesn't speak your language, but is familiar with the magic solution - UML.

Writing source code

After the modelling step is done I can get to coding. Here in the GNU world we can find a wide variety of different programs for text editing. You can find text editors that differ not only in their feature sets but even by the user interface concept. You can use visual editors such as mcedit and joe, text-command oriented editors like vi and vim. There are even script-driven ones like emacs that is very extensible with scripts in lisp. People who have only recently gotten involved in Linux and UNIX programming prefer visual editors rather than command-based. The only thing you should notice is that you can choose any kind of editor for your programming and documentation writing. You can even write poems using any of those great pieces of software in any way you like.

Version control

When a project is pretty big and you have been developing it for a while releasing new versions, bugfixes and stuff, you probably need to track all the changes you make to the source. The GNU development framework has a tool named CVS that can to help you.

CVS is a version control system. Every time you want to record modifications it asks you for a comment. Then you can check out an old version of your project from the repository and view the log of changes. It's extremely useful if, say, one of your recent modifications caused a bug.

Also you can develop several branches of a project separately. Let them be, say, "devel" and "stable". Using the CVS branching facility you can create a branch, add a feature or make a major change and then merge it with the main line of the source, leaving your free to experiment without the danger of damaging your existing work.

The modifications are held in a repository which can be accessed locally or from a remote workstation with TCP/IP. You can even use SSL for secure communications. This requires some extra setup. In order to do that, the CVS server software must be run via the SSL wrapper program, and the client must have a patch applied. I wrote the patch for our corporate needs, then it was released to the public. It can be downloaded at http://konst.org.ua/cvss/

Let's focus on the CVS's ability to work via TCP/IP. It makes the software very useful. For example, it enables developers to collaborate physically being located anywhere in the world. Every developer has their own local copy of the source. When a part of a work is ready he checks in the modifications and optionally comments them. Also there is a common practice to have a repository remotely available to the public in read-only mode, so testers can check out the most recent versions.

Compiling

The GNU compilers are the things we cannot imagine the world without. You need the GNU C compiler to compile your Linux kernel, various programs and it is even used to build itself. GNU compilers are what we cannot imagine good cross-platform portability without. GNU compilers are very good and easy to usxe. If, say, your C project contains no more than one file hello.c you can build the executable by simply running

gcc hello.c -o hello

There is also C++, Fortran and even GNU Java compilers you can use for development.

Dependencies Tracking

The situation with a project that has only one source file seems to be quite clear. You just have to execute the compiler every time you make a change to the code and then have a new executable. If it was that simple every time programming wouldn't be such an interesting affair and you might not be reading this article. So, say, you have a set of source files and headers. Compiling all the source files every time you make a minor modification can be rather annoying. Here the GNU make program comes to help you. This example Makefile demonstrates how to define dependencies and rules to build this project.

all: hello

hello: hello.o sayhi.o misc.o ui.o gcc hello.o sayhi.o misc.o ui.o -o hello This Makefile uses implicit rules to make *.o from *.c. But what about headers? They don't seem to be in the dependency tracking. You can include them into the Makefile this way:

all: hello

hello: hello.o sayhi.o misc.o ui.o gcc hello.o sayhi.o misc.o ui.o -o hello

hello.o: hello.c hello.h sayhi.o: sayhi.c sayhi.h misc.o: misc.c misc.h ui.o: ui.c ui.h Now it's better, but you'll have to update the dependencies stuff every time you add "#include" statements to your sources. Here a useful option of GNU C compiler comes to help you.

gcc -MM hello.c

Options such as -I specify additional directories to search for headers can be used also. Upon execution it prints the dependencies string for hello.o ready to include into your Makefile.

Now the final step. How can combine common make rules with dependencies auto-generated with the GNU C compiler?

all: hello

hello: dep hello.o sayhi.o misc.o ui.o gcc hello.o sayhi.o misc.o ui.o -o hello

clean: rm -f *.o hello .deps

dep: echo >.deps for i in hello.c sayhi.c misc.c ui.c; do gcc -MM $$i >>.deps; done

-include .deps

The dep target creates a file .deps that contains all the dependencies between source and header files. Also I have added the standard clean target. It's a common practice to include it into Makefiles so it could be easy to clean up all the generated stuff.

autoconf/automake

Aside from using the previously described methods there is a way to pay less attention to the Makefiles and build rules and concentrate on code you write. This is possible with the magic suite named autoconf/automake. It's really an amazing beast doing all of filthy work for you. Apart from taking care of Makefiles with all dependencies and other stuff for your projects, it has a mechanism to detect your system specific parameters before the compilation and building steps are performed. Start with writing the Makefile.am to define what exactly you want to build.

bin_PROGRAMS = hello hello_SOURCES = hello.c sayhi.c misc.c ui.c AUTOMAKE_OPTIONS = foreign

The last line tells automake it's not the GNU package, e.g. it does not contain standard files named NEWS, README, AUTHORS and ChangeLog that are necessary if you want your package to be GNU compliant.

After you have Makefile.am in the project directory, there is another input file for the suite named configure.in. It's responsible for the system specific parameters checking I mentioned before. The minimalist configure.in is below.

AC_INIT(hello.c) # Initializes the configure script. On start it will check for the # main source file specified here first.

AM_INIT_AUTOMAKE(hello, 1.0) # Tells automake we have project named "hello" version 1.0

AC_PROG_CC # Adds a check for C compiler

AC_OUTPUT(Makefile) # The output file is Makefile. All the build stuff will be put there

Now you are done with the autoconf/automake input specification. Run the following programs now in the order given.

aclocal autoconf automake -a -c

This will finally create the configure script, Makefile.in and add some default documentation to your project. Now it's finally ready to compile, debug and even to distribute.

Everyone who wants to build your program on his or her Linux computer needs to run ./configure make

We run ./configure so Makefile is created from Makefile.in. This is how the autoconf/automake is organized. It generates Makefile.in from your Makefile.am. Then Makefile.in is processed by ./configure script so all the system specific things would be considered and included into the final Makefile. It will also have a default install and uninstall targets which are extremely useful for your program users.

Debugging

Not a difficult thing to do in Linux: Let me introduce the GNU Debugger or gdb. It allows you to look inside your programs to see what's going on there. You can see the code executing, inspect variables and stack, set breakpoints, etc. C, C++, Modula-2, Chill, Pascal and Fortran languages are supported by gdb.

Programmers like me who came to Linux from desktop OSes such as MS-DOS, OS/2 or Windows may find the gdb command line interface less than useful. They often prefer visual debugging, but the gdb approach to user interface does have its advantages. For example, you can have a log of debugging a program, debug programs on remote in a telnet session on a slow link, etc. Whatever, there are a whole bunch of textmode and graphical visual front-end to gdb, so it's up to everybody to use what he or she likes. For free.

But let's have a quick look at what gdb can do itself. Let's start with running our program. Small hint: In order to debug a program it should be compiled with debug info and without optimization. Usually the last isn't paid much attention, but sometimes it interferes with debugging a lot. It sometimes happens that you cannot see some variables and whole lines of code in the debugger because of optimization. So it's your enemy if you want to debug comfortably.

To turn of the optimization and tell the compiler to put debug info into the binary add the following line to your Makefile.am.

CXXFLAGS = -g3 -O0

Those parameters, passed to a compiler, enable maximum debug info and turn off the optimization respectively.

To run a program in debugger issue the following command.

gdb ./hello

For watching its execution line-by-line use the "next" command that abbreviates to "n". Note, first you have to set a breakpoint at the first line of the main() function and run the program.

(gdb) break main Breakpoint 1 at 0x8048462: file hello.c, line 9. (gdb) run Starting program: /home/konst/cuj/./hello

Breakpoint 1, main () at hello.c:9 9 printf("hello, world!\n"); (gdb)

This was probably the most simple debugging session you will ever see. Nevertheless it has a lot of facilities such as attaching to an already running program, watching variables, conditional breakpoints, examining the stack, tracepoints and much more. It can also inspect "core" files that are generated by programs before they crush to see what the crush was caused by.

Profiling

Linux programmers usually use the GNU profiler (gprof) to find out where a program spends its time. It allows you to learn which functions were used and how long it takes to execute each of them. This way you can track routines that work slower than you expected. Whereas gdb allows you to see the program execution as it goes, gprof provides you with overall execution statistics.

GNU profiling is easy. In order to profile a program you should compile and link it with -pg parameter. Then you simply execute it and get the gmon.out file containing all the statistics collected. The file can be analyzed with the gprof program so you can see a complete graph and appropriate explanations on its output.

Distributing

Say, you have a complete program you are going to distribute - ship it to your customers or release it under the GPL and spread the word. There are various formats Linux programs are distributed in.

The most widely used one is a source archive, usually it's a gzipped tar file (package-version.tar.gz for example). If you use the autoconf/automake suite generating such an archive is not a problem at all. It creates a "dist" Makefile target which can generate a package ready for distributing. The resulting tar.gz includes the configure script and all the autoconf/automake stuff the end user needs to build and install the program on their system.

The tar.gz format is distribution independent. As it usually contains a program source, you should compile it every time you want it to be installed on a new computer. But it's not always necessary. There are two formats I want to tell you about aimed for popular Linux distributions that allow you to distribute only binaries.

My favorite distribution, RedHat Linux, as well as some others use RPM format for packaging. There is also a program named rpm (RedHat package manager) that enables you to generate such packages of your own programs. All you need in order to make an RPM distribution is just write so called specification file and feed rpm with it. The spec file includes a brief description, dependency list, application group, changelog, and a list of files to be installed. After you run the manager you have two packages, they are source and binary RPMs. Using rpm, user can query its database for the list of files that belong to a package, remove the whole package, see what packages depend on it, etc.

Debian package format is almost the same as RPM and almost as popular. To make a package you should fill up the debian/ directory in your project root directory with several specification files and run the package generation tool called dpkg-buildpackage.

Both Debian and RedHat distribution package formats allow the package maintainer to add scripts executed before the installation, after the installation, and before and after removing the package files. As they can generate source packages, binary packages from them can be built easily both with RPM and Debian tools.

Integrated Development Environments

What if you prefer complete environments rather than building your development process with separate brick-like tools? Linux has such software. Usually integrated development environment software has an internal source editor module and front-ends for the GNU tools such as gdb, autoconf/automake and optionally cvs, rpm and others. There are GUI and console text-mode IDEs.

The most famous of X-windows based IDEs for Linux is KDevelop. Being a part of the KDE project it's widely used for developing KDE applications. It has everything a KDE programmer dreams about including the reference on the Qt library which KDE uses for GUI.

People who prefer console may remember RHIDE, a text-mode programming IDE available for Linux. It had been ported to Linux from DOS where it had been a part of DJGPP tools. I liked the idea of such an IDE, but it looked more like a DOS program. Having an internal editor and gdb front-end it lacked vital support for cvs, automake and stuff. I tried to use it, but wasn't satisfied. Thus, I decided to write motor. Now it's a complete IDE that includes powerful editor with configurable syntax highlight feature. Every kind of project is organized with templates so potentially there is no problem to add support for almost any programming language. It includes support for gdb, cvs, automake/autoconf, RPM and templates for C, C++ and Java programs. I wrote motor for my own development needs, so I tried hard to make it meet the needs of a typical Linux programmer. I'm sure you are going to find it useful and it will be a great pleasure for me to get a feedback.

Obtaining further info

Unfortunately the short frame of this article don't allow me to tell you about every tool in detail. If you want to learn more about cvs, gcc, automake, autoconf, make, gdb or gprof, you can access the documentation with the "info" command. Info pages, being a part of the GNU development framework, provide you with lots of useful information about the tools you can use for fun and effective software development.

Konstantin "konst" Klyagin, lives in Kharkov, Ukraine. He is a 4th year student of the Kharkov State Polytechnical University, going to get his BS in System Analysis this year. Works as a programmer for the "Creative Data Desicions" company. Has got about 10 years overall programming expirience. Personal interests relate to computers, networking, programming, IT, Linux, digital innovations and also art, painting, history, politics, heavy music, girls, and having fun. Can be contacted at konst@konst.org.ua.

Konst's personal site URL is http://konst.org.ua/.