Originally Published: Monday, 13 August 2001 Author: Nile Geisinger and the BlueBox Team
Published to: develop_articles/Development Articles Page: 2/3 - [Printable]

An Introduction to Word Oriented Programming with BlueBox

Exciting new open source projects are growing every day. Here at Linux.com we're delighted to bring them to your attention, especially projects like BlueBox, a program that introduces the concept of a linkable programming unit. These structures, known as words, can be published on the Internet and linked together to create richer software. Words have the same potential for growth as the Web since anyone can extend software published on the Internet simply by creating new words that link to it. The BlueBox team wrote this weeks feature article just for Linux.com readers, so read on and get involved!

Chemistry Program  << Page 2 of 3  >>

Writing a Chemistry Program

With this basic introduction to words in hand, let's write a simple chemistry library in words that checks equations of the form "2H2 + O2 -> 2H20" and reports if they are balanced or not. In chemistry, a balanced equation has to have the same number of atoms on both sides of the equation. The finished program can then be used as a sanity check for chemical processes that involve a large number of reactions.

All of the examples in this tutorial are available under the GPL and can be found here.

The first step in writing a word-oriented program is to identify all of the elements in the problem domain and assign symbols to them. The next is to use the matching structure in words to create relationships between words that define how those symbols can be grouped together and what those relationships mean.

This is similar to how we solve problems in everyday life. On encountering a new set of data, we invent symbols to represent entities and create relationships between those symbols to reflect the relationships in the world. In the chemistry equation, "2H2 + O2 -> 2H2O," for example, the fact that two hydrogen molecules are being added to an oxygen molecule to form two water molecules is explicitly represented in the syntax. This explicit representation of relationships makes it easier to understand and maintain code.

There are seven basic entities in a chemical reaction: numbers, subscripts, atoms, molecules, the production symbol, addition, and the reaction itself. Let's define symbols for each of these words:

  • Numbers: In chemical reactions, the numbers proceeding a molecule specify the number of molecules in that part of the equation. 2H2O, for example, specifies that there are two H2O molecules. From our earlier introduction to regular expressions, we know that we can match numbers then with the symbol "[0-9]*".
  • Subscripts: Subscripts specify the number of atoms of a certain type that are in a molecule. In H20, for example, the subscript "2" after H specifies that there are two Hydrogen atoms in water. Since we are restricting ourselves to ASCII in this example, the subscript symbol is identical to the Number symbol.
  • Atoms: Atoms are the basic units of matter that make up molecules, their most popular representation being found in the Periodic chart. Atoms are expressed as either a single capital letter (e.g., "H" as in Hydrogen) or a capital letter followed by a lowercase letter (e.g., "Cl" as in Chlorine). The regular expression to match one capital letter optionally followed by one lowercase letter is "[A-Z][a-z]{0,1}."
  • Molecules: Molecules are made out of sets of atoms. H20, NaCl, CH4, are all molecules. In reaction equations, molecules come in groups like 4CH4 and 2H2O. Our symbol for molecules must express any number of Atom words grouped together. The abstract regular expression to match any number of subscripted atoms is "{#Number}({#Atom}{#Subscript})+"
  • Production: The production symbol in chemistry equation separates the product of a reaction from the result. Normally, it is in the form of an arrow. We'll use a "dash" and a greater than symbol and match the simple symbol "->".
  • Addition: The addition symbol is used to represent the process of adding two chemicals together. We'll use the traditional "+" symbol to represent addition. Since the "+" symbol is a reserved character in regular expressions, it needs to be escaped with a backslash. The resulting symbol for addition is "\+".
  • Reaction: The chemical reaction itself expresses the transformation of one group of molecules into another group. With our knowledge of abstract regular expressions, we can represent this as "\s*{#Molecule}(\s*{#Addition}{#Molecule})*\s*{#Production} \s*{#Molecule}(\s*{#Addition}{#Molecule})*"

With these definitions in hand, we are now ready to start writing the program. Here is the number word.

   <?xml version="1.0"?>
   <word:word xmlns:word="<a href=
"http://www.dloo.com">http://www.dloo.com</a>" >
   <!-- Specify the Match to match against-->
   <word:symbol name="Number">
   <word:match matcher="absregex" expression="[0-9*]"></word:match>
   </word:symbol>
   <word:includes>

        <word:include>string</word:include>
   </word:includes>
   <word:definition>

        <word:method name="finishedParsing" >
       <word:access></word:access>
       <word:code href="Python" >
       self.count = 1
       if (len(self.fMatch) != 0):
         self.count = string.atoi(self.fMatch)
       </word:code>
       <word:return>
           <word:variable></word:variable>
       </word:return>
       </word:method>

      <word:method name="getNumber" >
      <word:access></word:access>
      <word:code href="Python" > </word:code>
      <word:return>
      <word:variable name="self.count"></word:variable>
      </word:return>
      </word:method>
</word:definition>

Note that it matches any numeric string and then converts the string to a number. The subscript word is similar. Although they match the same symbol in ASCII, Number and Subscript do not clash because they are called in different contexts as can be seen in the Molecule word below.

The atom word matches any character string like "S" or "Cl". It has properties that represent how many electrons, neutrons, and protons an atom has.

   <?Xml version="1.0"?>
   <word:word xmlns:word="<A HREF="http://www.dloo.com">http://www.dloo.com</A>" >
   <!-- Specify the Match to match against-->
   <word:symbol name="Atom">
       <word:match matcher="absregex"
   expression="[A-Z][a-z]{0,1}"></word:match>
   </word:symbol>
   <word:definition>
     <word:method name="start" >
      <word:access></word:access>
      <word:code href="Python" >
      self.numberOfElectrons = 0
      self.numberOfPositrons = 0
      self.numberOfNeutrons = 0
      </word:code>
      <word:return>
      <word:variable></word:variable>
      </word:return>
      </word:method>
   </word:definition>

Here's our definition for molecule. It matches any number of atoms that are combined together like H2O or CH4.

   <?Xml version="1.0"?>
   <word:word xmlns:word="<A HREF="http://www.dloo.com">http://www.dloo.com</A>" >
   <!-- Specify the Match to match against-->
   <word:symbol name="Molecule">
       <word:match matcher="absregex"
   expression="{#Number}({#Atom}{#Subscript})+"/>
   </word:symbol>
   <word:definition>
   </word:definition>

The Addition and Production words are very simple. They match "\+" and "->" respectively .

   <?Xml version="1.0"?>
   <word:word xmlns:word="<A HREF="http://www.dloo.com">http://www.dloo.com</A>" >
   <!-- Specify the Match to match against-->
   <word:symbol name="Addition">

       <word:match matcher="absregex" expression="\+"/>
   </word:symbol>
   <word:definition>
   </word:definition>
   <?Xml version="1.0"?>
   <word:word xmlns:word="<A HREF="http://www.dloo.com">http://www.dloo.com</A>" >
   <!-- Specify the Match to match against-->
   <word:symbol name="Production">

       <word:match matcher="absregex" expression="->"/>
   </word:symbol>
   <word:definition>
   </word:definition>

Finally, here's a chemical reaction which can be of the form 2H2 + O2 ->2H2O. The methods in the reaction, check to see if the reaction is balanced.

   <?xml version="1.0"?>
   <word:word xmlns:word="<A HREF="http://www.dloo.com">http://www.dloo.com</A>" >
   <!-- Specify the Match to match against-->
   <word:symbol name="Reaction">
       <word:match matcher="absregex"
   expression="\s*{#Molecule}(\s*{#Addition}{#Molecule})*\s*{#Production}
   \s*{#Molecule}(\s*{#Addition}{#Molecule})*"/>
   </word:symbol><
   <word:definition>
       <word:method name="start" >
           <word:access></word:access>
           <word:code href="Python" >
           </word:code>

           self.leftSide = self.getFirstWord()
           self.rightSide = None
           self.balanced = 0

          </word:code>
           <word:return>
               <word:variable name="self.balanced"></word:variable>
           </word:return>
        </word:method>

       <word:method name="parsing" >
           <word:access></word:access>
           <word:code href="Python" >
           currentWord = self.leftSide
           while (currentWord != None):
                    if (currentWord.getName() == "Production"):
                           self.productionWord = currentWord
                           self.rightSide = currentWord.getNextWord()
                   currentWord = currentWord.getNextWord()
           self.checkEquation()

           </word:code>
           <word:return>
               <word:variable name="self.balanced"></word:variable>
           </word:return>
        </word:method>

       <word:method name="checkEquation" >
           <word:access></word:access>
           <word:code href="Python" >
           currentWord = self.leftSide
           alreadyParsed = []
           self.balanced = 1
           while (currentWord != self.productionWord):
                   # For each molecule, iterate through
                   # it's atoms
                   if (currentWord.getName() == "Molecule"):
                           # For each atom, calculate the number
                           # of atoms on both sides.

                           currentPart = currentWord.getFirstWord()
                           while (currentPart != None):
                                   if (currentPart.getName() == "Atom"):
                                   # Get it's match and add it to already
                                   # parsed.
                                   atomString = currentPart.getMatch()
                                   if not atomString in alreadyParsed:
                                           alreadyParsed.append(atomStrin
   g)
                                           leftAtomCount                =
   self.countAtoms(atomString, self.leftSide, self.productionWord)
                                           rightAtomCount               =
   self.countAtoms(atomString, self.rightSide, None)
                                           if      (leftAtomCount      !=
   rightAtomCount):
                                                   self.balanced = 0
                           currentPart = currentPart.getNextWord()
                   currentWord = currentWord.getNextWord()
           </word:code>
           <word:return>
               <word:variable name="self.balanced"></word:variable>
           </word:return>
       </word:method>
       <word:method name="countAtoms" >
           <word:variables>
               <word:variable name="atomString" type="string"/>
               <word:variable name="currentWord" type="word"/>
               <word:variable name="termination" type="word"/>
           </word:variables>
           <word:access></word:access>
           <word:code href="Python" >
           atomCount = 0
           while (currentWord != termination):
                   # For each molecule, iterate through
                   # it's atoms
                   foundAtom = 0
                   if (currentWord.getName() == "Molecule"):
                           # For each atom, calculate the number
                           # of atoms on both sides.
                           subscriptNumber = 1
                           moleculeNumber = 1
                           currentPart = currentWord.getFirstWord()
                           while (currentPart != None):
                                   if      (currentPart.getName()      ==
   "Number"):
                                           moleculeNumber               =
   currentPart.getNumber()
                                           if  (currentPart.getName()  ==
   "Atom"):
                                                   #  Get  it's match and
   add it to already
                                                   # parsed.
                                                   currentAtomString    =
   currentPart.getMatch()
                                                   if  (currentAtomString
   == atomString):
                                                           foundAtom = 1
                                                   subscriptWord        =
   currentPart.getNextWord()
                                                   if  (subscriptWord  !=
   None):
                                                           if
   (subscriptWord.getName() == "Subscript"):
                                                                   subscr
   iptNumber = subscriptWord.getNumber()
                                   currentPart                          =
   currentPart.getNextWord()
                    if (foundAtom == 1):
                           atomCount += subscriptNumber * moleculeNumber
                   currentWord = currentWord.getNextWord()

           </word:code>
           <word:return>
               <word:variable name="atomCount"></word:variable>
           </word:return>
       </word:method>
       <word:method name="isBalanced" >
           <word:access></word:access>
           <word:code href="Python" >

           </word:code>
           <word:return>
               <word:variable name="self.balanced"></word:variable>
           </word:return>
       </word:method>
   </word:definition>

The reaction word in the document is instantiated as the chemical reaction is matched. If "2H2 + O2 -> 2H20" is entered, the program will run the document, matching all the words and checking if the equation is balanced. If the equation is balanced, the balanced attribute will be set in the reaction word.





Chemistry Program  << Page 2 of 3  >>