oai4courts layer two: elements versus attributes

Elements vs. attributes

How do we choose between 

<element myattrib="foo"/>

and 

<element>foo</element> 

There are arguments to be made both ways -- a really good summary of them is here.  A vastly simpler approach, though, is to use attributes only for those things for which a controlled vocabulary exists.  This is a particularly useful approach when you don't know a lot about the data being marked up, and need to (at least initially) take a bottom-up, data-driven approach to the design.

Take, for example, the problem of representing the roles assumed by different parties in a case (eg. "third-party defendant-appellant" or whatever).  It would stand to reason that there is a limited universe of possible roles. And that may well be true, for a particular type of litigation in a particular court, leading one initially to the idea of using an attribute:

<party role="appellant">Mary Smith</party>, Appellant  

But the world is a very big place, with many courts and many types of litigation and lots of possible, unguessable names for the roles that the parties play in it.  So a better approach might be

<party><name>Mary Smith</name>,<role>Appellant</role></party> 

Over time, it may work out that there are certain roles that carry different names in different courts or in different types of litigation, all of which are equivalent.  This might well lead to the idea of using an attribute to represent the "role class".