Some 10 years ago I was working on a software model derived from the GENTECH Genealogical Data Model. I set up the GDMUML (Genealogical Data Models in the Unified Modeling Language) website to publish that work.
I have recently "spruced-up" GDMUML's existing web pages and have begun adding new content. The first addition is GEDCOM-UML. This is a UML model for the contents of a GEDCOM file. It includes elements from both versions 5.5 and 5.5.1, so that the differences can be more clearly seen. The model attempts to accurately reflect the specifications rather than current usage by genealogy applications.
The model is arbitrarily pegged at version 0.5 because there is more work to be done. For example, initially all class members were typed as String. I'm now adding the explicit field lengths by changing the type to char array. Documentation of the classes and class members also remains to be done. Also, each member is getting a "tag context" entry which references back to the GEDCOM tag sequence which contributes the member's value.
There are several motivations for doing this. Since GEDCOM is the defacto standard for genealogical data interchange, we are probably going to have to live with it for some time to come. A clear blueprint for its current design should help with the design of future extensions. For existing applications that support GEDCOM import/export, having a reference design should help in validating conformance. And last, but not least, a normalized UML model can be exported as a database schema: GEDCOM-DDL.