Tuesday, January 28, 2014

Announcing: GEDCOM Plugin for Notepad++

GEDCOM file in Notepad++ with GedcomLexer pluginI have been looking at raw GEDCOM files quite a lot lately. I usually go to the Notepad++ text editor for this because it is good at recognizing file encodings. Notepad++'s popularity derives from it being open-source and general-purpose, robust and lightweight. Many plugins have been developed for it to extend its capabilities to more specialized uses (especially for programmers). So that prompted me to cook up a GEDCOM plugin to provide syntax highlighting and folding.
Technically, this is a lexer plugin (GedcomLexer.dll) for GEDCOM files. A lexer is a program that performs lexical analysis, in this case of GEDCOM files. The lexer follows the data representation grammar of GEDCOM specification version 5.5.1. It recognizes the possible tokens in a line: level, xref_id, tag, user tag, pointer, value, and escape. Each of these tokens has a default style supplied by the plugin which can be customized through the Notepad++'s Style Configurator. When an invalid character in a token is detected, the lexer enters the Invalid state and outputs the remainder of the line in the Invalid style (default red).
In the current release (0.1.0), folding (hiding detail text) is based on the line level. In GEDCOM files, logical records begin at line level 0. Subordinate lines with levels 1 or higher contribute to the logical record which was defined by the level 0 line that preceded it. So, folding allows a user to see only level 0 lines (logical record starts) or level 0 lines plus selected additional levels, giving the user some control over the amount of detail displayed.
The plugin has been tested with a variety of GEDCOM files (*.ged), including UTF-8, UTF-16, ANSI, and ASCII. In release 0.1.0, the ANSEL character set is not supported.
This is released as an open source project on SourceForge: GEDCOM Lexer Plugin for Notepad++. A ZIP file of the plugin and install instructions can be downloaded here.
Share:

2 comments:

  1. Thanks very much for this. I use Notepad++ regularly to edit my Gedcom files as I have an on-line CSV to Gedcom converter that I am tweaking and modifying for different peoples use. It makes the analysis of the Gedcom files much easier.

    ReplyDelete
    Replies
    1. Glad you find it helpful for your work. There is always room for improvement, so let me know if you have suggestions.

      Delete

Popular Posts

Categories

Total Pageviews

Google+ Profile

Follow by Email