Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog and this project adheres to Semantic Versioning.
[0.0.0-beta.9]
Added
CLI
Added
glx pathcommand - Find the shortest relationship path between two people using breadth-first search. Traverses all relationship types (parent-child, marriage, sibling, godparent, etc.). Supports--max-hopsto limit search depth and--jsonfor machine-readable outputAdded
--birthplacefilter toglx query persons- Filter persons by birthplace using place ID or name substring (case-insensitive). Matches against bothborn_atvalue and resolved place name. Fixes #141Analyze flags uncited claims in notes -
glx analyzeevidence checks now detect assertion notes that reference sources (e.g., "per county history," "census shows") without a corresponding citation. Fixes #162
Fixed
- Stats lists duplicate entity IDs -
glx statsnow lists the specific duplicate IDs in its warning, consistent withglx analyze. Fixes #177 - Validate and archive loading skip non-.glx files -
glx validateand archive loading now only process files with the.glxextension. Previously,.yamland.ymlfiles in the archive directory were also parsed, causing spurious validation errors on non-GLX files like.wikitree.yml. Fixes #178
[0.0.0-beta.8] - 2026-03-15
Added
CLI
- Added
glx analyzecommand - Automated research gap analysis engine that cross-references all entities in a GLX archive to surface evidence gaps (missing dates, no parents, no events), evidence quality issues (unsupported assertions, single-source persons, orphaned citations/sources), chronological inconsistencies (death before birth, parent younger than child, implausible lifespan), and research suggestions (census years to search, vital records to locate). Supports--checkto run a single category,--format jsonfor machine-readable output, and person filtering by ID or name - Added
glx coveragecommand - Show source coverage matrix for a person, listing expected records (US census, vital, probate, land, military, church) and which are present vs missing. Flags high-priority gaps like the 1880 census. Supports--jsonoutput - Added
glx census addcommand - Bulk census import helper that generates GLX entities from a structured YAML template. Reads census year, location, household members, and citation details to produce person records, a census event with participants, source/citation entities, and evidence-based assertions. Supports matching members to existing archive persons by ID or name,--dry-runpreview, and FAN notes
Library
- Exported
ExtractFirstYearandExtractPropertyYear- Year-extraction utilities are now public API for use by CLI commands and external consumers
Validation
- Moved temporal consistency checks to
glx analyze- Death before birth, parent younger than child, and marriage before birth checks are now part of the analyze command's consistency category instead of the validator, keepingglx validatefocused on structural and referential integrity
Standard Vocabularies
- Added
vocabulary_typeto property definitions - Properties can now reference a controlled vocabulary (e.g.,vocabulary_type: gender_types) instead of a free-formvalue_type. Validation warns on out-of-vocabulary values. Mutually exclusive withvalue_typeandreference_type - Added
gender_typesvocabulary - First vocabulary-constrained property type. Standard entries: male, female, unknown, other — with GEDCOM SEX mappings. GEDCOM export now looks up gender→SEX via the vocabulary before falling back to hardcoded mappings - Added
marriage_typeevent property - Classification of marriage (civil, religious, common-law). Was used in GEDCOM import/export but missing from standard vocabulary - Added
primary_nameperson property - Simple display name fallback when structured name property is not available. Was used in event titles and data generation but missing from standard vocabulary - Added
blob_sizemedia property - Size in bytes of inline binary data from GEDCOM 5.5.1 BLOB records. Was used in GEDCOM media import but missing from standard vocabulary
Changed
- GEDCOM encoding conversion now streams for charmap encodings - CP1252/ISO-8859-1 decoding uses
transform.NewReaderinstead of reading the entire file into memory. Only ANSEL (which requires combining-mark reordering) buffers the full file. UTF-8 files pass through with near-zero overhead - ANSEL converter handles multiple combining diacriticals - Consecutive combining marks preceding a base letter are now all buffered and emitted after the base letter in Unicode order, instead of only handling a single combining mark
Changed
- Life History narrative mentions children -
glx summarynow includes children in the biographical narrative, listed by given name in birth order (e.g., "She had three children: Harriett, Elijah, and Mary."). Fixes #153
Fixed
- Analyze flags missing marriage events per spouse -
glx analyzenow checks each spouse relationship independently instead of checking for any marriage event. Persons with multiple spouses where one has an event and another doesn't are now correctly flagged with the specific spouse name. Fixes #166 - Places command detects person property references -
glx placesno longer reports places as "Unreferenced" when they are used in person properties (born_at,died_at,buried_at,residence). Also checks assertion values for place-reference properties. Handles string, structured map, and temporal list property shapes. Fixes #145 - Analyze checks citations for census coverage -
glx analyzenow checks assertions' citations and sources (not just census event entities) when determining whether a census year is covered. Previously, census records documented only via citations were still suggested as missing, contradictingglx coverageoutput. Fixes #140 - BEF date prefix respected in census suggestions -
glx analyzeandglx coveragenow treatBEF <year>death dates as exclusive upper bounds. A person withdied_on: "BEF 1870"no longer gets 1870 census suggestions. Fixes #165 - Summary shows marriages in chronological order -
glx summarynow sorts spouses by full marriage date (earliest first, using the same date sort key asglx timeline) instead of relationship ID order. Correctly orders marriages within the same year. Undated marriages sort after dated ones. The Life History narrative also reflects the correct order. Fixes #136 - Life History narrative formats ISO dates as readable text - Dates like
1863-06-18now render as "on June 18, 1863" instead of "in 1863-06-18". Handles full dates, year-month, and prefixed dates (ABT, BEF, AFT) - Census suggestions capped at plausible lifespan -
glx analyzeandglx coverageno longer suggest census years beyondbirth_year + 100when no death date is known. Previously, a person born ~1832 would get suggestions for 1940 and 1950 censuses. Fixes #130 - Burial events infer death for census suggestions - When
died_onis not set but a burial event exists, the burial date is used as the death upper bound for census suggestions. Prevents suggesting post-death censuses for persons with burial records but no explicit death date. Fixes #134 - 1890 census annotated as mostly destroyed -
glx coverageandglx analyzenow note that the 1890 US Census was mostly destroyed in a 1921 fire, so researchers don't waste time searching for non-existent records. Fixes #131 - Timeline includes person's own birth and death -
glx timelinenow synthesizes birth/death entries fromborn_on/died_onperson properties when no corresponding event entity exists. Previously these were omitted, making the person's own vital events the only events missing from their timeline. Fixes #142 - GEDCOM import now converts non-UTF-8 encodings - Files with
CHAR ANSI(Windows-1252),CHAR cp1252,CHAR ANSEL, orCHAR ISO-8859-1are now automatically converted to UTF-8 during import. Previously, non-ASCII characters (German umlauts, accented letters, copyright symbols) were stored as raw bytes, producing!!binaryYAML tags, garbled event titles, and{"type":"Buffer"}place names in the web UI - GEDCOM date import mangled when day-of-month matches level number - Dates like
2 AUG 1944(day 2) were imported as2 DATE 2 AUG 1944because the parser's value extraction matched the level number instead of the actual value. Fixed by walking past tokens positionally instead of using string search - Date year extraction now handles 1–3 digit years - Year extraction previously hardcoded a 4-digit assumption (
\d{4}), silently ignoring dates like800,476, orABT 476. All four extraction sites (query filtering, timeline sorting, temporal validation, event titles) now support 1–4 digit years. Day-of-month values (e.g.,15in15 MAR 1850) are correctly disambiguated. Timeline sort keys are zero-padded to 4 digits for proper chronological ordering. Fixes #108
[0.0.0-beta.7] - 2026-03-10
Added
CLI
- Added
glx exportcommand - Export GLX archives to GEDCOM 5.5.1 or 7.0 format. Supports both single-file and multi-file archives as input. Reconstructs GEDCOM FAM records from GLX relationships, converts dates/places/names back to GEDCOM format, and preserves sources, repositories, media, citations, and notes. Use--format 70for GEDCOM 7.0 output - Added
glx timelinecommand - Display chronological events for a person, including direct events and family events (spouse/child births, parent deaths) via relationship traversal. Supports--no-familyflag to exclude family events; undated events shown in a separate section - Added
glx summarycommand - Comprehensive person profile showing identity, vital events, life events, family (spouses, parents, siblings), other relationships, and an auto-generated life history narrative - Added
glx ancestorsandglx descendantscommands - Display ancestor/descendant trees using box-drawing characters. Traverses parent-child relationships with--generationsflag to limit depth. Handles biological, adoptive, foster, and step-parent types with cycle detection - Added
glx vitalscommand - Display vital records (name, sex, birth, christening, death, burial) for a person by ID or name search, plus any other life events they participated in - Added
glx citecommand - Generate formatted citation text from structured fields (source title, type, repository, URL, accessed date, locator), eliminating repetitive manualcitation_textwriting - Added
--sourceand--citationfilters toglx query assertions- Filter assertions by source or citation ID to find all claims derived from a specific source - Improved
glx query persons --nameto search all name variants - Now matches across birth names, married names, maiden names, and as-recorded variants (temporal name lists), not just the primary name. Results show alternate names with "aka:" suffix - Added
glx diffcommand - Compare two GLX archive states with genealogy-aware diffing. Shows added, modified, and removed entities with field-level detail, confidence upgrade/downgrade tracking, and new evidence metrics. Supports summary, verbose, short, and JSON output modes. Use--personto filter changes for a specific person - Added
glx clustercommand - FAN (Friends, Associates, Neighbors) club analysis for brickwall research. Cross-references census households, shared events, and place overlap to identify associates of a target person. Ranks associates by connection strength with compound scoring. Supports--place,--before,--afterfilters and--jsonoutput - Added
glx duplicatescommand - Detect potential duplicate person records using a weighted scoring model (name similarity with Levenshtein distance and nickname matching, birth/death year proximity, place match, shared relationships and events). Supports person-specific filtering and JSON output. Automatically skips persons already linked by relationships
Event Entity
- Added optional
titlefield - Human-readable label for events (e.g., "1860 Census — Webb Household"). Auto-generated on GEDCOM import (e.g., "Birth of Robert Webb (1815)", "Marriage of John Smith and Jane Doe (1850)")
GEDCOM Import
- Non-standard date preservation - BCE dates, Julian/Hebrew/French Republican calendar dates, and dual-year dates are preserved as raw strings instead of being dropped
- TITL with DATE/PLAC sub-records - Title properties with dates and places are stored as temporal list items and roundtrip correctly
- Empty OCCU with PLAC fallback - OCCU records with empty values but PLAC sub-records now extract the place text as the occupation value
- HEAD-level NOTE preservation - Notes on GEDCOM HEAD records are now imported and exported
- Family-level RESI import - RESI records under FAM are now distributed to both spouses as residence properties
- Family-level NOTE import/export - NOTE records on FAM are now stored on the relationship's Notes field and roundtrip correctly
GEDCOM Export
- Inline SOUR citations on individual events - Birth, death, burial, and other individual events now preserve SOUR citations during export
- Single-spouse family marriages - FAM records with only HUSB or WIFE now export marriage relationships and events instead of being silently dropped
- Multiple MARR events per family - Families with multiple MARR records now preserve all marriage events
- Marriage TYPE export - Marriage
marriage_typeproperty now exported as TYPE sub-record on MARR - Family event TYPE/properties export - Family events (EVEN, ENGA, etc.) now export event_subtype and other event properties (TYPE, CAUS, AGE) that were previously lost
- HEAD metadata roundtrip - LANG, FILE, COPR sub-records from the original GEDCOM HEAD are now preserved through import/export
- Single-value RESI export - RESI stored as scalar (not list) now exports correctly instead of being silently dropped
- Multi-family children placed in all matching families - Children belonging to multiple FAM records (e.g., birth family + step-family) are now placed in all matching families instead of only the first match
Validation
- Added temporal consistency checks - Validator now warns on: death year before birth year, parent born after child, marriage event before participant's birth. Reported as warnings since dates are often estimates
Documentation
- Added Westeros example archive - Large-scale example featuring 790+ characters from A Song of Ice and Fire with full evidence chains, 200+ custom vocabulary types, and temporal properties. Hosted at github.com/genealogix/glx-archive-westeros
- Added Hands-On CLI Guide - Step-by-step walkthrough of every
glxcommand using the Westeros demo archive, with real output examples
Fixed
- SOUR citation duplication on multi-value properties - Assertion-based SOUR references now filter by matching value, preventing N×N duplication when a person has multiple values for TITL, OCCU, etc.
[0.0.0-beta.6] - 2026-03-08
Added
CLI
- Added
glx placescommand - Analyze places for ambiguity and completeness: flags duplicate names, missing coordinates, missing types, hierarchy gaps, and unreferenced places with canonical hierarchy paths - Added
glx querycommand - Filter and list entities from a GLX archive with type-specific flags:--name,--born-before,--born-afterfor persons;--type,--before,--afterfor events;--confidence,--statusfor assertions - Added
glx statscommand - Summary dashboard showing entity counts, assertion confidence distribution, and entity coverage for quick feedback on archive health
Build & Release
- Added
make release-snapshottarget - Build cross-platform binaries locally without publishing, using GoReleaser snapshot mode - Updated release workflow to latest action versions -
actions/checkout@v4(withfetch-depth: 0for proper changelog),actions/setup-go@v5,goreleaser/goreleaser-action@v6
Person Entity
- Added name variation tracking - Expanded the
name.fields.typeclassification field with standard values for alternate spellings, abbreviations, and as-recorded forms (aka,maiden,anglicized,professional,as_recorded). Added documentation and examples for representing name variations like "R. Webb" vs. "Robert Webb"
Standard Vocabularies
- Added
original_place_namecitation property - Records the verbatim place name from a source before normalization to a place entity (e.g., "The Town Of Oakdale" vs the normalized place reference) - Added relationship types -
neighbor,coworker,housematefor census/social records;apprenticeship,employment,enslavement,relativefor occupational and generic kinship relationships - Added event types -
legal_separation,taxation,voter_registrationfor legal/administrative events;military_service,stillborn,affiliationfor service periods, stillbirths, and memberships - Added source types
population_register,tax_record,notarial_record- Common European and colonial record types - Expanded
militarysource type description - Now includes draft registrations and muster rolls
Participant Object
- Added
propertiesto participants - Participants across events, relationships, and assertions can now carry per-participant properties likeage_at_event, enabling shared events (census, passenger lists) to record individual data without creating separate events per person - Participant properties validated against parent entity vocabulary - Event participant properties validated against event_properties, relationship participant properties against relationship_properties, assertion participant properties against event_properties
Assertion Entity
- Added existential assertions - Assertions no longer require
propertyorparticipant; an assertion with onlysubjectand evidence asserts the entity's existence, optionally at a specificdate(#26)
GEDCOM Import
- Import HEAD metadata - GEDCOM HEAD record fields (export date, source file, copyright, language, source system/version/corporation, GEDCOM version, character set, notes) are now stored in a
metadatasection on the GLX archive instead of being discarded after logging - Import SUBM metadata - GEDCOM SUBM submitter information (name, address, phone, email, website) is now stored in
metadata.submitteron the GLX archive
Data Model
- Added
Metadatatype - New top-levelmetadatafield on GLX archives for storing import provenance information - Added
Submittertype - Nested within metadata to hold submitter contact details
Changed
Specification
- Removed hard-coded vocabulary counts - Replaced "N standardized type codes" with descriptive text to prevent stale counts as vocabularies grow
- Improved custom type example - Custom event type example now shows defining custom participant roles (
apprentice,master) alongside the custom event type - Clarified
subjectparticipant role - Documented as preferred overprincipal
Fixed
Specification
- Fixed confidence levels example format - Core concepts example now uses the correct
label/descriptionstructure instead of simple key-value strings - Fixed citation GEDCOM mapping - Corrected invalid
SOUR.CITN.EXIDtag toSOUR.EXID - Fixed core-concepts.md formatting - Property Vocabularies heading was merging with preceding table
- Fixed glossary Secondary Evidence example - Replaced "census records" (primary evidence) with "published indexes, compiled genealogies"
[0.0.0-beta.5] - 2026-03-06
Added
Standard Vocabularies
- Added
urlandaccessedproperties for digital sources - Sources can now record aurlproperty, and citations can record anaccesseddate for when an online source was last verified (#21) - Added
raceperson property - Temporal string property for recording racial classifications as they appear in historical documents such as census records (#24) - Added
urlandexternal_idscitation properties - Citations can now record a direct URL to cited material and external identifiers (e.g., FamilySearch ARK) for record-level specificity (#23) - Added
typefield toexternal_idsproperty - Allexternal_idsproperties (person, source, citation, repository) now support a structuredfields.typeto record the issuing authority (e.g., FamilySearch URI from GEDCOM EXID.TYPE) (#32) - Added
typefield tonameproperty - Name property now supports afields.typeto classify name usage (e.g., birth, married, alias) (#25)
Assertion Entity
- Added
statusfield to assertion entity — Assertions can now record a research status (e.g.,proven,disproven,speculative) independently ofconfidence, allowing researchers to distinguish between certainty and verification state (#27)
GEDCOM Import
- Import NAME.TYPE subfield - GEDCOM
NAME.TYPEvalues (BIRTH, MARRIED, AKA, etc.) are now lowercased and stored in the name property'stypefield (#25) - Import EXID on citations - GEDCOM 7.0
EXIDtags on source citations are now imported asexternal_idscitation properties (#32) - Structured EXID import - GEDCOM EXID.TYPE is now stored in
fields.typeinstead of being concatenated into the ID string; applies to all entity types (#32)
Fixed
GEDCOM Import
- Multiple GEDCOM NAME records no longer silently dropped (#29) - When a person has multiple NAME records (birth name, married name, etc.), all names are now stored as a temporal list instead of only keeping the last one
- FAM event processing no longer depends on HUSB/WIFE tag order (#15) - Family events (CENS, ENGA, MARB, etc.) are now collected in a first pass and processed after spouse IDs are extracted, so GEDCOM tag order no longer matters
- Census NOTE no longer discarded when SOUR exists (#30) - NOTE text on CENS records is now appended to existing citation notes when SOUR sub-records are present, instead of being silently lost
- Marriage/divorce events use
start_event/end_eventinstead of properties - GEDCOM MARR and DIV events are now correctly linked to relationships via the top-levelstart_eventandend_eventfields, eliminating non-vocabularymarriage_event/divorce_eventproperty warnings - Append residence on PLAC-without-DATE instead of overwriting - When residence came from a GEDCOM RESI tag or census-derived CENS data with a PLAC but no DATE, the residence property was overwritten instead of appended (#22)
[0.0.0-beta.4] - 2026-03-04
Added
Standard Vocabularies
- Added
townshipplace type - Township is a common administrative division in U.S. census and land records, distinct fromtown(a geographic settlement vs. a civil subdivision of a county) (#16)
Fixed
Validation
- Suggest correct vocabulary key on hyphen/underscore mismatch - When a reference fails validation due to a hyphen/underscore swap (e.g.,
birth_datevsbirth-date), the error message now suggests the correct key (#19)
CLI
- Show directory contents in
glx initnon-empty error - Whenglx initfails because the target directory is not empty, the error message now lists up to 5 files found (e.g.,.DS_Store,.git), helping users diagnose unexpected blockers like hidden files or sync artifacts (#18) - Remove self-referencing
replacedirective that blocksgo install- Thego.modcontained a no-op self-referencing replace directive that preventedgo install github.com/genealogix/glx/glx@latestfrom working (#17)
GEDCOM Import
- Deduplicate evidence references - When a GEDCOM record references the same source multiple times,
extractEvidence()andextractEventDetails()now skip IDs already seen, preventing duplicate entries that violate unique constraints in downstream consumers (#13)
Documentation & Website
- Fix dead links and website issues - Rewrote 83 dead links across the site to point to GitHub URLs and VitePress paths, added solid background to navbar on home page, and fixed module path resolution (#10)
- Fix Go Report Card link - Corrected badge link in CLI README to point to the repository root (#11)
[0.0.0-beta.3] - 2026-02-10
Added
Census Event Type
- Added
censusevent type to standard vocabulary - Census enumeration events (CENSGEDCOM tag) now included inevent-types.glx
Schema Embeds
CitationPropertiesSchemaandSourcePropertiesSchemaembed variables - Completes the pattern established by all other vocabulary schema embeds inembed.go
GEDCOM Import: Eliminate Meaningless Citations
- Bare source references no longer create empty citation entities - When a GEDCOM SOUR tag references a source without any citation-level detail (no PAGE, DATA, TEXT, QUAY, NOTE, or OBJE subrecords), the assertion or event now references the source directly via the
sourcesfield instead of creating a citation that only contains a source reference - Added
PropertySourcesconstant for event/relationship properties
Changed
Assertion Entity Improvements
Renamed claim to property
- Renamed
claimfield toproperty- The field name now matches the vocabulary terminology (property vocabularies) - Updated JSON schema, Go types (
Assertion.Claim→Assertion.Property), all specification examples, example archives, test data, and terminology throughout docs - Renamed test directories:
assertion-unknown-claim→assertion-unknown-property,assertion-participant-and-claim→assertion-participant-and-property,invalid-assertion-claims→invalid-assertion-properties
Typed Subject Reference
- Changed
subjectfrom string to typed reference object - Prevents entity ID collisions in large archives - Must specify exactly one of:
person,event,relationship, orplace - Before:
subject: person-john-smith→ After:subject: { person: person-john-smith } - Added
EntityRefGo type withType()andID()helper methods - Updated validation to ensure exactly one field is set and referenced entity exists
Media as Assertion Evidence
- Added
mediaas a third evidence option for assertions - Assertions can now reference media entities directly as evidence, alongside citations and sources - Useful for direct visual evidence like gravestone photos, handwritten documents, or family photographs
- JSON schema
anyOfevidence constraint updated to includemedia
Temporal date Field
- Added
datefield to assertions - Assertions can now specify a date or date range indicating when the asserted property value applies, enabling precise temporal targeting for properties like occupation, residence, and religion that change over time - Added
Datefield toAssertionGo struct anddateproperty to assertion JSON schema - Assertion
valuefield is now required whenpropertyis present
Vocabulary Consolidation
Adoption Modeling
- Removed redundant
adoptionrelationship type - Useadoptive-parent-childrelationship type instead - Clarified adoption semantics:
adoptionevent type records the legal proceeding;adoptive-parent-childrelationship type models the ongoing bond - Removed
RelationshipTypeAdoptionconstant from Go code
Godparent Modeling
- Clarified godparent dual usage - Participant role
godparentfor event participation (baptism sponsor); relationship typegodparentfor the ongoing bond - Added
godchildparticipant role for use in godparent relationships
Type System
Unified Participant Type
- Unified participant types - Consolidated
EventParticipant,RelationshipParticipant, andAssertionParticipantinto singleParticipantstruct- All three had identical structure:
person,role,notesfields Event.Participants,Relationship.Participants, andAssertion.Participantnow all use the unified type
- All three had identical structure:
Property Vocabularies
Media Properties
- New
media-properties.glxvocabulary - Standard properties for media entities:subjects- People depicted or referenced in the media (multi-value)width,height- Dimensions in pixels for images/videoduration- Duration in seconds for audio/videofile_size- File size in bytescrop- Crop coordinates as integers (top, left, width, height)medium- Physical medium type (photograph, document, film)original_filename- Original filename before importphotographer- Person who created the medialocation- Place where the media was created
- Added
Propertiesfield to Media struct andMediaPropertiesto GLXFile
Repository Properties
- New
repository-properties.glxvocabulary - Standard properties for repository entities:phones- Phone numbers for the repository (multi-value)emails- Email addresses for the repository (multi-value)fax- Fax numberaccess_hours- Hours of operation or access availabilityaccess_restrictions- Any restrictions on access (appointment required, subscription, etc.)holding_types- Types of materials held as YAML arrays (multi-value)external_ids- External identifiers from other systems like FamilySearch, WikiTree (multi-value)
- Added
RepositoryPropertiesto GLXFile - Moved contact fields (phone, email) from direct entity fields to
properties
Citation Properties
- New
citation-properties.glxvocabulary - Standard properties for citation entities:locator- Location within source (consolidates formerpageandlocatordirect fields; GEDCOM PAGE)text_from_source- Transcription or excerpt of relevant text (moved from direct entity field)source_date- Date when the source recorded the information (from GEDCOM DATA.DATE)
- Added
Propertiesfield to Citation struct,CitationPropertiesto GLXFile, and vocabulary specification section
Source Properties
- New
source-properties.glxvocabulary - Standard properties for source entities:abbreviation- Short reference name (from GEDCOM ABBR)call_number- Repository catalog number (from GEDCOM CALN)events_recorded- Types of events documented by this source (multi-value, from GEDCOM EVEN)agency- Responsible agency (from GEDCOM AGNC)coverage- Geographic/temporal scope of source contentexternal_ids- External system identifiers (multi-value)
- Added
Propertiesfield to Source struct,SourcePropertiesto GLXFile, andsource-properties.schema.json
Multi-Value Property Support
- Added
multi_valuefield to PropertyDefinition - Properties can now be marked as supporting multiple values - Validation correctly handles array values for multi-value properties
GEDCOM Import
Media/OBJE Import
- Implemented inline OBJE handling for all record types - Media references and embedded OBJE records on individuals, events, sources, families, submitters, census records, and person property tags are now imported (previously only marriage events and top-level OBJE were handled)
- Added
handleOBJEshared helper for XRef references, GEDCOM 7.0@VOID@pointers, and embedded OBJE - Added BLOB data handling, URL-type multimedia import, and OBJE processing in
extractEventDetails - Torture test media import improved from 2 to 32 entities (100% coverage)
Media File Import
- Media files are now copied into the archive during GEDCOM import - Relative FILE paths copied to
media/files/; BLOB data decoded and written to files - Media URIs rewritten to archive-relative paths; URL and absolute path references left as-is
- Filename deduplication with counter suffixes; missing source files produce warnings, not errors
Census (CENS) Support
- Implemented CENS tag handling for individual and family records - Census records treated as evidence sources, not events
- Each CENS creates a Source (type:
census) and Citation; extracts PLAC for temporalresidenceproperty - Family-level CENS applies census data to both husband and wife
- Added
createPropertyAssertionWithCitations()helper
Vocabulary-Driven Tag Resolution
- Added
gedcomfield toPropertyDefinitionstruct - Property vocabulary entries can now declare their corresponding GEDCOM tag - Added GEDCOM tag mappings to all 6 property vocabularies (person, event, citation, source, repository, media)
- Added
external_idsto person-properties.glx and event detail properties (age_at_event,cause,event_subtype) to event-properties.glx - Added
GEDCOMIndexreverse lookup infrastructure; replaced hardcoded mappings with vocabulary-driven lookups - Added
gedcomfield andfields/FieldDefinitionto all 8 property vocabulary JSON schemas - Updated vocabulary specification documentation with
gedcomfield and GEDCOM column
Evidence and Citation Handling
- Assertions require citations - Assertions are now only created when SOUR tags are present
- Embedded citation support - SOURCE_CITATION without pointer creates synthetic Source entity
- Properties-based storage - Source, media, and citation tags now stored in vocabulary-defined
propertiesinstead of notes - Citation linkage on media - SOUR on OBJE now properly links via
citation.Media
Validation
- Place hierarchy cycle detection - Validates that place parent references don't form cycles (e.g., A -> B -> C -> A). Reports exactly one error per cycle with the full cycle path in the error message.
Place Entity
- Moved
jurisdiction,place_format, andalternative_namesto properties - Now stored as vocabulary-defined properties instead of dedicated entity fields.alternative_namessimplified fromAlternativeName/DateRangetypes to a temporal, multi-value string property.
Relationship Entity
- Consolidated
descriptionintoproperties.description- Removed as a top-level field
Source Entity
- Consolidated
creatorfield intoauthors- Removedcreatorfrom spec, schema, and Go types
Library Package Restructuring
- Moved core library from
glx/lib/togo-glx/- The library is now at the repository root for clean external imports - Renamed package from
libtoglx- External consumers import asglxlib "github.com/genealogix/glx/go-glx"and useglxlib.GLXFile,glxlib.NewSerializer(), etc. - Updated all CLI files to use new import path and
glxlib.qualifier
CLI
- Changed
glx importdefault format - Now defaults to multi-file (-f multi) instead of single-file
JSON Schema URLs
- Standardized schema
$idURLs - All JSON schemas now use consistent GitHub raw content URLs; removed references toschema.genealogix.ioandgenealogix.orgdomains
Documentation
- Rewrote Migration from GEDCOM guide - Expanded from a skeleton to a comprehensive guide covering all supported GEDCOM tags, CLI flags, field mapping tables, common challenges, troubleshooting, and GEDCOM 5.5.1 vs 7.0 differences
- Clarified vocabulary file location is flexible - Spec, quickstart, and vocabulary docs now emphasize that vocabulary files can live anywhere in the archive, not only in
vocabularies/ - Streamlined Introduction - Simplified 1-introduction.md from 120 to 63 lines
- Restructured Core Concepts - Reorganized 2-core-concepts.md to emphasize flexibility; new section order: Archive-Owned Vocabularies → Entity Relationships → Data Types → Properties → Assertions → Evidence Chain → Collaboration
- Merged Data Types into Core Concepts - Integrated
6-data-types.mdas section 3; deleted standalone file - Added Glossary to specification - Moved from
docs/guides/glossary.mdto specification/6-glossary.md with "Property" and "Temporal Property" definitions - Updated table of contents and fixed broken links after restructuring
- Removed
.mdextensions from ~40 internal links for VitePress compatibility - Standardized GEDCOM mapping table headers across all 8 entity type files
- Added Properties sections to place.md and relationship.md
- Standardized entity file structure across all entity type docs
- Added Schema Reference sections to event, relationship, place, citation, and repository entity docs
- Added naming convention note (hyphens for file/entry names, underscores for YAML section keys) to core concepts
- Moved "Change Tracking with Git" section before "Next Steps" in core-concepts
- Removed 59 file path comments from YAML code blocks
- Standardized validation rules to reference vocabularies with links
- Added
participantsto all event examples that were missing the required field - Enhanced VitePress sidebar - Core Concepts promoted to its own collapsible sidebar section with 8 direct anchor links
- Updated quickstart.md - Examples updated to reflect schema changes
- Updated best-practices.md - Assertion examples updated to use typed
subjectreference andpropertyfield
Fixed
Specification
- Fixed Place hierarchy example that used duplicate YAML top-level keys
- Fixed examples using incorrect field names throughout specification (
description→notes,value→notes,file:→uri:,death_year→died_on,married_on→born_on,residence_dates→residence,registration_district→district) - Fixed assertion example using invalid date format (
circa 1825→ABT 1825) - Removed undocumented
birth_surnamefrom person name example - Fixed broken anchor link in repository.md (
#repository-properties→#repository-properties-vocabulary) - Standardized all event examples to use
subjectrole consistently (replaced remainingprincipalusages) - Fixed Event
datefield type fromstring/objecttostring(object form was never documented) - Fixed Event See Also to say Person "participates in events" instead of "contains event references"
- Fixed broken relative links in
1-introduction.mdandspecification/README.md - Fixed
residencereference type example in2-core-concepts.mdto use temporal format - Added minimum participant count (at least 2) to relationship fields table
- Removed stale
Created AtandCreated Byglossary entries - Fixed glossary Event and Event Type definitions that incorrectly included occupation and residence
- Fixed labels: "Event/Fact" → "Event", "living status" → "birth/death dates"
- Replaced
living: trueboolean example with non-misleading property names - Replaced "occupation" with "immigration" as event type example in 3 locations
- Fixed Event key properties ("description" → "notes") and Media key properties ("file path" → "URI") in entity-types README
- Fixed place types count from 14 to 15; added missing
localityto place-types.glx standard vocabulary - Fixed vocabulary directory structure example in core-concepts
GEDCOM Import
- Repository deduplication - Repositories with the same name and location are now deduplicated during import
- Dependency-ordered record processing - Records now grouped by type and processed in dependency order
- Repository-to-source linking - Sources now correctly link to their repository even when REPO records appear after SOUR records in the file
- NOTE reference resolution - Shared NOTE records now resolved to actual text content during import
- CONT/CONC text continuation - Long text fields spanning multiple lines now properly combined
- CR line ending support - GEDCOM files using CR-only line endings (old Mac Classic format) now import correctly
Code Quality & Robustness
unmarshalVocabnow returns error on missing YAML key - Previously silently returned nil when the expected top-level key was absent, causing downstream validation to think no vocabulary entries existappendMediaIDsafe type assertion - Now handles[]any(from YAML deserialization) instead of panicking on a bare type assertion to[]stringextensionFromMimeTypedeterministic output - MIME types with multiple extensions (.jpg/.jpeg,.tif/.tiff) now return a consistent preferred extension instead of random map iteration order- Directory emptiness check error handling -
isDirectoryEmptynow only treatsio.EOFas "empty", not all errors (permissions, I/O failures now properly reported) - Media file copy error handling -
copyMediaFilenow checksos.IsNotExistbefore fallback to URL-decoded paths, preserving original errors for permissions/disk issues - BLOB character validation -
decodeGEDCOMBlobnow validates characters are in valid GEDCOM BLOB range ('.' to 'm') before decoding, preventing silent corruption - EXID ID validation - GEDCOM external ID extraction now validates
idfield exists before use, skipping entries without usable IDs - Event Properties initialization -
extractEventDetailsnow ensuresevent.Propertiesmap is initialized before writing, preventing panics - Archive validation wiring -
LoadArchiveWithOptionsnow correctly passesschemaValidateflag to serializer for referential integrity validation - Property vocabulary documentation - Fixed
value_typeandreference_typefield requirements (marked "No*" instead of "Yes*" to match "exactly one required" constraint) - Test assertion completeness -
TestRunValidate_MediaFileMissingnow captures stdout and verifies warning is actually produced
CLI
glx validatesingle file behavior - Validating a single file now only validates that file's structure instead of loading the entire current directory. Cross-reference validation is skipped for single files with a warning message. Directory validation still performs full cross-reference checks.
Removed
- Removed
glx check-schemasCLI command - Moved tomake check-schemasMakefile target; this is a repo-internal dev tool, not a user-facing command
Citation Entity
- Removed
data_date,page,locator, andtext_from_sourcedirect fields — consolidated intoproperties
Source Entity
- Removed
citation,coverage, andcreatordirect fields (creatorconsolidated intoauthors)
Event Entity
- Removed
descriptionfield (useproperties.description) andtagsfield
[0.0.0-beta.2] - 2025-11-25
Added
GEDCOM Import (lib)
- GEDCOM 5.5.1 support - Import standard GEDCOM 5.5.1 files
- GEDCOM 7.0 support - Import GEDCOM 7.0 with new features
- GEDCOM 5.5.5 support - Import GEDCOM 5.5.5 specification samples
- Two-pass conversion - Entities first, then families for proper relationship handling
- Evidence chain mapping - GEDCOM SOUR tags → GLX Citations → GLX Assertions
- Place hierarchy building - Parse place strings into hierarchical Place entities
- Geographic coordinates - Extract MAP/LATI/LONG coordinates from GEDCOM
- Shared notes - Support for both GEDCOM 7.0 SNOTE and GEDCOM 5.5.1 NOTE records
- External IDs - Import GEDCOM 7.0 EXID tags (wikitree, familysearch, etc.)
- Comprehensive test coverage - 33 GEDCOM test files (5.5.1, 5.5.5, 7.0) successfully imported
- Large file support - Tested with files containing thousands of persons and events
- Edge case handling - Empty families, self-marriages, same-sex marriages, unknown genders
- Character encoding support - ASCII, UTF-8, Windows CP1252 (CRLF and LF)
GLX Serializer (lib)
- Single-file serialization - Convert GLX archives to single YAML files
- Multi-file serialization - Entity-per-file structure with random IDs
- Archive loading - Load both single-file and multi-file GLX archives
- Vocabulary embedding - Embed standard vocabularies using go:embed
- Vocabulary loading from directory - Load vocabularies from multi-file archives
- ID generation - Random 8-character hex IDs for entity filenames
- EntityWithID wrapper - Preserve entity IDs in multi-file format using _id field
- Collision detection - Retry logic for filename generation
- Configurable validation - Optional validation before serialization
- 12 standard vocabularies embedded in binary
- Round-trip preservation - Single→Multi→Single conversions preserve all data
CLI Commands (glx)
glx import- Import GEDCOM files to GLX format- Single-file and multi-file output formats
- Optional vocabulary inclusion (default: true)
- Optional validation (default: true)
- Verbose mode with import statistics
- Supports both GEDCOM 5.5.1 and 7.0
glx split- Convert single-file GLX to multi-file format- Splits archive into entity-per-file structure
- Includes standard vocabularies
- Preserves entity IDs
glx join- Convert multi-file GLX to single-file format- Combines multi-file archive into single YAML
- Restores entity IDs from _id fields
Schema Enhancements
- Properties field added to 5 entity types for extensibility:
- Source - Store GEDCOM ABBR, EXID, custom tags
- Citation - Store event type cited, role, entry date
- Repository - Store FAX, additional contacts, EXID
- Media - Store crop coordinates, alternative titles, EXID
- Assertion - Store assertion metadata
- Backward compatible - Properties fields are optional with omitempty
Project Organization
.claude/plans/directory for all planning documentsCLAUDE.mdproject context guide for AI assistants- Plans README documenting all planning files and current status
- Moved all planning docs from
docs/to.claude/plans/
Vocabularies & Standards
- Developer documentation - GEDCOM import docs in
glx/lib/doc.go - User documentation - Updated Migration from GEDCOM Guide
- Automated import instructions
- Testing and validation procedures
- Import result expectations
Fixed
GEDCOM Import
- Malformed line recovery - Parser now handles MyHeritage export bug
- Recovers from NOTE fields with missing CONT/CONC prefixes
- Gracefully imports files with HTML-formatted notes
- Test case: queen.ged (4,683 persons, line 15903 missing CONT prefix)
- Family event handling - Added missing ANUL, DIVF, EVEN to case statement
- Place type references - Fixed gedcom_place.go to use "state" instead of "state_province"
Vocabularies
- Event types vocabulary - Fixed probate description ("Probate of estate" not "of will")
- Place types vocabulary - Removed duplicate state_province alias (use "state" instead)
- Schema categories - Updated allowed categories in vocabulary schemas
- Event types: Added "legal", "migration"; changed "custom" → "other"
- Place types: Added "institution"; changed "custom" → "other"
- Source types vocabulary - Added to embedded vocabularies (was missing)
Code Quality
- Clean architecture - Removed file I/O from library layer
- Moved importGEDCOMFromFile to test helpers (gedcom_test_helpers.go)
- CLI handles file operations, lib works with io.Reader
- Better separation of concerns
- File organization - Renamed gedcom_7_0.go → gedcom_shared.go (more accurate)
Testing & CI
- Multi-file vocabulary loading - Fixed LoadMultiFile to properly load vocabularies from directory
- Vocabulary preservation - Vocabularies now correctly preserved in round-trip conversions
- CI test coverage - Updated GitHub Actions to explicitly run all tests
- Large file tests (habsburg.ged: 34,020 persons)
- Added 15-minute timeout for comprehensive test runs
- No tests skipped in CI (no -short flag)
- Test documentation - Fixed queen.ged README with correct software attribution
- GEDCOM TITL handling - Now uses proper
PersonPropertyTitleconstant instead of hardcoded string - GEDCOM name fields - Only populate
name.fieldsfrom explicit GEDCOM substructure tags (GIVN, SURN, etc.), not inferred from parsing the name string - Test data consistency - All testdata files updated to use unified name format
Removed
Attribute Event Types
- Removed attribute-type events from schema - Events are now strictly discrete occurrences with participants
- Removed from event.schema.json enum:
residence,occupation,title,nationality,religion,education - Removed
censusfrom event-types.glx vocabulary - These attributes are now represented as temporal properties on Person entities
- Removed from event.schema.json enum:
- Removed CENS (Census) event handling - Census records are skipped during GEDCOM import (TODO: re-implement as citations supporting property assertions)
- Converted RESI (Residence) to temporal property - GEDCOM RESI tags now create temporal
residenceproperties on Person entities instead of events
Quality Ratings Support
- Removed
quality_ratingsvocabulary - The GEDCOM 0-3 Quality Assessment scale was removed from the GLX specification- Deleted
quality-ratings.glxvocabulary file - Deleted
quality-ratings.schema.jsonschema file - Removed
qualityfield from Citation entity - Removed
QualityRatingtype from Go code
- Deleted
- Removed auto-generated assertion confidence - GEDCOM imports no longer auto-populate assertion confidence levels
- Confidence levels should reflect researcher judgment, not be inferred from QUAY values
- GEDCOM QUAY tags are now preserved in citation notes (e.g.,
GEDCOM QUAY: 2)
Assertion Entity Fields
- Removed
evidence_typefield - Evidence quality classification belongs on citations, not assertions - Removed
typefield - Redundant withclaimfield andtagsfor categorization - Removed
research_notesfield - Consolidated into singlenotesfield
Provenance Fields (All Entities)
- Removed
modified_at,modified_by,created_at,created_byfields - Redundant with git history; usegit logandgit blameinstead
Changed
Person Properties Schema
- Unified
nameproperty - Replaced fragmented name properties with single unified property- Old: Separate
given_name,family_nameproperties - New: Single
nameproperty withvalueand optionalfieldsbreakdown - Format:
name: { value: "John Smith", fields: { given: "John", surname: "Smith" } } - Supports temporal lists for name changes over time
- Fields include:
prefix,given,nickname,surname_prefix,surname,suffix
- Old: Separate
- Added
titleproperty - Nobility or honorific titles (temporal, like occupation)- Properly handles GEDCOM TITL tag imports
- Added
PersonPropertyTitleconstant
Vocabulary Updates
- person_properties vocabulary - Updated to reflect unified name structure
nameproperty now includesfieldssub-schema for structured breakdown- Added
titleproperty definition
Other
- Documentation structure - Separated user docs (docs/) from planning docs (.claude/plans/)
Technical Details
GEDCOM Import Coverage:
- 100% critical features implemented
- 94% high-priority features implemented
- PRODUCTION-READY status
- Comprehensive gap analysis completed
Serializer Features:
- Uses crypto/rand for ID generation
- 32 bits of randomness per ID (4.3 billion possible values)
- Collision probability: ~1 in 400,000 with 10,000 entities
- EntityWithID wrapper pattern for multi-file format
- All 12 standard vocabularies embedded with go:embed
Testing:
- All existing tests passing
- 48 new test cases for serializer
- 33 GEDCOM files tested for import (100% coverage of test files)
- Full round-trip serialization/deserialization tests
- Vocabulary preservation tests for both single-file and multi-file formats
- Comprehensive unit and integration tests
- Large file stress tests (3000+ persons, 4000+ events)
[0.0.0-beta.1] - 2025-11-18
Fixed
- Fixed GitHub release workflow to build on beta tags (
v*.*.*-beta*pattern) - Fixed VitePress build by adding
shikidependency towebsite/package.json
Changed
- Removed roadmap section from README (no longer maintaining public roadmap)
Removed
- Removed archive folder containing old planning documents
[0.0.0-beta.0] - 2025-11-14
Added
Specification & Standards
- Complete GENEALOGIX specification defining modern, evidence-first genealogy data standard
- 9 core entity types with full JSON Schema definitions:
- Person (individuals with biographical properties)
- Relationship (family connections with types and dates)
- Event (life events with sources and locations)
- Assertion (evidence-backed claims with quality assessment)
- Citation (evidence references with source quotations)
- Source (primary/secondary evidence documentation)
- Repository (physical storage information)
- Place (geographic locations with coordinate data)
- Participant (individuals involved in events)
- Repository-owned controlled vocabularies for extensibility
- Git-native architecture for version control and collaboration
- YAML-based human-readable format with schema validation
CLI Tool (glx)
glx init: Initialize new GLX repositories with optional single-file modeglx validate: Comprehensive validation with:- Schema compliance checking against JSON Schemas
- Cross-reference integrity verification across all files
- Vocabulary constraint validation
- Detailed error reporting with file/line locations
glx check-schemas: Utility for verifying schema metadata and structure- Support for both directory-based and single-file archives
- Cross-file entity resolution and validation
Documentation & Examples
- Comprehensive specification documentation (6 core documents)
- Complete examples demonstrating various use cases:
- Minimal single-file archive
- Basic family structure with multiple generations
- Complete family with all entity types
- Participant assertions workflow
- Temporal properties and date ranges
- Development guides covering:
- Architecture and design decisions
- Schema development practices
- Testing framework and test suite structure
- Local development environment setup
- User guides including:
- Quick-start guide for new users
- Best practices and recommendations
- Common pitfalls and troubleshooting
- Manual migration guide for converting from GEDCOM format
- Glossary of key terms and concepts
Testing & Quality Assurance
- Comprehensive test suite with:
- Valid example fixtures demonstrating correct usage
- Invalid example fixtures testing error handling
- Cross-reference validation tests
- Vocabulary constraint tests
- Schema compliance validation tests
- Automated CI/CD pipeline using GitHub Actions
- Full code coverage reporting
Project Infrastructure
- Apache 2.0 open-source license
- Community guidelines and code of conduct
- Contributing guidelines for developers
- GitHub issue and discussion templates
- Development container configuration for consistent environments
- Pre-configured VitePress documentation site