Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog and this project adheres to Semantic Versioning.

[0.0.0-beta.2] - 2025-11-25

Added

GEDCOM Import (lib)

GEDCOM 5.5.1 support - Import standard GEDCOM 5.5.1 files
GEDCOM 7.0 support - Import GEDCOM 7.0 with new features
GEDCOM 5.5.5 support - Import GEDCOM 5.5.5 specification samples
Two-pass conversion - Entities first, then families for proper relationship handling
Evidence chain mapping - GEDCOM SOUR tags → GLX Citations → GLX Assertions
Place hierarchy building - Parse place strings into hierarchical Place entities
Geographic coordinates - Extract MAP/LATI/LONG coordinates from GEDCOM
Shared notes - Support for both GEDCOM 7.0 SNOTE and GEDCOM 5.5.1 NOTE records
External IDs - Import GEDCOM 7.0 EXID tags (wikitree, familysearch, etc.)
Comprehensive test coverage - 33 GEDCOM test files (5.5.1, 5.5.5, 7.0) successfully imported
Large file support - Tested with files containing thousands of persons and events
Edge case handling - Empty families, self-marriages, same-sex marriages, unknown genders
Character encoding support - ASCII, UTF-8, Windows CP1252 (CRLF and LF)

GLX Serializer (lib)

Single-file serialization - Convert GLX archives to single YAML files
Multi-file serialization - Entity-per-file structure with random IDs
Archive loading - Load both single-file and multi-file GLX archives
Vocabulary embedding - Embed standard vocabularies using go:embed
Vocabulary loading from directory - Load vocabularies from multi-file archives
ID generation - Random 8-character hex IDs for entity filenames
EntityWithID wrapper - Preserve entity IDs in multi-file format using _id field
Collision detection - Retry logic for filename generation
Configurable validation - Optional validation before serialization
12 standard vocabularies embedded in binary
Round-trip preservation - Single→Multi→Single conversions preserve all data

CLI Commands (glx)

glx import - Import GEDCOM files to GLX format
- Single-file and multi-file output formats
- Optional vocabulary inclusion (default: true)
- Optional validation (default: true)
- Verbose mode with import statistics
- Supports both GEDCOM 5.5.1 and 7.0
glx split - Convert single-file GLX to multi-file format
- Splits archive into entity-per-file structure
- Includes standard vocabularies
- Preserves entity IDs
glx join - Convert multi-file GLX to single-file format
- Combines multi-file archive into single YAML
- Restores entity IDs from _id fields

Schema Enhancements

Properties field added to 5 entity types for extensibility:
- Source - Store GEDCOM ABBR, EXID, custom tags
- Citation - Store event type cited, role, entry date
- Repository - Store FAX, additional contacts, EXID
- Media - Store crop coordinates, alternative titles, EXID
- Assertion - Store assertion metadata
Backward compatible - Properties fields are optional with omitempty

Project Organization

.claude/plans/ directory for all planning documents
CLAUDE.md project context guide for AI assistants
Plans README documenting all planning files and current status
Moved all planning docs from docs/ to .claude/plans/

Vocabularies & Standards

Developer documentation - Comprehensive GEDCOM Import Developer Guide
- Architecture and conversion flow
- Entity mapping details
- ID generation (current incremental + future random)
- GEDCOM 5.5.1 vs 7.0 differences
- Malformed line recovery strategies
- Testing and debugging guides
User documentation - Updated Migration from GEDCOM Guide
- Automated import instructions
- Testing and validation procedures
- Import result expectations

Fixed

GEDCOM Import

Malformed line recovery - Parser now handles MyHeritage export bug
- Recovers from NOTE fields with missing CONT/CONC prefixes
- Gracefully imports files with HTML-formatted notes
- Test case: queen.ged (4,683 persons, line 15903 missing CONT prefix)
Family event handling - Added missing ANUL, DIVF, EVEN to case statement
Place type references - Fixed gedcom_place.go to use "state" instead of "state_province"

Vocabularies

Event types vocabulary - Fixed probate description ("Probate of estate" not "of will")
Place types vocabulary - Removed duplicate state_province alias (use "state" instead)
Schema categories - Updated allowed categories in vocabulary schemas
- Event types: Added "legal", "migration"; changed "custom" → "other"
- Place types: Added "institution"; changed "custom" → "other"
Source types vocabulary - Added to embedded vocabularies (was missing)

Code Quality

Clean architecture - Removed file I/O from library layer
- Moved importGEDCOMFromFile to test helpers (gedcom_test_helpers.go)
- CLI handles file operations, lib works with io.Reader
- Better separation of concerns
File organization - Renamed gedcom_7_0.go → gedcom_shared.go (more accurate)

Testing & CI

Multi-file vocabulary loading - Fixed LoadMultiFile to properly load vocabularies from directory
Vocabulary preservation - Vocabularies now correctly preserved in round-trip conversions
CI test coverage - Updated GitHub Actions to explicitly run all tests
- Large file tests (habsburg.ged: 34,020 persons)
- Added 15-minute timeout for comprehensive test runs
- No tests skipped in CI (no -short flag)
Test documentation - Fixed queen.ged README with correct software attribution
GEDCOM TITL handling - Now uses proper PersonPropertyTitle constant instead of hardcoded string
GEDCOM name fields - Only populate name.fields from explicit GEDCOM substructure tags (GIVN, SURN, etc.), not inferred from parsing the name string
Test data consistency - All testdata files updated to use unified name format

Removed

Attribute Event Types

Removed attribute-type events from schema - Events are now strictly discrete occurrences with participants
- Removed from event.schema.json enum: residence, occupation, title, nationality, religion, education
- Removed census from event-types.glx vocabulary
- These attributes are now represented as temporal properties on Person entities
Removed CENS (Census) event handling - Census records are skipped during GEDCOM import (TODO: re-implement as citations supporting property assertions)
Converted RESI (Residence) to temporal property - GEDCOM RESI tags now create temporal residence properties on Person entities instead of events

Quality Ratings Support

Removed quality_ratings vocabulary - The GEDCOM 0-3 Quality Assessment scale was removed from the GLX specification
- Deleted quality-ratings.glx vocabulary file
- Deleted quality-ratings.schema.json schema file
- Removed quality field from Citation entity
- Removed QualityRating type from Go code
Removed auto-generated assertion confidence - GEDCOM imports no longer auto-populate assertion confidence levels
- Confidence levels should reflect researcher judgment, not be inferred from QUAY values
- GEDCOM QUAY tags are now preserved in citation notes (e.g., GEDCOM QUAY: 2)

Assertion Entity Fields

Removed evidence_type field - Evidence quality classification belongs on citations, not assertions
Removed type field - Redundant with claim field and tags for categorization
Removed research_notes field - Consolidated into single notes field

Provenance Fields (All Entities)

Removed modified_at, modified_by, created_at, created_by fields - Redundant with git history; use git log and git blame instead

Changed

Person Properties Schema

Unified name property - Replaced fragmented name properties with single unified property
- Old: Separate given_name, family_name properties
- New: Single name property with value and optional fields breakdown
- Format: name: { value: "John Smith", fields: { given: "John", surname: "Smith" } }
- Supports temporal lists for name changes over time
- Fields include: prefix, given, nickname, surname_prefix, surname, suffix
Added title property - Nobility or honorific titles (temporal, like occupation)
- Properly handles GEDCOM TITL tag imports
- Added PersonPropertyTitle constant

Vocabulary Updates

person_properties vocabulary - Updated to reflect unified name structure
- name property now includes fields sub-schema for structured breakdown
- Added title property definition

Other

Documentation structure - Separated user docs (docs/) from planning docs (.claude/plans/)

Technical Details

GEDCOM Import Coverage:

100% critical features implemented
94% high-priority features implemented
PRODUCTION-READY status
Comprehensive gap analysis completed

Serializer Features:

Uses crypto/rand for ID generation
32 bits of randomness per ID (4.3 billion possible values)
Collision probability: ~1 in 400,000 with 10,000 entities
EntityWithID wrapper pattern for multi-file format
All 12 standard vocabularies embedded with go:embed

Testing:

All existing tests passing
48 new test cases for serializer
33 GEDCOM files tested for import (100% coverage of test files)
Full round-trip serialization/deserialization tests
Vocabulary preservation tests for both single-file and multi-file formats
Comprehensive unit and integration tests
Large file stress tests (3000+ persons, 4000+ events)

[0.0.0-beta.1] - 2025-11-18

Fixed

Fixed GitHub release workflow to build on beta tags (v*.*.*-beta* pattern)
Fixed VitePress build by adding shiki dependency to website/package.json

Changed

Removed roadmap section from README (no longer maintaining public roadmap)

Removed

Removed archive folder containing old planning documents

[0.0.0-beta.0] - 2025-11-14

Added

Specification & Standards

Complete GENEALOGIX specification defining modern, evidence-first genealogy data standard
9 core entity types with full JSON Schema definitions:
- Person (individuals with biographical properties)
- Relationship (family connections with types and dates)
- Event (life events with sources and locations)
- Assertion (evidence-backed claims with quality assessment)
- Citation (evidence references with source quotations)
- Source (primary/secondary evidence documentation)
- Repository (physical storage information)
- Place (geographic locations with coordinate data)
- Participant (individuals involved in events)
Repository-owned controlled vocabularies for extensibility
Git-native architecture for version control and collaboration
YAML-based human-readable format with schema validation

CLI Tool (`glx`)

glx init: Initialize new GLX repositories with optional single-file mode
glx validate: Comprehensive validation with:
- Schema compliance checking against JSON Schemas
- Cross-reference integrity verification across all files
- Vocabulary constraint validation
- Detailed error reporting with file/line locations
glx check-schemas: Utility for verifying schema metadata and structure
Support for both directory-based and single-file archives
Cross-file entity resolution and validation

Documentation & Examples

Comprehensive specification documentation (6 core documents)
Complete examples demonstrating various use cases:
- Minimal single-file archive
- Basic family structure with multiple generations
- Complete family with all entity types
- Participant assertions workflow
- Temporal properties and date ranges
Development guides covering:
- Architecture and design decisions
- Schema development practices
- Testing framework and test suite structure
- Local development environment setup
User guides including:
- Quick-start guide for new users
- Best practices and recommendations
- Common pitfalls and troubleshooting
- Manual migration guide for converting from GEDCOM format
- Glossary of key terms and concepts

Testing & Quality Assurance

Comprehensive test suite with:
- Valid example fixtures demonstrating correct usage
- Invalid example fixtures testing error handling
- Cross-reference validation tests
- Vocabulary constraint tests
- Schema compliance validation tests
Automated CI/CD pipeline using GitHub Actions
Full code coverage reporting

Project Infrastructure

Apache 2.0 open-source license
Community guidelines and code of conduct
Contributing guidelines for developers
GitHub issue and discussion templates
Development container configuration for consistent environments
Pre-configured VitePress documentation site

Changelog ​

[0.0.0-beta.2] - 2025-11-25 ​

Added ​

GEDCOM Import (lib) ​

GLX Serializer (lib) ​

CLI Commands (glx) ​

Schema Enhancements ​

Project Organization ​

Vocabularies & Standards ​

Fixed ​

GEDCOM Import ​

Vocabularies ​

Code Quality ​

Testing & CI ​

Removed ​

Attribute Event Types ​

Quality Ratings Support ​

Assertion Entity Fields ​

Provenance Fields (All Entities) ​

Changed ​

Person Properties Schema ​

Vocabulary Updates ​

Other ​

Technical Details ​

[0.0.0-beta.1] - 2025-11-18 ​

Fixed ​

Changed ​

Removed ​

[0.0.0-beta.0] - 2025-11-14 ​

Added ​

Specification & Standards ​

CLI Tool (glx) ​

Documentation & Examples ​

Testing & Quality Assurance ​

Project Infrastructure ​

Changelog

[0.0.0-beta.2] - 2025-11-25

Added

GEDCOM Import (lib)

GLX Serializer (lib)

CLI Commands (glx)

Schema Enhancements

Project Organization

Vocabularies & Standards

Fixed

GEDCOM Import

Vocabularies

Code Quality

Testing & CI

Removed

Attribute Event Types

Quality Ratings Support

Assertion Entity Fields

Provenance Fields (All Entities)

Changed

Person Properties Schema

Vocabulary Updates

Other

Technical Details

[0.0.0-beta.1] - 2025-11-18

Fixed

Changed

Removed

[0.0.0-beta.0] - 2025-11-14

Added

Specification & Standards

CLI Tool (`glx`)

Documentation & Examples

Testing & Quality Assurance

Project Infrastructure