Architecture Guide
This guide explains the technical architecture of the GENEALOGIX CLI implementation.
System Overview
The GENEALOGIX CLI is a Go-based tool that validates and manages GLX archives:
- Specification: Formal data model (see specification/)
- JSON Schemas: Embedded validation rules
- Go Structs: Type-safe parsing and validation
- CLI Tool: Command-line interface
CLI Tool Architecture
Command Structure
glx init [directory] # Initialize new archive
glx validate [paths...] # Validate .glx files
glx check-schemas # Validate schema filesSee CLI README for command details.
Validation Pipeline
The validator performs multi-stage validation:
JSON Schema Validation
- Parse YAML files
- Validate against embedded JSON schemas
- Check required fields and types
Struct-Based Parsing
- Parse files into Go structs (
lib.GLXFile) - Merge all files into single
GLXFile - Detect duplicate IDs (fatal error)
- Parse files into Go structs (
Reference Validation
- Use reflection with
refTypestruct tags - Validate entity references (e.g.,
person,place) - Validate vocabulary references (e.g.,
event_types) - Report all reference errors at once
- Use reflection with
See validator.go for implementation.
Type System
Go Structs
Located in lib/types.go:
- Structs for all entity and vocabulary types
- YAML tags for serialization
refTypetags for reference validationMergemethod for combining files
Naming Conventions:
- Go fields: Use
IDsuffix for readability (e.g.,SourceID,PersonID) - YAML tags: Singular entity names (e.g.,
yaml:"source",yaml:"person") - refType tags: Match GLXFile map keys (e.g.,
refType:"sources",refType:"persons")
Example:
type Citation struct {
SourceID string `yaml:"source" refType:"sources"`
RepositoryID string `yaml:"repository,omitempty" refType:"repositories"`
Media []string `yaml:"media,omitempty" refType:"media"`
}Reference Validation
Uses reflection to validate references:
refTypestruct tags define reference targets- Comma-delimited for multiple valid types (e.g.,
refType:"persons,events,relationships,places") - Validates both entity and vocabulary references uniformly
- Reports all errors at once
Implementation in validateEntityReferences() function.
Schema Embedding
Schemas are embedded in the Go binary using go:embed:
// specification/schema/v1/embed.go
//go:embed person.schema.json
var PersonSchema []byte
var EntitySchemas = map[string][]byte{
"person": PersonSchema,
"relationship": RelationshipSchema,
// ...
}Benefits:
- No file I/O at runtime
- Single binary distribution
- Guaranteed schema availability
After modifying schemas, rebuild the CLI:
cd glx
go buildVocabulary Loading
Vocabularies can be defined in any .glx file in the archive.
The LoadArchiveVocabularies() function:
- Walks all
.glxfiles - Parses each into
lib.GLXFile - Merges vocabulary definitions
- Returns unified vocabulary set
This allows flexible vocabulary organization without hardcoded paths.
Validation Flow
File-by-File Validation
- Parse YAML
- Validate against JSON schema
- Parse into Go struct
- Merge into master
GLXFile - Check for duplicate IDs
Cross-File Validation
After all files are merged:
- Build lookup maps for all entities and vocabularies
- Walk all structs using reflection
- Check each
refTypetagged field - Validate references exist
- Report all errors
See ValidateReferencesWithStructs() in validator.go.
Performance
Optimization Strategies
- Parallel file parsing
- Embedded schemas (no file I/O)
- Single-pass reference validation
- Efficient struct-based validation
Tested Scale
- 1000+ files
- Multi-megabyte archives
- Complex reference graphs
Testing
Tests are in glx/ directory:
glx/
├── main_test.go # CLI tests
├── validate_test.go # Validation tests
└── testdata/
├── valid/ # Valid test files
└── invalid/ # Invalid test filesTests run in CI on every commit.
See Testing Guide for details.
See Also
- Setup Guide - Development environment
- Schema Development - Schema maintenance
- Testing Guide - Testing framework
- Specification - Formal specification