UCD.js Docs

Introduction

Data processing pipelines for UCD files

Pipelines

The @ucdjs/pipelines-* packages provide a powerful, graph-based system for defining and running data processing pipelines on Unicode Character Database (UCD) files.

Pipelines are used to transform raw UCD files (like UnicodeData.txt or Blocks.txt) into structured data, artifacts, or TypeScript code.

Architecture

The pipeline system is split into several modular packages, each handling a specific part of the data processing workflow:

  • @ucdjs/pipelines-core: The heart of the pipeline system. It contains the core definitions, DAG (Directed Acyclic Graph) execution logic, event system, and fundamental types used across the pipeline ecosystem.
  • @ucdjs/pipelines-artifacts: Schema definitions and artifact management for pipeline outputs. It defines how data produced by a route is typed and validated.
  • @ucdjs/pipelines-graph: Utilities for building and analyzing pipeline execution graphs, ensuring dependencies are resolved in the correct order.
  • @ucdjs/pipelines-loader: Utilities for loading pipeline definitions from configuration files or remote sources dynamically.
  • @ucdjs/pipelines-presets: A collection of pre-built pipeline routes and transformations for common UCD files, ready to use out-of-the-box.
  • @ucdjs/pipelines-ui: A visual, web-based UI for exploring pipeline execution, viewing DAGs, inspecting artifacts, and debugging pipeline runs.

By separating these concerns, the pipeline system remains flexible, allowing you to use only the components necessary for your specific UCD processing needs.

On this page