Skip to content

.gl session model

A .gl script is a sequence of gapline commands that share a single in-memory feed. Instead of re-reading the ZIP for every step, the runner loads it once, applies each command to the working copy, and only touches disk when a save directive is reached. This is what makes multi-step cleanup pipelines fast and atomic.

A script executes in three phases:

  1. Load. The first feed <path> directive loads the archive into memory. Subsequent commands skip the -f flag — they operate on the loaded feed.
  2. Mutate. validate, read, create, update, delete run against the in-memory feed, in order. Each command sees the state left by the previous one.
  3. Persist. A save [path] directive writes the modified files back to disk atomically. Without save, the script exits without touching the original feed.

If save is passed without a path, gapline writes back to the path passed to feed. Passing a different path is the scripted equivalent of --output on a single command: the source feed stays intact.

Execution is strictly sequential and stop-on-error. As soon as any command fails — a parse error, a failed validation guard, a foreign-key refusal — the runner halts and the process exits non-zero.

When the script stops mid-way:

  • Everything before the failure is already applied in memory.
  • Nothing is persisted unless a save directive had already succeeded earlier in the script.
  • The source feed on disk is untouched.

This is a deliberate trade-off: it means .gl scripts are safe to re-run after fixing an upstream issue, without the risk of a partially-rewritten ZIP.

Running validate → update → validate → save without a session would re-parse the ZIP three times (once per command). That is wasteful: parsing a multi-million-row stop_times.txt dominates the cost of each step.

With a session:

  • Only the first feed directive pays the parse cost.
  • Mutations stay in RAM, with the same reverse-index structures used for referential integrity.
  • The final save serialises the modified files back in a single pass.

Typical numbers on a mid-sized transit feed: five CRUD operations plus two validations complete in under a second; the same pipeline as seven independent gapline … invocations takes several seconds.

Supported:

  • Every top-level sub-command (validate, read, create, update, delete) without the -f flag.
  • Comments starting with #, either on their own line or at the end of a directive line.

Not supported, by design:

  • Variables, substitution, interpolation.
  • Conditionals, loops, branching.
  • Nested run (a .gl script cannot call another .gl script).

.gl is a declarative batch format, not a scripting language. For anything beyond a linear pipeline, fall back to a shell script that orchestrates multiple gapline calls.

minimal.gl
feed ./data/gtfs.zip
validate
save

This loads the feed, validates it, and rewrites it to ./data/gtfs.zip. The rewrite is atomic, so a concurrent reader never sees a half-written archive.