Mobsql

builds.sr.ht status

Mobsql is a Go library and commandline application which facilitates loading one or multiple GTFS feeds into a SQLite database. Either user-provided GTFS ZIP feed archives or Mobility Database feeds (as specified by Mobility Database feed IDs) may be loaded. Mobsql's internal SQLite schema mirrors GTFS's spec but adds a feed_id field to each table (thus allowing multiple feeds to be loaded to the database simulatenously).

While primarily developed to be used by Mobroute, the general purpose GTFS router and related project, Mobsql itself is a fully independent tool and can be used as a standalone general-purpose GTFS-to-SQLite ETL & import utility (either via its CLI or as a Go library).

Documentation

Project Overview / Featureset

  • Imports GTFS ZIP archives into a local SQLite database with an additional 'feed_id' field in all GTFS tables; thus allowing multiple feeds to be stored without conflict.
  • Supports loading local filesystem user-specified GTFS ZIP archives, remote HTTP user-specified GTFS ZIP archives, and Mobility Database sourced ZIP archives.
  • Supports bulk import (e.g. 1-insert-for-multiple rows) functionality to decrease load time.
  • Utilizes SQLite's CSV Virtual table extension for optimized and fast imports of raw CSV GTFS data to SQLite DB.
  • Supports a caching system, storing a checksum of the imported GTFS table(s) such that successive imports on the same data nops effectively if there would be no net change.
  • In addition to downloading & GTFS loading functionality, can be used to compute contrived data (similar to materialized views), purge, and query status information about feeds.
  • Allows both searching and import of GTFS feeds by MDBID (from the Mobility Database) to import rather then making user source their own custom GTFS ZIP archives.
  • Simply models the database schema including: GTFS specification import rules, Mobility Database catalog (CSV) imported as table, internal tracking tables, and SQLite views in a single file - see schema datastructure.
  • Implements conversion logic building atop the GTFS schema for fields which can be stored more efficiently (such as stop_times's departure_time/arrival_time as integers) rather then as colon time strings.
  • Allows the creation of SQL views (currently used for internal tracking).
  • Implements automatic index creation based on schema specification.
  • Implements automatic creation of computed tables based on view logic (e.g. similar to the concept of materialized views).
  • Implements cleanup logic to gracefully handle partial / interrupted loads (e.g. killing process during GTFS zip extraction & SQL import / operations).