Checking Relational Databases
Similar to Registering Data Assets, Gable relies on having access to a local “proxy” database to check for contract violations. It’s important that the contract check runs against an instance of the database with the proposed changes in order to catch breaking changes before they’re merged and deployed to production. A proxy database is a database instance that represents the database’s schema should the proposed changes under evaluation be merged, and is accessible locally or in your CI/CD environment. The proxy database concept also removes the need to grant access to your production database, as well as eliminates the possibility of impacting the performance of your production database in any way. A proxy database can be a local Docker container, a Docker container that is spun up in your CI/CD workflow, or a database instance in a test/staging environment. If you already start a database Docker container in your CI/CD workflows for integration testing, Gable can be configured to use that same container at the end of the test run. When using a proxy database, you specify both the production host/port/schema, as well as those of the proxy. The production information is required to compute the unique data asset resource name for each discovered table in order to find any contracts associated with the database’s tables.Example:
Example:
In this example, a local Docker Postgres instance is created and database migrations are applied with proposed changes. Gable connects to the local Postgres instance and validates the changes.
Checking Protobuf/Avro/JSON Schema Files
Checking a service’s Protobuf, Avro, or JSON schema files for contract violations is straightforward as the only requirement is having the service’s git repository checked out locally. The CLI supports checking multiple files, either specified as a space delimited list (file1.proto file2.proto
), or as a glob pattern. The check
command must be run within the repository’s directory, as it uses the repo’s git information to construct the unique resource name for the data assets it discovers in order to find any contracts associated with the file(s).
Example:
Example:
In this example, Gable inspects the protobuf files of
serviceone
for contract violations.Static Code Analysis
Using static code analysis, Gable can check data-generating code across your codebase to ensure compliance with existing data contracts. Following the examples below, you can use the Gable CLI to have your code bundled, transmitted, and analyzed by Gable for native type detection and checking against data contracts. Please note that bundling and transmission of your code is necessary for our Gable static analysis, but rest assured that your code will not be persisted on Gable servers. In future releases, we will add the ability to run the static code analysis completely within your CI/CD pipeline.Python
Example: Checking Python Emitter Data Assets
Example: Checking Python Emitter Data Assets
Check Python Options
--source-type
: Specify source. Python in this case--project-root
: Specifying the project’s entry point for proper bundling--emitter-file-path
: Identify the emitter function location--emitter-function
: Identify the emitter function--emitter-payload-parameter
: Identify payload parameter within the emitter function--event-name-key
: Define the property of the event to distinguish event types
PySpark
Example: Checking PySpark Projects
Example: Checking PySpark Projects
Check PySpark Options
--source-type
- Set to pyspark for PySpark projects--project-root
- The directory containing the PySpark job to be analyzed--spark-job-entrypoint
- The command to execute the Spark job, including any argument--connection-string
- Connection string to the Hive metastore--csv-schema-file
- Path to csv file containing the schema of upstream tables, formatted with columnstable_name
,column_name
, andcolumn_type
Typescript
Example: Checking Typescript Projects (Supported Library)
Example: Checking Typescript Projects (Supported Library)
Example: Checking Typescript Projects (UDF: Event Name Parameter)
Example: Checking Typescript Projects (UDF: Event Name Parameter)
In this example, a parameter of the UDF (Check command using —emitter-name-parameter
eventName
) is used to set the event name when publishing.Example Event Publishing UDFExample: Checking Typescript Projects (UDF: Event Name Key)
Example: Checking Typescript Projects (UDF: Event Name Key)
In this example the event name is a property of the event payloadExample Event Publishing UDFCheck command using —event-name-key
Check Typescript Options
Required--source-type
- Set totypescript
to check events in Typescript--project-root
- The directory containing the Typescript project to be analyzed
--library
- The natively supported library used to publish data, usually events
--emitter-file-path src/lib/events.ts
- The path to the file containing the UDF--emitter-function trackEvent
- The name of the UDF--emitter-payload-parameter eventProperties
- The name of the function parameter representing the event payload--emitter-name-parameter eventName
- [Optional] The name of the function parameter representing the event name. Use either this option, or--event-name-key __event_name
. See above examples.--event-name-key __event_name
- [Optional] The name of the event property representing the event name. Use either this option, or--event-name-key __event_name
. See above examples.