Skip to main content

1.14.0 - 2024-05-09

Added

  • Common/dbt: add DREMIO to supported dbt profile types #2674 @surisimran
    *Adds support for dbt-dremio, resolving #2668.
  • Flink: support Protobuf format for sources and sinks #2482 @pawel-big-lebowski
    Adds schema extraction from Protobuf classes. Includes support for nested object types, array type, map type, oneOf and any.
  • Java: add facet conversion test #2663 @julienledem
    Adds a simple test that shows how to deserialize a facet in the server model.
  • Spark: job type facet to distinguish RDD jobs from Spark SQL jobs #2652 @pawel-big-lebowski
    Sets the jobType property of JobTypeJobFacet to either SQL_JOB or RDD_JOB.
  • Spark: add Glue symlink if reading from Glue catalog table #2646 @mobuchowski
    The dataset symlink now points to the Glue catalog table name if the Glue catalog table is used.
  • Spark: add spark_jobDetails facet #2662 @dolfinus
    Adds a SparkJobDetailsFacet, capturing information about Spark application jobs -- e.g. jobId, jobDescription, jobGroup, jobCallSite. This allows for tracking an OpenLineage RunEvent with a specific Spark job in SparkUI.

Removed

  • Airflow: drop old ParentRunFacet key #2660 @dolfinus
    Changes the integration to use the parent key for ParentFacet, dropping the outdated parentRun.
  • Spark: drop SparkVersionFacet #2659 @dolfinus
    Drops the SparkVersion facet, deprecated since 1.2.0 and planned for removal since 1.4.0.
  • Python: allow relative paths in URI formats for Python facets #2679 @JDarDagran
    Removes a URI validator that checked if scheme and netloc were present, allowing relative paths in URI formats for Python facets.

Changed

  • GreatExpectations: rename ParentRunFacet key #2661 @dolfinus
    The OpenLineage spec defined the ParentRunFacet with the property name parent but the Great Expectations integration created a lineage event with parentRun. This renames ParentRunFacet key from parentRun to parent. For backwards compatibility, keep the old name.

Fixed

  • dbt: support a less ambiguous logic to generate job names #2658 @blacklight
    Includes profile and models in the dbt job name to make it more unique.
  • Spark: update to use org.apache.commons.lang3 instead of org.apache.commons.lang #2676 @harels
    Updates Apache Commons Lang to the latest version. We were mixing two versions, and the old one was not present in many places.