Profiler

Never have a data profile go stale again! OwlDQ automatically profiles every dataset without the need for manual user intervention. The profiler will identify weekend trends, holidays, and major data segments as well as column correlations and outliers. If your data changes the OwlDQ profiler will automatically detect it and highlight the issue.

dashboard

Adaptive Rules

Turn AI into human readable SQL rules. OwlDQ helps you pass stringent audit requirements or simply export a list of rules that are distilled down from the internal AI. Derive your rules by learning from the data itself. Never maintain a static rule list again.

validate

List View

The Data Stewards favorite page! In a modern data lake there are so many different types of data quality issues happening all the time. An agile Data Steward requires the ability to look across all datasets and see quality issues classified into tangible buckets. The list view provides the ability to triage all datasets by issue type over time and by impact ranking. These features turn a data quality program into an operational data quality program.

validate

Pattern Analyzer

Define relationships not rules! By linking data relationships OwlDQ will simulate the equivalent of thousands of nuanced rules and sub groups. Don't know your data relationships? Use Owl's AutoML to find them for you. Next select the most meaningful curated relationships.

validate

Duplicates

Duplicate or redundant data issues are notorious quality issues. All data admins are looking for a way find and remove redundant data so they can reclaim valuable disk space. Sometimes it’s not so simply as data might be similar but not exactly matching. When Owl identifies fuzzy and exactly matching data it quantifies it into a likelihood score. So whether you are looking to clean your data for a single client view or simple looking to reclaim a TB of wasted space Owl’s duplicate feature has you covered.

rule

Explorer

Browse any database or file system in one place using a tree navigation. Track the progress of your DQ program over time as you add more datasets under DQ management. Generate coverage reports at each level of drill in per file system, database or schema.

validate

Catalog

The OwlDQ catalog self hydrates to keep things simple and smart. As data connections are added in Owl the catalog will naturally fill in with the best practice naming standard of “schema.table”. As more DQ tests are run the name increments but is always linked to the original source. A user can filter all tables by PII or MNPI to see a list of tables and columns that hold sensitive information. The catalog unifies Kafka Topics, File Systems and Databases in a single natural naming convention.

rule

Pulse View

Get a pulse on the health of every dataset broken down by business unit. This view provides data steward the ability to see all DQ jobs and cherry pick any missing runs or quality failures in a single heatmap. Known what went wrong before your downstream subscribers reach out.

validate

Feature List

For a list our OwlDQ's upcoming features see our roadmap

Capabilities OwlDQ Description
Security
Active Directory Users, Groups, Roles. Use your existing user logins with OwlWeb UI integration
LDAP Mechanism supported
OAuth Supported
SAML Supported
SSL Secure certificate transport supported
Kerberos MIT Kerb, kinit, tickets for spark expiration, users and fids all supported
ACLs Supported
Encryption All data can be encrypted at rest with 1 admin setting
Masking Any data column but commonly used for Pii fields can be masked with 1 click
DQ Features
Pattern Mining Most powerful DQ feature on the market, requires some user interaction
Behavioral Detection Automatic. No user input required
Relationship Analysis Automatic. No user input required
Outlier Detection Automatic and tunable both numerical and categorical
Data Shape Findings Automatic. No user input required
Rules User input driven
Complex Rules User input driven
Auto Rules Email, Zip, SSN, EIN, Credit Cards, Numbers, Dates, States, many more
Duplicates Exact and Fuzzy Matching
Missing Records Automatic
Schema Changes Automatic
Profiling Automatic
Enterprise
Multi-tenancy Ability to scale out user groups, LOBs without installing. Does not mingle user data.
Containerized Runs in docker containers and Kubernetes. Can be boot strapped to EMR
Notebooks Runs Spark/Scala notebooks and in Databricks