Never have a data profile go stale again! OwlDQ automatically profiles every dataset without the need for manual user intervention. The profiler will identify weekend trends, holidays, and major data segments as well as column correlations and outliers. If your data changes the OwlDQ profiler will automatically detect it and highlight the issue.
Turn AI into human readable SQL rules. OwlDQ helps you pass stringent audit requirements or simply export a list of rules that are distilled down from the internal AI. Derive your rules by learning from the data itself. Never maintain a static rule list again.
The Data Stewards favorite page! In a modern data lake there are so many different types of data quality issues happening all the time. An agile Data Steward requires the ability to look across all datasets and see quality issues classified into tangible buckets. The list view provides the ability to triage all datasets by issue type over time and by impact ranking. These features turn a data quality program into an operational data quality program.
Define relationships not rules! By linking data relationships OwlDQ will simulate the equivalent of thousands of nuanced rules and sub groups. Don't know your data relationships? Use Owl's AutoML to find them for you. Next select the most meaningful curated relationships.
Duplicate or redundant data issues are notorious quality issues. All data admins are looking for a way find and remove redundant data so they can reclaim valuable disk space. Sometimes it’s not so simply as data might be similar but not exactly matching. When Owl identifies fuzzy and exactly matching data it quantifies it into a likelihood score. So whether you are looking to clean your data for a single client view or simple looking to reclaim a TB of wasted space Owl’s duplicate feature has you covered.
Browse any database or file system in one place using a tree navigation. Track the progress of your DQ program over time as you add more datasets under DQ management. Generate coverage reports at each level of drill in per file system, database or schema.
The OwlDQ catalog self hydrates to keep things simple and smart. As data connections are added in Owl the catalog will naturally fill in with the best practice naming standard of “schema.table”. As more DQ tests are run the name increments but is always linked to the original source. A user can filter all tables by PII or MNPI to see a list of tables and columns that hold sensitive information. The catalog unifies Kafka Topics, File Systems and Databases in a single natural naming convention.
Get a pulse on the health of every dataset broken down by business unit. This view provides data steward the ability to see all DQ jobs and cherry pick any missing runs or quality failures in a single heatmap. Known what went wrong before your downstream subscribers reach out.
|Active Directory||Users, Groups, Roles. Use your existing user logins with OwlWeb UI integration|
|SSL||Secure certificate transport supported|
|Kerberos||MIT Kerb, kinit, tickets for spark expiration, users and fids all supported|
|Encryption||All data can be encrypted at rest with 1 admin setting|
|Masking||Any data column but commonly used for Pii fields can be masked with 1 click|
|Pattern Mining||Most powerful DQ feature on the market, requires some user interaction|
|Behavioral Detection||Automatic. No user input required|
|Relationship Analysis||Automatic. No user input required|
|Outlier Detection||Automatic and tunable both numerical and categorical|
|Data Shape Findings||Automatic. No user input required|
|Rules||User input driven|
|Complex Rules||User input driven|
|Auto Rules||Email, Zip, SSN, EIN, Credit Cards, Numbers, Dates, States, many more|
|Duplicates||Exact and Fuzzy Matching|
|Multi-tenancy||Ability to scale out user groups, LOBs without installing. Does not mingle user data.|
|Containerized||Runs in docker containers and Kubernetes. Can be boot strapped to EMR|
|Notebooks||Runs Spark/Scala notebooks and in Databricks|