The Modern Data & Analytics Snowflake Stack with Datameer + Datacoves

  • Ndz Anthony
  • December 16, 2022
The Modern Data & Analytics Snowflake Stack with Datameer and Datacoves

Datameer, in collaboration with Datacoves, hosted a webinar on the modern data and analytics Snowflake stack.

In this webinar, the parties go through a day in the life of a business analyst as new requirements arise.

We show how to leverage the modern cloud analytics Snowflake stack to reduce time to new insight from weeks to hours.

Presenters: Steve Eagen, Director of global solutions at Datameer

Co-Presenters: Noel Gomez, Co-founder of Datacoves

Introducing Datacoves

Datacoves is a turnkey analytics workbench that helps companies implement the components of a modern data stack.

They identify how to bundle all the different pieces using off-the-shelf open-source solutions with an opinionated approach. i.e., Extract, Load, Transform, Analyze and Orchestrate.

Datacoves current offerings include :

  • Air byte – For Extracting & Loading
  • SQL Dbt  –  For Transformation
  • Apache Superset – Reporting and BI
  • Airflow – For Orchestrating
  • Platform – Kubernetes
  • GIT & Dbt – Dataops strategy and functionalities include data auditing, lineage, cataloging, and governance.

Noel explains that Datacoves offers a flexible approach whereby customers can replace some of the default offerings on the stack with their preferred alternative.

Introducing Datameer

Steve highlights the following as core capabilities of Datameer:

  • Data Preparation and Analysis
  • Multi-persona Transformation Properties
  • Documentation and explore-to- Insight tools
  • Write-back or Productionalization features 

The Modern Data & Analytics Snowflake Stack

The Modern Data & Analytics Snowflake Stack with Datameer and Datacoves

 

At this juncture, the presenters explain their solutions’ roles in the illustration above.

The low-code alternative for business users – Datameer

Steve starts with the Datameer side of things.

He talks about how Datameer sits right on top of your Snowflake environment and caters to the lack of a no-code solution for non-technical business users.

Once a customer’s data is in Snowflake, Datameer can aid in self-service analytics, prototyping and ad-hoc discovery, and business validation with Datameer.

Full-Stack Solution For The Modern Data & Analytics Snowflake Stack – Datacoves 

Noel talks about the Datacoves portion of the stack.

Excerpt

” We use the Airbyte tool to take any number of sources and get the data into Snowflake.

Once data is in Snowflake, we use a tool called Dbt, which allows us to transform the data using standard SQL. 

Dbt fits in very well with software development best practices. 

Dbt allows you to do more than capture the data- we can capture documentation, and you can add testing and lineage. 

Once you have your transformations done and everything tested and deployed, we use a tool called airflow to orchestrate the process. 

Airflow allows us to load these sources at a specific time, and once those are done loading, we transform the data. You can also push data back into your CRM or feed your result sets into your ML model, etc.

Finally, we host superset to create dashboards and reports. “

The Datameer + Datacoves Demo

After explaining the data & analytics Snowflake stack, The presenters proceed with a practical demo on how Datameer + Datacoves can help organizations go from data to insights in minutes.

Datcoves play the role of IT, i.e., loading from the data loading and orchestrating. On the other hand, Datameer plays the part of the less technical end-user persona who takes an idea, i.e., a hard requirement, and takes it to insights within minutes. Subsequently, the SQL generated is now passed to IT to implement.

Ready?… Let’s explore the use case being referred to.

💭 Use-case/ Scenario

An analyst presents her just-completed dashboard to management. However, management (like always) is not satisfied.

The board wants to see a distribution of high-risk loans across the country. 

She notices that the location data required for this additional analysis is not in her current Snowflake – but in AWS.

She reaches out to IT.

However, not for a reason you think.

She asks that the required data be loaded into Snowflake so she can create a prototype of her data model.

IT, who runs a Datacoves infrastructure, gets her request and does just that.

Subsequently, she leverages Datameer to create a prototype of the data without burdening IT with unclear requirements.

With Datameer, she can leverage no-code and low-code features to build a demo model that can be passed along to IT for productionalization.

We’ll see how this was implemented in the following section.

✅ The Solution:  Leveraging Datameer + Datacoves for Modern Snowflake Analytics

DATACOVES – The Extract & Load Process

In this section, Noel, our co-presenters, plays the role of IT.

  • Airbyte – Loading

He navigates to the datacoves loading section to perform the loading activity.

He transfers the data from S3 to  Snowflake using Airbyte .

In this example, the data is transferred to a variant data type column in Snowflake using Airbyte.

  • Dbt – Transformation

Once in Snowflake, Noel uses Dbt to flatten the JSON file column in a table format and implement it in a Snowflake table.

DATAMEER – The ‘T’ in ELT

Steve playing the role of the business user, logs on to Datameer at @21:29 for prototyping.

He shows the full extensibility of Datameer’s self-service features by leveraging Datameer’s interactive SQL workspace feature to manipulate data visually.

He applies visual joins, performs ad-hoc analysis, data profiling, Snowflake case functions, etc.

Finally, he ( business user(steve)) comes up with a suitable data model befitting his use case.

Subsequently, using Datameer’s forward engineering capabilities, he deploys the code to his snowflake tables (dev environment).

In the subsequent sections, we see IT  using these crisp- clear requirements as a model for productionalization.

  • Visualizing Data In Datacoves – Apache Superset

Before passing the code to IT, the business user visualizes the metrics in his model to be sure it’s in line with the management’s needs. 

For this BI visuals, he uses Apache superset –  a fast, lightweight, open-source software within the Datacoves stack.

  • Productionalize with Datacoves – Dbt

IT gets these requirements and is now ready for implementation.

However, In this case, IT notices that the SQL generated by the end user does not follow the company-wide coding convention, hence the need for standardization. This is achieved using accelerators in Dbt.

Lastly, using Dbt’s cataloging and lineage features, IT  implements dependency analysis and orchestration of the new model.

And that’s it…the complete stack solution with Datameer and Datacoves!

 

Quick Q&A Lookup

Analytics Snowflake Stack

Q1

“You mentioned sampling and Datameer. How does that work with the sampling versus the full data, and when is each used?”

Q2

“So with the Datacoves stack, What if I have tableau, you know, already in place or Fivetran? How would that fit with the stack?”

Q3

“How are join suggestions calculated?”

Related Posts

Top 5 Snowflake tools for Analysts- talend

Top 5 Snowflake Tools for Analysts

  • Ndz Anthony
  • February 26, 2024