NashTech Insights

ETL

Posts by categories

Posts by tags

A Quick demo: ArangoDB to Spark to Bigquery

Hi Folks!! In this blog, we are going to learn how we can integrate Spark with ArangoDB and Big Query to build a simple ETL pipeline. ArangoDB:  ArangoDB is a multi-model database system. It supports three data models (graphs, JSON documents, key/value) with one database core and a unified query language AQL (ArangoDB Query Language).  Apache Spark: Apache Spark is an open-source, distributed processing engine used for big data …

A Quick demo: ArangoDB to Spark to Bigquery Read More »

big data, data, database-7216774.jpg

ETL TESTING – It is important for a project to collect data from multiple sources

When we build up a system which need more data such as for decision-making, market analytics, risk managements. We need a large data from different sources. They are from multiple systems, databases, files, … In testing, we must test how the system extract data from different sources, transforming it to meet the target schema or …

ETL TESTING – It is important for a project to collect data from multiple sources Read More »

An idea of Zero ETL with Open-source stacks

Key takeaways:Zero ETL is being introduced by AWS, Google, Databricks, and others as a future trend in the field of data engineering. However, current solutions are only available for projects with large budgets and rely on proprietary software. This article aims to share an idea for implementing Zero ETL using the open-source ClickHouse. I. Introduction …

An idea of Zero ETL with Open-source stacks Read More »