ETL system for cryptoassets transactions data

The goal of this project was to build an ETL system collecting and aggregating data about various cryptoassets, from multiple data sources.  The data about blocks and transactions on the cryptoassets was fetched via multiple REST APIs, the data was then normalized, and some elements were aggregated and computed. Finally, the normalized data was loaded into a common data model in a relational database (Exasol DB). 

The system was designed to run for a long-time with a minimal supervision. It handles errors with checksums and counts verification. In case of a detected error in data if fetches and computes affected parts of data again. It handles network issues gently attempting to reconnect with a delay and exponential backoff strategy. The whole system was also designed with care to avoid overloading REST API servers (data providers) and to respect theirs fair use policies. 

The final data was used by financial analysts of an advisory company focusing on investments in cryptoassets, in order to create investment recommendations for their clients.

Marcin Wylot, PhD
Data Scientist & Machine Learning Engineer

20 years of experience in data processing from A to Z