Sentiment Data Harvesting
The goal of this project was to build an ETL system collecting, normalizing, and aggregating data about the popularity index (search volume) of various keywords from Google Trends. The system had two subsystems one for pulling historical data in bulk, the other for current data.
After fetching RAW data, it is then normalized to align short and long-term values. Following, the data is aggregated for different time granularity. The data was finally loaded into a relational database (Exasol DB).
The system architecture combined Apache NiFi native data processors and several custom data processors in Python.
The final data was used for sentiment analysis by an advisory company focusing on investments in crypto assets, in order to create investment recommendations for their clients.