Saturday, 9 April 2022

Free Data Warehouse - Data Source

So I decided to embark on trying to build my own free data platform. If nothing else this could be a little project for someone looking to get into Data Engineering and Warehousing to have a go at. 

Honestly I have not yet decided on the subject area I am going to use but am looking into the tooling and processes. To give me a basic starting point on testing out the tooling etc. I am going to use a really basic Google Form and Spreadsheet as my data sources. I am an avid runner but do not do enough strength training so I set up a form to track what I am doing to help motivate me. 

The form can be seen in the screen shot below, though there are more fields for other types of exercise.



This form feeds straight into a spreadsheet which then acts as the data source for the ETL. Now this data source could be a transactional database, a file in an S3 bucket or whatever source you want. For ease of use downstream I have set all the column headers to be in capitals, this seemed to be beneficial downstream when working with Snowflake which didn't play nice with case changes.  

No comments:

Post a Comment