Categories
Computing Data Science

Global Warming and Data Science, Episode 2: Data Engineering and Loading into Snowflake

In the previous episode, I discussed the goals of my current work side-project: loading a fairly sizable weather data set from NOAA and analyzing is using data science techniques and machine learning.

This post will get into the nitty-gritty of how I went ahead and massaged (‘wrangled’) the data into a form Snowflake finds palatable to digest and load into tables from text/CSV files. Subsequent posts will go into the specifics of the analytics. This is all about the dark art of data engineering. I am not a pro data engineer and many are lucky to have their own tools, so take this with a view of me as a dangerous neophyte.

Share
Categories
Computing Data Science

Global Warming: self-learning journey to build the story with data science

When I was working at MathWorks, I had the opportunity to create a demo in MATLAB that provides a simple walkthrough of how to perform a data science workflow on Domino Data Lab‘s MLOps platform. The demo showed how to use climate data from NOAA, to build a simple prediction tool that uses machine learning (ML) regression.

With the goal of telling you whether you should consider buying an air conditioner, the model predicts how many hot days upcoming years hold in store for us. A hot day is defined as one with a temperature over 29° Celsius (or 84° Fahrenheit). You name a location, and the model predicts the number of hot days.

More than two years later and I am now a member of the Domino team. I am also embarking on a new project to demonstrate our deep integration with a major data store partner, and creating the demo in Python this time. This post and subsequent ones will act as my travelogue for the project.

Share
Share