Querying the open data source for meteorological data in my vicinity is done using a python script. The data is saved in an SQLite database. Interfacing with the database as well as migrations were done with SQLAlchemy.
The data is available at https://opendata.swiss/de/dataset/messwerte-lufttemperatur-2-m-10-min-mittel/resource/09066670-000d-4de5-aafc-99d040b25089 https://opendata.swiss/de/dataset/messwerte-relative-luftfeuchtigkeit-10-min-mittel/
New data points are available every 10 minutes. Data points are only available for 10 minutes hence the script for persisting them.
I store data for all stations. The measurements are
- Temperature
- Relative humidity
- Wind speed
- Precipitation
The scripts is run every 30 min using a cronjob. The response CSVs are cleaned, parsed and entered into the database.
Cleaning
The files are truncated as they contain a footer that does not adhere to the specification of CSV.
Parsing
The values are read into a pandas dataframe, the relevant columns are extracted into a SQLAlchemy object and saved.
Problems!
A Column was renamed (example). This caused problems in analyzing and loss of data. Namely, a typo was removed. This caused the extraction of relevant data to fail. To prevent further headaches, I added a monitoring system to alert me when more than 4 hours of data were missing for the nearest weather station.