Food-Prices across Europe: An exploratory analysis tool — Our project allows users to explore price-developments and -predictions of a variety of food categories across every European country from 1996–2021.

TechLabs Ruhr
5 min readOct 8, 2023

This project was carried out as part of the TechLabs “Digital Shaper Program” in Dortmund (summer term 2023). “FoodPrice Forecast” was awarded as the best project in the summer term 2023.

In a nutshell:

Food prices concern all of us. Yet, for most people their overall development is rather intransparent. How did certain food categories change in price? What are possible determinants of food-prices? Can you successfully predict food prices? By leveraging the open-access database of Eurostat, we offer a possible answer to these questions. By employing Python libraries, we develop analysis tools, which allows users to freely choose the data they want to explore.

Introduction:

We were assigned to this project after submitting our preferences for several potential project ideas. Most members of our group expressed that they chose this project because it was centred exclusively on Data Science and Deep Learning. Since the individual who initially proposed the idea did not select this project, we enjoyed a considerable degree of freedom in shaping our work.

Overall, we wanted to contribute a solution to the phenomenon of misconceptions and intransparent, confusing information around increasing food prices in Europe within times of increasing inflation trends. We therefore were faced with two challenges: How to present relevant data on food-price developments in an understandable way and how to generate precise forecasts for food prices in Europe and Germany within the given time frame.

The Data Science team gathered and refined the necessary data to generate precise predictions, while the Deep Learning team conducted research on potential models for forecasting food prices for Germany and Europe. Afterwards, they proceeded to integrate the data supplied by the Data Science team into their models to generate the predictions.

Methodology:

Everybody of us had a certain vision for the project. Our first challenge was to decide on a minimal viable product that we want to deliver. For this, we had to discuss and define the limits of the project, the methodology, technical depth and the functionalities of the final tool.

Since our knowledge of food-price analysis was rather limited, our next step was to take a dive into literature resolving around food prices and prediction methods. We had to find out what possibly drives European food-prices and how we can predict them. Further, we had to get a feeling for an intuitive and effective presentation of the relevant data. For us it was crucial to find a balance between statistical depth and an easy-to-understand style of presentation.

After research, we tackled the data-part, quickly encountering our first realisation: There is no one-fits-all food price. Here we had to weigh off the advantages and disadvantages of different data-sources according to criteria such as reliability, availability, completeness and granularity. After deciding on the data of the Eurostat database for food price data, we gathered further data for our feature variables, the determinants which are used for foodprice predictions. At this step, we mastered the challenge of loading, merging, cleaning and transforming data of different sources into one conclusive and practical format, with which our models could work with later.

As our minimal viable product, we decided to create a graphical representation and a prediction of the harmonized consumer price food-index for Germany from 1996–2021. After achieving this, we wanted to expand the scope of the product, which we defined with the help of our Mentors. The expansion should take place along three aspects of the existing MVP.

  1. The observable data should be increased. Our goal was to not only include Germany, but all European countries.
  2. Secondly, the granularity of our data should increase. The final product should be able to explore different categories of food instead of focussing on an aggregated index.
  3. the number of explanatory variables and analytic tools should be increased. By expanding on possible determinants and methods in the explanation of food-price we planned to increase our prediction performance.

Methodology:

For solving the problem the most important tool was time-series-analysis which was not explicitly part of the “Deep-Learning-Track”, but the track helped to understand the general logic of deep learning and also machine learning in general. This enabled us to use different machine learning models for analysing the data and predicting future values. A crucial part for using machine learning models is data pre- and postprocessing. To be able to process the data in an adequate way it was helpful to use the general method of dealing with dataframes gained in the “Data-Science-Track”.

Results:

The final product consists out of one central Jupyter-Notebook file and three separate supplementary files, containing further analysis tools. The details of the files’ functionality are explained either in the Readme-file of the repository, or in the files itself. In summary, the main-file executes a data-pipeline which scrapes data from different sources and dynamically transforms it based on user input. The different data-sources are cleaned and merged, preparing them for further analysis. Said analysis first takes place by creating a linear correlation matrix of all included variables, allowing to check for the existence of linear relationship between variables. In a next step, the main file presents the food-price data graphically before executing a univariate XGB Regressor Model, which forecasts the food-price based on previous food-price observations.

The three supplementary files complement this work by offering further correlation analysis and different predictive models. On the one hand, a Granger-causality-test and Impulse response functions are implemented to exemplarily show further approaches in analysing the correlation between independent and dependent variables.The two other files contain two further prediction models (Naiveforecaster & LSTM Neural Net), attempting to diversify and increase predictive power.

One central result is that even with quite advanced tools like machine learning it is almost impossible to predict data if there are ground breaking events like a war that has a huge influence on food prices and is very hard to foresee. Besides that our result is mostly a tool that can be used to analyse and predict several types of foods for several countries.

The most natural next step of our project would be to implement the three supplementary files into the central main file. This would allow an all-in-one solution connecting all of our work and prevents the necessity to switch between files for analysis. While price-data is filtered and processed dynamically through the categories of country and category, not all price-determining variables are subject of this filtering process yet. This could be a future step, increasing the accuracy of displayed data. Another future improvement could be the expansion of the prediction possibilities. The current state only allows for predictions of prices of 1996–2021. A prediction of future prices would allow for a real-time assessment of the models’ prediction quality. Further, the implementation of dynamic time-frame adjustment would increase the customisability of our product even more.

Our long-term goal would be to translate our existing tool into a web-based interactive solution, allowing users to explore data in their browser.

GitHub repository (or similar):

https://github.com/TechLabs-Dortmund/Food-price-forecast.git

Team members:

Data Science:

  • Jakob Schlicher
  • Lydia Hoffmeister
  • Peter Hinrichs
  • Tom Schwaiger

Deep Learning:

  • Niklas Diesendorf
  • Richard Lodenkämper
  • Steffen Daniel

Team mentors:

  • Franca Bluhm — Project Manager
  • Philipp Wall — Data Science & AI
  • Miguel Krause — Data Science & AI

Data Sources:

  1. Food Price monitoring tool — Eurostat, https://ec.europa.eu/eurostat/databrowser/view/prc_fsc_idx/default/table?lang=en
  2. Electricity Prices for non-household consumers — Eurostat, https://ec.europa.eu/eurostat/databrowser/view/NRG_PC_205/default/table?lang= en&category=nrg.nrg_price.nrg_pc
  3. Euro/ECU exchange rates — Eurostat, https://ec.europa.eu/eurostat/databrowser/view/ert_bil_eur_m/default/table?lang= en
  4. Producer prices in industry, domestic market — Eurostat, https://de.investing.com/commodities/brent-oil-historical-data
  5. Brent Oil Price — Investing.com, https://de.investing.com/commodities/brent-oilhistorical-data
  6. Precipitation — Deutscher Wetterdienst, https://opendata.dwd.de/climate_environment/CDC/regional_averages_DE/monthly /precipitation/
  7. Temperature — Deutscher Wetterdienst, https://opendata.dwd.de/climate_environment/CDC/regional_averages_DE/monthly /air_temperature_mean/

--

--