Wind Energy Forecasting
Iberdrola, a global leader in wind energy, faces the critical challenge of accurately predicting the energy that its wind farms, scattered across Europe and the United States, will generate. In a sector where competitiveness is key, anticipating energy production is essential to succeed in energy auctions, ensuring attractive profit margins and optimizing both the operation and maintenance of each park. The challenge is monumental, given the size of the data to be processed, including complex three-dimensional weather meshes from global meteorological institutes, such as GFS, ECMWF, and WRF, totaling several terabytes of data and requiring a state-of-the-art on-premise infrastructure to distribute computing.
WhiteBox has led the development of an innovative solution, starting with advanced processing of large volumes of meteorological data using Spark in Spark's fastest and most efficient mode (Scala). After a meticulous feature selection process, a set of Gradient Boosting models was implemented, customized for each wind farm with the support of open-source libraries such as scikit-learn and LightGBM, taking advantage of the distribution of algorithms with Spark and orchestration using Apache Airflow. In addition, we experimented with models based on convolutional neural networks, treating three-dimensional meshes as images, and obtaining promising results.
This project marked a turning point for Iberdrola, allowing it to move from traditional predictive methods to an advanced Machine Learning system that takes advantage of terabytes of meteorological data and historical records to generate highly accurate prediction models. This innovation has allowed Iberdrola to participate in the energy market with renewed confidence, accurately anticipating its wind power production capacity and thus improving the reliability and profitability of its operations.
- Wind Farms: 13 different locations.
- Processed Data: More than 50 TB of weather information.
- Processing Capacity: Training and inference in a 25-node Hadoop cluster.
- Managed Energy: More than 1 GW in total.