Production forecasting & why 2023 is a challenging year
Production forecasting and yield modeling need to be better understood. The methodology is changing with access to technology and data. Before delving into how yield modeling and forecasting work, it is vital to understand the methods and processes. One is far more complex and technologically advanced than the other.
An important distinction is whether companies forecast what the USDA will say in an upcoming report or predict final yields and production. The former attempts to mimic the USDA’s approach to objective analysis, survey data, and modeling. The latter can be quite advanced, utilizing large data sets, machine learning, and computing power.
Companies and farmers should want to know both because the market will always first trade the USDA figures. A fair question is, “Even if someone knew what the final yields would be in June, could they make money trading them?”
The answer is not always clear. The cynic then asks, “Why model production if you can’t make money?”
Modeling provides an objective way to understand what is the same and different this year. Clean data with sufficient history is the foundation for building robust models. Is it the heat, the growing degree days, or the precipitation deviating the furthest from normal? How many times has this happened? A good model will weigh these changing inputs accordingly.
Modeling tempers our inherent bias when it comes to markets. Many biases make humans unobjective to their business and investments. Recency bias, overconfidence bias, endowment bias, and confirmation bias are just a few. Confirmation bias is the most frequent as we seek information confirming our existing views. For farmers and traders, this creates blind spots and can lead to financial losses.
One can think of yield modeling as a real-time snapshot, replicable multiple times daily with each weather run. Here’s what can be learned by farmers and traders with a firm understanding of production forecasting.
The trend. Is the crop improving or getting worse? Is the data showing solid correlations and relationships to other data sets (yield trend versus crop conditions), or is this year unique for some reason (2023 is a great example)?
Key periods. With large enough data sets, the models weigh inputs and recognize which are more or less important. This year’s models weighted the lack of precipitation before pollination as less important than the heat from mid-July to mid-August. The result was yield models stayed much closer to the trend than analysts expected based on crop conditions and drought monitors.
Analogs. The data provides years and periods that were similar to the current environment. This year’s first unique feature was a dry spring, causing late planting. Usually, late planting and widespread prevent plant result from excessive spring moisture. Crop conditions fell off quickly, diverging from yield models. The late August ridge is one of the most intense in recent decades.
Objective data. Social media is in your face. Forecasters put out a lot of forecasts. Some will be right, and some will be wrong. Without transparent methodology and data, it is a guess. There is no consistency year to year, and nothing is learned from the process.
Warnings. Overfitting, a machine learning term, is when the model has success with past data, but cannot successfully predict or forecast future results. There a few reasons for this, some are based on the way the model is built, or it can be as simple as the data sets are too small.
Many forecasters will intentionally or unintentionally overfit their models, so be aware this exists with anyone touting their models’ accuracy.
2023 is an incredibly challenging year for the models because of a series of “rare events.” The transition from La Nina to El Nino would typically indicate a wetter and cooler summer, but the early season blocking pattern led to an extremely dry May, June, and July over the heart of the corn belt.
The GFS tried to bring extreme heat into the forecast since mid-June, but it did not materialize until late August. Most models rely on a stable forecast, and this year was volatile. Model outputs showed changes in 24-hour periods (4 model runs) that implied the national yield could change by 3-5%. The second week of July saw one of the most extreme forecasts in recent history. It did not happen.
Most years with early season declines in crop conditions and late planting are wet years when flooding is the problem. There were very few years since the 1980s when high prevent plant coincided with below-normal precipitation. Models struggle when rare patterns develop. Many farmers in Texas who claimed prevent plant from flooding have not done so in the modern era.
The other factor for corn is that the model weights heat during the critical growing weeks as the dominating input. The model does not understand what pollination is. Instead, the data proves that heat stress drives yield losses, and it never got scorching. The late heat will be much more impactful for soybeans at this point.
The human bias will be to assume the worst because of all the headlines. This year is an outlier for many reasons (Brazil Safrinha and Ukraine War), and this only adds to the challenges a farmer faces when making critical decisions. However, having an objective model to check your assumptions is essential when optimizing your business's profits.