Accurately forecasting resource demand within the supply chain has never been easy, particular given the constantly changing nature of the data over periods of time.
What may have been true in measurements around demand or logistics one minute might be entirely different an hour, day or week later, which can throw off a short-term load forecast (STLF) and lead to costly over- or under-estimations, which in turn can lead to too much or too little supply.
To improve such forecasts, there are multiple efforts underway to create new models that can more accurately predict load needs, and while they have led to better forecasts, models like autoregressive and exponential smoothing have been tripped up by the issue of time – the changes that occur to the data over periods of time and the varying lengths of such periods. Now it looks like recurrent neural networks (RNNs) may hold some promise in the areas of demand prediction and supply chain logistics.
According to a group of European researchers, RNNs and their ability to address changes to data over time – referred to as temporal dependence – are now being investigated as forecast models that can be applied to a range of industries, including telecommunications, transportation and energy. In addition, there are different shades of RNNs that can be used for modeling forecasts, with each having varying degrees of success, depending on the tasks they’re given. Applications as speech recognition, language modeling and image generation are seen as particularly ideal for RNNs.
“A Recurrent Neural Network (RNN) is a more flexible model, since it encodes the temporal context in its feedback connections, which are capable of capturing the time varying dynamics of the underlying system,” the researchers explain. “RNNs are a special class of Neural networks characterized by internal self-connections. … RNNs and their variants have been used in many contexts where the temporal dependency in the data is an important implicit feature in the model design.”
Autoregressive and exponential smoothing models, which the researchers said “represented for many years the baseline among systems for time series prediction,” require a certain degree of expertise and skill among users, and autoregressive models can be limited to assumptions that they make about the system that is being targeted. For its part, “as an RNN processes sequential information, it performs the same operations on every element of the input sequence,” the authors wrote. “Its output, at each time step, depends on previous inputs and past computations. This allows the network to develop a memory of previous events, which is implicitly encoded in its hidden state variables.”
RNNs are not required for all forecasting jobs, but there are many scenarios where their capabilities are needed, according to the researchers, who hail from Norway and Italy. In addition, various RNN models can be used for different (STLF) applications.
In theory, “RNNs can remember arbitrarily long sequences. However, their memory is in practice limited by their finite size and, more critically, by the suboptimal training of their parameters. To overcome memory limitations, recent research efforts have led to the design of novel RNN architectures, which are equipped with an external, permanent memory capable of storing information for indefinitely long amount of time. Contrarily to other linear models adopted for prediction, RNNs can learn functions of arbitrary complexity and they can deal with time series data possessing properties such as saturation or exponential effects and nonlinear interactions between latent variables.”
With the challenges of STLF, multiple RNN models are being applied, and some are finding their way into industries, according to the researchers. They take a look at the basic Elman RNN (ERNN) and two of its variants, the Long-Short Term Memory (LSTM) and Gated Recurrent Units (GRU), which they said their use for real-valued time series “has been limited so far.” In addition, the authors also address two other RNN architectures – Nonlinear Autoregressive with eXogenous inputs (NARX) neural network and the Echo State Network (ESN) – which are different from the others in terms of training procedures and have been used successfully for time series prediction. They’re easier to use and offer fast training procedures.
In their study, the authors ran three synthetically generated time series to provide through the various RNNs to give them experiments that could be easily controlled and results that were easy to replicate. However, the researchers also used three different datasets from the real world, showing how deep neural networks are making their way into industries. The datasets involved measurements of electricity and telephonic activity load, with two of the datasets containing exogenous variables, which they said “are used to provide additional context information to support the prediction task.” The datasets came from European telecommunications company Orange (phone call loads over a mobile network), ACEA (Azienda Comunale Energia e Ambiente), the electricity provider to Rome and surrounding regions (electricity consumption), and the Global Energy Forecasting Competition of 2012 (electricity load collected from an energy provider in the United States).
In each case, the researchers applied the RNNs to the related STLF problems. For the Orange dataset – which was a collection of call records from Orange mobile phone users in the Ivory Coast from Dec. 1, 2011, to April 28, 2012 – the work involved predicting the volume in minutes of the incoming calls the following day. With the ACEA, they used data about supplied electricity to Rome that had been collected every 10 minutes for 954 days from 2009 to 2011 and, among other things, trained the RNNs to predicted electricity load 24 hours ahead. From the energy forecasting competition, the data involved four years of hourly electricity load collected by the U.S. supplier, as well as time series of temperatures in the areas where the electricity consumption was measured, throwing in a seasonal component to the challenge. Again, they goal was to forecast demand 24 hours ahead of the aggregated electricity consumption.
Details of the experiments and results can be found in the study, but the researchers said that in the case of the Orange dataset, all the RNNs studied showed similar results. There was more variability in the ACEA dataset experiments, with the ESN performing better than the other RNNs. In the Global Energy Forecasting Competition, it was the GRU that performed the best, with the NARX showing many more errors in its predictions.
Overall, the researchers said they found that different RNNs worked best with different STLF problems. There was no magic bullet that could address the all.
“There is not a specific RNN model that outperforms the others in every prediction problem,” they wrote. “The choice of the most suitable architecture depends on the specific task at hand and it is important to consider more training strategies and configurations for each RNN.”
They noted that while the training of the gradient-based networks – ERNN, LSTM and GRU – are slower and more complex, they can offer good results with minimal fine-tuning needed and by selecting default parameters. “This implies that a strong expertise on the data domain is not always necessary,” they wrote.
Other findings included that the ESN was strong in most jobs and that “the simplicity of its implementation makes it an appealing instrument for time series prediction.” In addition, the gated RNNs – LSTM and GRU – were not necessarily better than the ERNN, with its simple architecture and training. The more complex gated mechanisms wasn’t always needed for many time series prediction jobs.
Overall, “we conclude by arguing that ERNN and ESN may represent the most convenient choice in time series prediction problems, both in terms of performance and simplicity of their implementation and training,” they wrote.