Abstract:
اهداف: الگوریتمهای مبتنی بر شبکۀ عصبی پیچشی (CNN) که شاخهای از مبحث یادگیری عمیق است، در سالهای اخیر پیشرفت چشمگیری در حوزههای تحلیل فیلم و تصویر داشتهاند؛ موفقیت و پذیرفتهشدن الگوهای نوین این حوزه باعث بهکارگیری گستردۀ آنها در زمینههای مختلف اعم از تحلیل متن و دادههای سری زمانی شده است. یادگیری عمیق بخشی از الگوریتمهای یادگیری ماشینی است که در آن از چندین لایۀ پردازش اطلاعات بهویژه اطلاعات غیرخطی استفاده میشود تا از ورودی خام، بهترین ویژگیهای مناسب با هدف تحلیل، بازشناخت الگو یا پیشبینی استخراج شود.روش: در پژوهش حاضر توانایی معماریهای مختلف الگوریتم CNN برای پیشبینی قیمت سهام بررسی شده است.نتایج: نتایج حاصل از اجرای الگوریتم به تعداد 54 دفعه با معماریها و پارامترهای متفاوت و با استفاده از دو دستۀ اصلی دادههای ورودی شامل اطلاعات قیمتی روزانۀ سهام و ده شاخص منتخب تکنیکال برای سهام شرکت ذوبآهن اصفهان نشاندهندۀ آن است که استفاده از CNN همراه با لایۀ ادغام بیشینه (ترکیب پارامترهای اندازۀ دستۀ 64، تعداد فیلتر 256 و با تابع فعالسازی ReLU)، دارای خطاهای درصد 79/1 = MAPE و درصد 71/2 = NRMSE است که نشاندهندۀ عملکرد بهتر آن نسبتبه سایر معماریها و الگوریتم RNN است.
Algorithms based on a Convolutional Neural Network (CNN), which is a branch of Deep Learning (DL), have seen significant progress in picture and video analyses in recent years. Success of these new models has led to widespread use of them in various fields, including text mining and time series data. DL is part of a broader family of machine learning methods that attempts to model high-level concepts using learning at multiple levels and layers and extract features of higher levels from the raw input. This survey investigated the abilities of different CNN architectures to predict stock prices. Upon running the model with various architectures and parameters for the stock price of Esfahan Steel Company, the results showed that a CNN with max-pooling layers (a combination of Batch size=64, filters=256, and ReLU Activation Function) and Mean Absolute Percentage Error (MAPE) of 1.79% and Normalized Root Mean Square Error (NRMSE) of 2.71% had a higher prediction accuracy than other CNN architectures and Recurrent Neural Network (RNN).IntroductionAmong the various deep learning techniques that have many applications in different sciences, specific algorithms like Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), and Convolutional Neural Network (CNN) have been used by researchers due to their characteristics of financial time series (Sezer, Gudelek, & Ozbayoglu, 2020). CNN is a feed-forward Artificial Neural Network (ANN) that takes its inputs as 2-D matrices. Unlike a fully connected neural network like Multi-Layer Perception (MLP) neural network, the locality of data within its input vector (or matrix) is important (Sezer & Ozbayoglu, 2018).CNN has different architectures. Usually one specific architecture is focused on in each study conducted in this field. In this study, however, the architectures used in various studies were surveyed in the first level and each selected architecture was optimized by using different parameters in the second level. Finally, the best performances of the architectures with various parameters were compared to choose the optimized model. The effective studies in model development are shown in Table 1. Table (1) Effective studies in model developmentArt.MethodDatasetFeature SetHorizonLivieris, E. Pintelas, & P. Pintelas (2020)Using two convolutional layers with different filtersGoldPrice data1 dayGao, Zhang, &Yang (2020)Simple CNNS & P500CSI300Nikkei225Price data, volume, technical indicators 1 dayCNN with a dropout layerGudelek, Boluk, & Ozbayoglu (2017)CNN with dropout and max-pooling layersETFPrice data, technical indicators1 dayJi, Zou, He, & Zhu (2019)CNN with a max-pooling layerFuture carbon pricePrice data7 daysLi & Dai (2020)CNN with a max-pooling layerBitcoinPrice data1 day Method and DataBased on the previous studies on CNN application, three different architectures of CNN were investigated as shown in Figure 1. Figure (1) The process of choosing an optimal CNN algorithm For selecting the best CNN architecture, all the three models were surveyed with various parameters. It is worth noting that the parameters that affected CNN included items like number of filters in the CNN layer, Batch size, and Activation Function. In this study, the data obtained from Esfahan Steel Company during the period of 2018-2021were used. The input data consisted of two categories, including price data (Open, High, Low, Close, and Volume) and technical indicators based on the surveys of Kara et al. (2011) and Patel et al., (2015). Python 3.8 with Keras Library was used to execute the model. In this study, the dataset was divided into a training set and a testing set, which covered about the first 80% and last 20% of the raw dataset, respectively. FindingsComparison of the three defined architectures with various parameters led to the optimized model. It should be noted that the selected model was the result of running it 54 times with different layers and parameters. In this study, the two performance measures of Mean Absolute Percentage Error (MAPE) and Normalized Root Mean Square Error (NRMSE) were selected to evaluate the predictive power of our proposed models. In Table 2, the errors of the best performances of each of the three architectures with different parameters and the RNN model (another DL model) were compared to choose the optimized model. Based on the results, the accuracy of the best performance of the second CNN architecture was higher than those of the others. Table (2) Comparison of the errors of the selected models ErrorMethodMAPENRMSERNN2.46%2.79%Best performance of the first CNN architecture2.13%3.09%Best performance of the second CNN architecture1.79%2.71%Best performance of the third CNN architecture2.18%3.26%Conclusion and discussion In this paper, the predictive powers of the various architectures of CNN models were investigated. The results demonstrated that the best performance of the second CNN architecture with the Max-Pooling layer and combination of Batch size of 64, filter of 256, and ReLU Activation Function and MAPE and NRMSE errors of 1.79 and 2.71%, respectively, provided higher prediction accuracy than other CNN and RNN architectures. The outcome of this survey was supported by research of Ji et al. (2019) on Carbon future price forecasting and that of Li & Dai (2020) on Bitcoin price forecasting by using a CNN model with convolutional and Max-Pooling layers. However, Gao et al. (2020) proposed a convolutional layer with a dropout layer and Gudelek et al. (2017) used a convolutional layer with dropout and Max-Pooling layers for predicting ETF prices. Their results were not confirmed by this paper since using a convolutional layer with a Max-Pooling layer had a better performance than other architectures.