**Research**

**Working Papers**

**Working Papers**

**Abstract:** This paper studies the predictive performance of various factor estimations comprehensively, in a coherent forecasting framework, under the big data that consist of major U.S. macroeconomic and finance variables. 148 target variables are forecasted, using 7 factor estimation methods, with 11 decision rules that determine the number of estimated factors for forecasting. First, I find that the number of estimated factors used in forecasting is important for predictive power. Incorporating more factors may not always provide better forecasting performance. Second, using consistently estimated number of factors may not necessarily improve predictive performance. The decision rules for the number of factors include but are not limited to consistent estimators of the total number of true factors in data. Forecasts obtained by these decision rules perform well, except for Partial Least Squares (PLS). The first PLS factor very often shows stronger forecasting performance than when the number of PLS factors for forecasting is decided by consistently estimated number of total factors. Third, the 7 best forecasting performance of 7 factor estimations, chosen across different decision rules, tends to be very similar. However, there is a large difference in the forecasting performance across different decision rules, even when the same factor estimation method is used. Therefore, the choice of factor estimation method, as well as the decision rule for the number of factors, is crucial in forecasting practice. Finally, the first PLS factor tends to yield forecasting performance very close to the best result from all the possible combinations of the 7 factor estimation methods and 11 decision rules. Strong predictive power of PLS comes from its factor estimation strategy. PLS estimates factors using not only predictors but also a target variable, which can explain the significant forecasting improvement of PLS.

**NBER-NSF Time Series Conference at Rice University, 2021**

with Seung C. (Min) Ahn

**Abstract: **We consider Partial Least Squares (PLS) estimation of a time-series forecasting model with the data containing a large number (T) of time series observations on each of a large number (N) of predictor variables. In the model, a subset or a whole set of the latent common factors in predictors are determinants of a single target variable to be forecasted. The factors relevant for forecasting the target variable, which we refer to as PLS factors, can be sequentially generated by a method called “Nonlinear Iterative Partial Least Squares” (NIPLS) algorithm. Two main findings from our asymptotic analysis are as follows. First, the optimal number of the PLS factors for forecasting could be much smaller than the number of the common factors in the original predictor variables relevant for the target variable. Second, as more than the optimal number of PLS factors is used, the out-of-sample explanatory power of the factors for the target variable could rather decrease while their in-sample power may increase. Our Monte Carlo simulation results confirm these asymptotic results. In addition, our simulation results indicate that unless very large samples are used, the out-of-sample forecasting power of the PLS factors is often higher when a smaller than the asymptotically optimal number of factors are used. We find that the out-of-sample forecasting power of the PLS factors often decreases as the second, third, and more factors are added, even if the asymptotically optimal number of the factors is greater than one.

**Work in Progress**

**Work in Progress**

**Single Modified Partial Least Squares (SMPLS) for Forecasting with Many Predictors **

**Single Modified Partial Least Squares (SMPLS) for Forecasting with Many Predictors**

with Seung C. (Min) Ahn and Seth Pruitt

**Abstract:** In this paper, we develop a novel factor estimation method, Single Modified Partial Least Squares (SMPLS) for forecasting a single time-series variable. We modify the original Partial Least Squares (PLS) such that only one SMPLS factor can estimate all the factors needed for forecasting the target variable. PLS-augmented forecasting with the first PLS factor performs well in the small data, even though the first PLS factor is not theoretically optimal. On the other hand, forecasting with the theoretically optimal number of PLS factors shows worse performance in small data. Our novel SMPLS estimation gives the accurate forecasting results as the first usual PLS factor can do for relatively small data, and as the theoretically optimal number of the PLS factors can do for very large data. SMPLS has the same theoretical forecasting power as the theoretically optimal number of the original PLS factors have. Simulation evidence demonstrates SMPLS outperforms other alternatives and shows robust forecasting performance. Empirical application on forecasting major macroeconomic and finance variables in big data also confirms a strong predictive power of SMPLS.

**Measuring Macroeconomic Uncertainty with Various Factor-Augmented Forecasting**

**Measuring Macroeconomic Uncertainty with Various Factor-Augmented Forecasting**

**Abstract: **This paper investigates measuring macroeconomic uncertainty, using factor-augmented forecasting. In this paper, uncertainty is measured by the conditional volatility of various macroeconomic series, that was not predicted from factor-augmented forecasting. Various factor-augmented forecasting methods are used to construct uncertainty measure, in order to remove as much predictable variations as possible from economic series. Failure to do so will overestimate the economic uncertainty, since it will wrongly include forecastable variations as a part of uncertainty. Target-specific factor estimation methods that incorporate the information of a target variable when factors are estimated, generate less forecasting errors and hence produce more accurate uncertainty measures. However, all the uncertainty measures constructed by factor- augmented forecasting demonstrate similar properties, show more persistent and correlated uncertainty periods with real activity, compared to other uncertainty proxies.