Wednesday, July 1, 2009

News Sentiment as a Quant Factor

In a previous blog-posting, I presented some of the key findings of an event study conducted by Macquarie Research. This blog-posting will be focusing on the same report (May edition of Factorial! Under the title “Breaking News: How to use news sentiment to pick stocks”), but presenting the key results of their quant factor analysis. To summarize, Macquarie shows that quant factors based on News Sentiment add significant value to multi-factor alpha models. More importantly, they find that investing based on News Sentiment has worked at a time when many traditional quant factors have failed. The study was conducted on the Russell 3000 index covering both large and small cap companies for the period January 2005 through March 2009. Sentiment data was supplied by RavenPack International.

In order to be able to include news sentiment in a multifactor quant model, it is necessary to perform some kind of frequency transformation, and thus translate sentiment into a quant factor working on the same periodic intervals as the relevant market data. In their report, Macquarie tests a number of such factors including the following examples:
  • Simple average: take the average of one of the RavenPack sentiment classifiers per company over the past week
  • Relevance-weighted average: scale each sentiment classifier score for the relevant company by the relevance score for that article and then take the average over the last week. This puts more emphasis on articles where the company was prominently mentioned.
  • Simple count: count the number of positive or negative scores per classifier for the relevant company in the past week.
  • Linear time-weighted average: use a linear decay profile to weight sentiment classifier scores for the past week. This puts more weight on more recent news articles.
Summarizing the key points of the Macquarie Research quant factor study:
  • Composite factors tend to work better than any of the factors based on individual scores. This is partly because of diversification benefits. The five RavenPack sentiment scores tend to have high, but not perfect, correlations. Also, each score has reasonable predictive power in its own right. So as a result, a composite factor is able to pick up some diversification benefits.
  • Relevance-weighted factors tend to have better performance than unweighted factors. Articles with a high relevance score are better at capturing news sentiment specific to a particular company. So weighting each sentiment score by the relevance score should enhance the predictive power of the factors.
  • Recent performance of news sentiment was excellent through the credit crisis – a time when many familiar quant factors have struggled (Fig. 33, below). More specifically, news sentiment-based quant factors account for 31 of the top 50 quant factors by average rank Information Coefficient (IC). Macquarie suggest this is because through the severe market dislocation, investors lost faith in the company fundamentals upon which many traditional quant factors are based. For example, investors stopped trusting not only factors based on sell-side analyst forecasts (eg, earnings revisions, forward PER), but also factors that could be clouded by uncertain asset values (eg, price/book, ROA).
  • Just being in the news is a positive sign. This is seen in the fact that a simple count of the number of news articles (irrespective of whether they are positive or negative) actually has positive predictive power. This is more pronounced in the monthly backtests. Marcquarie argues that further research is needed addressing this issue because the academic literature has been inconclusive on whether news coverage is positive or negative for stock returns. Another issue here is a potential size bias. Buying stocks with lots of news stories is similar to buying large cap names, because of the correlation between size and the number of stories.
  • Rebalancing frequency is important. The weekly and monthly results are similar, but there are also some significant differences. In general, count-based factors tend to outperform average-based factors in the monthly backtests. However, the weekly results are opposite. This raises interesting questions about the information decay rate of different components of news sentiment. For example, does the ‘attention effect’ of just being in the news persist longer than the directional aspect (ie, positive or negative sentiment) of the article? Again, Macquarie suggests that further research is needed adressing this issue.
  • Is data mining a potential problem? According to Macquarie, the short answer is yes. Whenever testing a large number of factors, some will almost certainly give good performance purely due to random chance. How can one tell if this is the case here?
Macquarie argues that there are a few aspects to the results that give them some comfort. First, the majority of results have the signs that they would expect. For example, the count of negative stories for four of the five sentiment scores has a negative (IC) in the weekly backtests, which makes sense because one would expect that stocks with a large number of negative stories would underperform. Second, there is a consistency of performance across factors derived from the five news sentiment score, despite the fact that the factors actually have a reasonable low correlation. Again, this provides some confidence that the results are not purely the result of spurious data variation.


In my next blog posting, I will make reference to some interesting findings by Macquarie looking into the correlations between news sentiment-based quant factors and the more traditional quant factors. This is obviously important since quants are not only interested in positive ICs, but also how it correlates with other factors - otherwise there is little benefit from adding it to a multifactor model.

0 comments:

Post a Comment