In order to be able to include news sentiment in a multifactor quant model, it is necessary to perform some kind of frequency transformation, and thus translate sentiment into a quant factor working on the same periodic intervals as the relevant market data. In their report, Macquarie tests a number of such factors including the following examples:
- Simple average: take the average of one of the RavenPack sentiment classifiers per company over the past week
- Relevance-weighted average: scale each sentiment classifier score for the relevant company by the relevance score for that article and then take the average over the last week. This puts more emphasis on articles where the company was prominently mentioned.
- Simple count: count the number of positive or negative scores per classifier for the relevant company in the past week.
- Linear time-weighted average: use a linear decay profile to weight sentiment classifier scores for the past week. This puts more weight on more recent news articles.
- Composite factors tend to work better than any of the factors based on individual scores. This is partly because of diversification benefits. The five RavenPack sentiment scores tend to have high, but not perfect, correlations. Also, each score has reasonable predictive power in its own right. So as a result, a composite factor is able to pick up some diversification benefits.
- Relevance-weighted factors tend to have better performance than unweighted factors. Articles with a high relevance score are better at capturing news sentiment specific to a particular company. So weighting each sentiment score by the relevance score should enhance the predictive power of the factors.
- Recent performance of news sentiment was excellent through the credit crisis – a time when many familiar quant factors have struggled (Fig. 33, below). More specifically, news sentiment-based quant factors account for 31 of the top 50 quant factors by average rank Information Coefficient (IC). Macquarie suggest this is because through the severe market dislocation, investors lost faith in the company fundamentals upon which many traditional quant factors are based. For example, investors stopped trusting not only factors based on sell-side analyst forecasts (eg, earnings revisions, forward PER), but also factors that could be clouded by uncertain asset values (eg, price/book, ROA).
- Just being in the news is a positive sign. This is seen in the fact that a simple count of the number of news articles (irrespective of whether they are positive or negative) actually has positive predictive power. This is more pronounced in the monthly backtests. Marcquarie argues that further research is needed addressing this issue because the academic literature has been inconclusive on whether news coverage is positive or negative for stock returns. Another issue here is a potential size bias. Buying stocks with lots of news stories is similar to buying large cap names, because of the correlation between size and the number of stories.
- Rebalancing frequency is important. The weekly and monthly results are similar, but there are also some significant differences. In general, count-based factors tend to outperform average-based factors in the monthly backtests. However, the weekly results are opposite. This raises interesting questions about the information decay rate of different components of news sentiment. For example, does the ‘attention effect’ of just being in the news persist longer than the directional aspect (ie, positive or negative sentiment) of the article? Again, Macquarie suggests that further research is needed adressing this issue.
- Is data mining a potential problem? According to Macquarie, the short answer is yes. Whenever testing a large number of factors, some will almost certainly give good performance purely due to random chance. How can one tell if this is the case here?
In my next blog posting, I will make reference to some interesting findings by Macquarie looking into the correlations between news sentiment-based quant factors and the more traditional quant factors. This is obviously important since quants are not only interested in positive ICs, but also how it correlates with other factors - otherwise there is little benefit from adding it to a multifactor model.

0 comments:
Post a Comment