Quant Finance Notes

1. Alternative Data

  • 另类数据与分析师盈利预测

    • Dessaint, O., T. Foucault, and L. Fresard (2022). Does alternative data improve financial forecasting? The horizon effect. Working paper.
    • 假设分析师在进行盈利预测时,需要最优地分配其投入到不同时间尺度预测的精力,从而最小化预测误差以及获取不同时间尺度预测信息的成本这二者之和。另类数据的出现降低了获取短期预测数据的成本,并同时提高了短期预测数据的准确度。因此,它促使分析师将更多的精力投入到获取和分析短期预测信息上,以此来提高短期预测的准确度。然而顾此失彼,由于分析师的精力是有限的,这造成的后果是降低了他们长期预测的准确度。
  • Wolfe Research | Global Stock Selection with Proprietary Global Trademark Data

    • USPTO, foreign applications, Madrid filing, international registration
    • Trademark is not a proxy of advertising spending. Firms with very high advertising spending may over-spend and suffer from agency problem.
    • Signal transformations: growth rate, vintage ratio (long term number / short term number), long/short lookback window; residualizing size (log revenue) and sector.
    • Use life cycle of trademark to construct signals: applications, success rate of application, renewal ratio, trademark age, age dispersion, dispersion in trademark category, secretive foreign priority application (to keep trademark information under protection, and re-file in US using international registration)
    • Trademark category to construct linkage.
  • Wolfe Research | The Intangible Asset Premia

    • KC (Knowledge Capital): accumulate past R&D spending with discount (perpetual inventory method)

    • OC (Organizational Capital): accumulate 30% of past SG&A expenses.

    • IAI (Intangible Asset Intensity), KCI, OCI:

    • OCI+KCI combined has consistently more relevant than growth factor (as risk factor).

    • Firm with higher IAI score has higher momentum scores.

  • Wolfe Research | Patent, Innovation, and Alpha

    • Innovation industry: all industries where at least 50% firms have patent grants. High R&D spending (log R&D residualized by log market cap) is negatively correlated to future return, but is positively correlated to future return in innovation industry group.
    • Signals:
      • number of unique patent class / number of patent
      • Inward citation number, unique inward citation assignee; Outward citation number, unique outward citation assignee
      • Citation as linkage; patent class as linkage.
      • Patent maintenance.
  • Wolfe Research | Innovation Relevance

    • Total citations for all patents that a firm has cited to form a firm i's knowledge base (we can normalize within each patent class for citation) where indicator of indicates whether a patent has been cited by firm i, and is the total number of citations of patent at time . Then technological obsolescence is defined as the rate of change over a window .

    • Investors disagree most on high innovation relevance firms, we can assign higher risk budget to these firms.

  • Lerner, Josh, and Amit Seru. "The use and misuse of patent data: Issues for finance and beyond." The Review of Financial Studies 35.6 (2022): 2667-2704.

    • Patent Data Challenges: Patent data is valuable but can be problematic due to truncation (not all patents are granted by the end of the study period) and changes in inventor composition over time. (Both patent grants and patent citations).
    • Biases in Aggregation: When patent data is aggregated at the firm level, biases can persist even after common adjustment methods are applied.
    • Correlation with Firm Characteristics: Patent and citation biases are correlated with firm characteristics such as size, market-to-book ratio, and R&D intensity.
  • Shu, Tao, Xuan Tian, and Xintong Zhan. "Patent quality, firm value, and investor underreaction: Evidence from patent examiner busyness." Journal of Financial Economics 143.3 (2022): 1043-1069.

    • This paper attempts to study the causal effect of examiner busyness on patent quality and firm value. Using a broad set of patent quality measures, we find strong evidence that patents allowed by busy examiners exhibit significantly lower quality.
  • Bekkerman, Ron, Eliezer M. Fich, and Natalya V. Khimich. "The effect of innovation similarity on asset prices: Evidence from patents’ big data." The Review of Asset Pricing Studies 13.1 (2023): 99-145.

    • 科技关联度 (II)
    • Methodology:

      • Patent text analysis: use external sources such as Wikipedia and professional dictionaries to establish the professional termi- nology for every patent in our sample. This process enables us to define two patents as similar if they share the same professional terminology.
      • Remove “boilerplates” (i.e., long lists of terminology used in patent texts to illus- trate the invention generality).
      • Patent similarity: Two patents' distance is represented as a vector of their common terms weighted by a TFIDF (term frequency– inverse document frequency) variant.
      • Firm similarity: log of the sum of similar patent pairs discounted by the age of the newer patent in each pair and normalize it by the log of the product of the total number of patents for each firm in the pair.
    • 基本面解释:For R&D-to-Total Assets and ROA terms, peer firms have both contemporaneous correlation and prediction power.

    • 信息扩散缓慢的原因是投资者注意力不足,而不是投资者完全完全意识不到关联。注意力不足意味着投资者未来能认识到关联,因而会有信息的进一步扩散和关联动量。而后者意味着投资者压根就看不到关联的存在,因此也就没关联动量效应了。
  • Narrative Momentum

    • Data

      • The collected news articles are classified into media reservoirs: General, Corporate, FX, and Country Equity.
      • Articles are classified into 347 narratives: including 53 pre-specified Journal of Economic Literature (JEL) narratives and around 300 additional narratives. The 347 narratives are classified into 14 narrative tags including Geopolitics, Macro, Micro, etc. The narrative series are provided by MKT MediaStats, LLC.
    • Construction

      • Narrative Intensities: Negative (positive) intensity is the fraction of negative (positive) sentiment articles pertaining to a narrative out of the overall discussion, with a value in [0,1].
      • Narrative market beta: whether narratives can explain excess market returns. univariate regressions of the one-month market excess returns on contemporaneous one-month intensity changes.
      • stock-level narrative betas: univariate regression of stock return and intensity changes.
    • Conclusion:

      • Financial analysts also tend to underreact to narrative-sensitive stocks
      • Narrative momentum is different from price momentum.
  • Goldman, Eitan, Jordan Martel, and Jan Schneemeier. "A theory of financial media." Journal of Financial Economics 145.1 (2022): 239-258.

    • Firms are more likely to manipulate their announcements when media coverage is more extensive.
    • Negative news is more likely to be reported than positive news.
    • The presence of financial journalists can lead to more efficient pricing.
  • Froot, Kenneth, et al. "Predicting Performance Using Consumer Big Data." Journal of Portfolio Management 48.3 (2022).

    • Proxies for Corporate Sales: The authors construct three proxies for real-time corporate sales using distinct information sources: in-store foot traffic (IN-STORE), web traffic to companies' websites (WEB), and consumers' interest level in corporate brands and products (BRAND).
    • Predict SUE, SUR, Analyst forecast error.
    • Check analyst coverage and media exposure, and market attention level to see if the market consensus really matters, whether we want to trade the surprise from consensus or just the quarterly change.
  • Wolfe Research | Global shipping and supply chain alpha

    • S&P global panjiva supply chain intelleigence (US, Brazil, India, Mexico); FactSet (US only)
    • BOL: bill of lading form.
    • Features:
      • shipping volume (predict sales), (level/growth)
      • supplier, product, country of origin (diversity)
      • supply-chain network (company's position in supply chain)
      • shipping network momentum
  • Wolfe Research | Alpha insights from global job postings data

    • RavenPack dataset (most likely use LinkUp)
    • Term construction: job postings level/growth (we can use SOC median salary as importance weight); technical skill intensity, level/growth, uniqueness in skills, adoption of new technical skills, skill importance (TF-IDF).
  • DB Research | Macro and Micro JobEconomics

    • LinkUp job posting dataset: scrape from company website. Data since 2007, description data since 2014. Includes SOC job classification, geolocation data, and technical skills data. Most coverage in the USA.
    • Term construction similar for micro.
    • Macro: use to predict employment, PMI index, CPI, retail, consumer sentiment.

2. Analyst

  • Analyst Forecast Bundling Intensity and Earnings Surprise

    • Barth, Mary E., et al. "Analyst Forecast Bundling Intensity and Earnings Surprise." Available at SSRN 4839739 (2024).

    • The authors explore how financial analysts convey information about a company's earnings without necessarily making full revisions to their earnings forecasts. They achieve this by increasing what they term 'bundling intensity,' which refers to the extent to which an analyst's report that includes an earnings forecast revision also includes revisions to price targets and/or recommendations that have the same direction as the earnings forecast revision.

    • The researchers have developed a measure called BF_Score at the firm level to quantify bundling intensity. Their findings suggest that BF_Score is a significant predictor of earnings surprises based on analyst forecasts. These surprises often result from biases in consensus earnings forecasts, which are influenced by the information analysts communicate through bundling intensity. Below is definition of BR score, where TP is target price.

    • The use of bundling and the predictive power of BF_Score increase during times of higher macroeconomic uncertainty, when analysts have greater incentives to avoid bold revisions to their earnings forecasts.

  • He, Jie Jack, and Xuan Tian. "The dark side of analyst coverage: The case of innovation." Journal of financial economics 109.3 (2013): 856-878.

    • Firms covered by a larger number of analysts generate fewer patents and patents with lower impact.
    • The evidence is consistent with the hypothesis that analysts exert too much pressure on managers to meet short-term goals, impeding firms' investment in long-term innovative projects.

3. Anomalies

4. Asset Pricing

5. Behavorial Finance

6. Crypto

7. Event

8. Linkage

  • Chen, Xin, et al. "Attention spillover in asset pricing." The Journal of Finance 78.6 (2023): 3515-3559.

    • The paper leverages a unique feature of stock display on trading platforms in China, where the order of stock display is determined by the stock's listing code. This feature creates an attention spillover effect, where investors are more likely to notice and trade stocks with listing codes adjacent to those of stocks they currently hold.
    • The authors propose that overconfident investors, following positive investment experiences, are likely to increase their trading activities and are more likely to direct their attention to neighboring stocks on the display.
  • Jin, Zuben. "Business aspects in focus, investor underreaction and return predictability." Journal of Corporate Finance 84 (2024): 102525.

    • Conference call transcripts -> topic model -> firm similarity -> linkage signals
  • Zhang, Zhiyu, et al. "Uncovering interfirm links through textual topic similarity: A comomentum analysis in financial markets." The British Accounting Review (2024): 101446.

    • cross-firm similarity measure based on the various topics extracted from Management Discussion and Analysis texts
  • Feng, Jian, et al. "Economic Links from Bonds and Cross-Stock Return Predictability." Available at SSRN 4047776 (2022).

    • Main idea: linkage from bond market credit-rating comovements.
    • This study identifies a "market segmentation" effect between the equity and bond markets, showing that information from bond markets is often not incorporated promptly by equity market investors.
    • Firms are connected through "credit-rating comovements," defined as instances when two firms' bond ratings are updated in the same direction within a ±10-day window.
  • Chen, Xin, and Huaixin Wang. "News Links and Predictable Returns." Available at SSRN 4458612 (2023).

    • Main idea: news-implied linkages in China where firms are connected based on shared media coverage.
    • News-based links were established by identifying instances where two firms were mentioned in the same article within a 12-month window.
    • The authors perform robustness checks to validate these results, including a placebo test using shared media platforms, demonstrating that only specific news stories—not general media coverage—predict future returns.
    • They explore linkage complexity, showing stronger predictability when linkages are more complex (e.g., higher numbers of shared stories or connections).
  • Wang, Huaixin. "The Day and Night Tale of Momentum Spillover Effects." Available at SSRN 4179413 (2022).

    • Main idea: high overnight returns for peer stocks predict elevated opening prices for focal stocks, followed by intraday reversals, while peer intraday returns consistently predict positive future intraday returns for focal stocks.
    • Retail investors, who trade primarily on overnight information due to news salience, and professional investors, who engage in intraday trading, correcting the market.
    • Predictable patterns arise not only from underreaction but from a systematic interplay between the different investor types. Retail-driven overnight price distortions are followed by intraday reversals managed by professionals
  • Data-driven graph learning

    • Pu et al. (2023) Network Momentum across Asset Classes

    • features: 8 in total, MOM and MACD.

    • From Kalofolias (AAAI 2016) How to learn a graph from smooth signals, we can define a convex optimization problem: where is the feature matrix with -days lookback window, where is a diagonal matrix with . The graph adjacency matrix we want to estimate represents the network at day t for constructing network momentum, with the -th entry measuring the strength of similarity of individual momentum between asset i and asset j. In the objective function, the first trace term measures the spectral variations of on the learned graph adjacency matrix , encouraging connections between nodes with similar features. It is derived from Laplacian smoothness under the mild assumption that each column of ​​ is a low-pass graph signal.

    • The above is derived from: Consider a matrix , where each row resides on one of m nodes of an undirected graph G. In this way, each of the n columns of X can be seen as a signal on the same graph. A simple assumption about data residing on graphs, but also the most widely used one is that it changes smoothly between connected nodes. An easy way to quantify how smooth is a set of vectors on a given weighted undirected graph is through the function

      where denotes the weight of the edge between nodes i and j and is the graph Laplacian, being the diagonal weighted degree matrix. In words, if two vectors and from a smooth set reside on two well connected nodes (i.e. is large), they are expected to have a small distance so that is small.

    • In our empirical analysis, we combine distinct graphs learned from from five different lookback windows such that trading days as follows: (graph ensemble) .

      To mitigate the effects of scale differences in constructing network momentum, which may arise due to the difference in the number of connections certain assets have - with some connected to numerous other assets and others only to a few - we also apply a graph normalisation as follows:

      (graph normalisation) , where is a diagonal matrix with ​​.

    • Another way to solve above equation (#mjx-eqn-) is described in Pu et al. (2023) Learning to Learn Financial Networks for Optimising Momentum Strategies.

      The Algo L2G reformulates optimisationbased graph learning into an unrolling neural network. By leveraging the inherent modularity of neural networks, where different layers can be easily stacked for forward propagation, we propose to incorporate an additional layer into L2G for directly constructing network momentum.

      A upgrade version of L2G algorithm is described in Pu et al. (NIPS 2021) Learning to learn graph topologies

  • de Bodt, Eric, B. Espen Eckbo, and Richard Roll. "Competition shocks, rival reactions, and stock return comovement." Journal of Financial and Quantitative Analysis, forthcoming, Tuck School of Business Working Paper 3218544 (2024).

    • Methodology: The authors use a novel test statistic to discriminate between two hypotheses: increased product differentiation (H1) or increased standardization (leading to decreased return comovement) and cost-cutting (leading to increased return comovement) (H2). They exploit changes in stock return comovement following tariff cuts to infer strategic reactions.
    • Empirical Findings: The study finds that tariff cuts lead to a significant increase in return comovement, particularly among "followers" within an industry, suggesting a move towards greater standardization and cost-cutting strategies (H2) rather than increased product differentiation (H1).
    • Leader definition: sales-based market shares, financial ratios and R&D.
  • Eisdorfer, Assaf, et al. "Competition links and stock returns." The Review of Financial Studies 35.9 (2022): 4300-4340.

    • Consider a firm’s competitiveness based on the manner by which other firms mention it on their 10-K filings.
    • C-Rank: 10-K cross-mention graph + page-rank algo -> preferred measure of firm-level competition rank
    • A firm’s effective competition status stems mostly from competing with companies outside of its sector.
    • C-Rank might identify an element of a firm’s risk profile. If the firm is “targeted” by strong competitors, it can increase the uncertainty about the firm’s future performance and value, then the outperformance of high C-Rank firms might manifest compensation for risk.
  • Yamamoto, Rei, Naoya Kawadai, and Hiroki Miyahara. "Momentum information propagation through global supply chain networks." Journal of Portfolio Management 47.8 (2021): 197-211.

    • Use factset supply chain data

    • Customer Momentum: Here we assume that a company has N customers, and let be the sales ratio; thus, customer momentum is defined by the following:

    • Weighting Method Based on Network Centrality: almost all of the sales ratios are unavailable. Thus, we use network centrality in network theory as the weight of customer momentum. Let be the edge betweenness centrality between supplier i and customer j.

    • Multilayer Customer Information

  • Earnings Propagation Effects through the Global Supply Chain Network

    • In this paper, we separate the regression model that examines propagation from customers and the regression model that examines propagation from suppliers and then estimate the coefficients of the following two regression models:

  • Deep GNN methods

  • Zhang, Chao, et al. "Graph-based methods for forecasting realized covariances." Journal of Financial Econometrics (2024): nbae026.

    • Previous HAR-DRD method: decomposing the return covariance matrix into the diagonal matrix of realized volatilities and the correlation matrix:

      where is the diagonal matrix with the elements of the square roots of on the main diagonal, that is, , and . is the correlation matrix. We can estimate above using

      where is the dimensional vectorized version of the lower triangular part of and (resp. ) is computed as (resp. ).

    • Using graph information, we can estimate variance and correlation as follows,

      where is the normalized adjacency matrix. Specifically, is a adjacency matrix indicating the connections between assets with diagonal elements as 0 , and , where . Therefore , represent the neighborhood aggregation over daily, weekly, and monthly horizons. represent the effects from connected neighbors over different horizons. Moreover, we apply the idea of the graph effect to modeling correlations according to the model

      where is the normalized adjacency matrix. Specifically, is a adjacency matrix indicating the connections between pairwise correlations with diagonal elements as 0 , and , where ​.

    • Choices of graphs

      • Variance: Complete, Sector, Graph-Lasso
      • Correlation: Complete, Line graph

      Given a graph , its line graph is a graph such that

      • each node of represents an edge of ;
      • two nodes of are adjacent if and only if their corresponding edges share a common endpoint in ​.
    • Extension: Graph neural networks for forecasting multivariate realized volatility with spillover effects

  • He, Wei, et al. "Similar stocks." Available at SSRN 3815595 (2021).

    • Similarity between two stocks is measured by the distance between their characteristics such as price, size, book-to-market, operating profitability, and investment-to-assets.
    • Retail investor behavior, including attention spillover and categorical trading, plays a significant role. Retail order imbalance increases for high similar-stock return portfolios, reflecting stronger demand from individual investors.
    • The similarity effect is stronger among firms with low institutional ownership, suggesting retail investors are primary drivers.
  • Guo, Li, et al. "Joint news, attention spillover, and market returns." arXiv preprint arXiv:1703.02715 (2017).

    • Joint news has a higher degree of attention spillover than self-mentioned news. Measured by increase in Google search activity and EDGAR filings for connected firms, more so than self news.
    • Define the degree of investor attention spillover to a given firm i, from firms connected to i through joint coverage, , as the centrality-weighted (node centrality) sum of abnormal joint news coverage across the connected firms.
    • Both JointNews and Market-JointNews (aggregation to market level) istrongly and negatively predicts the one-month-ahead (market) return. It increases the overall attention to the firms and resulting in high valuation and low future stock returns.
  • Huang, Shiyang, Tse-Chun Lin, and Hong Xiang. "Psychological barrier and cross-firm return predictability." Journal of Financial Economics 142.1 (2021): 338-356.

    • When a firm’s economically linked firms have good (bad) news, and its stock price is near (far from) the 52-week high, it has an underreaction to the good (bad) news about economically linked firms.
    • The nearness to the 52-week high significantly moderates the predictability of supplier returns based on customer returns. Long-term returns for firms close to their 52-week high are higher when customers exhibit strong performance.
  • Menzly, Lior, and Oguzhan Ozbas. "Market segmentation and cross‐predictability of returns." The Journal of Finance 65.4 (2010): 1555-1580.

    • The extent of cross-predictability is negatively related to the level of information in the market, measured by the level of analyst coverage or by the level of institutional ownership.
    • Analyst coverage measure: analyst is considered actively engaged in a stock for a 12-month period after making an EPS forecast on that stock.
    • Institutional ownership measure: sum the holdings of institutional investors in the stock at a given quarter-end report date and then divide by the number of outstanding shares. An alternative proxy is the number of different institutional investors in the stock.

9. Machine Learning

10. Macro

11. Microstructure

12. Miscellaneous

13. Momentum and Factor Timing

14. NLP

  • Wolfe Research | Text mining unstructured corporate filing data
    • EDGAR 10-K and 10-Q filings (by section comparison)
    • Features:
      • Sentiment and tone analysis
      • chanes in sentiment
      • distance measures (YoY embedding/BoW)
  • Learning Fundamentals from Text
    • Use attention machenism to weigh the importance of different paragraphs in a document, focusing on those that are most relevant to market reactions. The document-level aggregated vector is then used to predict the target variable, which is the direction of stock returns around the filing date.

15. Portfolio Construction

16. General Knowledge

results matching ""

    No results matching ""