How can I know if my model is overfitting?
artificial intelligence and machine learning
In a paper titled “Pseudo-Mathematics and Financial Charlataniism: The Impact of Backtesting Overfitting on Out-of-Sample Performance” by Marcos López de Prado et al., the problem of overfitting in financial data analysis and investment We are looking for implications for homes and financial practitioners.
The authors begin by emphasizing the importance of data analysis in finance given that investment decisions are often based on analysis of historical data. However, he points out that such analyzes can easily be manipulated by overfitting the data. This involves fitting the model to the data, making the model too specific and losing its predictive power on new data.
To illustrate the problem of overfitting, the authors provide an example of a hypothetical investment strategy that has been backtested on historical data and found to work well.
However, performance degrades as the strategy becomes tested on new data. As a result, it suggests overfitting to past data. The authors claim that such strategies are prevalent in the financial industry. And often get used to cheating investors.
In addition, the authors also propose a new methodology called the ‘Multiple Hypothesis Testing Framework’, which involves testing a large number of hypotheses and correcting multiple comparisons to ensure the results are not erroneous. The authors use several examples to demonstrate the effectiveness of this approach, arguing that it helps mitigate overfitting problems in financial data analysis.
The authors also emphasize the need for greater transparency in the financial industry, especially regarding the disclosure of backtesting methods and results. He argues that investors need to be more informed about the limitations of backtesting and the potential risks of investing in strategies based on backtested data.
Overall, this paper makes a valuable contribution to the literature on financial data analysis and overfitting issues.
The authors articulate this problem and its implications for investors and financial practitioners, and propose new methodologies to mitigate the problem. However, this paper could have benefited from a more detailed discussion of the limitations of the proposed methodology and a more comprehensive review of the existing literature on this topic.
The author’s focus on the need for greater transparency in the financial industry is particularly noteworthy. By highlighting the potential risks of investing in strategies based on backtested data, the authors effectively advocate for greater accountability and ethical behavior in the industry.
This is an important message for all financial practitioners to keep in mind. And the methodology proposed by the authors provides useful tools to achieve this goal.
In conclusion, “Pseudo-Mathematics and Financial Charlatanism: The Effects of Backtest Overfitting on Out-of-Sample Performance” is a valuable contribution to the literature on financial data analysis and overfitting issues. The authors articulate the issues, propose new methodologies to mitigate them, and highlight the need for greater transparency in the financial industry. This paper could have benefited from a more detailed discussion of the limitations of the proposed methodology, nevertheless for those interested in the intersection of finance and data analysis, the suggestions It is a rich and informative read.
Pseudomathematics and Financial Charlataniism: The Impact of Backtest Overfitting on Out-of-Sample Performance David H. Bailey, Jonathan Borwein, Marcos Lopez de Prado, Qiji Jim Zhu :: SSRN
Written by Li Jia Tong
How can I know if my model is overfitting?