Does Trend Following Have an Edge? — Why a Strategy That "Enters Late" Survives

Last updated at 2026-05-28Posted at 2026-05-26

Does Trend Following Have an Edge? — Why a Strategy That "Enters Late" Survives

Conclusion First

In this USDJPY experiment, no clear edge was confirmed for the simple MA cross on the 60-minute timeframe.

On the 240-minute timeframe, a trend-following-like right-tail structure was visible over the full period, but it collapsed in the 2025 OOS period. Furthermore, the long-side profit during the dev period was roughly comparable to USDJPY's upward bias (always-long), making it difficult to separate from any MA-cross-specific edge.

Therefore, the conclusion of this article is not "trend following wins," but rather:

Trend-following strategies should be evaluated by right-tail dependence, cost resilience, latency resilience, directional PnL, and OOS reproducibility.

This article does not prove that trend following has a permanent edge. It uses a simple MA cross on USDJPY to observe the PnL structures that tend to appear in trend-following strategies, and how they break down.

How to Use This Article

If this ended with "I ran an MA cross on USDJPY, and here's what happened," the value to readers would be thin. Use this article in the following ways.

1. As a framework for diagnosing your own strategies

Readers working on trend-following, momentum, or breakout strategies can apply the same diagnostics to their own backtests. The diagnostics done here are not specific to one strategy — they can be taken away as general-purpose evaluation axes.

Specifically, running the following 8 diagnostics in order on your own strategy will reveal where it is likely to break.

dev / OOS fixed-parameter comparison (run the same parameters on different periods)
Top 1% / 5% / 10% winning trade exclusion (check right-tail dependence)
Cost sensitivity (find where expectancy disappears at 0, 0.8, 1.0, 2.0 pips)
Entry lag sensitivity (does the structure survive 1–4 bar delays)
Random direction comparison (randomize only direction at the same timing)
Separate dev / OOS parameter heatmaps (does the dev "bright zone" survive OOS)
Comparison with Buy & Hold / always-long (edge, or market bias?)
Monthly PnL and Time Under Water (can you wait it out?)

These are not "you must run all of them to evaluate" — rather, a strategy that fails even one of them deserves suspicion before live deployment.

2. As a checklist for questioning someone else's backtest

When you're shown a good-looking equity curve, applying the diagnostics from this article helps avoid overvaluation.

For example, when presented with "an MA cross strategy that made +2500 pips over the past 3 years," you can return the same questions as this article.

Is that the dev period, or does it include OOS?
Does it survive excluding the top 5% winning trades?
How many pips did Buy & Hold make over the same period?
Where does it rank if you randomize direction with the same timing?
Does the structure survive slightly different parameters?
Does it survive even a 1-bar entry delay?

The 240-minute case in this article failed all of the above. This perspective applies equally to your own strategies and to others'.

3. As material for judging before running an MA cross live

For readers about to actually run an MA cross, the specific observations in this article translate directly.

Short timeframes (60 min and below) are fragile against cost and latency — Expectancy can disappear at 1 pip round-trip cost, and a single bar of execution delay degrades it significantly. May not suit discretionary or semi-automated operations where execution delay is unavoidable.
Mid timeframes (240 min and above) have strong right-tail dependence — Losing a few large winning trades can collapse expectancy, so stopping the strategy during drawdown can be fatal.
In an uptrending market, long-side profits are hard to separate from Buy & Hold — Even if a long-only MA cross profits, whether that comes from the strategy's edge needs separate verification. Once the market reverses, the apparent advantage may disappear.
TUW can be several times longer than expected — The 60-minute timeframe had a TUW of about 866 days, and the 240-minute about 344 days. Starting with the expectation that "results come in a few months" makes it impossible to continue.

4. As a starting point for follow-on research

This article provides diagnostic data for moving on to the next research topics.

The observations needed to separate whether the short-side loss is structural or driven by market bias are in place
The fact that dev-bright parameters collapsed in OOS supports the need for WFO verification
Since even long-only failed the 2025 OOS, long-side trend quality filters (MA200 direction, Efficiency Ratio, ADX) become the next verification target

The very fact that "a simple MA cross is not enough" can be confirmed is itself a starting point for the next research.

1. The Essence of Trend Following

Trend following is always late. It can't buy the bottom, and it can't sell the top. It buys after confirming that price has gone up, and sells after confirming that price has gone down.

The reason this seemingly inefficient strategy has survived for so long is not that it "predicts the future accurately," but that it rides on the periods where price changes persist, and goes after the right tail of the PnL distribution.

From the expectancy formula, even with a low win rate, if average win significantly exceeds average loss, expectancy can be positive.

E[PnL] = WinRate × AvgWin − LossRate × AvgLoss − TradingCost

For trend following, if

AvgWin >> AvgLoss

holds, PnL can be positive even with a low win rate. Therefore, what matters is not the win rate itself, but average win, average loss, the thickness of the right tail, and post-cost expectancy.

"Entering late" is both a weakness and a design choice that discards initial-move noise. Entering too early tends to get caught in head-fakes. Trend following enters only after confirming that price has moved beyond a certain threshold.

Academic Context

Trend following is not a retail-only technical method — it has been studied as time-series momentum, a strategy class.

Moskowitz, Ooi, Pedersen (2012) reported return continuation over 1–12 months across 58 liquid futures
Hurst, Ooi, Pedersen (2017) analyzed long-term performance of time-series momentum across global markets since 1880
Baltas, Kosowski (2013) analyzed the relationship between time-series momentum in futures markets and CTA / trend-following funds

So the small USDJPY experiment in this article connects, in a broader context, to time-series momentum, managed futures, and CTAs — strategy classes that are actually operated in production.

2. Why Price Changes Sometimes Persist

The commonly cited reasons for trend persistence are below. However, what matters from a quant perspective is not to believe the explanation, but to convert it into an observable, testable hypothesis.

Delayed human reaction: initial skepticism toward price rises, then following on. Testable hypothesis: return continuation after large bullish/bearish candles.
Institutional flows: chains of rebalancing, stop-losses, VaR constraints, margin calls. Testable hypothesis: volatility rise during drops, and the probability of further drops.
Supply/demand imbalance: persistence until the imbalance between buyers and sellers is resolved. Testable hypothesis: follow-through after breakouts.

A trend is less "price away from fair value" and more "the supply/demand imbalance has not yet been resolved."

3. The Evaluation Axes from a Quant Perspective

When evaluating trend following, instead of win rate, look at the following.

Basic metrics: total PnL, win rate, AvgWin / AvgLoss, Profit Factor
Risk and survival: MaxDD, Time Under Water, max consecutive losses
PnL distribution: skewness, right-tail thickness, total PnL after excluding top 5% / 1% / 10%
Real-world tolerance: post-cost PnL, degradation with entry lag
Robustness: fixed-parameter OOS, neighboring parameters, comparison with random direction
Benchmark comparison: difference vs Buy & Hold / always-long

The last — benchmark comparison — is particularly important. Even if an MA cross is profitable, unless you separate whether it's an MA-cross-specific edge or merely picking up the directional bias of the target market, you'll overvalue the strategy.

Note: this article does not perform WFO (Walk-Forward Optimization). Since WFO involves parameter re-optimization, we treat it as the next stage after the present topic of "does the PnL structure survive under fixed conditions."

4. Experimental Setup

Item	Detail
Target	USDJPY
Timeframes	60-minute, 240-minute
Strategy	Moving average cross (short 20 / long 80)
Execution	Signal judged at bar close, fill at next bar open
Exit	Reverse signal closes and reverses position
Cost	1.0 pips round-trip fixed (main case)
Period	2023-01-01 to before 2026-01-01
Dev period	2023–2024
OOS period	2025
WFO	Not performed

Note that no parameter re-optimization was done in the OOS period.

5. Experimental Results

5.1 Look at OOS First: The dev / OOS Gap

Placing this section first is intentional. Showing only the full-period positive number first would make trend following look like it "worked."

Timeframe	Full	2023-2024 dev	2025 OOS
60m	-140.1	-235.9	+75.5
240m	+1746.6	+2569.3	-808.0

The 240-minute timeframe looks good in dev at +2569.3 pips. But in OOS, it's -808.0 pips. The gap exceeds 3300 pips.

The structure that looked good in dev did not reproduce in 2025 OOS. This is the most important observation in this article.

5.2 Equity Curve and Drawdown

The 240-minute timeframe rose to about 3800 pips from late 2024 through May 2025, then was rapidly given back during the OOS period.

The 60-minute timeframe stayed in an unrealized-loss state of 1500+ pips almost continuously from mid-2024 to September 2025. Since trend following is a strategy that waits for the right tail, the key operational question is whether the operator can endure long stagnation periods.

5.3 PnL Distribution: Look at the Distribution, Not Win Rate

Timeframe	Trade count	Total PnL	Win rate	Profit Factor	MaxDD
60m	292	-140.1 pips	35.27%	0.990	2067.1 pips
240m	70	+1746.6 pips	41.43%	1.304	2120.2 pips

Neither has a high win rate. The reason the 240-minute total PnL is still positive is that large winning trades exist in the right tail of the PnL distribution.

The 240-minute timeframe has clearly outlying large winning trades on the right. The 60-minute timeframe has a thick peak on the negative side; the right tail exists but is easily swallowed by costs.

5.4 Right-Tail Dependence: Miss the Big Wins, and It Collapses

Timeframe	Baseline	Excl. top 1%	Excl. top 5%	Excl. top 10%
60m	-140.1	-1236.2	-3001.4	-4571.9
240m	+1746.6	+629.1	-352.9	-1127.3

Excluding just the top 5% of winning trades collapses the 240-minute timeframe to -352.9 pips. In fact, the single largest winning trade alone is +1117.5 pips — about 64% of the total +1746.6 pips.

Missing just a few big winners can drastically change the evaluation of the entire strategy. This is the essence of trend following, and at the same time its operational difficulty.

5.5 Cost Resilience and Entry Lag Sensitivity: Real-World Friction

Cost Resilience

Timeframe	0.0 pips	0.8 pips	1.0 pips	2.0 pips
60m	+151.9	-81.7	-140.1	-432.1
240m	+1816.6	+1760.6	+1746.6	+1676.6

The 60-minute timeframe falls into negative territory as soon as cost is applied. Short timeframes have many trades, so fixed costs have a large impact. The 240-minute timeframe maintains positive PnL even at 2.0 pips, but cost resilience and future viability are different things.

Entry Lag Sensitivity

Timeframe	lag 0	lag 1	lag 2	lag 4
60m	-140.1	-1152.9	-1232.1	-1401.9
240m	+1746.6	+2416.6	+2551.4	+2286.8

The 60-minute timeframe degrades rapidly when delay is added. PnL depends on the few bars right after the signal, making it fragile against real-world delay.

The 240-minute timeframe actually improves with 1–2 bars of delay. This suggests continuation on a longer time scale, though it does not mean "entering later is always better."

5.6 Comparison with Random Direction

Timeframe	Actual PnL	Percentile	Random trials beating actual
60m	-140.1	50.3	49.7%
240m	+1746.6	77.5	22.5%

Randomization affects only long/short direction; entry timing, trade count, and holding period are the same as the actual strategy, over 1000 trials.

The 60-minute timeframe is about the same as random. The 240-minute timeframe is in a better position than random, but at the 77.5th percentile — not strong enough to be in the top 1% or 5%. While the directional signal may contain some information, this alone cannot establish an edge.

5.7 Parameter Robustness: The dev Bright Zone Disappears in OOS

In the dev period, the 240-minute timeframe had Profit Factor > 1 for all 24 parameter combinations. But in OOS, only 2 combinations maintained PF > 1.

The dev → OOS gap for the best-performing 240-minute dev parameters is below.

Short MA	Long MA	dev PF	OOS PF	dev PnL	OOS PnL
30	120	5.241	0.647	+4543.9	-766.2
30	160	4.182	0.642	+3822.7	-785.8
20	200	4.173	0.369	+3395.9	-2218.0
20	120	4.083	0.596	+3876.9	-918.6

The zone that looked strong in dev essentially did not reproduce in OOS.

Judging robustness from the full-period heatmap alone is dangerous. When you separate dev and OOS, the "strength" on the 240-minute parameter surface looks less like a stable edge and more like a structure specific to the dev period.

5.8 By Direction: Short-Side Weakness and the Limits of the Long Side

Timeframe	Long PnL	Short PnL
60m	+1215.3	-1355.4
240m	+2164.6	-418.0

The short side is negative on both timeframes. This suggests that USDJPY had an upward bias during the target period, and that simple reversal-style shorts were at a disadvantage.

5.9 Comparison with Buy & Hold / Always-Long: Edge, or Upward Bias?

With long-term JPY weakness, profiting on the long side is to some extent expected. What matters is separating MA-cross edge from simply riding the upside of the target market.

Timeframe	Period	MA long/short	MA long only	Always Long
60m	2023-2025	-140.1	+1215.3	+2569.7
60m	2023-2024 dev	-235.9	+1203.5	+2640.9
60m	2025 OOS	+75.5	+11.8	-52.9
240m	2023-2025	+1746.6	+2164.6	+2581.6
240m	2023-2024 dev	+2569.3	+2603.8	+2637.3
240m	2025 OOS	-808.0	-424.5	-41.0

In the dev 240-minute case, MA long only was +2603.8 pips, while always-long was +2637.3 pips — essentially the same level. MA cross used the market rise well, but cannot be said to have outperformed always-long.

In 2025 OOS, always-long stayed at a mild -41.0 pips, while MA long/short was -808.0 pips and MA long only was -424.5 pips. In 2025, the act of entering and exiting via MA cross may itself have worsened PnL.

The long-side profit was supported not only by the trend-following signal but also by USDJPY's upward bias during the period.

5.10 Monthly PnL and Time Under Water

Timeframe	Max losing streak	Max TUW	Monthly win rate	Worst month	Best month
60m	10	866.6 days	58.3%	-594.8	+1064.8
240m	6	344.3 days	55.2%	-571.1	+982.0

The 60-minute timeframe had a period of about 866 days without making a new equity high. The 240-minute timeframe had about 344 days of TUW.

Trend following is a strategy that waits for big winning trades. So even with positive PnL, if you can't endure long stagnation, you can't continue in real operation.

6. Diagnostic Ablation: long-only and short suppression

Since the short-side loss was large, we diagnostically checked long-only and short-suppression filters. This is not strategy improvement, but observation to separate whether the short-side loss is a structural weakness or a condition-specific weakness.

Full-period results

Variant	60m	240m
baseline long/short	-140.1	+1746.6
long only	+1215.3	+2164.6
short allowed: MA80 falling	-657.7	+2536.6
short allowed: close < falling MA200	+863.0	+1915.4

But in OOS

Variant	60m OOS	240m OOS
baseline long/short	+75.5	-808.0
long only	+11.8	-424.5
short allowed: MA80 falling	-345.1	-847.8
short allowed: close < falling MA200	-512.2	-245.9

In the 240-minute 2025 OOS, even long only was -424.5 pips. Excluding shorts alone does not solve it. close < falling MA200 reduced the 240-minute OOS loss from -808.0 to -245.9 pips, but this is a diagnostic done after seeing the 2025 OOS, not a finished strategy.

The short-suppression filter is a diagnostic experiment to investigate why the short side worsened PnL. Filters that do not reproduce in OOS, or filters added after seeing OOS, are not treated as finished strategies.

7. The Limits Revealed by the Experiment

What was confirmed by this experiment is below.

No clear edge was confirmed for the simple MA cross on the 60-minute timeframe. Fragile against cost and noise, and around the median in the random-direction comparison.
The 240-minute timeframe did show a trend-following-like right-tail structure over the full period, but went negative when excluding the top 5% and collapsed to -808 pips in 2025 OOS.
The dev-period long-side profit was about the same as always-long, making it impossible to separate from any MA-cross-specific edge.
The short side was negative on both timeframes. Simple reversal-style shorts were disadvantaged under USDJPY's upward bias.
The dev bright zone of parameters essentially disappeared in OOS (on the 240-minute timeframe, only 2 of 24 maintained PF > 1 in OOS).
The 240-minute timeframe is robust to delay; the 60-minute is fragile to delay. Given real-world execution delay, short-timeframe trend following requires particular caution.
TUW was about 866 days on 60m and about 344 days on 240m. Before PnL, whether you can wait is the key to operation.

8. How to Take It Away: Who Can Use It and How

The diagnostics in this article are taken away differently depending on the reader's role.

Quant researchers and strategy developers

You can incorporate the same 8 diagnostics as a "minimum test set" for trend-following strategies under development or about to be verified.

Do not make decisions based on full-period cumulative PnL alone
Split dev / OOS at the outset; treat any additional optimization after seeing OOS as diagnostic, not a new edge
Do not select parameters based on "strong in dev" alone (in this article, the 240-minute had PF > 1 for 24/24 in dev, but only 2 in OOS)
Always include a "Buy & Hold comparison" (skipping this leads to mistaking market bias for edge)
Use top-tail exclusion tests to visualize structural fragility

Just incorporating these into your development flow can reduce wasted live-deployment decisions driven by overfitting.

Discretionary and semi-automated traders

You can apply this article's questions to systems and indicators you're watching.

Does a strategy advertised as "+XX pips over X years" still hold under dev/OOS split?
For short-timeframe strategies, do you get the same results when accounting for your own execution speed (delay)?
How much does your JPY-weakness-era profit exceed Buy & Hold?
Do you have a money-management setup that lets you continue during drawdowns and prolonged unrealized losses?

In particular, keep in mind that short-timeframe MA-cross strategies are likely fragile against cost and delay.

People making investment decisions (choosing someone else's operation)

When evaluating others' trend-following operations — funds, CTAs, signal services, copy trading — these questions are usable.

Is the presented performance the full period, or does it include OOS?
Is it dependent on the right tail (does the entire performance hinge on the most recent few large winners)?
How much does it exceed Buy & Hold / index over the same period?
Is there a track record of continued operation during Time Under Water?
Did it stop in past drawdowns?

"Annual return X%" or "Sharpe of XX" alone does not reveal how the strategy breaks. The axes from this article raise the resolution of evaluation.

Readers studying for understanding

For readers who just want to understand trend-following strategies, take away the following.

Trend following is a strategy class that cannot be evaluated by win rate
PnL is structurally supported by a small number of large winners
Head-fakes and drawdowns are structural features, not defects
However, having that structure does not automatically mean you can win
Separating the directional bias of the market from the edge is harder than it looks

These can be taken away as understanding of the trend-following category as a whole, independent of any specific strategy.

9. Conclusion

What this experiment revealed is not "a winning MA cross."

What was revealed is the right-tail dependence, cost resilience, OOS collapse, directional asymmetry, and difficulty of separation from upward bias that trend-following strategies carry.

The essence of trend following is not predicting the future, but being positioned when a large price change continues. But if you miss those few right-tail moves, expectancy collapses easily.

That is exactly why trend following must be evaluated by distribution, OOS, cost, delay, drawdown, and benchmark comparison — not by win rate.

The Reader's Next Move

After reading this article, I recommend taking one of the following steps.

If you have your own strategy: pick one of the 8 diagnostics in this article that you haven't checked yet, and run it this week. In particular, "Buy & Hold comparison," "top 5% exclusion," and "dev/OOS parameter heatmap" are low-cost and high-impact.
If you evaluate others' operations: throw at least 5 of the questions from this article at the trend-following operation under evaluation. An operation that cannot answer some questions may not be as stable as the presented performance suggests.
If you are starting to learn trend following: first build the habit of looking at "PnL distribution" rather than "win rate." Then place "comparison with Buy & Hold" as your first evaluation axis. These two alone change how strategies look.

The value of this article is not in the specific MA cross results, but in being taken away as a lens for evaluating trend-following strategies. The USDJPY data of 2023–2025 is merely the material for grinding that lens.

Interpretation Boundary

The experiment in this article is an observation under the specific conditions of USDJPY, 2023–2025, 60-minute and 240-minute timeframes, MA 20/80, and a fixed round-trip cost of 1.0 pip.

The same results are not guaranteed under different parameters, periods, or cost assumptions. Nor does it prove a permanent edge for trend following in general.

The results of this article should be read as an experimental article for understanding the PnL structure of trend following as a strategy class — not as investment advice or a production trading system.

"The tools utilized in this experiment can be downloaded from the following GitHub repository (Lab_5 experiment)."
https://github.com/tikeda123/article_lab

Future Development Topics

Long-side trend-quality filters (close > MA200, Efficiency Ratio, ADX, etc.)
More systematic evaluation of short-side suppression filters
Trend-continuation detection via rolling autocorrelation
Verification across multiple markets (EURUSD, GBPJPY, commodities, equity indices)
Verification closer to real operation via WFO
Execution modeling including bid/ask, slippage, swap
Size control via volatility adjustment

These are in the domain of strategy improvement, and are kept separate from the main topic of this article — "observing the PnL structure of a simple trend-following strategy."

References

Moskowitz, T. J., Ooi, Y. H., & Pedersen, L. H. (2012). "Time Series Momentum." Journal of Financial Economics, 104(2), 228-250.
Hurst, B., Ooi, Y. H., & Pedersen, L. H. (2017). "A Century of Evidence on Trend-Following Investing." The Journal of Portfolio Management, 44(1), 15-29.
Baltas, A. N., & Kosowski, R. (2013). "Momentum Strategies in Futures Markets and Trend-Following Funds." Working Paper, Imperial College Business School.
- SSRN: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1968996
- DOI: https://doi.org/10.2139/ssrn.1968996

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up