-
Notifications
You must be signed in to change notification settings - Fork 691
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pluggable and customizable cache design #907
base: master
Are you sure you want to change the base?
Conversation
…into feature/indicator-cache
ta4j-core/src/main/java/org/ta4j/core/indicators/RecursiveCachedIndicator.java
Outdated
Show resolved
Hide resolved
I thought I read in a thread somewhere that the bar series size management functionality (i.e. setting a max bar count, removing old bars, etc) was a blocker for this feature. Is that still the case here or is there something else? |
@TheCookieLab Usually it is possible to set a max cache size in third party caching libraries, especially in coffeine cache. |
@team172011 Would it make sense as a first step (before replacing the actual caching system) to create a new PR that just replaces the |
I added two further cache implementations:
and added the |
* their values based on the values of other indicators. Such nested indicators | ||
* can call {@link #getValue(int)} multiple times without the need to | ||
* {@link #calculate(int)} again. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not remove the existing javadocs of this class.
* Constructor. | ||
* | ||
* @param series the bar series | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not remove the javadocs of this class.
* that may be thrown on the first getValue(int) call of a recursive indicator. | ||
* Concretely when an index value is asked, if the last cached value is too | ||
* old/far, the computation of all the values between the last cached and the | ||
* asked one is executed iteratively. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not remove the javadocs of this class.
/** | ||
* The recursion threshold for which an iterative calculation is executed. | ||
* | ||
* TODO: Should be variable (depending on the sub-indicators used in this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not remove the javadocs of this class.
ta4j-core/src/test/java/org/ta4j/core/indicators/bollinger/BollingerBandWidthIndicatorTest.java
Show resolved
Hide resolved
Let’s not normalize raising concerns on blank lines in between imports,
comment style, etc in code reviews, that’s frankly anal retentive and
causes more harm than good. There’s a reason code style is typically
considered off-limits in code reviews for anything not explicitly defined
upfront as a universal style guide.
…On Fri, Jul 14, 2023 at 2:25 AM nimo23 ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In ta4j-core/src/main/java/org/ta4j/core/BacktestExecutor.java
<#907 (comment)>:
> @@ -25,7 +25,6 @@
import java.util.List;
import java.util.stream.Collectors;
-
Please do *not* remove the "blank lines" between the "package groups"
(also called "top level packages") as this helps to visualize the groups
better. For example, in Eclipse these "blank lines" are automatically added
as a "visual separator" when using "Source->Organize Imports" - this is
common and best practice. I think, intelliJ has something similar (it's
found in File/Settings menu, under something like "Editor/Code Style/Java,
-> "Import Layout" section)
------------------------------
In
ta4j-core/src/main/java/org/ta4j/core/indicators/caching/NativeIndicatorValueCache.java
<#907 (comment)>:
>
-/**
- * Cached ***@***.*** Indicator indicator}.
- *
- * <p>
- * Caches the calculated results of the indicator to avoid calculating the same
- * index of the indicator twice. The caching drastically speeds up access to
- * indicator values. Caching is especially recommended when indicators calculate
- * their values based on the values of other indicators. Such nested indicators
- * can call ***@***.*** #getValue(int)} multiple times without the need to
- * ***@***.*** #calculate(int)} again.
- */
Please do not remove the existing javadocs of this class.
------------------------------
In
ta4j-core/src/main/java/org/ta4j/core/indicators/caching/NativeIndicatorValueCache.java
<#907 (comment)>:
> @@ -52,40 +42,20 @@ public abstract class CachedIndicator<T> extends AbstractIndicator<T> {
*/
protected int highestResultIndex = -1;
- /**
- * Constructor.
- *
- * @param series the bar series
- */
Please do not remove the javadocs of this class.
------------------------------
In
ta4j-core/src/main/java/org/ta4j/core/indicators/caching/NativeRecursiveIndicatorValueCache.java
<#907 (comment)>:
> import org.ta4j.core.BarSeries;
import org.ta4j.core.Indicator;
-/**
- * Recursive cached ***@***.*** Indicator indicator}.
- *
- * <p>
- * Recursive indicators should extend this class.
- *
- * <p>
- * This class is only here to avoid (OK, to postpone) the StackOverflowError
- * that may be thrown on the first getValue(int) call of a recursive indicator.
- * Concretely when an index value is asked, if the last cached value is too
- * old/far, the computation of all the values between the last cached and the
- * asked one is executed iteratively.
Please do not remove the javadocs of this class.
------------------------------
In
ta4j-core/src/main/java/org/ta4j/core/indicators/caching/NativeRecursiveIndicatorValueCache.java
<#907 (comment)>:
>
- /**
- * The recursion threshold for which an iterative calculation is executed.
- *
- * TODO: Should be variable (depending on the sub-indicators used in this
Do not remove the javadocs of this class.
------------------------------
In
ta4j-core/src/test/java/org/ta4j/core/indicators/bollinger/BollingerBandWidthIndicatorTest.java
<#907 (comment)>:
> @@ -38,6 +35,8 @@
import org.ta4j.core.mocks.MockBarSeries;
import org.ta4j.core.num.Num;
+import static org.ta4j.core.TestUtils.assertNumEquals;
Please organize the imports according common practice. "Static imports"
should be always at the top and not at the bottom. I guess in Intellij you
can find it in the "File/Settings" menu under "Editor/Code style/Java" ->
"Import layout" section where you can edit the import order.
—
Reply to this email directly, view it on GitHub
<#907 (review)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABQJ6FOA7UHXT4PNYRD2WRLXQDQ6NANCNFSM6AAAAAARPMGWOU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
No, there is a good reason for maintaining code style! The end users of this library are API end users, i.e. in order not to diminish the developer's joy, the coding style must be homogeneous, among other things! if everyone does their gibberish and sloppily codes into it, you end up with poorly maintainable and readable code in the long run. Off-topic: Maybe we should provide an automated task like "spotless" or "gradle-imports" into our github build? |
No thanks.
…On Fri, Jul 14, 2023 at 7:20 AM nimo23 ***@***.***> wrote:
Let’s not normalize raising concerns on blank lines in between imports,
comment style, etc in code reviews, that’s frankly anal retentive and
causes more harm than good. There’s a reason code style is typically
considered off-limits in code reviews for anything not explicitly defined
upfront as a universal style guide.
No, there is a good reason for maintaining code style! The end users of
this library are *API end users*, i.e. in order not to diminish the
developer's joy, the coding style must be homogeneous, among other things!
if everyone does their gibberish and sloppily codes into it, you end up
with poorly maintainable and readable code in the long run.
—
Reply to this email directly, view it on GitHub
<#907 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABQJ6FKUAMFQ5PEGY7NETQLXQETONANCNFSM6AAAAAARPMGWOU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
import org.ta4j.core.BarSeries; | ||
import org.ta4j.core.BaseTradingRecord; | ||
import org.ta4j.core.Position; | ||
import org.ta4j.core.*; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove the "*
" (asterisk) and import only required classes.
@TheCookieLab I aggree, especially in PRs like this where the focus is on the general design and solving different challenges to reach the objectives. |
Current problem I face ist that one goal of this PR is to eliminate alle the difficulties we have with the indexed based caching and the sync between the circular array in This leads to a second challenge: How to replace the Only idea I have so far is to create a cache that always checks/calculates if every previous index is already in the cache |
@team172011 I don't agree. It does not depend on any kind of PR! Things like clean code, etc. should be understandable and not just a social expression within the community. If you both agree, then please answer me: who feels responsible for cleaning up your dirt afterwards? I? If not me, no one will - history has proven that! The result today would be very different, not good code! So please think twice before anyone here seems to agree. I have more and more fear that the project will be steered in the wrong direction in the future if such fundamental, actually completely self-evident things have to be said here several times. That's tedious. |
1. Solution: If I understand it correctly, one potential solution to this problem to remove the "indexed based caching":
Hmm... but if we can solve this with "solution 1" then it would make sense to keep using it, or?
Hmm..the latest could be the last element in a sorted map or alternativly, the latest one with a time based
Something like this would also generate a random String really fast, nearly unique and is also thread safe (the parameter appendTimestamp can be replaced by simply using the With such a (time based) |
But isn't this exactly the same what I already did in this PR? I just used the endTime of each bar as primary key, it is mandatory and it makes no sense to have two bars for the same endTime -> primary key. And how would the sync work in constant time? At the moment, the BarSeries and CachedIndicator are using the maximumBarCount and removedBarCount property to calculate the "real" index. If we would only store a unique key (like my PR already does with endTime) we do not know how many bars or which bars may have been removed from the BarSeries. We would have to iterate over the whole cache values and check if they are still in the barSeries However the concept or the RecursiveCachedIndicator would not work without known and storing indices of values |
Yes, you are right - using
Hmm..yes, that's a problem with the current caching design and how we connect the things together: we have the relation |
This idea sounds very interesting. This would allow an indicator not just to cache the final value for index n but also use it for meta information like the That means we would change the Indicator interface to something like: public interface Indicator<T> {
T getValue(Bar bar);
// ...
} And the Bar interface to something like public interface Bar extends Serializable {
IndicatorValueCache<T> getCache(IndicatorInstanceKey key) // PROBLEM: T is unknown can only be Object
// ...
} The API of the IndicatorValueCache could look like this, providing also a possibility to store meta information required for public interface IndicatorValueCache<T> {
Optional<T> getValue(); // returns the value for this indicator if already calculated and stored
void putValue(T result); // stores the value
Map<String, Object> getMetaValues(); // possibility to add meta information
putMetaValues(Map<String, Object>); // possibility to get meta information
// and/or:
Optional <Object> getMetaValue(String key);
void putMetaValue(String key, Object value);
} And of course each indicator instance needs to have its own key for the cache in the Bar public interface IndicatorInstanceKey {
// needs to be stored in each indicator instance
default String getKey() {
return new UUID().toString();
}
} EDIT 1: Added comment to |
@team172011 very good, you understood exactly what I meant. And the solution you provided is very promising 👍 |
But I remember one issue I had in a previous try to refactor the caching: As far as you leave the So this suggestion would not work, because the Because of that, when playing around with implementing some concepts in swift I completely dropped the concept of having a generic class parameter for Indicators. My personal attitude is also to avoid using generics or just use them in cases where you really are able to generify something in a usefull way. But beside of my personal flavor I really thing that having a generic class parameter As far as I can see, there is no reasonable way to use this library with So one option could also be to remove the generic class parameter T by introducing two interfaces (for the time being ignoring
|
The past has shown that we really don't need such kind of abstraction level. Your solution:
is absolutely fine and completely sufficient. If in the future there is a need for another indicator type, then let's just add it in this lib, for example:
You could do it like that (an interface extending another interface).
and so on and you can put these indicator types into the package |
@team172011 Are we sure we still need the |
@TheCookieLab it depends on the stack size of your jvm.For me the org.ta4j.core.indicators.volume.OnBalanceVolumeIndicatorTest#stackOverflowError test fails, when only extending the cached indicator in OnBalanceVolumeIndicator. I am mixing up some thing but somehow it is all related in the end. The @Override
protected Num calculate(int index) {
if (index == 0) {
return zero();
}
final Num prevClose = getBarSeries().getBar(index - 1).getClosePrice();
final Num currentClose = getBarSeries().getBar(index).getClosePrice();
final Num obvPrev = getValue(index - 1);
if (prevClose.isGreaterThan(currentClose)) {
return obvPrev.minus(getBarSeries().getBar(index).getVolume());
} else if (prevClose.isLessThan(currentClose)) {
return obvPrev.plus(getBarSeries().getBar(index).getVolume());
} else {
return obvPrev;
}
}
@Override
public int getUnstableBars() {
return 0;
} Every calculation is based on the previous close and previous indicator value. So what is the unstable bars period? When is it returning "correct" values? If you call the indicator for index n and you have only three previous values the result would differ for the same index n with thirty previous values for the same BarSeries. |
Yes, good point. But it will probably take a while until #1051 has been solved. |
So far this approach does not really solve all mentioned problems, especially the question regarding recursive calculations is still unsolved. I will revert this to a draft PR |
May I ask, what access patterns does solve caching? From my research, I have found a few main scenarios. LiveTrading
Also array may be replaced with only pointer to last bar. Backtesting Secondly each optimization run have different parameters so it can not benefit from cache calculated in previous run. RecursiveIndicator in serial acces does not need to be recursive, it depends only on previous value that may be stored inside instance. See approach in #1140 I have dropped all "caching indicators" in #1148 and after completion I am going to investigate caching needs of each indicator separately to prove that global cache does not make real benefit. Only optimization of each indicator does matter (I think). |
Sure, your willingness to take on the effort is appreciated, and if you can show comparable performance without the cache then we may not need it after all. |
Suggestion on how to use a third party library (caffeine) as cache
This approach allows to use a default cache, to enable or to disable caching for specific indicator instances and to be able to provide a custom cache implementation for specific indicator instance.
This approach is using the endTime (
ZonedDateTime
) as key instead of the index. This would avoid issues with limited bar series where the indices of the underlying array may be reused.The
CachedIndicator
and theRecursiveCachedIndicator
have been removed. They will be replaced by an external cache of typeIndicatorValueCache
that can be configured (maxSize at the moment) and will be used in theAbstractIndicator
. Current cache implementations are:BaseIndicatorValueCache
NoIndicatorValueCache
NativeIndicatorValueCache
the former cache using custom circular array implementation for cachingNativeRecursiveIndicatorValueCache
the former recursive cache extending theNativeIndicatorValueCache
Caching in
AbstractIndicator
looks like this now:Usage example
Performance tests
Executed the example MovingAverageCrossOverRangeBacktest.java. 100 Runs with different subsets of the provided BarSeries, each run creates 2145 Strategies and executes them with a BacktestExecutor:
Test1
Test 2
Test 3
Test1.txt
Test2.txt
Test3.txt