Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Faure sequence is wrong #2653

Open
mbaudin47 opened this issue May 11, 2024 · 6 comments · May be fixed by #2654
Open

The Faure sequence is wrong #2653

mbaudin47 opened this issue May 11, 2024 · 6 comments · May be fixed by #2654
Labels
Milestone

Comments

@mbaudin47
Copy link
Collaborator

mbaudin47 commented May 11, 2024

What happened?

The Faure sequence generates wrong points in any dimension.

Here are the proofs of this.

  • In dimension 2, we see that a set of 4 points does not correctly fill the 4 base-2 elementary volumes.
  • In dimension 3, we see that 9 first points generated are wrong. The basis is obviously equal to 2 instead of 3. The third dimension is a copy of the first dimension.
  • The estimated value of the mean of the GSobol' function does not converge to its true value.

Many thanks to Félix Husson (EDF & Institut Galilée) for pointing that bug.

One of the reasons of the bug is undetected is the way the unit test t_FaureSequence_std.cxx is implemented. Since the algorithm produces a set of points, the simplest and most efficient way to implement the unit test is to check the coordinates of the set of points against a pre-defined reference Sample. This reference sample can be computed from another library, e.g. Matlab, Scilab, Mathematica, etc. Several dimensions e.g. 2, 3, and 4 should be checked. Instead, the first part of the current unit test checks the library against itself, which prevents from detecting any bug. The second part of the unit test estimates the number $\pi$. Unfortunately, the algorithm seems to converge with relative accuracy equal to $10^{-4}$. Actually, this unit test is still of poor value. Indeed, the fact that the estimate converges is a weak proof that the method is OK. Indeed, we may expect that the rate of convergence is OK (which is not). Finally, the sample size is equal to 1000, instead of being equal to a power of the basis, i.e. 2 in this particular case.

How to reproduce the issue?

Proof #1: In dimension 2

import openturns as ot
import openturns.usecases.ishigami_function as ishigami
import openturns.viewer as viewer
import matplotlib.pyplot as plt

basis = 2
dimension = 2
taille = basis**dimension
distribution = ot.ComposedDistribution([ot.Uniform(0.0, 1.0)] * dimension)
sequence = ot.FaureSequence()
experiment_QMC = ot.LowDiscrepancyExperiment(sequence, distribution, taille, False)
sample = experiment_QMC.generate()
fig = viewer.PlotDesign(
    sample,
    subdivisions=[basis] * dimension,
    bounds=ot.Interval([0.0] * dimension, [1.0] * dimension),
)
plt.title(f"n = {taille}")
fig.set_figwidth(3.0)
fig.set_figheight(3.0)

image
Figure 1. The first 4 points of the Faure sequence in dimension 2. There should be one single point for each elementary volume, but the upper left cell contain 2 points.

Proof #2: In dimension 3

The script below defines the first 9 points of the Faure sequence in dimension 3. This is from 1.

# First 9 points of Faure sequence in dimension 3
expected = ot.Sample([
    [1 / 3, 1 / 3, 1 / 3],
    [2 / 3, 2 / 3, 2 / 3],
    [1 / 9, 4 / 9, 7 / 9],
    [4 / 9, 7 / 9, 1 / 9],
    [7 / 9, 1 / 9, 4 / 9],
    [2 / 9, 8 / 9, 5 / 9],
    [5 / 9, 2 / 9, 8 / 9],
    [8 / 9, 5 / 9, 2 / 9],
])

Now compare to the result from OpenTURNS.

dimension = 3
taille = 9
distribution = ot.ComposedDistribution([ot.Uniform(0.0, 1.0)] * dimension)
sequence = ot.FaureSequence()
experiment_QMC = ot.LowDiscrepancyExperiment(sequence, distribution, taille, False)
sample = experiment_QMC.generate()
sample

which prints:

y0y1y2
00.50.50.5
10.250.750.25
20.750.250.75
30.1250.6250.125
40.6250.1250.625
50.3750.3750.375
60.8750.8750.875
70.06250.93750.0625
80.56250.43750.5625

We see that the basis is equal to 2 instead of 3. We see that the third dimension is a copy of the first one.

Proof #3: Convergence on GSobol'

Notice that the lack of convergence of the faulty Faure sequence is not so obvious on the Ishigami function, because the third input variable is no so influential. Only the speed of convergence is affected here.

We consider the GSobol' test function and compute its mean.

a=[0.0, 9.0, 99.0]
def GSobolModel(X):
    X = ot.Point(X)
    d = X.getDimension()
    Y = 1.0
    for i in range(d):
        Y *= (abs(4.0 * X[i] - 2.0) + a[i]) / (1.0 + a[i])
    return ot.Point([Y])

gSobolFunction = ot.PythonFunction(dimension, 1, GSobolModel)
gSobolFunction.setOutputDescription(["Y"])

# Define the distribution
distributionList = [ot.Uniform(0.0, 1.0) for i in range(dimension)]
distribution = ot.ComposedDistribution(distributionList)

name = "GSobol"

# Compute Mean
gSobolMean = 1.0


dimension = 3
basis = 3  # ie the smallest prime number greater or equal to the dimension
sampleSize = basis ** 7  # This is significant!
sequence = ot.FaureSequence()
experiment_QMC = ot.LowDiscrepancyExperiment(sequence, im.distributionX, sampleSize, False)
inputSample = experiment_QMC.generate()
outputSample = gSobolFunction(inputSample)
print(f"n = {sampleSize} , estimate = {outputSample.computeMean()[0]}")
print(f"Expected = {gSobolMean}")

We get:

n = 2187 , estimate = 10.722134899604974
Expected = 1.0

Version

1.22

Operating System

unknown

Installation media

unknown

Additional Context

No response

Footnotes

  1. "Low Discrepancy Toolbox Manual", Michael Baudin. Version 0.1. May 2013. https://gitlab.com/scilab/forge/lowdisc

@mbaudin47 mbaudin47 added the bug label May 11, 2024
@regislebrun
Copy link
Member

regislebrun commented May 11, 2024

@mbaudin47 Nice catch, and thanks for the detailed analysis. The faulty part seems to be in LowDiscrepancySequence::GetNextPrimeNumber(n), which was supposed to return the first prime number greater or equal to n and return the first prime number greater or equal to n-1 if primesieve is used (see https://github.com/kimwalisch/primesieve/blob/master/doc/CPP_API.md#primesieveiteratorjump_to-since-primesieve-110).

Once the bug is fixed, using your script I get (after replacing im.distributionX by distribution):

    [ y0       y1       y2       ]
0 : [ 0.333333 0.333333 0.333333 ]
1 : [ 0.666667 0.666667 0.666667 ]
2 : [ 0.111111 0.444444 0.777778 ]
3 : [ 0.444444 0.777778 0.111111 ]
4 : [ 0.777778 0.111111 0.444444 ]
5 : [ 0.222222 0.888889 0.555556 ]
6 : [ 0.555556 0.222222 0.888889 ]
7 : [ 0.888889 0.555556 0.222222 ]
8 : [ 0.037037 0.592593 0.481481 ]
n = 2187 , estimate = 0.9998313836954922
Expected = 1.0

@mbaudin47
Copy link
Collaborator Author

I try to improve some of the low discrepancy unit tests in #2654. I was not able to reproduce the bug based on my Linux-build of the lib: do you know why?

@mbaudin47 mbaudin47 linked a pull request May 12, 2024 that will close this issue
mbaudin47 added a commit to mbaudin47/openturns that referenced this issue May 12, 2024
@jschueller jschueller added this to the 1.23 milestone May 13, 2024
@mbaudin47
Copy link
Collaborator Author

mbaudin47 commented May 13, 2024

Is there a workaround, like uninstalling PrimeSieve?

@jschueller
Copy link
Member

no need to uninstall, you can pass -DUSE_PRIMESIEVE=OFF to cmake

@mbaudin47
Copy link
Collaborator Author

What if the installation is based on Conda: is there a way to remove that package/module, or is this integrated within the OT binary?

@jschueller
Copy link
Member

its not possible to change it at runtime

mbaudin47 added a commit to mbaudin47/openturns that referenced this issue May 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants