Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

high relative error between sig() vs extending with displacements via sigjoin() #19

Open
mmp3 opened this issue Jan 9, 2023 · 0 comments

Comments

@mmp3
Copy link

mmp3 commented Jan 9, 2023

I am observing high relative error when comparing two approaches for computing the signature of a path. I am writing to confirm if the high relative error is expected, or if I am doing something wrong.

The two approaches are:

Approach 1: compute the signature of the path directly via sig().
Approach 2: compute the signature for an initial chunk of the path, and then "extend" the signature to the full signature by incroporating successive displacements via sigjoin().

I am interested in the signature at each timepoint, so I used format=2 in the call to sig(), and I compare the two approaches by comparing the signature at each timepoint.

Here is a minimal reproducible example:

import numpy as np                                                              
import iisignature                                                              
                                                                                
np.random.seed(42)                                                              
                                                                                
                                                                                
# notation and constants                                                              
D = 3                                                                           
M = 3                                                                           
T = 1000                                                                        
initial_size = 3                                                                
signature_length = iisignature.siglength(D, M)                                  
                                                                                
# make a path                                                                     
path = np.stack(                                                                
    (                                                                           
        np.random.normal(size=T),                                               
        np.random.normal(size=T),                                               
        np.random.normal(size=T),                                               
    ),                                                                          
    axis=1,                                                                     
)                                                                               
                                                                                
# Approach 1: compute the signature directly                                    
direct_signature = iisignature.sig(path, M, 2)                                  
                                                                                
# Approach 2: compute the signature by incrementally                            
# adding displacements to an initial signature.                                 
running_signature = np.empty(                                                   
    shape=(                                                                     
        T - 1,                                                                  
        signature_length,                                                       
    )                                                                           
)                                                                               
                                                                                
# compute the signature for an initial chunk of the path.                       
running_signature[: (initial_size - 1), :] = iisignature.sig(                   
    path[:initial_size, :], M, 2                                                
)                                                                               
                                                                                
# extend the signature by adding displacements.                                 
for t in range(initial_size, T - 1):                                            
    current_datapoint = path[t, :]                                              
    previous_datapoint = path[t - 1, :]                                         
    displacement = current_datapoint - previous_datapoint                       
                                                                                
    update = iisignature.sigjoin(running_signature[t - 2, :], displacement, M)  
    running_signature[t - 1, :] = update                                        
                                                                                
                                                                                
# compare the signatures computed in Approach 1 vs Approach 2.                 
abs_diff = np.absolute(direct_signature - running_signature)  
                                                                              
denominator = np.maximum(                                                       
    np.absolute(direct_signature),                                              
    np.absolute(running_signature),                                             
)                                                                               
                                                                                
relative_error = abs_diff / denominator                                         
sorted_relative_error = np.flip(np.sort(relative_error.flatten()))              
                                                                                
                                                                                
# print                                                                         
print("Top 100 relative error:")                                                
print(sorted_relative_error[:100])                                              
                                                                                
print("\nRelative error by quantile:")                                          
for quantile in [0.95, 0.96, 0.97, 0.98, 0.99, 0.995, 0.999, 1.00]:             
    qerror = np.nanquantile(relative_error, quantile)                           
    print(f"  {quantile}:  {qerror}")

Running the above script generated the following output:

Top 100 relative error:
[1.77889903 1.75197024 1.74931133 1.53113897 1.43709795 1.35398972
 1.19961425 1.09537373 1.07515143 1.07361146 1.06185186 1.04532904
 1.03441084 1.02479755 1.0011026  1.         1.         1.
 1.         1.         1.         1.         1.         1.
 1.         1.         1.         1.         1.         1.
 1.         1.         1.         1.         1.         1.
 1.         1.         1.         1.         1.         1.
 1.         1.         1.         1.         1.         1.
 1.         1.         1.         1.         1.         1.
 0.99952614 0.99901951 0.99269932 0.98260195 0.97493537 0.97284908
 0.96509212 0.8476231  0.8367656  0.82640576 0.80160814 0.7749034
 0.72631975 0.70964534 0.67476374 0.64989362 0.64582039 0.63875046
 0.60588317 0.59209884 0.57517917 0.56833303 0.54908256 0.51336958
 0.49694044 0.47375533 0.36536458 0.3638046  0.328896   0.32234955
 0.30920041 0.26370045 0.2165778  0.19417226 0.182693   0.17473405
 0.17305372 0.15839498 0.15739248 0.15720863 0.15269935 0.14275847
 0.11872745 0.11720925 0.11585532 0.10494738]

Relative error by quantile:
  0.95:  1.950945410653788e-05
  0.96:  3.1805970732785893e-05
  0.97:  6.201639573069697e-05
  0.98:  0.0001756239441881015
  0.99:  0.0012692327526374583
  0.995:  0.008403982333750958
  0.999:  1.0
  1.0:  1.7788990349469285

From the relative error by quantile, we see that overall the computed signatures are very similar. However, from the "Top 100 relative error", we see that there are, I thought, a surprisingly large number of components of signatures with substantial relative error.

I found this surprising because, as far as I can tell from looking into the code (and theory), this extension approach (Approach 2) is basically what the code does internally to compute signatures; i.e. sig(), sigjoin(), and sigcombine() all use concatenateWith() in an analogous fashion to the loop I used above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant