Are the "efficient-kan" and "official-kan" equivalent in terms of algorithms? #35

yuedajiong · 2024-05-11T14:35:38Z

yuedajiong
May 11, 2024

as title

Indoxer · 2024-05-12T14:54:34Z

Indoxer
May 12, 2024

As I know almost the same, only official version looks to have additional bias after each layer. Also, I am not sure if initialization is the same. + regularization loss is changed because of optimizations.

0 replies

yuedajiong · 2024-05-13T02:10:42Z

yuedajiong
May 13, 2024
Author

@Indoxer Thanks, you are so kindly.

0 replies

WhatMelonGua · 2024-05-13T03:48:23Z

WhatMelonGua
May 13, 2024

No, I'm not quite sure
I tried the official tutorial on the following link: Tutorial

*Including the use of the official LBFGS training strategy
The results showed that after completing all the one-time training, the model was almost identical to the official one
But if training is conducted in phases, it cannot be perfectly fitted(But the model is still effective, just slightly underperforming)

official KAN

Eff-KAN

0 replies

WhatMelonGua · 2024-05-13T03:50:30Z

WhatMelonGua
May 13, 2024

I think this is acceptable, after all, the model is very efficient, and some losses are normal. It's strange if there are no losses at all. While it effectively retains the characteristics of the official model, it also combines training optimization

0 replies

Indoxer · 2024-05-13T09:50:01Z

Indoxer
May 13, 2024

@WhatMelonGua, are you sure, that you didn't train spline_scaler and base_weights? Also did you have the same parameters in LBFGS optimizer (number of steps, etc.)?

0 replies

Indoxer · 2024-05-13T10:20:47Z

Indoxer
May 13, 2024

(spline_scaler not trained, base_weights not trained)

(spline_scaler trained, base_weights trained):

(I am using my modified version (but the same algorithm as efficient kan), so I am not sure)

0 replies

WhatMelonGua · 2024-05-13T15:35:43Z

WhatMelonGua
May 13, 2024

@WhatMelonGua, are you sure, that you didn't train spline_scaler and base_weights? Also did you have the same parameters in LBFGS optimizer (number of steps, etc.)?

Oh, yes, forgive me for forgetting
There are no such parameters, so for that reg_ variable (I don't know what it is), I simply took the default value of 1 and fixed many errors (perhaps I was fixing it blindly, just making it work)
And then the result was that the official "LBFGS" cannot be directly migrated here

0 replies

WhatMelonGua · 2024-05-13T15:37:35Z

WhatMelonGua
May 13, 2024

(spline_scaler not trained, base_weights not trained) (spline_scaler trained, base_weights trained):

(I am using my modified version (but the same algorithm as efficient kan), so I am not sure)

This may seem like our operations are similar
What a coincidence! 🤗

0 replies

Indoxer · 2024-05-13T16:13:52Z

Indoxer
May 13, 2024

@WhatMelonGua, are you sure, that you didn't train spline_scaler and base_weights? Also did you have the same parameters in LBFGS optimizer (number of steps, etc.)?

Oh, yes, forgive me for forgetting There are no such parameters, so for that reg_ variable (I don't know what it is), I simply took the default value of 1 and fixed many errors (perhaps I was fixing it blindly, just making it work) And then the result was that the official "LBFGS" cannot be directly migrated here

reg_ is regularization loss. loss = train_loss + lamb * reg_ for continual learning lamb=0.0 so loss = train_loss

1 reply

WhatMelonGua May 20, 2024

Thank you, let me know those knowledge!

Indoxer · 2024-05-13T16:55:32Z

Indoxer
May 13, 2024

Here are my results and code, so you can compare

0 replies

Blealtan · 2024-05-17T18:52:10Z

Blealtan
May 17, 2024
Maintainer

AFAIK the only difference is that the "efficient" regularization loss is different from the official one. But I'm not sure if the parallel associativity will introduce numerical error that's large enough to break some important features.

0 replies

Blealtan · 2024-05-20T12:25:06Z

Blealtan
May 20, 2024
Maintainer

Just found that I missed the bias term after each layer. Will update that soon.

I scanned over this long thread few days ago and totally missed the comment by @Indoxer lol

0 replies

mtom365 · 2024-05-23T13:48:07Z

mtom365
May 23, 2024

Hi is there any plan to support update_grid_from_samples and initialize_from_another_model from the original KANs. A lot of use cases work much better when we use these APIs so I think its critical for KANs

0 replies

EladWarshawsky · 2024-05-29T22:38:43Z

EladWarshawsky
May 29, 2024

Its seems like the algos are somewhat equivalent. As someone who is much more interested in the symbolic representation aspect, is there enough consistency between the original KAN implementation and what's here to do formula approximation?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Are the "efficient-kan" and "official-kan" equivalent in terms of algorithms? #35

{{title}}

Replies: 14 comments 1 reply

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Are the "efficient-kan" and "official-kan" equivalent in terms of algorithms? #35

Replies: 14 comments · 1 reply

yuedajiong May 13, 2024 Author

Blealtan May 17, 2024 Maintainer

Blealtan May 20, 2024 Maintainer

Replies: 14 comments 1 reply

yuedajiong
May 13, 2024
Author

Blealtan
May 17, 2024
Maintainer

Blealtan
May 20, 2024
Maintainer