Conv2D: Add CPU version #14320

am17an · 2025-06-21T19:10:44Z

Adding as draft because at the moment it doesn't seem to be always faster than doing im2col, but in some cases it is. Looking to optimize this solution as it's currently completely unoptimized, but it might be useful for #14316

Input Size	Kernel	Config	IM2COL (ms)	SIMD (ms)	Speedup
8x8x3	3x3x3→16	s1 p0	0.300	0.013	23.08x SIMD
8x8x3	3x3x3→16	s1 p1	0.020	0.017	1.18x SIMD
16x16x8	5x5x8→32	s2 p2	0.066	0.070	1.06x IM2COL
32x32x64	1x1x64→128	s1 p0	0.930	6.485	6.97x IM2COL
16x16x16	3x3x16→32	s1 p1	0.359	0.485	1.35x IM2COL
64x64x3	3x3x3→32	s1 p1	1.387	2.757	1.99x IM2COL
128x128x16	3x3x16→32	s1 p1	9.760	73.721	7.55x IM2COL
128x128x32	3x3x32→64	s1 p1	20.337	187.484	9.22x IM2COL
64x64x64	3x3x64→128	s1 p1	11.420	235.696	20.64x IM2COL
224x224x3	3x3x3→32	s1 p1	14.899	25.178	1.69x IM2COL
224x224x3	7x7x3→64	s2 p3	10.947	69.425	6.34x IM2COL
512x512x3	3x3x3→16	s1 p1	46.892	53.811	1.15x IM2COL
512x512x3	3x3x3→16	s2 p1	13.348	17.387	1.30x IM2COL
56x56x64	1x1x64→128	s1 p0	2.848	5.834	2.05x IM2COL
28x28x128	1x1x128→256	s1 p0	1.460	4.830	3.31x IM2COL
14x14x256	1x1x256→512	s1 p0	0.897	5.093	5.68x IM2COL
256x256x8	3x3x8→8	s1 p1	17.228	7.223	2.39x SIMD
512x512x4	3x3x4→4	s1 p1	36.965	10.013	3.69x SIMD

etasnadi · 2025-06-22T12:18:05Z

Check memory usage too. Naive im2col might use tons of memory (maybe not the CPU version?), so even if your code is slower it's worth it to add such an inplace version especially for training conv layers where memory size counts a lot.

Doesn't vec_dot_f16/f32 faster than omp for computing the inner products?

Conv2D: Add CPU version

4e3f47c

github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Jun 21, 2025

Green-Sky mentioned this pull request Jun 21, 2025

A specialized Winograd Conv2d op ggml-org/ggml#971

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Conv2D: Add CPU version #14320

Conv2D: Add CPU version #14320

am17an commented Jun 21, 2025

Uh oh!

etasnadi commented Jun 22, 2025 •

edited

Loading

Uh oh!

Uh oh!

Conv2D: Add CPU version #14320

Are you sure you want to change the base?

Conv2D: Add CPU version #14320

Conversation

am17an commented Jun 21, 2025

Uh oh!

etasnadi commented Jun 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

etasnadi commented Jun 22, 2025 •

edited

Loading