MBConvBlockWithoutDepthwise stride implemented in 1x1 projection, wasting expansion arithmetic

`MBConvBlockWithoutDepthwise` implements stride in the `1x1` projection convolution. When stride=2, the projection discards 3/4ths of the activations produced by the expansion. It would be equivalent to implement stride on the `3x3` expansion convolution instead, and this would reduce the total block arithmetic almost by a factor of 4.

https://github.com/tensorflow/tpu/blob/8462d083dd89489a79e3200bcc8d4063bf362186/models/official/efficientnet/efficientnet_model.py#L422-L442

	self._expand_conv = tf.layers.Conv2D(
	filters,
	kernel_size=[3, 3],
	strides=[1, 1],
	kernel_initializer=conv_kernel_initializer,
	padding='same',
	use_bias=False)
	self._bn0 = self._batch_norm(
	axis=self._channel_axis,
	momentum=self._batch_norm_momentum,
	epsilon=self._batch_norm_epsilon)

	# Output phase:
	filters = self._block_args.output_filters
	self._project_conv = tf.layers.Conv2D(
	filters,
	kernel_size=[1, 1],
	strides=self._block_args.strides,
	kernel_initializer=conv_kernel_initializer,
	padding='same',
	use_bias=False)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MBConvBlockWithoutDepthwise stride implemented in 1x1 projection, wasting expansion arithmetic #660

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

MBConvBlockWithoutDepthwise stride implemented in 1x1 projection, wasting expansion arithmetic #660

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions