Given the input Fi with a size of M × N × 64, the stage I output feature maps F′i with a size of M/8 × N/8 × 256 in the default configuration.