That last ArrayConvert.Build method made me curious, so I ran a quick BenchmarkDotNet comparison between the original loop-based version and a simple Buffer.BlockCopy implementation.
Release build, .NET 10.0.3.
Method
Size
Mean
Error
StdDev
Median
Ratio
RatioSD
Gen0
Gen1
Gen2
Allocated
Alloc Ratio
Original
4096
4,094.5 ns
74.51 ns
73.17 ns
4,078.0 ns
1.00
0.02
0.9766
-
-
8.03 KB
1.00
BlockCopy
4096
366.6 ns
15.12 ns
44.35 ns
353.9 ns
0.09
0.01
0.9823
-
-
8.03 KB
1.00
Original
16384
16,381.6 ns
322.53 ns
345.11 ns
16,307.2 ns
1.00
0.03
3.8757
-
-
32.03 KB
1.00
BlockCopy
16384
1,409.2 ns
28.47 ns
75.50 ns
1,399.4 ns
0.09
0.00
3.9043
-
-
32.03 KB
1.00
Original
65536
86,562.0 ns
1,688.07 ns
2,474.35 ns
85,771.1 ns
1.00
0.04
41.6260
41.6260
41.6260
128.04 KB
1.00
BlockCopy
65536
36,093.2 ns
713.88 ns
1,192.74 ns
35,702.5 ns
0.42
0.02
41.6260
41.6260
41.6260
128.04 KB
1.00
For smaller sizes (4–16 KB) the difference is roughly ~10×.
For larger buffers (64 KB) the gap shrinks, but it’s still noticeably faster.
Allocations are identical in both cases (one byte[] per call), so this is purely a copy-performance difference.
Just sharing the numbers since your post made me dig into this part out of curiosity
For reference, this is the version I tested: