Matmul on Blackwell: Part 2 – Using Hardware Features to Optimize Matmul