GDPVal: Measuring the performance of our models on real-world tasks