In magnetoencephalography (MEG) research there are a variety of inversion methods to transform sensor data into estimates of brain activity. Each new inversion scheme is generally justified against a specific simulated or task scenario. The choice of this scenario will however have a large impact on how well the scheme performs. We describe a method with minimal selection bias to quantify algorithm performance using human resting state data. These recordings provide a generic, heterogeneous, and plentiful functional substrate against which to test different MEG recording and reconstruction approaches. We used a Hidden Markov model to spatio-temporally partition data into self-similar dynamic states. To test the anatomical precision that could be achieved, we then inverted these data onto libraries of systematically distorted subject-specific cortical meshes and compared the quality of the fit using cross validation and a Free energy metric. This revealed which inversion scheme was able to identify the least distorted (most accurate) anatomical models, and allowed us to quantify an upper bound on the mean anatomical distortion accordingly. We used two resting state datasets, one recorded with head-casts and one without. In the head-cast data, the Empirical Bayesian Beamformer (EBB) algorithm showed the best mean anatomical discrimination (3.7 mm) compared with Minimum Norm/LORETA (6.0 mm) and Multiple Sparse Priors (9.4 mm). This pattern was replicated in the second (conventional dataset) although with a marginally poorer (non-significant) prediction of the missing (cross-validated) data. Our findings suggest that the abundant resting state data now commonly available could be used to refine and validate MEG source reconstruction methods and/or recording paradigms.