SciUnit Demo¶

K. Rosenfeld

Measles Team Meeting

12/13/2021

To create slides:

jupyter nbconvert demo-20211213.ipynb --to slides --TagRemovePreprocessor.remove_input_tags={\"to_remove\"} --post serve --SlidesExporter.reveal_transition=none --SlidesExporter.reveal_scroll=True

Code vs. Model Review¶

Code review¶

What functionality is a component expected to have?
What functionality has been adequately implemented? What remains to be done?
Does a candidate code contribution cause regressions in other parts of a program?

Model review¶

What is a model's scope and how is validity measured?
Which observations are already explained by existing models? What are the best models of a particular quantity? What data has yet to be explained?
What effect do new observations have on the validity of previously published models? Can new models explain previously published data?

This presentation focuses on SciUnit, a Pythonic framework for data-driven unit testing. SciUnit is used to create a domain specific model review package. The package can then be applied across models. For each model you create a sciunit.Model for a sciunit.Capability that will be judged by a sciunit.Test.

my_model = MyModel(**my_args) # Instantiate a class that wraps your model of interest.  
my_test = MyTest(**my_params) # Instantiate a test that you write.  
score = my_test.judge() # Runs the test and return a rich score containing test results and more.

SciUnit contributions 2012-2021:

ISSE conference paper from 2014 and an active repository. Heavy users in neuroscience (NeuronUnit)

Hypothetical example for orbital mechanics¶

orbital mechanics: tests¶

Test classes are data-agnostic

from cosmounit import PositionTest, VelocityTest, EccentricityyTest # Cosmounit is an external library.

and test instances encode the data you want a model to recapitulate.

from . import saturn_data # Hypothetical library containing Saturn data.  
position_test = PositionTest(observation=saturn_data.position)
velocity_test = VelocityTest(observation=saturn_data.velocity)
eccentricity_test = EccentricityTest(observation=saturn_data.eccentricity)

orbital mechanics: models¶

Orbital models can predict any planet, but we are interested in Saturn:

from cosmounit import PtolemyModel, CopernicusModel, KeplerModel, NewtonModel  
ptolemy_model = PtolemyModel(planet='Saturn')
copernicus_model = CopernicusModel(planet='Saturn')
kepler_model = KeplerModel(planet='Saturn')
newton_model = NewtonModel(planet='Saturn')

orbital mechanics: test suite¶

from saturn_suite.suites import saturn_motion_suite
saturn_motion_suite.judge([ptolemy_model, copernicus_model, kepler_model, newton_model])

Hypothetical example for measles / epi¶

measles epi: tests¶

Tests could be location specific or a theoretical result.

from . import measles_data # Hypothetical library containing measles data.  
from epiunit import CCSTest, AgeAtInfTest # epiunit is an external library.
ccs_test = CCSTest(observation=measles.ccs)
age_at_inf_test = AgeAtInfTest(observation=measles.Nigeria)
seas_test = SeasonalityTest(observation=measles.Nigeria)

and packaged up into a location specific test suite:

nigeria_epi_suite = sciunit.TestSuite([ccs_test, age_at_inf_test, seas_test])

measles epi: models¶

Models could be differentiated by version, features, type, etc...

from enod_package import EmodModel
from tsir_package import TSIRModel, DynaMICEModel 
emod_model = EmodModel(location='Nigeria')
tsir_model = TSIRModel(location='Nigeria')
mice_model = DynaMICEModel(location='Nigeria')

measles epi: test suite¶

from epi_suite.suites import nigeria_epi_suite
nigeria_epi_suite.judge([emod_model, tsir_model, mice_model])

Models do not need to be capable across the entire suite.

Example test suite: constant number generation¶

Capababilities¶

Every model has a capability it aims to implement:

class ProducesNumber(sciunit.Capability):
    """An example capability for producing some generic number."""

    def produce_number(self):
        """The implementation of this method should return a number."""
        raise NotImplementedError("Must implement produce_number.")

Each model may have a unique method for that particular capability:

class ConstModel(sciunit.Model, ProducesNumber):
"""A model that always produces a constant number as output."""

def __init__(self, constant, name=None):
    self.constant = constant 
    super(ConstModel, self).__init__(name=name, constant=constant)

def produce_number(self):
    return self.constant

and create a model instance:

const_model_37 = ConstModel(37, name='Constant Model 37')

Tests¶

A sciunit.Test class must contain:

Required model capabilities and type of score
generate_prediction to get model prediction
compute_score to compute a sciunit.Score

## Example test class
class EqualsTest(sciunit.Test):
    """Tests if the model predicts the same number as the observation."""   

    required_capabilities = (ProducesNumber,) # The one capability required for a model to take this test.  
    score_type = sciunit.scores.BooleanScore # This test's 'judge' method will return a BooleanScore.  

    def generate_prediction(self, model):
        return model.produce_number() # The model has this method if it inherits from the 'ProducesNumber' capability.

    def compute_score(self, observation, prediction):
        score = self.score_type(observation['value'] == prediction) # Returns a BooleanScore. 
        score.description = 'Passing score if the prediction equals the observation'
        return score

## Example test instance
# create test instances
equals_37_test = EqualsTest({'value': 37}, name='=37') # Test model output equals 37.
equals_1_test = EqualsTest({'value': 1}, name='=1') # Test model output equals 1.  

# create test suite
equals_suite = sciunit.TestSuite([equals_1_test, equals_2_test, equals_37_test], name="Equals test suite")

# run suite
score_matrix = equals_suite.judge(const_model_37)

Out[2]:

	=1	=37
Constant Model 37	Fail	Pass

Score types¶

Complete score types in SciUnit are:

BooleanScore: true or false
ZScore: standardized difference from the mean
CohenDScore: normalized difference between two means
RatioScore: ratio of two numbers
PercentScore: float between 0 and 100
FloatScore: a float

Incomplete score types are also included (NoneScore, TBDScore, NAScore, InsufficientDataScore).

SciUnit does not include statistical tests (ex. Kolmogorov–Smirnov test, Cramér–von Mises test) for comparing distributions.

SciUnit is only one way of tackling this problem. It requires support and time to write the tests but they can be wrapped around the final model. Test can be applied across models.

Model validation:¶

sciunit (python)
scinunits (linux)
idm-test (python)

Paper reproduction:

showyourwork

References:

SciUnit paper
SciUnit docs and tutorials