Monday, October 31, 2016

Carousel Cotton Candy

Version 0.3

I'm super excited to announce the Cotton Candy release of Carousel, version 0.3 on PyPI and GitHub. There were a few more issues I really wanted to close with this release, but I decided to push it forward anyway. So the remaining milestones for v0.3 will get pushed to v0.3.1.

  • issue #62 use Meta class for all layers - currently usage is spotty and inconsistent. I wanted to keep the number of commits to close PR #68 to a minimum (following the winning workflow) so I only implemented Meta classes where I had to. In fact it's not even implemented in the DataSource example below.
  • issue #25 move all folders into project package - this is already how I have set up the PVPower demo. It just makes more sense with new style models to have them all in the same package.
  • issue #63 and issue #22 split calculations into separate parameters - I knew this couldn't be done by v0.3, it was a stretch goal, but I'm super excited about this. By moving dependencies to an attribute of each parameter, the DAG shows which calculations are orthogonal, so we can run them simultaneously. According to issue #22, I had this idea already, but when I saw this presentation at PyData SF 2016 on Airflow by Matt Davis I was even more motivated to make it happen.
  • issue #59 and issue #73 which don't seem like they're relevant, but they both have to do with implementing a Calculator class whose job it is to crunch through the calculations, somewhat similar to what DataReaders and FormulaImporter do for their layers. Then the boiler plate uncertainty propagation code in the static class could be applied to any calculator such as a dynamic calculator, a linear system solver for an acyclic DAG of linear equations or a non-linear system solver for an acyclic DAG of non-linear equations.

The Parameter class

What's new in Carousel-0.3 (Cotton Candy)? The biggest difference is the introduction of the Parameter class which is now used to specify parameters for data, formulas, outputs, calculations and simulation settings. For example, previously data parameters would be entered as a dictionary of attributes.

Bicycle Bears

class PVPowerData(DataSource):
    """
    Data sources for PV Power demo.
    """
    data_reader = ArgumentReader
    latitude = {"units": "degrees", "uncertainty": 1.0}
    longitude = {"units": "degrees", "uncertainty": 1.0}
    elevation = {"units": "meters", "uncertainty": 1.0}

Cotton Candy

class PVPowerData(DataSource):
    """
    Data sources for PV Power demo.
    """
    latitude = DataParameter(units="degrees", uncertainty=1.0)
    longitude = DataParameter(units="degrees", uncertainty=1.0)
    elevation = DataParameter(units="meters", uncertainty=1.0)

    class Meta:
        data_reader = ArgumentReader

Why the change? The Bicycle Bear version did not have any way to distinguish parameters, like latitude from attributes of the DataSource like data_reader. This had two unfortunate side-effects:

  • each layer attribute had to be hardcoded in the base metaclass so that they wouldn't be misinterpreted as parameters
  • and users could not define any custom class attributes, because they would be misinterpreted as parameters and stripped from the class by the metaclass.

The Cotton Candy version makes it easy for the metaclass to determine which class attributes are parameters, which are attributes of the layer and then leaves everything else alone. Every layer now has a corresponding Parameter subclass which also defines some base attributes corresponding to that layer. Any extra parameter attributes are saved in extras. Attributes that apply to the entire layer are now specified in the `Meta` class attribute, similar to Django, Marshmallow and DRF. The similarities are completely intentional as I have been strongly inspired by those project code bases. Unfortunately, the Meta class is only partially implemented, but will be the major focus of v0.3.1.

Separated Formulas

The Formula class is also improved. Now each formulas is a Parameter with attributes, rather than a giant dictionary. This improvement is still on the roadmap for the Calculation class. As I said above, it was a stretch goal for this release.

No comments:

Post a Comment

Fork me on GitHub