Chalk home page
Docs
API
CLI
  1. Features
  2. Underscore

Underscore expressions

Underscore expressions can be used to define and resolve features from operations on other features.

Consider the feature class RocketEngine defined below:

from chalk.features import features

@features
class RocketEngine:
    id: int
    mass: float
    volume: float

Suppose we wanted to define a feature called density and resolve it as mass / volume. We can use the underscore feature definition syntax as follows:

from chalk.features import features
from chalk.features import _, features

@features
class RocketEngine:
  id: int
  mass: float
  volume: float

  # Defining density using an underscore expression
  density: float = _.mass / _.volume

Note that _ is a reference to the containing feature class RocketEngine and should be used to reference features contained in the same feature class.


DataFrames

Underscore can also be used to define expressions with DataFrame features.

In the example below, we will define two new feature classes, Astronaut and RocketShip:

from chalk.features import features, has_one, has_many, DataFrame

@features
class Astronaut:
    id: int
    name: str
    height: float
    mass: float
    ship_id: int

@features
class RocketShip:
    id: int
    engine_id: int

    # `has_one` relationship with `RocketEngine`
    engine: RocketEngine = has_one(lambda: RocketShip.engine_id == RocketEngine.id)

    # `has_many` relationship with `Astronaut`
    astronauts: DataFrame[Astronaut] = has_many(lambda: Astronaut.ship_id == RocketShip.id)

Projections

Suppose we wanted to define a feature called total_mass on RocketShip and resolve it as the sum of the masses of the astronauts on the ship and the mass of the engine.

Using the underscore syntax, we can define total_mass as follows:

from chalk.features import features, has_one, has_many, DataFrame
from chalk.features import _, features, has_one, has_many, DataFrame

@features
class Astronaut:
  id: int
  name: str
  height: float
  mass: float
  ship_id: int

@features
class RocketShip:
  id: int
  engine_id: int

  # `has_one` relationship with `RocketEngine`
  engine: RocketEngine = has_one(lambda: RocketShip.engine_id == RocketEngine.id)

  # `has_many` relationship with `Astronaut`
  astronauts: DataFrame[Astronaut] = has_many(lambda: Astronaut.ship_id == RocketShip.id)

  # Defining total_mass using an underscore expression
  total_mass: float = _.engine.mass + _.astronauts[_.mass].sum()

Here we used the underscore syntax do a projection of the mass column in the astronauts DataFrame.

Filters

Underscores can also be used to define filters on DataFrame features.

For example, suppose we wanted to define a feature called count_tall_astronauts on RocketShip. We could define the feature as follows

@features
class RocketShip:
  id: int
  engine_id: int

  # `has_one` relationship with `RocketEngine`
  engine: RocketEngine = has_one(lambda: RocketShip.engine_id == RocketEngine.id)

  # `has_many` relationship with `Astronaut`
  astronauts: DataFrame[Astronaut] = has_many(lambda: Astronaut.ship_id == RocketShip.id)

  # Defining total_mass using an underscore expression
  total_mass: float = _.engine.mass + _.astronauts[_.mass].sum()

  # Filtering on the `height` column
  count_tall_astronauts = _.astronauts[_.height > 2.0].count()