Module containing different datasets.

ExternalMMSData

ExternalMMSData(dataset_path, rootdir=None, transform=None, cache=True, return_epoch=True)

Bases: Dataset

Loading a dataset with labeled MMS data based on dataset file.

This dataset class looks for datafiles stored in CDF files in another location. By default SpacePhyML will look for external MMS data at the PySPEDAS data location (PySPEDAS.) If the PySPEDAS environmental variable's are not set data will be placed at $HOME/spacephyml_data/mms, following the same directory structure as PySPEDAS (and the MMS Science Data Center). Data files that are missing when the class is initialised will be downloaded.

The dataset file have to have the following columns:

  • label : The label corresponding to the sample
  • epoch : The CDF epoch for the label
  • file {i} : Specifying the MMS CDF file to read data from, the {i} is a running number.
  • var_name {i} : The variable in the CDF file to read, the {i} is a running number.
  • epoch {i} : The CDF epoch to read data from the {i} is a running number.
Warning

If loading data fail it may be due to the cdf file being corrupt. Delete the failing file and retry.

Examples:

>>> from spacephyml.datasets.general import ExternalMMSData
>>> dataset = ExternalMMSData('./mydataset.csv')

Parameters:

Name Type Description Default
dataset_path string

Path to the file containing the dataset.

required
rootdir string

The override the default rootdir to for the MMS data storage.

None
transform callable

Optional transform to be applied on each sample.

None
cache bool

If data should be cached.

True
return_epoch bool

If the label epoch should be returned.

True