Module containing different datasets.
ExternalMMSData
ExternalMMSData(dataset_path, rootdir=None, transform=None, cache=True, return_epoch=True)
Bases: Dataset
Loading a dataset with labeled MMS data based on dataset file.
This dataset class looks for datafiles stored in CDF files in another
location. By default SpacePhyML will look for external MMS data at the
PySPEDAS data location
(PySPEDAS.)
If the PySPEDAS environmental variable's are not set data will be placed
at $HOME/spacephyml_data/mms, following the same directory structure as
PySPEDAS (and the
MMS Science Data Center).
Data files that are missing when the class is initialised will be
downloaded.
The dataset file have to have the following columns:
- label : The label corresponding to the sample
- epoch : The CDF epoch for the label
- file {i} : Specifying the MMS CDF file to read data from, the {i} is a running number.
- var_name {i} : The variable in the CDF file to read, the {i} is a running number.
- epoch {i} : The CDF epoch to read data from the {i} is a running number.
Warning
If loading data fail it may be due to the cdf file being corrupt. Delete the failing file and retry.
Examples:
>>> from spacephyml.datasets.general import ExternalMMSData
>>> dataset = ExternalMMSData('./mydataset.csv')
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_path
|
string
|
Path to the file containing the dataset. |
required |
rootdir
|
string
|
The override the default rootdir to for the MMS data storage. |
None
|
transform
|
callable
|
Optional transform to be applied on each sample. |
None
|
cache
|
bool
|
If data should be cached. |
True
|
return_epoch
|
bool
|
If the label epoch should be returned. |
True
|