Arcana Processing

Submodules

arcana.processing.data_processing module

This module is for data preparation for the model. It includes the following functions: 1. get_data_for_model: get the data for the model 2. data_splits: split the data into train, validation and test data 3. standardize_data: standardize the data based on the train data 4. tensorized_and_pad: convert the data to tensor and pad them 5. pad_the_splits: pad the train, validation and test data 6. prepare_data_for_model: main functions for data preparation

class arcana.processing.data_processing.DataPreparation(general_config, data_config, procedure_config)

Bases: object

Data preparation class

data_splits(data, ratio)

Split the data into train, validation and test data

get_data_for_model()

Get the data for the model

get_max_available_scaled_cycle()

Get the maximum available scaled cycle

pad_the_splits(train, val, test)

Pad the train, validation and test data

Parameters:
  • train (pandas dataframe) – train data

  • val (pandas dataframe) – validation data

  • test (pandas dataframe) – test data

Returns:
  • padded_train (list) – list of padded train data

  • padded_val (list) – list of padded validation data

  • padded_test (list) – list of padded test data

prepare_data_for_model()

Main functions for data preparation

prepare_test_data_for_pretrained_model()

Prepare the test data for the pretrained model. This is used for the finetuning

standardize_data()

Standardize the data based on the train data

tensorized_and_pad(data, padded_data, data_lengths)

Convert the data to tensor and pad them

Parameters:
  • data (pandas dataframe) – data to be converted to tensor

  • padded_data (list) – list of padded data

  • data_lengths (list) – list of data lengths

Returns:
  • padded_data (list) – list of padded data

  • data_lengths (list) – list of data lengths

Module contents