Skip to content

ML Training System

To make creation and testing of ML models easy for you, we are providing you with a ML model creation suite. This is a beta feature, so please bear with us while we iron out the bugs!

Installation

Get the toolbox by typing the following commands in your terminal:

pip uninstall auquan_toolbox
pip install -U auquan_toolbox_beta --no-cache-dir

For ML Training system to work, following need to be specified:

DataSource

Similar to Trading System above.

DataSplit Ratio

A list specifying the percentage of Training, Validation and Test Data. Eg [6,2,2]. The model is trained and validated on Training and Validation data respectively and then backtested for trading on Test Data.

Features

InstrumentFeatureConfigDicts

Similar to Trading System above.

CustomFeatures

Similar to Trading System above.

Target Variable

The Variable you want to predict. This can be one or many variables. They can be specified as one of the features created above (with a shift) or loaded from a file as below:

getTargetVariableConfigDicts(self):
        Y = {'featureKey' : 'Y', #Use a feature loaded from datasource
             'featureId' : '',
             'params' : {}}
        tv = {'featureKey' : 'direction_tv',  ##Create a target variable from a feature
                  'featureId' : 'direction',
                  'params' : {'period' : 5,
                              'featureName' : 'ma_5',
                              'shift' : 5}}
        return {INSTRUMENT_TYPE_STOCK : [tv]}

FeatureSelectionConfigDicts

Use this to specify the methods to be used for feature engineering. For example, you can choose to reduce the overall set of features by keeping only those features which display a certain minimum correlation to target Variable. Current Available methods are:

def getFeatureSelectionConfigDicts(self):
        corr = {'featureSelectionKey': 'corr',
                'featureSelectionId' : 'pearson_correlation',
                'params' : {'startPeriod' : 0,
                            'endPeriod' : 60,
                            'steps' : 10,
                            'threshold' : 0.1,
                            'topK' : 2}}

        genericSelect = {'featureSelectionKey' : 'gus',
                         'featureSelectionId' : 'generic_univariate_select',
                         'params' : {'scoreFunction' : 'f_classif',
                                     'mode' : 'k_best',
                                     'modeParam' : 'all'}}

        rfecvSelect = {'featureSelectionKey': 'rfecv',
                       'featureSelectionId': 'rfecv_selection',
                       'params' : {'estimator' : 'LinearRegression',
                       'estimator_params' : {},
                       'step' : 1,
                       'cv' : None,
                       'scoring' : None,
                       'n_jobs' : 2}}

        return {INSTRUMENT_TYPE_STOCK : [genericSelect]}

FeatureTransformationConfigDicts

The methods you want to use to normalize or transform features. Current available methods are:

def getFeatureTransformationConfigDicts(self):
        stdScaler = {'featureTransformKey': 'stdScaler',
                     'featureTransformId' : 'standard_transform',
                     'params' : {}}

        minmaxScaler = {'featureTransformKey' : 'minmaxScaler',
                        'featureTransformId' : 'minmax_transform',
                        'params' : {'low' : -1,
                                    'high' : 1}}
        pcaScaler = {'featureTransformKey' : 'pcaScaler',
                      'featureTransformId' : 'pca_transform',
                      'params' : {'n_comp' : 6,
                                  'copy' : True,
                                  'whiten' : False,
                                  'svd' : 'full',
                                  'itr_power' : 'auto',
                                  'random_state' : None}}
        return {INSTRUMENT_TYPE_STOCK : [stdScaler]}

Model Config Dicts

The ML models you want to train. Current available models are:

def getModelConfigDicts(self):
        regression_model = {'modelKey': 'linear_regression',
                     'modelId' : 'linear_regression',
                     'params' : {}}

        mlp_regression_model = {'modelKey': 'mlp_regression',
                     'modelId' : 'mlp_regression',
                     'params' : {}}

        classification_model = {'modelKey': 'logistic_regression',
                     'modelId' : 'logistic_regression',
                     'params' : {}}

        mlp_classification_model = {'modelKey': 'mlp_classification',
                     'modelId' : 'mlp_classification',
                     'params' : {}}

        svm_model = {'modelKey': 'svm_model',
                     'modelId' : 'support_vector_machine',
                     'params' : {}}
        return {INSTRUMENT_TYPE_STOCK : [classification_model]}

Prediction Function

You do not have to specify a prediction function. The toolbox automatically creates a prediction function using the features and trained model from above.