ML Training System
To make creation and testing of ML models easy for you, we are providing you with a ML model creation suite. This is a beta feature, so please bear with us while we iron out the bugs!
Installation
Get the toolbox by typing the following commands in your terminal:
pip uninstall auquan_toolbox
pip install -U auquan_toolbox_beta --no-cache-dir
For ML Training system to work, following need to be specified:
DataSource
Similar to Trading System above.
DataSplit Ratio
A list specifying the percentage of Training, Validation and Test Data. Eg [6,2,2]. The model is trained and validated on Training and Validation data respectively and then backtested for trading on Test Data.
Features
InstrumentFeatureConfigDicts
Similar to Trading System above.
CustomFeatures
Similar to Trading System above.
Target Variable
The Variable you want to predict. This can be one or many variables. They can be specified as one of the features created above (with a shift) or loaded from a file as below:
getTargetVariableConfigDicts(self):
Y = {'featureKey' : 'Y', #Use a feature loaded from datasource
'featureId' : '',
'params' : {}}
tv = {'featureKey' : 'direction_tv', ##Create a target variable from a feature
'featureId' : 'direction',
'params' : {'period' : 5,
'featureName' : 'ma_5',
'shift' : 5}}
return {INSTRUMENT_TYPE_STOCK : [tv]}
FeatureSelectionConfigDicts
Use this to specify the methods to be used for feature engineering. For example, you can choose to reduce the overall set of features by keeping only those features which display a certain minimum correlation to target Variable. Current Available methods are:
def getFeatureSelectionConfigDicts(self):
corr = {'featureSelectionKey': 'corr',
'featureSelectionId' : 'pearson_correlation',
'params' : {'startPeriod' : 0,
'endPeriod' : 60,
'steps' : 10,
'threshold' : 0.1,
'topK' : 2}}
genericSelect = {'featureSelectionKey' : 'gus',
'featureSelectionId' : 'generic_univariate_select',
'params' : {'scoreFunction' : 'f_classif',
'mode' : 'k_best',
'modeParam' : 'all'}}
rfecvSelect = {'featureSelectionKey': 'rfecv',
'featureSelectionId': 'rfecv_selection',
'params' : {'estimator' : 'LinearRegression',
'estimator_params' : {},
'step' : 1,
'cv' : None,
'scoring' : None,
'n_jobs' : 2}}
return {INSTRUMENT_TYPE_STOCK : [genericSelect]}
FeatureTransformationConfigDicts
The methods you want to use to normalize or transform features. Current available methods are:
def getFeatureTransformationConfigDicts(self):
stdScaler = {'featureTransformKey': 'stdScaler',
'featureTransformId' : 'standard_transform',
'params' : {}}
minmaxScaler = {'featureTransformKey' : 'minmaxScaler',
'featureTransformId' : 'minmax_transform',
'params' : {'low' : -1,
'high' : 1}}
pcaScaler = {'featureTransformKey' : 'pcaScaler',
'featureTransformId' : 'pca_transform',
'params' : {'n_comp' : 6,
'copy' : True,
'whiten' : False,
'svd' : 'full',
'itr_power' : 'auto',
'random_state' : None}}
return {INSTRUMENT_TYPE_STOCK : [stdScaler]}
Model Config Dicts
The ML models you want to train. Current available models are:
def getModelConfigDicts(self):
regression_model = {'modelKey': 'linear_regression',
'modelId' : 'linear_regression',
'params' : {}}
mlp_regression_model = {'modelKey': 'mlp_regression',
'modelId' : 'mlp_regression',
'params' : {}}
classification_model = {'modelKey': 'logistic_regression',
'modelId' : 'logistic_regression',
'params' : {}}
mlp_classification_model = {'modelKey': 'mlp_classification',
'modelId' : 'mlp_classification',
'params' : {}}
svm_model = {'modelKey': 'svm_model',
'modelId' : 'support_vector_machine',
'params' : {}}
return {INSTRUMENT_TYPE_STOCK : [classification_model]}
Prediction Function
You do not have to specify a prediction function. The toolbox automatically creates a prediction function using the features and trained model from above.