How to specify and fit a hybrid machine learning - linear modelCombining several variables into one outcome score: How is it done in the machine learning community?Are visualization techniques useful when the predictive model is a highly flexible machine learning algorithm?Reconciling boosted regression trees (BRT), generalized boosted models (GBM), and gradient boosting machine (GBM)Heteroscedasticity in machine learning predictionsA scenario of developing machine learning modelGeneral form of a machine learning algorithmWould machine learning techniques help if the linear and nonlinear relationships is so weak?Different machine learning models give contradictory results

Overwrite file only if data

Is "stainless" a bulk or a surface property of stainless steel?

What is "Wayfinder's Guide to Eberron"?

Is it appropriate for a prospective landlord to ask me for my credit report?

How much code would a codegolf golf if a codegolf could golf code?

Do we need to assume underlying returns are normal in BSM model, given Central Limit Theorem?

Can pay be witheld for hours cleaning up after closing time?

How to create a summation symbol with a vertical bar?

Something in the TV

What can I do to keep a threaded bolt from falling out of its slot?

How big would a Daddy Longlegs Spider need to be to kill an average Human?

Can you grapple/shove with the Hunter Ranger's Whirlwind Attack?

Sleeping solo in a double sleeping bag

Bug or undocumented behaviour in Intersection

!I!n!s!e!r!t! !n!b!e!t!w!e!e!n!

Can I submit a paper under an alias so as to avoid trouble in my country?

Would combining A* with a flocking algorithm be too performance-heavy?

Does Git delete empty folders?

Thread-safe, Convenient and Performant Random Number Generator

Defense against attacks using dictionaries

What is the difference between a premise and an assumption in logic?

Can a group have a cyclical derived series?

How can I support the recycling, but not the new production of aluminum?

Importing ES6 module in LWC project (sfdx)



How to specify and fit a hybrid machine learning - linear model


Combining several variables into one outcome score: How is it done in the machine learning community?Are visualization techniques useful when the predictive model is a highly flexible machine learning algorithm?Reconciling boosted regression trees (BRT), generalized boosted models (GBM), and gradient boosting machine (GBM)Heteroscedasticity in machine learning predictionsA scenario of developing machine learning modelGeneral form of a machine learning algorithmWould machine learning techniques help if the linear and nonlinear relationships is so weak?Different machine learning models give contradictory results






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








3












$begingroup$


I want to understand how some dependent variable y, depends on a known relationship with independent variable x, but also how x potentially interacts with a high dimensional complex set of features (microbiome data which can be represented as thousands of predictors per observation of y). Therefore, I know that the general form of the model is:



y ~ m1*x + b



However I would like to add an interaction term, generating the model:



y ~ m1*x + m2*x*microbiome + b



Where microbiome is the very large microbiome feature set.



I think the microbiome data essentially interacts with the independent variable x to affect y, but I don't know how. There are thousands of species, and my guess is that certain combinations will be predictive of this interaction and explain a lot of variation, but a priori I don't know which. I also suspect that the microbiome features relate to each other in a non-linear way. Essentially I want to use a machine learning approach to figure that out. I am aware of "boosted" regressions, where you use a machine learning algorithm on the residuals, however I want to specify something a bit more mechanistic than that.



If anyone could suggest a method to do something like this (especially if it can be implemented in R) I would be very interested. If there is a way to use the residuals of the model to do this, I would also be interested. It's worth noting that in the actual application I have many more predictors in the model, many of which have non-linear relationships with y.










share|cite|improve this question











$endgroup$













  • $begingroup$
    Why is it not sufficient to specify the interaction in the usual way?
    $endgroup$
    – Sycorax
    8 hours ago










  • $begingroup$
    @Sycorax the microbiome feature set has 1000s of columns, and I don't think any one of the individual columns should have a relationship with y I think its way more likely that different combinations of the features in the microbiome dataset interact with x to predict y, but I don't know which combinations.
    $endgroup$
    – colin
    8 hours ago

















3












$begingroup$


I want to understand how some dependent variable y, depends on a known relationship with independent variable x, but also how x potentially interacts with a high dimensional complex set of features (microbiome data which can be represented as thousands of predictors per observation of y). Therefore, I know that the general form of the model is:



y ~ m1*x + b



However I would like to add an interaction term, generating the model:



y ~ m1*x + m2*x*microbiome + b



Where microbiome is the very large microbiome feature set.



I think the microbiome data essentially interacts with the independent variable x to affect y, but I don't know how. There are thousands of species, and my guess is that certain combinations will be predictive of this interaction and explain a lot of variation, but a priori I don't know which. I also suspect that the microbiome features relate to each other in a non-linear way. Essentially I want to use a machine learning approach to figure that out. I am aware of "boosted" regressions, where you use a machine learning algorithm on the residuals, however I want to specify something a bit more mechanistic than that.



If anyone could suggest a method to do something like this (especially if it can be implemented in R) I would be very interested. If there is a way to use the residuals of the model to do this, I would also be interested. It's worth noting that in the actual application I have many more predictors in the model, many of which have non-linear relationships with y.










share|cite|improve this question











$endgroup$













  • $begingroup$
    Why is it not sufficient to specify the interaction in the usual way?
    $endgroup$
    – Sycorax
    8 hours ago










  • $begingroup$
    @Sycorax the microbiome feature set has 1000s of columns, and I don't think any one of the individual columns should have a relationship with y I think its way more likely that different combinations of the features in the microbiome dataset interact with x to predict y, but I don't know which combinations.
    $endgroup$
    – colin
    8 hours ago













3












3








3


1



$begingroup$


I want to understand how some dependent variable y, depends on a known relationship with independent variable x, but also how x potentially interacts with a high dimensional complex set of features (microbiome data which can be represented as thousands of predictors per observation of y). Therefore, I know that the general form of the model is:



y ~ m1*x + b



However I would like to add an interaction term, generating the model:



y ~ m1*x + m2*x*microbiome + b



Where microbiome is the very large microbiome feature set.



I think the microbiome data essentially interacts with the independent variable x to affect y, but I don't know how. There are thousands of species, and my guess is that certain combinations will be predictive of this interaction and explain a lot of variation, but a priori I don't know which. I also suspect that the microbiome features relate to each other in a non-linear way. Essentially I want to use a machine learning approach to figure that out. I am aware of "boosted" regressions, where you use a machine learning algorithm on the residuals, however I want to specify something a bit more mechanistic than that.



If anyone could suggest a method to do something like this (especially if it can be implemented in R) I would be very interested. If there is a way to use the residuals of the model to do this, I would also be interested. It's worth noting that in the actual application I have many more predictors in the model, many of which have non-linear relationships with y.










share|cite|improve this question











$endgroup$




I want to understand how some dependent variable y, depends on a known relationship with independent variable x, but also how x potentially interacts with a high dimensional complex set of features (microbiome data which can be represented as thousands of predictors per observation of y). Therefore, I know that the general form of the model is:



y ~ m1*x + b



However I would like to add an interaction term, generating the model:



y ~ m1*x + m2*x*microbiome + b



Where microbiome is the very large microbiome feature set.



I think the microbiome data essentially interacts with the independent variable x to affect y, but I don't know how. There are thousands of species, and my guess is that certain combinations will be predictive of this interaction and explain a lot of variation, but a priori I don't know which. I also suspect that the microbiome features relate to each other in a non-linear way. Essentially I want to use a machine learning approach to figure that out. I am aware of "boosted" regressions, where you use a machine learning algorithm on the residuals, however I want to specify something a bit more mechanistic than that.



If anyone could suggest a method to do something like this (especially if it can be implemented in R) I would be very interested. If there is a way to use the residuals of the model to do this, I would also be interested. It's worth noting that in the actual application I have many more predictors in the model, many of which have non-linear relationships with y.







machine-learning boosting






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited 8 hours ago







colin

















asked 8 hours ago









colincolin

3875 silver badges20 bronze badges




3875 silver badges20 bronze badges














  • $begingroup$
    Why is it not sufficient to specify the interaction in the usual way?
    $endgroup$
    – Sycorax
    8 hours ago










  • $begingroup$
    @Sycorax the microbiome feature set has 1000s of columns, and I don't think any one of the individual columns should have a relationship with y I think its way more likely that different combinations of the features in the microbiome dataset interact with x to predict y, but I don't know which combinations.
    $endgroup$
    – colin
    8 hours ago
















  • $begingroup$
    Why is it not sufficient to specify the interaction in the usual way?
    $endgroup$
    – Sycorax
    8 hours ago










  • $begingroup$
    @Sycorax the microbiome feature set has 1000s of columns, and I don't think any one of the individual columns should have a relationship with y I think its way more likely that different combinations of the features in the microbiome dataset interact with x to predict y, but I don't know which combinations.
    $endgroup$
    – colin
    8 hours ago















$begingroup$
Why is it not sufficient to specify the interaction in the usual way?
$endgroup$
– Sycorax
8 hours ago




$begingroup$
Why is it not sufficient to specify the interaction in the usual way?
$endgroup$
– Sycorax
8 hours ago












$begingroup$
@Sycorax the microbiome feature set has 1000s of columns, and I don't think any one of the individual columns should have a relationship with y I think its way more likely that different combinations of the features in the microbiome dataset interact with x to predict y, but I don't know which combinations.
$endgroup$
– colin
8 hours ago




$begingroup$
@Sycorax the microbiome feature set has 1000s of columns, and I don't think any one of the individual columns should have a relationship with y I think its way more likely that different combinations of the features in the microbiome dataset interact with x to predict y, but I don't know which combinations.
$endgroup$
– colin
8 hours ago










1 Answer
1






active

oldest

votes


















4












$begingroup$

The easiest way to implement model like



y ~ m1*x + m2*x*microbiome + b


would be to replace microbiome with a dense neural network



y ~ m1*x + m2*x*nn(microbiome) + b


so that the neural network nn would reduce dimensionality (to single or multiple dimensions, depending of number of units in the output layer) and do the feature engineering for you. The nice part is that it would let you to keep the assumed form of the model, but the neural network would deal with the extra features for you.



This can be easily done in frameworks like Keras, that are designed to deal with large datasets and scale nicely. In Keras, this would translate to something like the model definition below. To understand the code, you would probably need to dive deeper into Keras, but hopefully many tutorials are available online.



from keras.models import Model
from keras.layers import Input, Dense, multiply, concatenate

x_inp = Input(shape=(1,))
microbiome_inp = Input(shape=(k,))

# 3-layer neural network
nn = Dense(200, activation='relu')(microbiome_inp)
nn = Dense(50, activation='relu')(nn)
nn = Dense(1)(nn)

# x*nn(microbiome)
mul = multiply([x_inp, nn])

# m1*x + m2*x*nn(microbiome) + b
conc = concatenate([x_inp, mul])
out = Dense(1)(conc)

model = Model(inputs=[x_inp, microbiome_inp], outputs=out)
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit([x, microbiome], y)





share|cite|improve this answer











$endgroup$














  • $begingroup$
    This is exactly what I am talking about, thanks! Can you link to any place that has a tutorial on how to specify and fit a model like this in TensorFlow?
    $endgroup$
    – colin
    8 hours ago













Your Answer








StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "65"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f422996%2fhow-to-specify-and-fit-a-hybrid-machine-learning-linear-model%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









4












$begingroup$

The easiest way to implement model like



y ~ m1*x + m2*x*microbiome + b


would be to replace microbiome with a dense neural network



y ~ m1*x + m2*x*nn(microbiome) + b


so that the neural network nn would reduce dimensionality (to single or multiple dimensions, depending of number of units in the output layer) and do the feature engineering for you. The nice part is that it would let you to keep the assumed form of the model, but the neural network would deal with the extra features for you.



This can be easily done in frameworks like Keras, that are designed to deal with large datasets and scale nicely. In Keras, this would translate to something like the model definition below. To understand the code, you would probably need to dive deeper into Keras, but hopefully many tutorials are available online.



from keras.models import Model
from keras.layers import Input, Dense, multiply, concatenate

x_inp = Input(shape=(1,))
microbiome_inp = Input(shape=(k,))

# 3-layer neural network
nn = Dense(200, activation='relu')(microbiome_inp)
nn = Dense(50, activation='relu')(nn)
nn = Dense(1)(nn)

# x*nn(microbiome)
mul = multiply([x_inp, nn])

# m1*x + m2*x*nn(microbiome) + b
conc = concatenate([x_inp, mul])
out = Dense(1)(conc)

model = Model(inputs=[x_inp, microbiome_inp], outputs=out)
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit([x, microbiome], y)





share|cite|improve this answer











$endgroup$














  • $begingroup$
    This is exactly what I am talking about, thanks! Can you link to any place that has a tutorial on how to specify and fit a model like this in TensorFlow?
    $endgroup$
    – colin
    8 hours ago















4












$begingroup$

The easiest way to implement model like



y ~ m1*x + m2*x*microbiome + b


would be to replace microbiome with a dense neural network



y ~ m1*x + m2*x*nn(microbiome) + b


so that the neural network nn would reduce dimensionality (to single or multiple dimensions, depending of number of units in the output layer) and do the feature engineering for you. The nice part is that it would let you to keep the assumed form of the model, but the neural network would deal with the extra features for you.



This can be easily done in frameworks like Keras, that are designed to deal with large datasets and scale nicely. In Keras, this would translate to something like the model definition below. To understand the code, you would probably need to dive deeper into Keras, but hopefully many tutorials are available online.



from keras.models import Model
from keras.layers import Input, Dense, multiply, concatenate

x_inp = Input(shape=(1,))
microbiome_inp = Input(shape=(k,))

# 3-layer neural network
nn = Dense(200, activation='relu')(microbiome_inp)
nn = Dense(50, activation='relu')(nn)
nn = Dense(1)(nn)

# x*nn(microbiome)
mul = multiply([x_inp, nn])

# m1*x + m2*x*nn(microbiome) + b
conc = concatenate([x_inp, mul])
out = Dense(1)(conc)

model = Model(inputs=[x_inp, microbiome_inp], outputs=out)
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit([x, microbiome], y)





share|cite|improve this answer











$endgroup$














  • $begingroup$
    This is exactly what I am talking about, thanks! Can you link to any place that has a tutorial on how to specify and fit a model like this in TensorFlow?
    $endgroup$
    – colin
    8 hours ago













4












4








4





$begingroup$

The easiest way to implement model like



y ~ m1*x + m2*x*microbiome + b


would be to replace microbiome with a dense neural network



y ~ m1*x + m2*x*nn(microbiome) + b


so that the neural network nn would reduce dimensionality (to single or multiple dimensions, depending of number of units in the output layer) and do the feature engineering for you. The nice part is that it would let you to keep the assumed form of the model, but the neural network would deal with the extra features for you.



This can be easily done in frameworks like Keras, that are designed to deal with large datasets and scale nicely. In Keras, this would translate to something like the model definition below. To understand the code, you would probably need to dive deeper into Keras, but hopefully many tutorials are available online.



from keras.models import Model
from keras.layers import Input, Dense, multiply, concatenate

x_inp = Input(shape=(1,))
microbiome_inp = Input(shape=(k,))

# 3-layer neural network
nn = Dense(200, activation='relu')(microbiome_inp)
nn = Dense(50, activation='relu')(nn)
nn = Dense(1)(nn)

# x*nn(microbiome)
mul = multiply([x_inp, nn])

# m1*x + m2*x*nn(microbiome) + b
conc = concatenate([x_inp, mul])
out = Dense(1)(conc)

model = Model(inputs=[x_inp, microbiome_inp], outputs=out)
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit([x, microbiome], y)





share|cite|improve this answer











$endgroup$



The easiest way to implement model like



y ~ m1*x + m2*x*microbiome + b


would be to replace microbiome with a dense neural network



y ~ m1*x + m2*x*nn(microbiome) + b


so that the neural network nn would reduce dimensionality (to single or multiple dimensions, depending of number of units in the output layer) and do the feature engineering for you. The nice part is that it would let you to keep the assumed form of the model, but the neural network would deal with the extra features for you.



This can be easily done in frameworks like Keras, that are designed to deal with large datasets and scale nicely. In Keras, this would translate to something like the model definition below. To understand the code, you would probably need to dive deeper into Keras, but hopefully many tutorials are available online.



from keras.models import Model
from keras.layers import Input, Dense, multiply, concatenate

x_inp = Input(shape=(1,))
microbiome_inp = Input(shape=(k,))

# 3-layer neural network
nn = Dense(200, activation='relu')(microbiome_inp)
nn = Dense(50, activation='relu')(nn)
nn = Dense(1)(nn)

# x*nn(microbiome)
mul = multiply([x_inp, nn])

# m1*x + m2*x*nn(microbiome) + b
conc = concatenate([x_inp, mul])
out = Dense(1)(conc)

model = Model(inputs=[x_inp, microbiome_inp], outputs=out)
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit([x, microbiome], y)






share|cite|improve this answer














share|cite|improve this answer



share|cite|improve this answer








edited 4 hours ago

























answered 8 hours ago









TimTim

63.9k10 gold badges142 silver badges241 bronze badges




63.9k10 gold badges142 silver badges241 bronze badges














  • $begingroup$
    This is exactly what I am talking about, thanks! Can you link to any place that has a tutorial on how to specify and fit a model like this in TensorFlow?
    $endgroup$
    – colin
    8 hours ago
















  • $begingroup$
    This is exactly what I am talking about, thanks! Can you link to any place that has a tutorial on how to specify and fit a model like this in TensorFlow?
    $endgroup$
    – colin
    8 hours ago















$begingroup$
This is exactly what I am talking about, thanks! Can you link to any place that has a tutorial on how to specify and fit a model like this in TensorFlow?
$endgroup$
– colin
8 hours ago




$begingroup$
This is exactly what I am talking about, thanks! Can you link to any place that has a tutorial on how to specify and fit a model like this in TensorFlow?
$endgroup$
– colin
8 hours ago

















draft saved

draft discarded
















































Thanks for contributing an answer to Cross Validated!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f422996%2fhow-to-specify-and-fit-a-hybrid-machine-learning-linear-model%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Sahara Skak | Bilen | Luke uk diar | NawigatsjuunCommonskategorii: SaharaWikivoyage raisfeerer: Sahara26° N, 13° O

The fall designs the understood secretary. Looking glass Science Shock Discovery Hot Everybody Loves Raymond Smile 곳 서비스 성실하다 Defas Kaloolon Definition: To combine or impregnate with sulphur or any of its compounds as to sulphurize caoutchouc in vulcanizing Flame colored Reason Useful Thin Help 갖다 유명하다 낙엽 장례식 Country Iron Definition: A fencer a gladiator one who exhibits his skill in the use of the sword Definition: The American black throated bunting Spiza Americana Nostalgic Needy Method to my madness 시키다 평가되다 전부 소설가 우아하다 Argument Tin Feeling Representative Gym Music Gaur Chicken 일쑤 코치 편 학생증 The harbor values the sugar. Vasagle Yammoe Enstatite Definition: Capable of being limited Road Neighborly Five Refer Built Kangaroo 비비다 Degree Release Bargain Horse 하루 형님 유교 석 동부 괴롭히다 경제력

19. јануар Садржај Догађаји Рођења Смрти Празници и дани сећања Види још Референце Мени за навигацијуу