Would this neural network have short term memory?Arbitrarily big neural networkWhy would neural networks be a particularly good framework for “embodied AI”?If a neural network approach becomes widely used within a real-world situation, how would one debug/understand/fix the outcome if in one case poor?Detect street and sidewalk surface in aerial imagery (neural network)When do you back-propagate errors through a Neural Network when using TD LambdaHow to create a task-graph based neural network?Why not teach to a NN not only what is true, but also what is not true?Neural Network for Optical Mark Recognition?Using an 'operation ID' as a neural network inputWould this NN for my chip outputs work?

How to Create an Image for Cantor's *Diagonal Argument* with a Diagonal Oval

Iterate over non-const variables in C++

What does コテッと mean?

Is it legal for private citizens to "impound" e-scooters?

Can the 2019 UA Artificer's Returning Weapon and Radiant Weapon infusions stack on the same weapon?

Send a single HTML email from Thunderbird, overriding the default "plain text" setting

Why/when is AC-DC-AC conversion superior to direct AC-Ac conversion?

Why are all my history books dividing Chinese history after the Han dynasty?

What to do when you reach a conclusion and find out later on that someone else already did?

Why are so many countries still in the Commonwealth?

Explain why watch 'jobs' does not work but watch 'ps' work?

Basic Questions on Wiener Filtering

Can I make a matrix from just a parts of the cells?

Decreasing star size

Weed in Massachusetts: underground roots, skunky smell when bruised

Why are off grid solar setups only 12, 24, 48 VDC?

How can I receive packages while in France?

Marrying a second woman behind your wife's back: is it wrong and can Quran/Hadith prove this?

Request for a Latin phrase as motto "God is highest/supreme"

High income, sudden windfall

Where to place an artificial gland in the human body?

What do I do when a student working in my lab "ghosts" me?

TSA asking to see cell phone

Why can't my huge trees be chopped down?

Would this neural network have short term memory?

Arbitrarily big neural networkWhy would neural networks be a particularly good framework for “embodied AI”?If a neural network approach becomes widely used within a real-world situation, how would one debug/understand/fix the outcome if in one case poor?Detect street and sidewalk surface in aerial imagery (neural network)When do you back-propagate errors through a Neural Network when using TD LambdaHow to create a task-graph based neural network?Why not teach to a NN not only what is true, but also what is not true?Neural Network for Optical Mark Recognition?Using an 'operation ID' as a neural network inputWould this NN for my chip outputs work?

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;

I want to design a NN that can remember it's last 7 actions and use them as inputs. So for example it would be able to store words in it's memory. Therefore if it had a choice of 10 different actions, the number of words it could store is $10^7$.

Here is my design:

$$out_n+1 = f(out_n, in_n)mathbfN + out_n.mathbfM$$

$$action_n = sigma(mathbfN cdot out_n)$$

Where $f$ represents some layered neural network. Some of the actions would be physical actions and some might be internal (such as thinking of the letter 'C').

Basically I want $out_n$ to be an array that keeps the last 6 action values and puts them back in. So $M$ will be the matrix:

$$beginbmatrix
0&1&0&0&0&0\
0&0&1&0&0&0\
0&0&0&1&0&0\
0&0&0&0&1&0\
0&0&0&0&0&1\
0&0&0&0&0&0
endbmatrix$$

i.e. it would drop the 6th item from it's memory.

and $N$ would be the vector:

$$beginbmatrix
1&0&0&0&0&0&0
endbmatrix$$

I think this would be equivalent to an equation of the form:

$$out_n+1=F(in_n,out_n,out_n-1,out_n-2,...,out_n-6)$$

So I think this would be an advantage over an RNN since this model remembers precisely it's last 6 actions. But would this be better than an RNN or worse? One could increase it's memory to more than 7 quite easily.

I think it's basically the same archececture as an RNN except elinimating a lot of the connections. Is this a new design or a common design?

One problem with this design is that you might also want a memory that is over longer time periods (e.g. for actions that take more than one tick.) But that might be solved by enhancing the archecture.

edited 8 hours ago

asked 9 hours ago

zooby

6564 silver badges12 bronze badges

add a comment |

Here is my design:

$$out_n+1 = f(out_n, in_n)mathbfN + out_n.mathbfM$$

$$action_n = sigma(mathbfN cdot out_n)$$

Where $f$ represents some layered neural network. Some of the actions would be physical actions and some might be internal (such as thinking of the letter 'C').

Basically I want $out_n$ to be an array that keeps the last 6 action values and puts them back in. So $M$ will be the matrix:

$$beginbmatrix
0&1&0&0&0&0\
0&0&1&0&0&0\
0&0&0&1&0&0\
0&0&0&0&1&0\
0&0&0&0&0&1\
0&0&0&0&0&0
endbmatrix$$

i.e. it would drop the 6th item from it's memory.

and $N$ would be the vector:

$$beginbmatrix
1&0&0&0&0&0&0
endbmatrix$$

I think this would be equivalent to an equation of the form:

$$out_n+1=F(in_n,out_n,out_n-1,out_n-2,...,out_n-6)$$

I think it's basically the same archececture as an RNN except elinimating a lot of the connections. Is this a new design or a common design?

edited 8 hours ago

asked 9 hours ago

zooby

6564 silver badges12 bronze badges

add a comment |

Here is my design:

$$out_n+1 = f(out_n, in_n)mathbfN + out_n.mathbfM$$

$$action_n = sigma(mathbfN cdot out_n)$$

Where $f$ represents some layered neural network. Some of the actions would be physical actions and some might be internal (such as thinking of the letter 'C').

Basically I want $out_n$ to be an array that keeps the last 6 action values and puts them back in. So $M$ will be the matrix:

$$beginbmatrix
0&1&0&0&0&0\
0&0&1&0&0&0\
0&0&0&1&0&0\
0&0&0&0&1&0\
0&0&0&0&0&1\
0&0&0&0&0&0
endbmatrix$$

i.e. it would drop the 6th item from it's memory.

and $N$ would be the vector:

$$beginbmatrix
1&0&0&0&0&0&0
endbmatrix$$

I think this would be equivalent to an equation of the form:

$$out_n+1=F(in_n,out_n,out_n-1,out_n-2,...,out_n-6)$$

I think it's basically the same archececture as an RNN except elinimating a lot of the connections. Is this a new design or a common design?

edited 8 hours ago

asked 9 hours ago

zooby

6564 silver badges12 bronze badges

Here is my design:

$$out_n+1 = f(out_n, in_n)mathbfN + out_n.mathbfM$$

$$action_n = sigma(mathbfN cdot out_n)$$

Where $f$ represents some layered neural network. Some of the actions would be physical actions and some might be internal (such as thinking of the letter 'C').

Basically I want $out_n$ to be an array that keeps the last 6 action values and puts them back in. So $M$ will be the matrix:

$$beginbmatrix
0&1&0&0&0&0\
0&0&1&0&0&0\
0&0&0&1&0&0\
0&0&0&0&1&0\
0&0&0&0&0&1\
0&0&0&0&0&0
endbmatrix$$

i.e. it would drop the 6th item from it's memory.

and $N$ would be the vector:

$$beginbmatrix
1&0&0&0&0&0&0
endbmatrix$$

I think this would be equivalent to an equation of the form:

$$out_n+1=F(in_n,out_n,out_n-1,out_n-2,...,out_n-6)$$

I think it's basically the same archececture as an RNN except elinimating a lot of the connections. Is this a new design or a common design?

neural-networks long-short-term-memory

edited 8 hours ago

asked 9 hours ago

zooby

6564 silver badges12 bronze badges

edited 8 hours ago

asked 9 hours ago

zooby

6564 silver badges12 bronze badges

edited 8 hours ago

asked 9 hours ago

zooby

6564 silver badges12 bronze badges

asked 9 hours ago

zooby

6564 silver badges12 bronze badges

asked 9 hours ago

zooby

6564 silver badges12 bronze badges

add a comment |

1 Answer
1

active

oldest

votes

Congrats, you have invented 1d convolution. Convolution combined with RNN would have some advantage over just RNN. Think about the perception field.
In this layer, you do aggregate $6$ values to one. Imagine two of them - it will be $36$ already, etc. But, in the end, you still need RNN at the end to aggregate a variable length to constant length.

edited 5 hours ago

nbro

5,6604 gold badges15 silver badges32 bronze badges

answered 9 hours ago

user8426627

22411 bronze badges

$begingroup$
Well that's good! Glad I'm on the right track! (Not sure what you mean at the end about variable lengths).
$endgroup$
– zooby
8 hours ago

$begingroup$
@zooby This is not a 1D CNN, its a non differentiable RNN. (actions must be sampled under some categorical distribution based on whats described). The only similarity to a 1d cnn is the sliding window
$endgroup$
– mshlis
8 hours ago

$begingroup$
Why is it non-differentiable ?
$endgroup$
– zooby
8 hours ago

$begingroup$
do you train with sequences of different lenght, right? also if you put output as input think about output may be wrong so you can consider to force-feeding ( expected data instead of output)
$endgroup$
– user8426627
7 hours ago

$begingroup$
I could be wrong but generally actions are drawn from a distribution (that’s why you show one hot encodingns) and you can’t differentiate through a categorical distrib
$endgroup$
– mshlis
7 hours ago

add a comment |

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "658"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fai.stackexchange.com%2fquestions%2f13622%2fwould-this-neural-network-have-short-term-memory%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

edited 5 hours ago

nbro

5,6604 gold badges15 silver badges32 bronze badges

answered 9 hours ago

user8426627

22411 bronze badges

$begingroup$
Well that's good! Glad I'm on the right track! (Not sure what you mean at the end about variable lengths).
$endgroup$
– zooby
8 hours ago

$begingroup$
@zooby This is not a 1D CNN, its a non differentiable RNN. (actions must be sampled under some categorical distribution based on whats described). The only similarity to a 1d cnn is the sliding window
$endgroup$
– mshlis
8 hours ago

$begingroup$
Why is it non-differentiable ?
$endgroup$
– zooby
8 hours ago

$begingroup$
do you train with sequences of different lenght, right? also if you put output as input think about output may be wrong so you can consider to force-feeding ( expected data instead of output)
$endgroup$
– user8426627
7 hours ago

$begingroup$
I could be wrong but generally actions are drawn from a distribution (that’s why you show one hot encodingns) and you can’t differentiate through a categorical distrib
$endgroup$
– mshlis
7 hours ago

add a comment |

edited 5 hours ago

nbro

5,6604 gold badges15 silver badges32 bronze badges

answered 9 hours ago

user8426627

22411 bronze badges

$begingroup$
Well that's good! Glad I'm on the right track! (Not sure what you mean at the end about variable lengths).
$endgroup$
– zooby
8 hours ago

$begingroup$
@zooby This is not a 1D CNN, its a non differentiable RNN. (actions must be sampled under some categorical distribution based on whats described). The only similarity to a 1d cnn is the sliding window
$endgroup$
– mshlis
8 hours ago

$begingroup$
Why is it non-differentiable ?
$endgroup$
– zooby
8 hours ago

$begingroup$
do you train with sequences of different lenght, right? also if you put output as input think about output may be wrong so you can consider to force-feeding ( expected data instead of output)
$endgroup$
– user8426627
7 hours ago

$begingroup$
I could be wrong but generally actions are drawn from a distribution (that’s why you show one hot encodingns) and you can’t differentiate through a categorical distrib
$endgroup$
– mshlis
7 hours ago

add a comment |

edited 5 hours ago

nbro

5,6604 gold badges15 silver badges32 bronze badges

answered 9 hours ago

user8426627

22411 bronze badges

edited 5 hours ago

nbro

5,6604 gold badges15 silver badges32 bronze badges

answered 9 hours ago

user8426627

22411 bronze badges

edited 5 hours ago

nbro

5,6604 gold badges15 silver badges32 bronze badges

edited 5 hours ago

nbro

5,6604 gold badges15 silver badges32 bronze badges

edited 5 hours ago

nbro

5,6604 gold badges15 silver badges32 bronze badges

answered 9 hours ago

user8426627

22411 bronze badges

answered 9 hours ago

user8426627

22411 bronze badges

answered 9 hours ago

user8426627

22411 bronze badges

$begingroup$
Well that's good! Glad I'm on the right track! (Not sure what you mean at the end about variable lengths).
$endgroup$
– zooby
8 hours ago

$begingroup$
@zooby This is not a 1D CNN, its a non differentiable RNN. (actions must be sampled under some categorical distribution based on whats described). The only similarity to a 1d cnn is the sliding window
$endgroup$
– mshlis
8 hours ago

$begingroup$
Why is it non-differentiable ?
$endgroup$
– zooby
8 hours ago

$begingroup$
do you train with sequences of different lenght, right? also if you put output as input think about output may be wrong so you can consider to force-feeding ( expected data instead of output)
$endgroup$
– user8426627
7 hours ago

$begingroup$
I could be wrong but generally actions are drawn from a distribution (that’s why you show one hot encodingns) and you can’t differentiate through a categorical distrib
$endgroup$
– mshlis
7 hours ago

add a comment |

$begingroup$
Well that's good! Glad I'm on the right track! (Not sure what you mean at the end about variable lengths).
$endgroup$
– zooby
8 hours ago

$begingroup$
@zooby This is not a 1D CNN, its a non differentiable RNN. (actions must be sampled under some categorical distribution based on whats described). The only similarity to a 1d cnn is the sliding window
$endgroup$
– mshlis
8 hours ago

$begingroup$
Why is it non-differentiable ?
$endgroup$
– zooby
8 hours ago

$begingroup$
do you train with sequences of different lenght, right? also if you put output as input think about output may be wrong so you can consider to force-feeding ( expected data instead of output)
$endgroup$
– user8426627
7 hours ago

$begingroup$
I could be wrong but generally actions are drawn from a distribution (that’s why you show one hot encodingns) and you can’t differentiate through a categorical distrib
$endgroup$
– mshlis
7 hours ago

Well that's good! Glad I'm on the right track! (Not sure what you mean at the end about variable lengths).

– zooby
8 hours ago

@zooby This is not a 1D CNN, its a non differentiable RNN. (actions must be sampled under some categorical distribution based on whats described). The only similarity to a 1d cnn is the sliding window

– mshlis
8 hours ago

Why is it non-differentiable ?

– zooby
8 hours ago

do you train with sequences of different lenght, right? also if you put output as input think about output may be wrong so you can consider to force-feeding ( expected data instead of output)

– user8426627
7 hours ago

I could be wrong but generally actions are drawn from a distribution (that’s why you show one hot encodingns) and you can’t differentiate through a categorical distrib

– mshlis
7 hours ago

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Artificial Intelligence Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

MvrfXnqx4uynD8rP2 FEfXiNLUsRUukg,xqTMg,ezlsFqq7glostMX2FX5

搜尋此網誌

Xjyuk

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

19. јануар Садржај Догађаји Рођења Смрти Празници и дани сећања Види још Референце Мени за навигацијуу

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

19. јануар Садржај Догађаји Рођења Смрти Празници и дани сећања Види још Референце Мени за навигацијуу

1 Answer
1

1 Answer
1

1 Answer
1