How to choose the number of hidden layers and nodes in a feedforward neural network?
Is there a standard and accepted method for selecting the number of layers, and the number of nodes in each layer, in a feed-forward neural network? I'm interested in automated ways of building neu...
stats.stackexchange.com
Determining its size (number of neurons) is simple
The Input Layer
shape of your training data
Specifically, the number of neurons comprising that layer is equal to the number of features (columns) in your data. Some NN configurations add one additional node for a bias term.
The Output Layer
Machine mode: returns a class label (e.g., "Premium Account"/"Basic Account"). Regression Mode returns a value (e.g., price).
If the NN is a regressor, then the output layer has a single node.
If the NN is a classifier, then it also has a single node unless softmax is used in which case the output layer has one node per class label in your model.
The Hidden Layers
How many hidden layers? Well if your data is linearly separable (which you often know by the time you begin coding a NN) then you don't need any hidden layers at all. Of course, you don't need an NN to resolve your data either, but it will still do the job.
see the insanely thorough and insightful NN FAQ for an excellent summary of that commentary
comp.ai.neural-nets FAQ, Part 1 of 7: Introduction
Copyright 1997, 1998, 1999, 2000, 2001, 2002 by Warren S. Sarle, Cary, NC, USA. --------------------------------------------------------------- Additions, corrections, or improvements are always welcome. Anybody who is willing to contribute any information
www.faqs.org
the situations in which performance improves with a second (or third, etc.) hidden layer are very few. One hidden layer is sufficient for the large majority of problems.
'the optimal size of the hidden layer is usually between the size of the input and size of the output layers'. Jeff Heaton, author of Introduction to Neural Networks in Java offers a few more.
Heaton Research Bookstore
Jeff Heaton publishs several books on Artificial Intelligence. These books are written to be programming language independent. These books usually contain examples in the Python, R, Java, and C# pr
www.heatonresearch.com
In sum, for most problems, one could probably get decent performance (even without a second optimization step) by setting the hidden layer configuration using just two rules: (i) number of hidden layers equals one; and (ii) the number of neurons in that layer is the mean of the neurons in the input and output layers.
You state that for the majority of problems need only one hidden layer. Perhaps it is better to say that NNs with more hidden layers are extremly hard to train
'이제는 사용하지 않는 공부방 > Artificial intelligence' 카테고리의 다른 글
BERT (bidirectional encoder representations from transformers) (0) | 2021.10.04 |
---|---|
Transformer ( attention is all you need ) (0) | 2021.10.01 |
자꾸 헷갈리는 axis 정리 (0) | 2021.09.14 |
자연어처리 이론 한번에 정리하기 (0) | 2021.09.01 |
learn about a specific topic (0) | 2021.08.31 |