[fast ai] Intro_ ( deep learning course 1 )

2021. 1. 8. 21:10

자 시작한다 애들아 잘 들어!

딥러닝 여행의 전설을 시작한다!

공부 방식: The key is to just code and try to solve problems: the theory can come later, when you have context and motivation.

What you will need to do to succeed however is to apply what you learn in this book to a personal project, and always persevere.!!!

*목차

[0.딥러닝 소개]

[1.딥러닝의 역사]

[2.pytorch, fastai, jupyter]

[3.첫 번째 모델]

[4.머신러닝이란 무엇인가?]

[5.Neural Network]

[6.머신러닝의 한계점]

[7.image recognizer 코드 리뷰]

[8.what our Image Recognizer Learned?]

[9.Image Recognizers can tackle non-image tasks]

[10.최종 복습]

[0.딥러닝 소개]

딥러닝이란 컴퓨터 기술로 데이터를 추출하고 변화시키는 것이다. 이때 뉴럴 네트워크를 통한다.

딥러닝은 간단하지만 매우 강력해서 여러 분야에 많이 쓰이고 있다.

대표적으로 어떤 예시가 있을까?

딥러닝으로 매우 높은 정확도를 보이는 분야는 다음과 같다.

1. 자연어처리 ( 말하는 것을 인식하고 문서를 요약하고 문서를 주제별로 구분하고 ...)

2. 컴퓨터비전 (인공위성이나 드론의 이미지 해석, 얼굴 인식, 자율주행차) -> 영상을 통해 이해하는 컴퓨터 프로그램

3. 약 ( 방사선 이미지에서 이상 징후 찾기 )

4. 생명과학 (세포 구분)

5. 이미지 생성 (해상도 높이기, 노이즈 제거)

6. 추천 시스템 (웹 검색, 제품 추천)

7. 게임 (알파고)

뉴럴 네트워크라는 하나의 모델을 통해서 다양한 분야에 적용될 수 있다는 점이 놀랍다!

[1.딥러닝의 역사]

딥러닝에는 2번의 겨울이 있었다.

1) critical mathematical functions like XOR logic gate. While they subsequently demonstrated in the same book that additional

layers can solve this problem, only the first insight was recognized, leading to the start of the first AI winter.

-> xor문제를 해결하지 못할 것이라고 생각하여 겨울!

2)adding just one extra layer of neurons was enough(too big and slow)

allow us to run and train neural networks hundreds of times faster than a regular CPU.

-> layer을 더 늘린다는 생각을 못하고 하나의 계층만 추가하여 두 번째 겨울!

그리고 나온 것은 parallel distributed processing 병렬 분산 처리이다.

=>이해안감x

Perhaps the most pivotal work in neural networks in the last 50 years was the multi-volume Parallel Distributed Processing (PDP) by David Rumelhart, James McClellan, and the PDP Research Group, released in 1986 by MIT Press. Chapter 1 lays out a similar hope to that shown by Rosenblatt:

: People are smarter than today's computers because the brain employs a basic computational architecture that is more suited to deal with a central aspect of the natural information processing tasks that people are so good at. ...We will introduce a computational framework for modeling cognitive processes that seems… closer than other frameworks to the style of computation as it might be done by the brain.

The premise that PDP is using here is that traditional computer programs work very differently to brains, and that might be why computer programs had been (at that point) so bad at doing things that brains find easy (such as recognizing objects in pictures). The authors claimed that the PDP approach was "closer than other frameworks" to how the brain works, and therefore it might be better able to handle these kinds of tasks.

In fact, the approach laid out in PDP is very similar to the approach used in today's neural networks. The book defined parallel distributed processing as requiring:

A set of processing units
A state of activation
An output function for each unit
A pattern of connectivity among units
A propagation rule for propagating patterns of activities through the network of connectivities
An activation rule for combining the inputs impinging on a unit with the current state of that unit to produce an output for the unit
A learning rule whereby patterns of connectivity are modified by experience
An environment within which the system must operate

We will see in this book that modern neural networks handle each of these requirements.

[2.pytorch, fastai, jupyter]

pytorch와 fastai는 딥러닝 라이브러리다.

파이토치는 기본적인 즉, 낮은 수준의 api를 제공하고, fastai는 높은 수준의 api를 제공한다.

즉, 딥러닝의 메소드를 쓰기위한 것이라고 보면 된다.

주피터 노트북은 텍스트, 코드, 이미지, 비디오 등을 모두 보여주는 시각화에 좋은 소프트웨어이다.

게다가 한 줄씩 실행한 결과를 보여줘서 데이터관련 개발자들이 잘 사용하는 소프트웨어라고 할 수 있다.

[3.첫 번째 모델]

왜 그들이 작동하는 지를 알려주기보다는 어떻게 하는 지를 알려줄 것이다.

개와 고양이를 구분하는 이미지 분류기 모델을 학습시키는 것부터 시작해보자.

*세팅부분을 반드시 해보자.

딥러닝 라이브러리를 사용하기 위해서는 gpu가 필요하다. 하지만 우리는 딥러닝을 공부하는 것이니깐 gpu에 대해서 신경쓰기 보다는

collab이나 다른 프로그램을 통해서 이를 연습해보도록 하자.

(*gpu: graphic processing unit으로 그래픽에 특화된 처리장치다. cpu보다 뉴럴 네트워크를 훈련하는 데 몇 백배 빠른 속도를 보인다.)

*1코드팁

img = PILImage.create(image_cat())

img.to_thumb(192)

from fastai2.vision.all import *

PILImage 이미지 클래스다.

*2코드팁

uploader = widgets.FileUpload()

uploader

업로드 객체를 통하여 파일을 업로드하자.

*3코드팁

img = PILImage.create(uploader.data[0])

is_cat,_,probs = learn.predict(img)

print(f"Is this a cat?: {is_cat}.")

print(f"Probability it's a cat: {probs[1].item():.6f}")

2코드팁에서 업로드한 사진을 객체로 생성하여 훈련된 모델을 통해 predict해보면

고양이인지 그 가능성은 얼마인지가 출력된다.

완성했으니 이게 어떤 의미인지 생각해보자!

[4.머신러닝이란 무엇인가?]

우리의 첫 classifier은 딥러닝 모델이다.

그리고 딥러닝은 머신러닝의 한 분야이다.

그럼, 첫 모델을 통해서 알아본 것을 우리가 이해하기 위해서는 딥러닝을 이해하기 보다는 머신러닝을 이해하는 것이 좋다.

그래서 머신러닝이 무엇인지 그리고 핵심개념을 알아보고자 한다.

머신러닝은 컴퓨터를 통해서 특정 task를 완료하는 방법이다.

우리가 기본적으로 알고 있는 프로그래밍으로 앞서 해결한 문제(개와 고양이를 구분하기)를 해결하기 위해서는 무수히 많은 룰을 프로그래밍해야하는 데 이는 불가능에 가깝다.

쉽게 말해서, step by step으로 코딩을 할텐데 사진 속의 물체를 인식하는 것을 어떻게 코딩하나?

우리가 사진 속 물체를 인식할때 어떤 step by step을 거치는 지 어떻게 알아?

1962년 인공지능 에세이에서 IBM연구자 아서 사무엘은 아래와 같이 말하였다.

"Programming a computer for such computations is, at best, a difficult task, not primarily because of any inherent complexity in the computer itself but, rather, because of the need to spell out every minute step of the process in the most exasperating detail. Computers, as any programmer will tell you, are giant morons, not giant brains."

그럼 어떻게 해결하였을까?

컴퓨터에게 step by step으로 알려주는 것이 아니고, 해결하도록 문제의 예시를 보여주는 것이다.

그리고 그 예시들과 정답을 보고 컴퓨터가 스스로 그 방법을 알아내도록 하는 것이다.

Suppose we arrange for some automatic means of testing the effectiveness of any current weight assignment in terms of actual performance and provide a mechanism for altering the weight assignment so as to maximize the performance. We need not go into the details of such a procedure to see that it could be made entirely automatic and to see that a machine so programmed would "learn" from its experience. --> 여기 이것이 그 아이디어다!

이게 무슨 말일까? 하나 하나 살펴보도록 하자.

1. weight assignment

2. weight assignment has some "actual performance"

3. automatic means testing that performance

4. mechanism for improving the performance

사실 아래 설명을 쓰긴 했는데 굉장히 알아먹기가 어렵다 대충 읽어서 감만 익혀보자.

1설명.

weight는 변수를 의미한다. 그리고 assignment는 그 weight 변수에 값을 할당하는 것이다.

예를 들어서, 이미지를 input으로 그리고 dog라는 결과가 출력된다고 하자.

weight assignment는 프로그램이 어떻게 작동하는 지 정의하는 값이다. (?)

(요즘에는 weight는 model parameter을 의미한다.)

2설명.

실제 성능과 관련하여 현재의 model parameter가 효율적인지를 확인하는 automatic means가 필요하다.

두 개의 모델의 성능을 비교하여 누가 이기는 지 확인한다.

예를 들어서 체스를 생각해보자. 실제 성능은 체스를 얼마나 잘하는가?

3설명.

성능을 극대화하도록 model parameter을 바꿀 수 있는 메카니즘이 필요하다.

쉽게 말하면 체스를 더 잘하도록 모델의 파라메터(weight assignment)를 갱신해줘야 한다. 그 메카니즘을 어떻게 할 수 있을까?

winning model과 losing model 사이의 weights 차이를 보고 이기는 방향으로 parameter을 갱신해주면 된다.

4설명.

즉 기계는 경험으로 부터 지속적으로 학습한다. weight를 갱신하는 과정이 automatic하다면 학습의 과정도 automatic해진다.

아래 그림을 보면 이해가 더 잘된다.

그리고 이렇게 해서 parameter들이 확정된다면 더 이상 update하지 않으므로 inputs -> model -> result의 처리과정을 가진다.

=> trained model can be treated just like a regular computer program.

이것이 머신러닝이다!

Machine Learning: The training of programs developed by allowing a computer to learn from its experience, rather than through manually coding the individual steps

[5.Neural Network]

우리는 융통성이 있는 함수를 원한다. 쉽게 말해보면 어떤 문제가 주어져도 weight만 바꿔주면 해결할 수 있는 함수를 원한다.

이런 함수가 바로 바로 "뉴럴 네트워크"다!!!

(universal approximation theorem라는 수학적 증명에 따르면, 뉴럴 네트워크가 모든 문제를 해결할 수 있다고 한다.)

그리고 그 weight를 구하는 메카니즘조차도 매번 달라지는 게 아니고 일반적인 메카니즘이 존재한다. 대박!

(일반적인 메카니즘 = stochastic gradient descent(SGD) 즉, 이론적으로 neral network랑 sgd를 통해서 거의 모든 문제가 해결 가능하다!)

=>정리

In other words, to recap, a neural network is a particular kind of machine learning model, which fits right in to Samuel's original conception. Neural networks are special because they are highly flexible, which means they can solve an unusually wide range of problems just by finding the right weights. This is powerful, because stochastic gradient descent provides us a way to find those weight values automatically.

자, 이제 배운 내용을 기반으로 위에서 공부한 첫 모델에 대해서 다시 바라보자

input = images

weight = weights in the neural net

mode = neural net

output = dog , cat

mechanism for updating the weight assignments = sgd

그럼 대박이네 모든 문제를 딥러닝을 통해서 일정 수준의 정확도로 해결할 수 있다는 거 잖아?

<<notice!>>

results of the model = predictions

measure of performance = loss

loss depends on the predictions and correct labels.

[6.머신러닝의 한계점]

그럼 딥러닝은 너무 사기가 아닌가라고 생각할 수 있다.

하지만 이러한 무적의 머신러닝에도 한계점이 있었으니....

1) 데이터없이는 모델이 만들어질 수 없다.

2) 모델이 train하기 위한 데이터로부터만 학습할 수 있다.

3) prediction을 출력할 뿐, action을 추천하는 것이 아니다(?)

4) 데이터에서 예시뿐만 아니라 정답도 포함되어야한다.

정답이 있는 예시데이터가 필요하고, 학습과정에서 예시데이터에 과적합될 수 있다는 단점이 있다.

=>이해가 안감.

Generally speaking, we've seen that most organizations that say they don't have enough data, actually mean they don't have enough labeled data. If any organization is interested in doing something in practice with a model, then presumably they have some inputs they plan to run their model against. And presumably they've been doing that some other way for a while (e.g., manually, or with some heuristic program), so they have data from those processes! For instance, a radiology practice will almost certainly have an archive of medical scans (since they need to be able to check how their patients are progressing over time), but those scans may not have structured labels containing a list of diagnoses or interventions (since radiologists generally create free-text natural language reports, not structured data). We'll be discussing labeling approaches a lot in this book, because it's such an important issue in practice.

Since these kinds of machine learning models can only make predictions (i.e., attempt to replicate labels), this can result in a significant gap between organizational goals and model capabilities. For instance, in this book you'll learn how to create a recommendation system that can predict what products a user might purchase. This is often used in e-commerce, such as to customize products shown on a home page by showing the highest-ranked items. But such a model is generally created by looking at a user and their buying history (inputs) and what they went on to buy or look at (labels), which means that the model is likely to tell you about products the user already has or already knows about, rather than new products that they are most likely to be interested in hearing about. That's very different to what, say, an expert at your local bookseller might do, where they ask questions to figure out your taste, and then tell you about authors or series that you've never heard of before.

*feedback loop

모델이 환경과 상호작용하는 방식

모델이 더 사용될수록 더 편향된 데이터가 만들어지고 모델도 더 편향되어진다.

=positive feedback loop

예를 들어서, 비디오 추천시스템에서 극단주의자들이 비디오 콘텐츠를 더 많이보는 경향이 있고 그 결과로 유저들에게 그런 비디오를 추천하게 된다. 느낌만 보고 나중에 더욱 알아보자.

[7.image recognizer 코드 리뷰]

1. from fastai.vision.all import *

-> This gives us all of the functions and classes we will need to create a wide variety of computer vision models.

2. path = untar_data(URLs.PETS)/'images'

-> The second line downloads a standard dataset from the fast.ai datasets collection (if not previously downloaded) to your server, extracts it (if not previously extracted), and returns a Path object with the extracted location

(머신러닝에서 데이터없이는 아무것도 할 수 없으니깐 데이터를 미리 준비해준다.)

tip. untar_data is a helper function to download datasets that are present in URLs.

(url로부터 데이터셋을 받아와주는 함수이다.)

(참고1)

untar_data is a very powerful convenience function to download files from url to dest. The url can be a default url from the URLs class or a custom url. If dest is not passed, files are downloaded at the default_dest which defaults to ~/.fastai/data/.

This convenience function extracts the downloaded files to dest by default. In order, to simply download the files without extracting, pass the noop function as extract_func.

(참고2)

For instance, notice that the fastai library doesn't just return a string containing the path to the dataset, but a Path object. This is a really useful class from the Python 3 standard library that makes accessing files and directories much easier. If you haven't come across it before, be sure to check out its documentation or a tutorial and try it out. Note that the https://book.fast.ai[website] contains links to recommended tutorials for each chapte)

3. def is_cat(x): return x[0].isupper()

-> we define a function, is_cat, labels cats based on a filename rule provided by the dataset creators:

4.dls = ImageDataLoaders.from_name_func(
path, get_image_files(path), valid_pct=0.2, seed=42,
label_func=is_cat, item_tfms=Resize(224))

dataset을 받아오는데 다양한 클래스가 있다. ImageDataLoaders는 그 중 하나다.

fastai에게 말해주는 다른 중요한 정보는 데이터셋에 있는 label이다. label은 filename이나 path로 되어있다.

그리고 labeling method는 is_cat

validation set 20프로는 모델의 정확도를 위해서 사용된다. seed = 42가 매번 코드를 실행할 때마다 같은 validation set 20프로를 추출하여준다. 이렇게하면 우리가 모델을 바꾸고, 다시 학습시킬때 validation set이 아닌 model에 의하여 차이점이 발생한다는 것을 알 수 있다.

우리는 모델이 학습하지 못한 unseen data에 대해서 잘작동하기를 원하기 때문에, validation set은 accuracy를 측정하는 경우에만 사용된다. 왜냐하면 적은 data set에 대해서 오랫동안 학습을 시키면, 그 특징들을 일반화하여 학습하는 것이 아니라 마치 정답을 외우듯이 모델이 학습되어 overfitting되기 때문이다.

overfitting이란 training set에 너무 집중된 나머지 학습을 하면 할수록 training set에 대한 accuracy는 높아지지만, validation set에 대해서는 처음에는 높아지는 경향을 보이나 후에는 오히려 성능이 안좋아져버리는 경우를 말한다.

->이해x

There are various different classes for different kinds of deep learning datasets and problems—here we're using ImageDataLoaders. The first part of the class name will generally be the type of data you have, such as image, or text.

The other important piece of information that we have to tell fastai is how to get the labels from the dataset. Computer vision datasets are normally structured in such a way that the label for an image is part of the filename, or path—most commonly the parent folder name. fastai comes with a number of standardized labeling methods, and ways to write your own. Here we're telling fastai to use the is_cat function we just defined.

Finally, we define the Transforms that we need. A Transform contains code that is applied automatically during training; fastai includes many predefined Transforms, and adding new ones is as simple as creating a Python function. There are two kinds: item_tfms are applied to each item (in this case, each item is resized to a 224-pixel square), while batch_tfms are applied to a batch of items at a time using the GPU, so they're particularly fast (we'll see many examples of these throughout this book).

->이해x

The Pet dataset contains 7,390 pictures of dogs and cats, consisting of 37 different breeds. Each image is labeled using its filename: for instance the file great_pyrenees_173.jpg is the 173rd example of an image of a Great Pyrenees breed dog in the dataset. The filenames start with an uppercase letter if the image is a cat, and a lowercase letter otherwise. We have to tell fastai how to get labels from the filenames, which we do by calling from_name_func (which means that labels can be extracted using a function applied to the filename), and passing x[0].isupper(), which evaluates to True if the first letter is uppercase (i.e., it's a cat).

5. learn = cnn_learner(dls, resnet34, metrics=error_rate)

이미지 분류기를 학습시키는 코드이다.

convolutional neural network를 사용하여 어떤 모델을 사용할지

어떤 데이터를 train에 사용할지 어떤 metric을 사용할지 정해준다.

(cnn에 대해서는 추후에 학습예정)

사실 모델은 중요한 것이 아니다. 우리는 대부분 ResNet 모델을 사용하게 될것이다. 이 모델은 대부분의 데이터셋과 문제에서 빠르고 정확하다. 무엇보다도 중요한 것은 데이터이다.

모델에 layer이 많으면 학습하는데 오랜 시간이 걸리고 더 overfitting되는 경향이 있다. (i.e. you can't train them for as many epochs before the accuracy on the validation set starts getting worse). 반면에, 데이터가 많아질수록 더욱 정확해진다.

metric은 모델의 성능을 측정하는 용도이다. 매 epoch마다 출력된다.

여기서는 error_rate를 사용하는데 이는 정답으로부터 얼마나 틀리게 예측되었는지를 퍼센트로 보여준다. on validation set

[loss와 metric의 차이점]

loss는 training system이 parameter를 update할때 쓰는 성능의 측정방법이라면 쉽게 말해서, 좋은 loss function은 메카니즘인 stochastic gradient descent이 사용하기 쉬운 것이라면, metric은 인간이 이해하기에 쉬운 것을 말한다.

[loss function 만으로도 metric할 수 있을탠데 왜 그렇게 안할까??]

loss function은 문제(모델)별로 달라 loss를 직접 비교할 수 없지만, metric은 정규화되어있어 모델간 비교가 가능하기때문이예요. loss function을 사람이 이해하기 어려운 측면도 있구요

stackoverflow.com/questions/57756451/why-we-use-the-loss-to-update-our-model-but-use-the-metrics-to-choose-the-model/57756794#57756794

Why we use the loss to update our model but use the metrics to choose the model we need?

First of all,I am confused about why we use the loss to update the model but use the metrics to choose the model we need. Maybe not all of code, but most of the code I've seen does,they use

stackoverflow.com

참고.

게다가 이 코드에는 pretrained parameter가 기본 값인 true로 설정되어있다. 유명한 데이터셋인 ImageNet dataset으로 부터 수천개의 분류를 이미 학습했기때문에 pretrained model을 사용하여 더 빠르게 분류가 가능하다.

이렇게, 다른 데이터셋으로부터 미리 학습된 모델을 pretrained model이라고 말한다.

(weight가 이미 학습이 되어있는 상태다.)

우리는 이렇게 이전에 학습된 모델을 활용하는 것이 더 효과적이다. = more quickly, more accurate, with less data, less money

(pretrained model을 사용할때, cnn_learner은 마지막 레이어를 삭제한다. 왜냐하면 이전에 학습한 데이터셋에 맞춰진 레이어이기때문이다. 이 레이어를 삭제하고 거기에 하나 이상의 새로운 레이어(랜덤화된 파라메터를 가진)로 대체하여 현재 나의 데이터셋에서 사용할 수 있도록 바꿔줘야한다.)

( 추가로, 이렇게 새로 추가된 레이어 부분을 head of a model이라고 한다. )

저자가 말하기를 pretrained model은 학문적인 사람들이 잘 이해하지 못할 것이라고 말한다.

-> what the??

[transfer learning]

analysisbugs.tistory.com/103

우리가 앞서 말한 Pretrained model을 다른 task에 사용하는 것을 transfer learning이라고 한다.

(아직, 연구가 잘되지 않아서 적은 도메인에서만 사용할 수 있다. 예를 들어서, medicine 도메인에서는 연구가 잘되지 않은 편이다.)

6.learn.fine_tune(1)

->fastai how to fit the model:

어떻게 모델의 파라메터를 fit시킬 것인가가 주된 관심사다.

fit하기 위해서는 number of epochs를 정해야한다.

number of epochs는 1) 얼마나 시간이 있는지, 2) 모델을 실제로 맞추는 데 얼마나 걸리는지에 따라 달라진다.

[중요한 점: 왜 fit을 사용하지 않고 fine_tune을 사용하느냐?]

fit method를 이용하여 적절한 parameter를 구할 수 있다. 그러나 이 경우에는 pretrained model을 사용하기 때문에 fine_tuning을 사용한다.

*fine tuning: pretrained model의 parameter을 additional epochs를 통해 update하는 transfer learning 기술

fine tune은 두 단계를 거친다.

->이해x

1.Use one epoch to fit just those parts of the model necessary to get the new random head to work correctly with your dataset.

2.Use the number of epochs requested when calling the method to fit the entire model, updating the weights of the later layers (especially the head) faster than the earlier layers (which, as we'll see, generally don't require many changes from the pretrained weights).

그래서 지금까지의 코드로 개와 고양이를 구분한다.

그럼 좀 더 자세히 알아보자.

[8.what our Image Recognizer Learned?]

잘 작동하는 분류기를 만들기는 했는데, 얘네들이 무엇을 하는 지를 모르니깐 그 부분을 알아보고자 한다.

이 부분은 사실 black box라고 불리며 알기 힘들다고 하지만 사실 방대한 연구를 통해 알 수 있다고 한다.

처음 이미지 분류기가 크게 선보였던 그 당시 모델인 ALEXNET을 이용하여 설명하고자 한다.

사진은 01_intro에서 확인할 수 있다.

->이해x

This picture requires some explanation. For each layer, the image part with the light gray background shows the reconstructed weights pictures, and the larger section at the bottom shows the parts of the training images that most strongly matched each set of weights.(각 레이어에 대해 밝은 회색 배경의 이미지 부분은 재구성 된 가중치 그림을 보여주고 하단의 큰 섹션은 각 가중치 집합과 가장 일치하는 훈련 이미지 부분을 보여줍니다.)

layer 1 에서는 model이 diagonal, horizontal, vertical edges, various different gradients의 weights를 발견한 것을 알 수 있다.

이런 것들이 컴퓨터 비전으로부터 학습된 모델의 basic building blocks(기본 단위)이다.

정말 신기하게도 연구자들이 분석한 결과 이러한 building blocks가 사람 눈의 basic visual machinery(기본 눈 기계)와 비슷하다고 한다.

For layer 2, there are nine examples of weight reconstructions for each of the features found by the model

layer 2 에서는 corners, repeating lines, circles, ... simple patterns.

이러한 것들은 ayer 1 에서 발전된 basic building block으로 부터 만들어졌다.

오른쪽의 사진은 실제 사진에서 앞서 발견한 features(conrners, repeating lines....)와 매치되는 작은 부분들이다.

layer 3 에 도달하게 되면 드디어 레이어3에 들어오는 features가 의미있는 부분들을 인식하는 것을 확인할 수 있다.

차의 바퀴나 텍스트 그리고 꽃 등등을 인식할 수 있게 되었다.

이렇게 layer가 더 깊숙해질수록 더 높은 개념을 인식하게된다.

눈 코 입 -> 개의 얼굴 -> 개의 종류 .... 계속해서 높은 특징들을 인식해나간다.

그래서 pretrained model을 활용하면 작은 단위의 특징들을 이미 찾아놨기때문에 더 빠른 속도로 학습이 가능한 것이다.

[9.Image Recognizers can tackle non-image tasks]

많은 것들이 이미지로 표현되기때문에 이미지 분류기를 통해 다양한 분야의 task를 해결할 수 있다.

1)예를 들어서 소리는 spectrogram으로 변하여 이미지로 볼 수 있다.

etown.medium.com/great-results-on-audio-classification-with-fastai-library-ccaf906c5f52

Great results on audio classification with fastai library

The latest version of Jeremy Howard’s fast.ai deep learning for coders course has just begun. It utilizes the new fastai library built on…

etown.medium.com

2) 또 다른 예시로는 images from a time series dataset for olive oil classification, using a technique called Gramian Angular Difference Field (GADF);

3) 세번째 예시로 fraud detection at splunk에서도 마우스의 움직임이나 클릭을 이용하여 이미지를 만들고 이 모델을 활용하였다.

www.splunk.com/en_us/blog/security/deep-learning-with-splunk-and-tensorflow-for-security-catching-the-fraudster-in-neural-networks-with-behavioral-biometrics.html

Splunk and Tensorflow for Security: Catching the Fraudster with Behavior Biometrics

Raising the barrier for fraudsters and attackers: how to leverage Splunk and Deep Learning frameworks to discover Behavior Biometrics patterns within user activities

www.splunk.com

4) 네번째 예시로 malware classification은 이진수의 malware을 8진수 벡터로 만들고 그것을 grayscal image로 바꾸어 사용하였다.

ieeexplore.ieee.org/abstract/document/8328749

Malware Classification with Deep Convolutional Neural Networks - IEEE Conference Publication

ieeexplore.ieee.org

이렇게하면 malware의 종류에 따라서 이미지가 다르게 잡힌다. 사람의 눈으로도 구분이 가능한데 사람의 눈으로 구분이 가능하면 딥러닝으로도 구분이 가능하다.

이것이 dataset을 이미지로 전환하여 사용한 가장 좋은 예시라고 생각한다.

이건 내 생각인데 결국 모델과 알고리즘은 있으니깐 내가 현재 해결하고 싶은 문제를 이미지로 변환하는 것이 중요하다고 생각한다.

In general, you'll find that a small number of general approaches in deep learning can go a long way, if you're a bit creative in how you represent your data!

[10.최종 복습]

여기서 조금 헷갈리는 것은 아키텍쳐는 모델의 틀이고 특정 데이터셋에 맞춰져 파라메터가 정해진 것을 모델이라고 한다.

전체 정리는 영어로 남긴다 해석해서 적기 귀찮다.

=> Machine learning is a discipline where we define a program not by writing it entirely ourselves, but by learning from data. Deep learning is a specialty within machine learning that uses neural networks with multiple layers. Image classification is a representative example (also known as image recognition). We start with labeled data; that is, a set of images where we have assigned a label to each image indicating what it represents. Our goal is to produce a program, called a model, which, given a new image, will make an accurate prediction regarding what that new image represents.

Every model starts with a choice of architecture, a general template for how that kind of model works internally. The process of training (or fitting) the model is the process of finding a set of parameter values (or weights) that specialize that general architecture into a model that works well for our particular kind of data. In order to define how well a model does on a single prediction, we need to define a loss function, which determines how we score a prediction as good or bad.

To make the training process go faster, we might start with a pretrained model—a model that has already been trained on someone else's data. We can then adapt it to our data by training it a bit more on our data, a process called fine-tuning.

When we train a model, a key concern is to ensure that our model generalizes—that is, that it learns general lessons from our data which also apply to new items it will encounter, so that it can make good predictions on those items. The risk is that if we train our model badly, instead of learning general lessons it effectively memorizes what it has already seen, and then it will make poor predictions about new images. Such a failure is called overfitting. In order to avoid this, we always divide our data into two parts, the training set and the validation set. We train the model by showing it only the training set and then we evaluate how well the model is doing by seeing how well it performs on items from the validation set. In this way, we check if the lessons the model learns from the training set are lessons that generalize to the validation set. In order for a person to assess how well the model is doing on the validation set overall, we define a metric. During the training process, when the model has seen every item in the training set, we call that an epoch.

All these concepts apply to machine learning in general. That is, they apply to all sorts of schemes for defining a model by training it with data. What makes deep learning distinctive is a particular class of architectures: the architectures based on neural networks. In particular, tasks like image classification rely heavily on convolutional neural networks, which we will discuss shortly.

[11.이미지 분류를 넘어가서..]

이미지를 분류하는 것을 넘어서서 이미지 중에 인간을 인식할 수 있어야 한다.

reniew.github.io/18/

CNN을 활용한 주요 Model - (4) : Semantic Segmentation

An Ed edition

reniew.github.io

Segmentation이란, Image를 Pixel단위로 구분해 각 pixel이 어떤 물체 class인지 구분하는 문제다.

= Creating a model that can recognize the content of every individual pixel in an image is called segmentation.

객체를 구분할때 각 픽셀을 각자의 색으로 칠하면 차는 노랑으로 하늘은 핑크로 건물 중 창문은 파랑으로 ... 등등 구분할 수 있다.

[clean]

aigong.tistory.com/137

Solution : RuntimeError: CUDA out of memory.

Solution : RuntimeError: CUDA out of memory. AI 특히나 parameter가 많은 Neural Network를 사용한다는 것은 GPU RAM이 많이 필요하다는 것을 의미하기도 합니다. 때문에 장비가 좋은 회사에서만 가능한 모델..

aigong.tistory.com

'이제는 사용하지 않는 공부방 > Artificial intelligence' 카테고리의 다른 글

fastai vision tutorial/single label classification (0)	2021.01.14
[이해] gpt - 3 (0)	2021.01.10
[이해] DALL E: Creating Images from Text (0)	2021.01.10
[이해] 자율주행차의 원리 Self driving cars (0)	2021.01.10
[service] colab (0)	2021.01.06

나의 배움터

[fast ai] Intro_ ( deep learning course 1 )

'이제는 사용하지 않는 공부방 > Artificial intelligence' 카테고리의 다른 글

+ Recent posts

티스토리툴바