Chapter 1. Data mining이란?

Data Mining Tasks

Descriptive methods : Find human-interpretable patterns that describe the data.
Example : Clustering, Pattern Mining

설명 방식은 사람이 이해할 수 있는 패턴으로 데이터를 설명한다. 예를 들어 클러스터링, 패턴 마이닝 등이 있다.

Predictive methods : Use some variables to predict unknown or future values of other variables
Example : Recommender Systems, Time Series Analysis

몇몇 변수들을 사용하여 예측 방식은 알려지지않거나 미래의 값들 예측한다. 예를 들어, 추천시스템, 시계열 분석이 있다.

Classification 이란?

Predictive Modeling의 하나로, Classification은 데이터들의 관계를 파악하는 것이다.
또한 새롭게 관측된 데이터의 Category를 스스로 판별하는 과정이다.

Example : Fraud detection, Churn predction for telephone customers, Sky survey categloging.

Regression 이란?

선형 혹은 비선형 모델의 독립성(관계)를 가정(추측)하여, 주어진 연속적인 변수들의 값들을 예측한다.
주어진 변수들은 다른 변수들의 값을 기반한다.
Predict a value of a given continuous valued variable based on the values of other variables, assuming a linear or nonlinear model of dependency.

Example : Prediction sales new products based on advertising, prediction wind velocities, Time series prediction of stock market indices.

Clustering 이란?

한 그룹안에서 객체들을 그룹지어주는 방식이다. 그룹지어진 객체들은 하나의 객체와 같은 특징들을 가지게 되며, 다른 그룹들의 객체들과는 다른 특징들을 가지게 된다.
Finding groups of objects such that the objects in a group will be similar (or related) to one another and different from (or unrelated to) the objects in other groups.

Example : Market Segmentation, Document Clustering

Learning 학습이란?

Learning은 크게 3가지로 분류된다.

지도학습 Supervised Learning
비지도학습 Unsupervised Learning
강화학습 Reinforcement Learning

Supervised Learning 이란?

정답(Label)을 알려주며 학습시키는 것이다. 위에서 언급한 Classificaiton과 Regression 방법들을 예로 들 수 있다.

Supervised Learning Algorithm
Classification	kNN
	Naive Bayes
	Support Vector
	Machine Decision
Regression	Linear Regression
	Locally Weighted Linear
	Ridge
	Lasso

Unsupervised Learning 이란?

정답(Label)을 알려주지 않고, 비슷한 데이터들을 군집화 하는 것이다. 대표적으로 클러스터링(Clustering)이 있다. 일반적으로 데이터는 Label(정답)과 Category가 무엇인지 알 수 없는 경우가 많기 때문에 이 방법을 많이 사용한다. Dimentionality Reduction과 Hidden Markov Model도 이에 속한다.

Unsupervised Learning Algorithm
Clustering
K-Means Clustering
Density Estimation
Exception Maximization
Pazen Window
DBSCAN

Reinforcement Learning 이란?

행동에 대한 결과와, 보상(Reward)을 수여하여 + 적인 요소를 최대화하고 - 적인 요소를 최소화 하도록 학습하는 것이다.

'AI Master Degree > Data Mining' 카테고리의 다른 글

Midterm Preparation : Data Mining (0)	2021.10.14
Chapter 3. Data preprocessing이란? (0)	2021.10.12
Chapter 2. Data 타입이란? Missing value란? Outliers란? (Data Mining) (0)	2021.10.12
Data Mining 기본 용어 정리 (0)	2021.10.11
Chapter 5. Clustering - 02 (0)	2021.09.29