Skip to content

yunjinyong730/PatchMLP_Sensor_Calibration

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

15 Commits
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

PatchMLP_Sensor_Calibration

# ์‹คํ—˜ ๊ฒฐ๊ณผ
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_4 (InputLayer)        [(1, 360, 5)]             0         
                                                                 
 normalizer_3 (Normalizer)   (1, 360, 5)               0         
                                                                 
 PatchMLP (PatchMLP)         (1, 1)                    3619      
                                                                 
 denormalizer_3 (Denormaliz  (1, 1)                    0         
 er)                                                             
...
_________________________________________________________________
934/934 [==============================] - 7s 6ms/step
Inference time: 7.062 seconds
Throughput: 4231.30 samples/second

PM10 ์ •ํ™•๋„

Antwerp_pm10_w360 Antwerp: val rmse : 8.31779956817627, test rmse : 11.96527099609375
oslo_pm10_w360 Oslo: val rmse : 9.048280715942383, test rmse : 12.056038856506348
Zagreb_pm10_w360 Zagreb: val rmse : 16.360061645507812, test rmse : 13.834146499633789

SensEURCity ๋ฐ์ดํ„ฐ ์…‹

(3๊ฐœ ์œ ๋Ÿฝ ๋„์‹œ ๋Œ€๊ทœ๋ชจ ๋ฏธ์„ธ๋จผ์ง€ ๋ฐ์ดํ„ฐ| ์ €๋น„์šฉ ์„ผ์„œ | ๊ณ ๋น„์šฉ ์„ผ์„œ ๋ฐ์ดํ„ฐ ํฌํ•จ)

์‚ฌ์šฉํ•œ ๋ฐ์ดํ„ฐ ์…‹:

Paper Link:

PatchMLP: Patch ๊ธฐ๋ฐ˜ MLP๋กœ ์žฅ๊ธฐ ์‹œ๊ณ„์—ด ์˜ˆ์ธก์„ ์žฌ์ •์˜ํ•จ

patchMLP๋ฅผ ๋…ผ๋ฌธ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ๊ตฌํ˜„ํ•˜๊ณ , ํ•ด๋‹น ๋ชจ๋ธ์„ ์„ผ์„œ ๋ณด์ •(์ €๋น„์šฉ ์„ผ์„œ๋ฅผ ๊ณ ๋น„์šฉ ์„ผ์„œ ๋ฐ์ดํ„ฐ๋งŒํผ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์„ ์ด์šฉํ•ด์„œ ๋ณด์ •) ๋ชจ๋ธ๋กœ ๋ณ€ํ™˜ ํ›„
์˜จ-๋””๋ฐ”์ด์Šค ์„ฑ๋Šฅ ์ธก์ •ํ•˜๊ธฐ

TL;DR

Transformer๊ฐ€ LTSF(Long-Term Time Series Forecasting)์—์„œ ๊ฐ•๋ ฅํ•ด ๋ณด์ด๋Š” ์ด์œ ๋Š” ์ž์ฒด๊ฐ€ ์•„๋‹ˆ๋ผ โ€˜Patchโ€™ ํ‘œํ˜„ ๋•์ผ ์ˆ˜ ์žˆ์œผ๋ฉฐ, ๋ฉ€ํ‹ฐ์Šค์ผ€์ผ ํŒจ์น˜ + ์ž„๋ฒ ๋”ฉ ๊ณต๊ฐ„์—์„œ์˜ ๋‹จ์ˆœ ๋ถ„ํ•ด + Intra/Inter-variable MLP ํ˜ผํ•ฉ๋งŒ์œผ๋กœ๋„ SOTA๋ฅผ ๋‹ฌ์„ฑํ•˜๋Š” PatchMLP๋ฅผ ์ œ์•ˆํ•จ

๋ฌธ์ œ์˜์‹๊ณผ ํ•ต์‹ฌ ๋ฉ”์‹œ์ง€

  • ์ž๊ธฐํšŒ๊ท€์ ์ด์ง€ ์•Š์€ Transformer์˜ Permutation-invariant self-attention์€ ์ ˆ๋Œ€์  ์‹œ๊ฐ„์งˆ์„œ๋ฅผ ํฌ์„์‹œํ‚ค๋ฉฐ, ์›์‹œ ์‹œ๊ณ„์—ด์˜ ๊ณ ์ฃผํŒŒ ์žก์Œ๊ณผ ์ค‘๋ณต ํŠน์ง•์— ์ทจ์•ฝํ•จ์„ ๋ณด์ž„
  • ๋ฐ˜๋ฉด Patch๋Š” ์ง€์—ญ์„ฑ(locality)์„ ๊ฐ•ํ™”ํ•˜๊ณ  ์ฐจ์›์„ ์ถ•์†Œํ•˜๋ฉฐ ์Šค๋ฌด๋”ฉ ํšจ๊ณผ๋กœ ์žก์Œ์„ ์ค„์—ฌ, ์‹œ๊ณ„์—ด์— ๋” ์ ํ•ฉํ•œ ์ž…๋ ฅ ํ‘œํ˜„์„ ์ œ๊ณตํ•จ
  • ์ตœ๊ทผ ์œ ํ–‰ํ•œ ์ฑ„๋„ ๋…๋ฆฝ(channel independence) ๊ฐ€์„ค์€ ๊ณผ๋Œ€ํ‰๊ฐ€๋˜์—ˆ๊ณ , ์˜ฌ๋ฐ”๋ฅธ ๋ฐฉ์‹์˜ ๋ณ€์ˆ˜ ๊ฐ„ ์ƒํ˜ธ์ž‘์šฉ(channel mixing) ์€ ๋‹ค๋ณ€๋Ÿ‰ ์˜ˆ์ธก ์„ฑ๋Šฅ์— ํ•„์ˆ˜์ ์ž„์„ ๋ณด์ž„
  • ์œ„ ํ†ต์ฐฐ์„ ๋ฐ”ํƒ•์œผ๋กœ, ๋ณต์žกํ•œ ์–ดํ…์…˜ ์—†์ด๋„ ๋‹จ์ˆœ MLP๋กœ ์ตœ์ฒจ๋‹จ ์„ฑ๋Šฅ์„ ๋‹ฌ์„ฑํ•˜๋Š” PatchMLP๋ฅผ ์„ค๊ณ„ํ•จ

๋ชจ๋ธ ๊ฐœ์š”

image

PatchMLP๋Š” ๋„ค ๊ฐœ์˜ ๊ตฌ์„ฑ์š”์†Œ๋กœ ์ด๋ค„์ง

  • Multi-Scale Patch Embedding (MPE): ์„œ๋กœ ๋‹ค๋ฅธ ๊ธธ์ด์˜ ํŒจ์น˜๋กœ ์›์‹œ ์‹œ๊ณ„์—ด์„ ๋น„์ค‘์ฒฉ ๋ถ„ํ• ํ•œ ๋’ค, ๊ฐ ํŒจ์น˜๋ฅผ ์„ ํ˜• ์ž„๋ฒ ๋”ฉํ•˜์—ฌ ๋ฉ€ํ‹ฐ์Šค์ผ€์ผ ์ •๋ณด๋ฅผ ๊ฒฐํ•ฉํ•จ
  • Feature Decomposition in Embedding Space: ์›์‹ ํ˜ธ๊ฐ€ ์•„๋‹ˆ๋ผ ์ž„๋ฒ ๋”ฉ ํ† ํฐ์„ ํ‰๊ท ํ’€๋ง(AvgPool) ๊ธฐ๋ฐ˜์œผ๋กœ ์Šค๋ฌด์Šค ์„ฑ๋ถ„(Xs) ๊ณผ ์ž”์ฐจ ์„ฑ๋ถ„(Xr) ๋กœ ๋ถ„๋ฆฌํ•˜์—ฌ, ๋žœ๋ค ์š”๋™์„ ์–ต์ œํ•˜๊ณ  ์œ ์˜๋ฏธํ•œ ํŒจํ„ด์„ ๋ถ€๊ฐํ•จ
  • MLP Layer with Dual Mixing:
    • Intra-variable MLP๋กœ ์‹œ๊ฐ„์ถ• ๋‚ด ํŒจํ„ด์„, Inter-variable MLP๋กœ ๋ณ€์ˆ˜ ๊ฐ„ ์ƒํ˜ธ์ž‘์šฉ์„ ํ•™์Šตํ•จ
    • Inter-variable ๊ฒฝ๋กœ์—์„œ ์ ๊ณฑ(dot-product) ๊ฒฐํ•ฉ์„ ๋„์ž…ํ•˜์—ฌ ๋น„์„ ํ˜• ์ƒํ˜ธ์ž‘์šฉ์„ ๊ฐ•ํ™”ํ•จ
    • ๊ฐ ๋ธ”๋ก ๋’ค์— Residual connection๊ณผ ์ •๊ทœํ™”๋ฅผ ์ ์šฉํ•ด ํ•™์Šต ์•ˆ์ •์„ฑ์„ ํ™•๋ณดํ•จ
  • Projection Layer & Loss: ์ž ์žฌํ‘œํ˜„์„ ์›๊ณต๊ฐ„์œผ๋กœ ํˆฌ์˜ํ•ด ๋ฉ€ํ‹ฐ์Šคํ… ์˜ˆ์ธก์„ ์‚ฐ์ถœํ•˜๊ณ , MSE ์†์‹ค๋กœ ํ•™์Šตํ•จ

์™œ Patch๊ฐ€ ํšจ๊ณผ์ ์ธ๊ฐ€

  • ๊ณ ๋นˆ๋„ ์ƒ˜ํ”Œ๋ง์œผ๋กœ ์ธํ•œ ์ค‘๋ณตยท์žก์Œ ํŠน์„ฑ์ด ๋งŽ์€ ์‹œ๊ณ„์—ด์—์„œ, Patch๋Š” ์ž…๋ ฅ์„ ์••์ถ•ยทํ‰ํ™œํ•ด ๋…ธ์ด์ฆˆ ๋ฏผ๊ฐ๋„๋ฅผ ๋‚ฎ์ถ”๊ณ  ์ง€์—ญ์  ์˜๋ฏธ ๊ตฌ์กฐ๋ฅผ ๊ฐ•ํ™”ํ•จ
  • ํŒจ์น˜ ํฌ๊ธฐ๋Š” ์ปค์งˆ์ˆ˜๋ก ํ•ญ์ƒ ์œ ๋ฆฌํ•˜์ง€ ์•Š์œผ๋ฉฐ, ๋ชจ๋ธ ์šฉ๋Ÿ‰(d_model) ๊ณผ์˜ ๊ท ํ˜•์ด ์ค‘์š”ํ•จ์„ ์‹คํ—˜์œผ๋กœ ๋ณด์ž„
  • ์ž…๋ ฅ ๊ธธ์ด๊ฐ€ ๊ธธ์–ด์งˆ์ˆ˜๋ก ์ตœ์  ํŒจ์น˜ ํฌ๊ธฐ๋„ ์ปค์ง€๋Š” ๊ฒฝํ–ฅ์ด ์žˆ์œผ๋ฉฐ, ์ง€๋‚˜์นœ ์••์ถ•์€ ์ •๋ณด ์†์‹ค์„ ์œ ๋ฐœํ•  ์ˆ˜ ์žˆ์Œ์„ ๋ณด๊ณ ํ•จ

์„ค๊ณ„ ํฌ์ธํŠธ

  • ์ž„๋ฒ ๋”ฉ ํ›„ ๋ถ„ํ•ด: ์ „ํ†ต์  ์ถ”์„ธ/๊ณ„์ ˆ ๋ถ„ํ•ด๋ฅผ ์›์‹ ํ˜ธ์—์„œ ์ˆ˜ํ–‰ํ•˜๋Š” ๋Œ€์‹ , ์ž„๋ฒ ๋”ฉ ๊ณต๊ฐ„์—์„œ ํ‰๊ท ํ’€๋ง์œผ๋กœ ์Šค๋ฌด์Šค/์ž”์ฐจ๋ฅผ ๋‚˜๋ˆ  ๊ฐ„๋‹จํ•˜๋ฉด์„œ๋„ ํšจ๊ณผ์ ์ธ ์žก์Œ ์–ต์ œ๋ฅผ ๋‹ฌ์„ฑํ•จ
  • Dual Mixing: ์ฑ„๋„ ๋…๋ฆฝ์ด ๋งŒ๋Šฅ์ด ์•„๋‹ˆ๋ฉฐ, ์ ์ ˆํ•œ ๋ณ€์ˆ˜ ๊ฐ„ ํ˜ผํ•ฉ์ด ์˜ˆ์ธก๋ ฅ์„ ์ผ๊ด€๋˜๊ฒŒ ๋Œ์–ด์˜ฌ๋ฆผ์„ ๋ณด์ž„
  • ์ ๊ณฑ ๊ฒฐํ•ฉ ์ด์ : Inter-variable ๊ฒฝ๋กœ์—์„œ ๋‹จ์ˆœ ํ•ฉ๋ณด๋‹ค ์ ๊ณฑ ๊ฒฐํ•ฉ์ด ์ƒํ˜ธ์ž‘์šฉ ํ‘œํ˜„๋ ฅ์„ ๋†’์—ฌ ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋ณด์ž„

์‹คํ—˜ ๊ฒฐ๊ณผ ์š”์•ฝ

  • ETT ์‹œ๋ฆฌ์ฆˆ, ECL, Traffic, Weather, Solar ๋“ฑ 8๊ฐœ ํ‘œ์ค€ ๋ฒค์น˜๋งˆํฌ์—์„œ 4๊ฐœ ์˜ˆ์ธก ์ง€ํ‰(96/192/336/720) ํ‰๊ท  ์„ฑ๋Šฅ ๊ธฐ์ค€์œผ๋กœ ์ „ ํ•ญ๋ชฉ SOTA๋ฅผ ๋ณด๊ณ ํ•จ
  • iTransformer, PatchTST, Crossformer, FEDformer ๋“ฑ Transformer ๊ณ„์—ด๊ณผ TimeMixer, DLinear, TiDE, TimesNet ๋“ฑ CNN/MLP ๊ณ„์—ด์„ ํญ๋„“๊ฒŒ ์ƒํšŒํ•จ์„ ์ œ์‹œํ•จ
  • ์ž…๋ ฅ ๊ธธ์ด ์ฆ๊ฐ€ ์‹œ ๋‹ค์ˆ˜ ๋ชจ๋ธ์ด ์žฅ๊ธฐ ๊ตฌ๊ฐ„์—์„œ ์„ฑ๋Šฅ ์ €ํ•˜๋ฅผ ๊ฒช๋Š” ๋ฐ˜๋ฉด, PatchMLP์™€ DLinear๋Š” ์•ˆ์ •์  ๊ฐœ์„ ์„ ๋ณด์ด๋ฉฐ ์žฅ๊ธฐ ํŒจํ„ด ํฌ์ฐฉ์— ์œ ๋ฆฌํ•จ์„ ์‹œ์‚ฌํ•จ

์–ด๋ธ”๋ ˆ์ด์…˜ ์ธ์‚ฌ์ดํŠธ

  • MPE ์ œ๊ฑฐ ์‹œ ๋ฉ€ํ‹ฐ์Šค์ผ€์ผ ๊ด€๊ณ„ ํ•™์Šต์ด ์•ฝํ™”๋˜์–ด ์„ฑ๋Šฅ ํ•˜๋ฝ์ด ๋ฐœ์ƒํ•จ
  • ์ž„๋ฒ ๋”ฉ ๋ถ„ํ•ด ์ œ๊ฑฐ ์‹œ ์žก์Œ ์–ต์ œ๊ฐ€ ์–ด๋ ค์›Œ์ ธ ์˜ค๋ฅ˜๊ฐ€ ์ฆ๊ฐ€ํ•จ
  • Inter-variable ์ ๊ณฑ ์ œ๊ฑฐ ๋˜๋Š” ๋ณ€์ˆ˜ ํ˜ผํ•ฉ ์ž์ฒด ์ œ๊ฑฐ ์‹œ ๋‹ค๋ณ€๋Ÿ‰ ์ƒํ˜ธ์ž‘์šฉ์„ ์žƒ์–ด ์œ ์˜ํ•œ ์„ฑ๋Šฅ ์ €ํ•˜๊ฐ€ ๋ฐœ์ƒํ•จ
  • ๊ฒฐ๋ก ์ ์œผ๋กœ ๋ฉ€ํ‹ฐ์Šค์ผ€์ผ ํŒจ์น˜ + ์ž„๋ฒ ๋”ฉ ๋ถ„ํ•ด + ์ ๊ณฑ ๊ธฐ๋ฐ˜ ๋ณ€์ˆ˜ ํ˜ผํ•ฉ์˜ ์กฐํ•ฉ์ด ์„ฑ๋Šฅ์˜ ํ•ต์‹ฌ ๋™๋ ฅ์ž„์„ ํ™•์ธํ•จ

ํ•œ๊ณ„์™€ ํ–ฅํ›„ ๊ณผ์ œ

  • ํ‰๊ท ํ’€๋ง ๊ธฐ๋ฐ˜ ๋ถ„ํ•ด๋Š” ๋น„์ •์ƒ์„ฑยท๊ตฌ์กฐ์  ๋ณ€ํ™”๊ฐ€ ๋งค์šฐ ํฐ ๋„๋ฉ”์ธ์—์„œ ์ตœ์ ์ด ์•„๋‹ ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์ ์‘์  ๋ถ„ํ•ด ์ปค๋„์ด๋‚˜ ํ•™์Šตํ˜• ์Šค๋ฌด๋”ฉ์œผ๋กœ ํ™•์žฅ์ด ํ•„์š”ํ•จ
  • ๋ฉ€ํ‹ฐ์Šค์ผ€์ผ ํŒจ์น˜์˜ ํฌ๊ธฐยท๋น„์œจ ์„ ํƒ์€ ๋„๋ฉ”์ธ ์ฃผ๊ธฐ์„ฑ๊ณผ ์ƒํ˜ธ์ž‘์šฉํ•˜๋ฏ€๋กœ, ์ž๋™ ์Šค์ผ€์ผ ์„ ํƒ ํ˜น์€ ๋ฉ”ํƒ€๋Ÿฌ๋‹ ๊ธฐ๋ฒ•์˜ ๋„์ž…์ด ์œ ๋งํ•จ
  • Inter-variable ์ ๊ณฑ ๊ฒฐํ•ฉ์€ ๋‹จ์ˆœํ•˜๊ณ  ํšจ์œจ์ ์ด์ง€๋งŒ, ํฌ์†Œยท๊ฐ€๋ณ€์  ์ƒ๊ด€๊ตฌ์กฐ๋ฅผ ๊ฐ€์ง„ ๋ฐ์ดํ„ฐ์—์„œ๋Š” ๊ฐ€์ค‘ ๋งˆ์Šคํ‚น์ด๋‚˜ ์กฐ๊ฑด๋ถ€ ํ˜ผํ•ฉ์œผ๋กœ ๋” ๋‚˜์€ ์ ์‘์„ฑ์„ ๊ธฐ๋Œ€ํ•  ์ˆ˜ ์žˆ์Œ

๊ฒฐ๋ก 

PatchMLP๋Š” ๋ณต์žกํ•œ ์–ดํ…์…˜ ์„ค๊ณ„ ์—†์ด๋„ Patch ํ‘œํ˜„์˜ ๋ณธ์งˆ์  ์ด์ ๊ณผ ์ž„๋ฒ ๋”ฉ ๊ณต๊ฐ„ ๋ถ„ํ•ด, ์ด์ค‘ ํ˜ผํ•ฉ MLP๋งŒ์œผ๋กœ LTSF์—์„œ ๊ฐ„๊ฒฐํ•จยทํšจ์œจ์„ฑยท์ •ํ™•์„ฑ์„ ๋™์‹œ์— ๋‹ฌ์„ฑํ•จ์„ ๋ณด์˜€์Œ
์ด๋Š” LTSF์—์„œ Transformer์˜ ์šฐ์ˆ˜์„ฑ์ด ์–ดํ…์…˜ ๊ทธ ์ž์ฒด๊ฐ€ ์•„๋‹ˆ๋ผ ์ž…๋ ฅ ํ‘œํ˜„(ํŒจ์น˜) ์— ๊ธฐ์ธํ–ˆ์„ ์ˆ˜ ์žˆ์Œ์„ ์‹œ์‚ฌํ•˜๋ฉฐ, ๋‹จ์ˆœํ•˜์ง€๋งŒ ์˜ฌ๋ฐ”๋ฅธ ๊ตฌ์กฐ์  ์„ ํƒ์ด ๋Œ€์•ˆ์ด ๋  ์ˆ˜ ์žˆ์Œ์„ ์ž…์ฆํ•จ


๊ตฌํ˜„ ์ฝ”๋“œ

๋ฐ์ดํ„ฐ ํ๋ฆ„

[B, L, M] ์ž…๋ ฅ โ†’ MPE(๋ฉ€ํ‹ฐ์Šค์ผ€์ผ ํŒจ์น˜ ์ž„๋ฒ ๋”ฉ) โ†’ ์ž„๋ฒ ๋”ฉ ๊ณต๊ฐ„ ๋ถ„ํ•ด(Xs/Xr) โ†’ Dual Mixing MLP ๋ธ”๋ก(Intraโ†’Inter) โ†’ Predictor โ†’ ์ถœ๋ ฅ [B, T]

์ƒ์„ธ ๊ฐœ์š”

  • ์ž…๋ ฅ x_enc โˆˆ โ„^{Bร—Lร—M}์—์„œ L์€ ์ž…๋ ฅ ์œˆ๋„ ๊ธธ์ด, M์€ ๋ณ€์ˆ˜ ์ˆ˜, B๋Š” ๋ฐฐ์น˜ ํฌ๊ธฐ
  • MPE๊ฐ€ ์›์‹œ ์‹œ๊ณ„์—ด์„ ์Šค์ผ€์ผ๋ณ„๋กœ ํŒจ์น˜ํ™” โ†’ ๊ฐ ํŒจ์น˜๋ฅผ ์„ ํ˜• ์ž„๋ฒ ๋”ฉ โ†’ ์Šค์ผ€์ผ ๊ฒฐํ•ฉ
  • ์ž„๋ฒ ๋”ฉ ๊ณต๊ฐ„ ๋ถ„ํ•ด๊ฐ€ ์ด๋™ํ‰๊ท ์œผ๋กœ ์Šค๋ฌด์Šค ์„ฑ๋ถ„ Xs์™€ ์ž”์ฐจ ์„ฑ๋ถ„ Xr ๋ถ„๋ฆฌ
  • Dual Mixing MLP ๋ธ”๋ก์ด Intra-variable โ†’ Inter-variable ์ˆœ์œผ๋กœ ์‹œ๊ฐ„/ํŠน์ง• ์ถ•๊ณผ ๋ณ€์ˆ˜ ์ถ•์„ ํ˜ผํ•ฉ
  • Predictor๊ฐ€ ๋ณ€์ˆ˜ ์ถ•์„ ์š”์•ฝํ•˜๊ณ  ์˜ˆ์ธก ์ง€ํ‰ T๋กœ ์‚ฌ์ƒํ•ด ์ตœ์ข… ์ถœ๋ ฅ ์ƒ์„ฑ

์ „์ฒด ํ๋ฆ„์„ ํฐ ๊ทธ๋ฆผ์œผ๋กœ ๋ณด๊ธฐ

์ž…๋ ฅ: x = [๋ฐฐ์น˜, ๊ธธ์ด L, ๋ณ€์ˆ˜ M]

MPE(๋ฉ€ํ‹ฐ์Šค์ผ€์ผ ํŒจ์น˜ ์ž„๋ฒ ๋”ฉ)

  • ์—ฌ๋Ÿฌ ํŒจ์น˜ ๊ธธ์ด(์˜ˆ: 4, 8, 16)๋กœ x๋ฅผ ์กฐ๊ฐ๋‚ด๊ณ , ๊ฐ ์กฐ๊ฐ์„ ๊ฐ„๋‹จํ•œ Dense์— ํ†ต๊ณผ์‹œ์ผœ ํ† ํฐ์œผ๋กœ ๋ฐ”๊ฟˆ
  • ์Šค์ผ€์ผ๋งˆ๋‹ค ํ† ํฐ ๊ฐœ์ˆ˜๊ฐ€ ๋‹ค๋ฅผ ์ˆ˜ ์žˆ์œผ๋‹ˆ ๋ณด๊ฐ„ํ•ด์„œ ๊ธธ์ด๋ฅผ ๋งž์ถ˜ ๋‹ค์Œ, ํ•ฉ์ณ์„œ ํ•˜๋‚˜์˜ ํ‘œํ˜„์œผ๋กœ ๋งŒ๋“ฆ

์ž„๋ฒ ๋”ฉ ๊ณต๊ฐ„ ๋ถ„ํ•ด(FeatureDecomposition)

  • ๋ฐฉ๊ธˆ ๋งŒ๋“  ํ† ํฐ์—ด์„ ์ด๋™ํ‰๊ท ์œผ๋กœ ๋ถ€๋“œ๋Ÿฝ๊ฒŒ ๋งŒ๋“  ๊ฒƒ(Xs)๊ณผ, ์›๋ณธ์—์„œ ๊ทธ๊ฑธ ๋บ€ ์ž”์ฐจ(Xr) ๋กœ ๋‚˜๋ˆ”
  • ์ฆ‰ โ€œ์ถ”์„ธ/๋А๋ฆฐ ํŒŒ๋™โ€๊ณผ โ€œ๋น ๋ฅธ ๋ณ€ํ™”/๋…ธ์ด์ฆˆ์— ๊ฐ€๊นŒ์šด ๋ถ€๋ถ„โ€์„ ์ž„๋ฒ ๋”ฉ์—์„œ ๋ถ„๋ฆฌํ•œ๋‹ค๊ณ  ๋ณด๋ฉด ๋จ

Dual Mixing MLP ๋ธ”๋ก(์—ฌ๋Ÿฌ ์ธต)

  • Intra-variable(๋ณ€์ˆ˜ ๋‚ด๋ถ€) MLP: ๊ฐ ๋ณ€์ˆ˜ ์•ˆ์—์„œ ์‹œ๊ฐ„/ํŠน์ง•์„ ์„ž์–ด ๊ทธ ๋ณ€์ˆ˜ ์ž์ฒด์˜ ํŒจํ„ด์„ ๋” ์ž˜ ํ‘œํ˜„
  • Inter-variable(๋ณ€์ˆ˜ ๊ฐ„) MLP: ๋ณ€์ˆ˜ ์ถ•์„ ๊ธฐ์ค€์œผ๋กœ MLP๋ฅผ ๋Œ๋ ค ๋ณ€์ˆ˜๋“ค ์‚ฌ์ด ์ƒํ˜ธ์ž‘์šฉ์„ ํ•™์Šต
    • interaction="elem": ์›์†Œ๋ณ„ ๊ฒŒ์ดํŒ…(y * x + x) โ€” ์•ˆ์ •์ 
    • interaction="dot": ์ ๊ณฑ ๊ฒŒ์ดํŠธ โ€” ํ‘œํ˜„๋ ฅโ†‘(๊ฐ€๋” ๋ฏผ๊ฐ)

ํฌ์ธํŠธ: ๊ฐ ๋ธ”๋ก ๋’ค์— Residual/์ •๊ทœํ™”๋ฅผ ์ ์šฉํ•ด ํ•™์Šต ์•ˆ์ •์„ฑ ํ™•๋ณด

Predictor ํ—ค๋“œ

  • ๋ณ€์ˆ˜ ์ถ•์„ ํ‰๊ท (๋˜๋Š” ๊ฐ€์ค‘ํ•ฉ)์œผ๋กœ ์š”์•ฝํ•œ ๋’ค, Dense(T)๋กœ T ์Šคํ… ์˜ˆ์ธก์„ ๋ฑ‰์–ด๋ƒ„
  • ์ด ๊ตฌํ˜„์€ ๊ธฐ๋ณธ์ด ๋‹จ์ผ ์‹œ๊ณ„์—ด ์ถœ๋ ฅ [B, T] ์ด์•ผ. ๋‹ค๋ณ€๋Ÿ‰ ์˜ˆ์ธก์ด ํ•„์š”ํ•˜๋ฉด ํ—ค๋“œ๋ฅผ ๋ฐ”๊พธ๋ฉด ๋จ

์˜ต์…˜) ์ •๊ทœํ™”(use_norm)

  • ์ž…๋ ฅ์„ ์œˆ๋„ ๊ธธ์ด ๊ธฐ์ค€์œผ๋กœ ํ‘œ์ค€ํ™”ํ–ˆ๋‹ค๊ฐ€, ์˜ˆ์ธก์„ ๋‚ผ ๋•Œ ์›์ฒ™๋„๋กœ ๋˜๋Œ๋ ค์คŒ

๊ฐ ๋ธ”๋ก์„ ์™œ/์–ด๋–ป๊ฒŒ๋กœ ์ดํ•ดํ•˜๊ธฐ

  1. MultiScalePatchEmbedding

    • ์™œ? ๊ธด ์‹œ๊ณ„์—ด์—๋Š” ๋น ๋ฅธ ๋ณ€ํ™”๋„ ์žˆ๊ณ  ๋А๋ฆฐ ์ฃผ๊ธฐ๋„ ์žˆ์–ด. ์—ฌ๋Ÿฌ ๊ธธ์ด์˜ ํŒจ์น˜๋กœ ๋ณด๋ฉด ๋‘ ์˜์—ญ์„ ๊ฐ™์ด ์žก๊ธฐ ์‰ฌ์›Œ์ง
    • ์–ด๋–ป๊ฒŒ?
      • ๊ธธ์ด p๋กœ ์ž๋ฆ„ โ†’ [B, N, p, M]
      • p ๊ธธ์ด ํŒจ์น˜๋ฅผ ํŽด์„œ Dense(d_each) โ†’ [B, N, M, d_each]
      • ์Šค์ผ€์ผ๋งˆ๋‹ค N(ํ† ํฐ ์ˆ˜)์ด ๋‹ค๋ฅด๋ฉด ๋ณด๊ฐ„์œผ๋กœ ๋งž์ถค
      • ์Šค์ผ€์ผ๋“ค์„ ํŠน์ง• ์ฐจ์›์œผ๋กœ ํ•ฉ์น˜๊ณ  Dense(d_fuse)๋กœ ์ •๋ฆฌ
      • flatten_tokens=True๋ผ๋ฉด ํ† ํฐ์„ ํ‰๊ท ๋‚ด์„œ [B, M, d_model]๋กœ ์••์ถ•(์‹œ๊ฐ„ ํ•ด์ƒ๋„ โ†“, ๊ณ„์‚ฐ ํšจ์œจ โ†‘)
      • ํŒ: ์„ธ๋ฐ€ํ•œ ์‹œ๊ฐ„ ํŒจํ„ด์ด ์ค‘์š”ํ•˜๋ฉด flatten_tokens=False๋กœ ๋‘๊ณ  ํ† ํฐ์„ ์œ ์ง€ํ•จ
  2. FeatureDecomposition (์ด๋™ํ‰๊ท  ๋ถ„ํ•ด)

    • ์™œ? ์ž„๋ฒ ๋”ฉ์—๋„ ์—ฌ์ „ํžˆ ๋…ธ์ด์ฆˆ/๋น ๋ฅธ ์š”๋™์ด ์žˆ์Œ. ์ด๋™ํ‰๊ท ์œผ๋กœ ๋ถ€๋“œ๋Ÿฝ๊ฒŒ ๋งŒ๋“  ๊ฒƒ๊ณผ ์ž”์ฐจ๋กœ ๋‚˜๋ˆ„๋ฉด, ๋‹ค์Œ ๋ธ”๋ก๋“ค์ด ๋” ์•ˆ์ •์ ์œผ๋กœ ๋ฐฐ์šธ ์ˆ˜ ์žˆ์Œ
    • ์–ด๋–ป๊ฒŒ? ํ† ํฐ ์ถ• N ๋ฐฉํ–ฅ์œผ๋กœ AveragePooling1D๋ฅผ ์ ์šฉํ•˜๋Š”๋ฐ, ์–‘๋์„ ๋ฐ˜๋ณตํ•ด์„œ ํŒจ๋”ฉํ•ด์„œ ๊ธธ์ด๊ฐ€ ์ค„์ง€ ์•Š๋„๋ก ํ–ˆ์Œ
    • ๊ฒฐ๊ณผ๋Š” (Xs, Xr) = ๊ฐ™์€ ๋ชจ์–‘์˜ ๋‘ ํ† ํฐ์—ด
  3. Dual Mixing MLP (Intra โ†’ Inter ์ˆœ์„œ)

    • Intra-variable MLP: ๊ฐ ๋ณ€์ˆ˜ ์•ˆ์—์„œ ์‹œ๊ฐ„/ํŠน์ง•์„ ์„ž์–ด ํ•ด๋‹น ๋ณ€์ˆ˜์˜ ํ‘œํ˜„์„ ์—…๊ทธ๋ ˆ์ด๋“œ
      • axis="feature"๊ฐ€ ๊ธฐ๋ณธ: ๋งˆ์ง€๋ง‰ ํŠน์ง• ์ฐจ์›๋งŒ MLP๋กœ ๋Œ๋ ค ๊ฐ€๋ณ๊ณ  ์•ˆ์ •์ 
      • axis="token"๋„ ๊ฐ€๋Šฅ: ํ† ํฐ ์ถ•์„ ๋งˆ์ง€๋ง‰์œผ๋กœ ์˜ฎ๊ฒจ ์‹œ๊ฐ„ ๋ฐฉํ–ฅ ํ˜ผํ•ฉ๋„ ํ•  ์ˆ˜ ์žˆ์Œ
    • Inter-variable MLP: ๋ณ€์ˆ˜ ์ถ•์„ ๋งˆ์ง€๋ง‰์œผ๋กœ ์˜ฎ๊ฒจ Dense๊ฐ€ ๋ณ€์ˆ˜ ๊ฐ„์„ ์„ž๊ฒŒ ํ•จ
      • interaction ๋ชจ๋“œ๋กœ ์ƒํ˜ธ์ž‘์šฉ ๊ฐ•๋„๋ฅผ ๊ณ ๋ฅผ ์ˆ˜ ์žˆ์Œ
        • "elem": ์•ˆ์ •์ , ๊ธฐ๋ณธ๊ฐ’์œผ๋กœ ๋ฌด๋‚œ
        • "dot": ์ ๊ณฑ ๊ฒŒ์ดํŠธ๋กœ ๋ณ€์ˆ˜ ๊ฐ„ ๊ด€๊ณ„๋ฅผ ๋” ๊ฐ•ํ•˜๊ฒŒ ํ‘œํ˜„(ํ•™์Šต๋ฅ /์ •๊ทœํ™”์— ๋‹ค์†Œ ๋ฏผ๊ฐ)
    • ํฌ์ธํŠธ: Intra๋กœ ๊ฐ ๋ณ€์ˆ˜ ๋‚ด๋ถ€๋ฅผ ๋‹ค๋“ฌ๊ณ , Inter๋กœ ๋ณ€์ˆ˜ ๊ฐ„ ๊ด€๊ณ„๋ฅผ ์žก๋Š”๋‹ค โ€” ์ˆœ์„œ๋ฅผ ์œ ์ง€ํ•˜๋Š” ๊ฒŒ ์•ˆ์ •์ 
  4. Predictor (์ถœ๋ ฅ)

    • ์ด ๊ตฌํ˜„์€ ๋ณ€์ˆ˜ ์ถ•์„ ๋จผ์ € ์š”์•ฝ(ํ‰๊ท  ๋˜๋Š” ๊ฐ€์ค‘ํ•ฉ)ํ•˜๊ณ , ๋‚จ์€ ํ‘œํ˜„์„ ํŽด์„œ Dense(T)์— ๋„ฃ์–ด [B, T] ๋ฅผ ์ถœ๋ ฅ
    • ๋‹ค๋ณ€๋Ÿ‰ ์ถœ๋ ฅ์ด ํ•„์š”ํ•˜๋ฉด?
      • ์ง‘์•ฝํ•˜๊ธฐ ์ „์— ๋ณ€์ˆ˜๋ณ„๋กœ Dense(T)๋ฅผ ์ ์šฉํ•ด์„œ [B, T, M]์„ ๋งŒ๋“ค๊ฑฐ๋‚˜
      • Dense(T*M) ํ›„ reshapeํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ ๋ฐ”๊ฟ€ ์ˆ˜ ์žˆ์Œ

์„ค์ •์„ ๊ณ ๋ฅผ ๋•Œ์˜ ์ง๊ด€

  • flatten_tokens
    • True: ๋น ๋ฅด๊ณ  ๊ฐ€๋ฒผ์›€, ๊ธด ์˜ˆ์ธก ์ง€ํ‰/๊ณ ์žก์Œ ๋„๋ฉ”์ธ์— ์œ ๋ฆฌ(์‹œ๊ฐ„ ํ•ด์ƒ๋„๋Š” ํฌ์ƒ)
    • False: ํ† ํฐ ์œ ์ง€๋กœ ์„ธ๋ฐ€ํ•œ ํŒจํ„ด ํฌ์ฐฉ(์—ฐ์‚ฐ/๋ฉ”๋ชจ๋ฆฌ โ†‘)
  • interaction
    • "elem"์œผ๋กœ ์‹œ์ž‘ โ†’ ์•ˆ์ •ํ™” ํ›„ "dot" ์‹คํ—˜
  • pool_size(์ด๋™ํ‰๊ท  ์ปค๋„)
    • ๋ฐ์ดํ„ฐ ์ฃผ๊ธฐ์™€ ์ฐฝ ๊ธธ์ด๋ฅผ ๋ณด๊ณ  9~25 ์ •๋„์—์„œ ํŠœ๋‹(๋„ˆ๋ฌด ํฌ๋ฉด ๊ณผํ•˜๊ฒŒ ๋ถ€๋“œ๋Ÿฌ์›Œ์งˆ ์ˆ˜ ์žˆ์Œ)
  • ์ •๊ทœํ™”(use_norm=True)
    • ๋ณ€์ˆ˜ ์Šค์ผ€์ผ์ด ์ œ๊ฐ๊ฐ์ด๋ฉด ๊ฑฐ์˜ ํ•„์ˆ˜. ์ˆ˜๋ ด๊ณผ ์ผ๋ฐ˜ํ™”์— ๋„์›€

About

๐Ÿง‘๐Ÿปโ€๐Ÿ’ปpatchMLP implementation ver. Sensor Calibration

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages