Skip to content

yunjinyong730/TimeXer_Sensor_Calibration

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

58 Commits
ย 
ย 
ย 
ย 

Repository files navigation

TimeXer_Sensor_Calibration

Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_2 (InputLayer)        [(1, 360, 5)]             0         
                                                                 
 normalizer_1 (Normalizer)   (1, 360, 5)               0         
                                                                 
 timexer (TimeXer)           (1, 1)                    6745      
                                                                 
 denormalizer (Denormalizer  (1, 1)                    0         
 )                                                               
...
_________________________________________________________________
934/934 [==============================] - 2s 1ms/step
Inference time: 1.866 seconds
Throughput: 16016.67 samples/second

Antwerp_pm10_w360
val rmse : 8.589814186096191, test rmse : 13.827042579650879 oslo_pm10_w360
val rmse : 9.852630615234375, test rmse : 15.736166954040527 Zagreb_pm10_w360
val rmse : 17.27320098876953, test rmse : 14.202249526977539
avg test rmse: 14.588486353556315 [13.827043, 15.736167, 14.20225]

TimeXer โ€” ์™ธ์ƒ ๋ณ€์ˆ˜(Exogenous Variables)๋ฅผ ํ†ตํ•œ ์‹œ๊ณ„์—ด ์˜ˆ์ธก ๊ฐ•ํ™” ๋…ผ๋ฌธ ์ •๋ฆฌ

๋…ผ๋ฌธ: TimeXer: Empowering Transformers for Time Series Forecasting with Exogenous Variables (NeurIPS 2024)
์ €์ž/์†Œ์†: Tsinghua Univ. BNRist
ํ•ต์‹ฌ ์•„์ด๋””์–ด ํ•œ ์ค„ ์š”์•ฝ: ๋‚ด์ƒ(Endogenous) ์‹œ๊ณ„์—ด์€ ํŒจ์น˜ ๋‹จ์œ„ ํ† ํฐ์œผ๋กœ, ์™ธ์ƒ(Exogenous) ์‹œ๊ณ„์—ด์€ ๋ณ€์ˆ˜(variates) ๋‹จ์œ„ ํ† ํฐ์œผ๋กœ ํ‘œํ˜„ํ•˜๊ณ , ๊ธ€๋กœ๋ฒŒ ํ† ํฐ์„ ๋‹ค๋ฆฌ๋กœ ์‚ผ์•„ ํŒจ์น˜-์ž๊ธฐ์–ดํ…์…˜๊ณผ ๋ณ€์ˆ˜-๊ต์ฐจ์–ดํ…์…˜์„ ๋™์‹œ์— ์ˆ˜ํ–‰ํ•ด ์™ธ๋ถ€ ์š”์ธ์„ ๊ฒฌ๊ณ ํ•˜๊ฒŒ ํก์ˆ˜ํ•œ๋‹ค.


๋ชฉ์ฐจ


์™œ ์™ธ์ƒ ๋ณ€์ˆ˜๊ฐ€ ์ค‘์š”ํ•œ๊ฐ€?

์‹ค์„ธ๊ณ„ ์‹œ๊ณ„์—ด์€ ๊ฒฐ์ธก, ๋น„๊ท ์ผ ์ƒ˜ํ”Œ๋ง, ์ฃผ๊ธฐ/๊ธธ์ด ๋ถˆ์ผ์น˜, ์‹œ๊ฐ„ ์ง€์—ฐ ํšจ๊ณผ๊ฐ€ ํ”ํ•˜๋‹ค. ๊ธฐ์กด ์ ‘๊ทผ(๋‚ดยท์™ธ์ƒ์„ ๋™์ผ ์‹œ์ ์— concat)์œผ๋กœ๋Š” ์ •๋ ฌ/๋™๊ธฐํ™”๊ฐ€ ์–ด๋ ต๊ณ , ๋ถˆํ•„์š”ํ•œ ์ƒํ˜ธ์ž‘์šฉ๊ณผ ๋ณต์žก๋„๊ฐ€ ์ปค์ง„๋‹ค. TimeXer๋Š” ์ž„๋ฒ ๋”ฉ ๋‹จ๊ณ„์—์„œ ์—ญํ• ์„ ๋ถ„๋ฆฌํ•ด ์ด๋Ÿฌํ•œ ๋ฌธ์ œ๋ฅผ ์šฐํšŒํ•œ๋‹ค.


๋ฌธ์ œ ์ •์˜

  • ์ž…๋ ฅ: ๋‚ด์ƒ ๋‹จ๋ณ€๋Ÿ‰ $x_{1:T}$ ์™€ ๋‹ค์ˆ˜์˜ ์™ธ์ƒ ๋ณ€์ˆ˜ ์ง‘ํ•ฉ $z^{(1)}{1:T{\mathrm{ex}}}, \dots, z^{(C)}{1:T{\mathrm{ex}}}$ (๋‚ดยท์™ธ์ƒ์˜ look-back ๊ธธ์ด ๋ถˆ์ผ์น˜ ํ—ˆ์šฉ, $T \neq T_{\mathrm{ex}}$)
  • ๋ชฉํ‘œ: ํ–ฅํ›„ $S$ ์Šคํ…์˜ ๋‚ด์ƒ ์‹œ๊ณ„์—ด $\hat{x}{T+1:T+S} = F{\theta}!\big(x_{1:T}, z_{1:T_{\mathrm{ex}}}\big)$ ์˜ˆ์ธก

๋ชจ๋ธ ์•„ํ‚คํ…์ฒ˜

image

ํ•ต์‹ฌ ์„ค๊ณ„:

  1. ๋‚ด์ƒ ์ž„๋ฒ ๋”ฉ(Endogenous) โ€” ๋น„์ค‘์ฒฉ ํŒจ์น˜๋กœ ๋‚˜๋ˆˆ ๋’ค, ํŒจ์น˜ ํ† ํฐ๋“ค(temporal patch tokens) + ํ•™์Šตํ˜• ๊ธ€๋กœ๋ฒŒ ํ† ํฐ(series-level global token) ๊ตฌ์„ฑ. ๊ธ€๋กœ๋ฒŒ ํ† ํฐ์ด ํŒจ์น˜โ†”์™ธ์ƒ ์ •๋ณด ํ†ต๋กœ ์—ญํ• .
  2. ์™ธ์ƒ ์ž„๋ฒ ๋”ฉ(Exogenous) โ€” ๋ณ€์ˆ˜(variates) ๋‹จ์œ„ ์‹œ๊ณ„์—ด ์ „์ฒด๋ฅผ ํ•˜๋‚˜์˜ ํ† ํฐ์œผ๋กœ ์ž„๋ฒ ๋”ฉ(variate token). ๊ฒฐ์ธก/๋ฏธ์ •๋ ฌ/์ฃผ๊ธฐยท๊ธธ์ด ์ƒ์ด์„ฑ์— ์ž์—ฐ ์ ์‘.
  3. ์–ดํ…์…˜ ํ๋ฆ„
    • ๋‚ด์ƒ ์ž๊ธฐ์–ดํ…์…˜(Self-Attn): [ํŒจ์น˜ ํ† ํฐ๋“ค + ๊ธ€๋กœ๋ฒŒ ํ† ํฐ]์— ๋Œ€ํ•ด ํŒจ์น˜-ํŒจ์น˜ ๋ฐ ํŒจ์น˜-๊ธ€๋กœ๋ฒŒ ๊ด€๊ณ„๋ฅผ ๋™์‹œ์— ํ•™์Šตํ•ด ์‹œ๊ฐ„ ์˜์กด์„ฑ์„ ์ •ํ™•ํžˆ ์บก์ฒ˜.
    • ์™ธ์ƒโ†’๋‚ด์ƒ ๊ต์ฐจ์–ดํ…์…˜(Cross-Attn): **๋‚ด์ƒ ๊ธ€๋กœ๋ฒŒ ํ† ํฐ(์งˆ์˜)**์ด **์™ธ์ƒ ๋ณ€์ˆ˜ ํ† ํฐ๋“ค(ํ‚ค/๊ฐ’)**์„ ์„ ํƒ์ ์œผ๋กœ ํก์ˆ˜ โ†’ ๋ณ€์ˆ˜-์ˆ˜์ค€ ์ƒ๊ด€ ๋ฐ˜์˜.

์ง๊ด€: ์™ธ์ƒ์€ โ€œ๋ฌด์—‡์ด ์ค‘์š”ํ•œ ๋ณ€์ˆ˜์ธ๊ฐ€โ€๋ฅผ ๊ณ ๋ฅด๊ณ (๋ณ€์ˆ˜-์ˆ˜์ค€), ๋‚ด์ƒ์€ โ€œ์–ธ์ œ ์ค‘์š”ํ•œ๊ฐ€โ€๋ฅผ ์ •๋ฐ€ํžˆ ๋ณธ๋‹ค(ํŒจ์น˜-์ˆ˜์ค€). ๋‘ ์ถ•์„ ๊ธ€๋กœ๋ฒŒ ํ† ํฐ์œผ๋กœ ์—ฎ์–ด ๋ถˆํ•„์š”ํ•œ ์ „-๋ณ€์ˆ˜๊ฐ„ ์ƒํ˜ธ์ž‘์šฉ ๋น„์šฉ์„ ์ค„์ด๋ฉด์„œ๋„ ์ •๋ณด๋Š” ์„ ํƒ์ ์œผ๋กœ ์œ ์ž…๋œ๋‹ค.


ํ•™์Šต/์†์‹ค ๋ฐ ๋ฉ€ํ‹ฐ๋ณ€์ˆ˜ ์˜ˆ์ธก์œผ๋กœ์˜ ์ผ๋ฐ˜ํ™”

  • ์ถœ๋ ฅ ์ƒ์„ฑ: ๋งˆ์ง€๋ง‰ ๋ธ”๋ก์—์„œ ์–ป์€ ํŒจ์น˜ ํ‘œํ˜„๊ณผ ์ „์—ญ(๊ธ€๋กœ๋ฒŒ) ํ‘œํ˜„์„ ํ•˜๋‚˜๋กœ ํ•ฉ์นœ ๋’ค, ์ด๋ฅผ **์„ ํ˜• ๋ณ€ํ™˜(์™„์ „์—ฐ๊ฒฐ์ธต)**์— ํ†ต๊ณผ์‹œ์ผœ ๋ฏธ๋ž˜ ๊ฐ’์„ ์˜ˆ์ธกํ•œ๋‹ค. ์ฆ‰, ์‹œ๊ฐ„ ๊ตฌ๊ฐ„๋ณ„ ์ •๋ณด(ํŒจ์น˜)์™€ ์‹œ๊ณ„์—ด ์ „๋ฐ˜์˜ ์š”์•ฝ ์ •๋ณด(์ „์—ญ)๋ฅผ ๊ฒฐํ•ฉํ•ด ์ตœ์ข… ์˜ˆ์ธก์„ ๋งŒ๋“ ๋‹ค.
  • ์†์‹ค: L2(์ œ๊ณฑ ์˜ค์ฐจ)
  • ๋ฉ€ํ‹ฐ๋ณ€์ˆ˜ ์˜ˆ์ธก: ๊ฐ ๋ณ€์ˆ˜๋ฅผ โ€œ๋‚ด์ƒโ€์œผ๋กœ ๋‘๊ณ  ๋‚˜๋จธ์ง€ ๋ณ€์ˆ˜๋Š” ์™ธ์ƒ์œผ๋กœ ๋ณ‘๋ ฌ ์ฒ˜๋ฆฌ(์ฑ„๋„ ๋…๋ฆฝ), Self/Cross-Attn ์ธต ๊ณต์œ .

๊ฒฐ๊ณผ ์š”์•ฝ

  • ๋‹จ๊ธฐ ์ „๋ ฅ๊ฐ€๊ฒฉ(EPF, ์ž…๋ ฅ 168โ†’์˜ˆ์ธก 24) 5๊ฐœ ๋งˆ์ผ“ ๋ชจ๋‘์—์„œ SOTA(MSE/MAE). ์˜ˆ:
    PJM MSE 0.093 (iTransformer 0.097, Crossformer 0.101 ๋“ฑ)
    NP MSE 0.236 (Crossformer 0.240, RLinear 0.335 ๋“ฑ)
    โ†’ ์™ธ์ƒ ๋ณ€์ˆ˜์˜ ์ •ํ™•ํ•œ ํ™œ์šฉ + ์‹œ๊ฐ„ ์˜์กด ํ•™์Šต์ด ๊ฒฝ์Ÿ ๋ชจ๋ธ์„ ์ผ๊ด€๋˜๊ฒŒ ์ƒํšŒ.
  • ์žฅ๊ธฐ ๋ฉ€ํ‹ฐ๋ณ€์ˆ˜(ETT/ECL/Weather/Traffic ๋“ฑ, ํ‰๊ท ) ๋Œ€๋‹ค์ˆ˜ ๋ฐ์ดํ„ฐ์…‹์—์„œ ์ผ๊ด€๋œ ์šฐ์ˆ˜ ์„ฑ๋Šฅ.
  • ์™œ ์ž˜ ๋˜๋‚˜? ๊ธฐ์กด ๋ชจ๋ธ์€
    • Crossformer: ๋ชจ๋“  ๋ณ€์ˆ˜๋ฅผ ์„ธ๋ฐ€ ํŒจ์น˜ ์ˆ˜์ค€์œผ๋กœ ์—ฎ์–ด ๋…ธ์ด์ฆˆ/๋ณต์žก๋„ ์ฆ๊ฐ€
    • iTransformer: ๋ณ€์ˆ˜-์ˆ˜์ค€๋งŒ ๋ณด๊ณ  ์‹œ๊ฐ„-์„ธ๋ถ€๋Š” ์„ ํ˜• ํˆฌ์˜์— ์˜์กด
      โ†’ TimeXer๋Š” ํŒจ์น˜(์‹œ๊ฐ„)ร—๋ณ€์ˆ˜(์™ธ์ƒ) ์ด์› ์„ค๊ณ„๋กœ ์žฅ๋‹จ์ ์„ ๋™์‹œ์— ๋ณด์™„.

์ผ๋ฐ˜์„ฑ/๊ฒฌ๊ณ ์„ฑ/ํ™•์žฅ์„ฑ

  • Look-back ๋ถˆ์ผ์น˜(๋‚ด์ƒ/์™ธ์ƒ ๊ธธ์ด ๋‹ค๋ฆ„)์—๋„ ์„ฑ๋Šฅ ์ด๋“ ์œ ์ง€. ์™ธ์ƒ ๊ธธ์ด ํ™•์žฅ๋ณด๋‹ค ๋‚ด์ƒ ๊ธธ์ด ํ™•์žฅ์ด ํŠนํžˆ ์œ ์ต.
  • ๊ฒฐ์ธก/๋žœ๋ค ์™ธ์ƒ์—๋„ ๋‚ด์ƒ์˜ ์‹œ๊ฐ„ ํ‘œํ˜„์ด ์˜ˆ์ธก์„ ์ฃผ๋„ํ•ด ์„ฑ๋Šฅ ๊ฐ•๊ฑด(์™ธ์ƒ์ด ์™„์ „ํžˆ ๋ฌด์˜๋ฏธํ•ด๋„ ๊ธ‰๋ฝํ•˜์ง€ ์•Š์Œ). ๋ฐ˜๋Œ€๋กœ ๋‚ด์ƒ์ด ๋ฌด์˜๋ฏธํ•ด์ง€๋ฉด ๊ธ‰๊ฒฉํžˆ ์•…ํ™”.
  • ํšจ์œจ์„ฑ: ์™ธ์ƒ ๊ฐ„ ์ƒํ˜ธ์ž‘์šฉ์„ ์ธต๋งˆ๋‹ค ํ’€์–ด๋†“์ง€ ์•Š๊ณ  ๊ธ€๋กœ๋ฒŒ ํ† ํฐ ๊ธฐ๋ฐ˜ ๊ต์ฐจ์–ดํ…์…˜์œผ๋กœ ์ฒ˜๋ฆฌ โ†’ ๋ฉ”๋ชจ๋ฆฌ ์šฐ์œ„/ํ•™์Šต์†๋„ ์œ ๋ฆฌ.

์žฌํ˜„์„ ์œ„ํ•œ ๊ธฐ๋ณธ ์„ค์ •

  • ํ”„๋ ˆ์ž„์›Œํฌ/ํ•˜๋“œ์›จ์–ด: PyTorch, ๋‹จ์ผ RTX 4090 24GB
  • ์ตœ์ ํ™”: Adam, lr=1e-4, L2 Loss, Early Stopping, 10 epoch ๊ณ ์ • ํ•™์Šต
  • ๋ชจ๋ธ ํฌ๊ธฐ: Block $L \in {1,2,3}$, $d_{\text{model}}\in{128,256,512}$
  • ํŒจ์น˜ ๊ธธ์ด: ์žฅ๊ธฐ 16, ๋‹จ๊ธฐ 24(๋น„์ค‘์ฒฉ) โ€” ์ž‘์€ ํŒจ์น˜๋Š” ์˜๋ฏธ ์ •๋ณด ํฌ์„ ๊ฐ€๋Šฅ(์„ฑ๋Šฅ ์ €ํ•˜)

Quick Pseudo-code (๊ฐœ๋… ํ๋ฆ„)

# x: endogenous (T,), z_list: [z^(1)_(T_ex), ..., z^(C)_(T_ex)]
patch_tokens = PatchEmbed(split_nonoverlap(x, P))       # (N, D)
g_token     = LearnableGlobalToken()                    # (1, D)
v_tokens    = [VariateEmbed(z) for z in z_list]         # (C, D)

# L layers
for _ in range(L):
    # Self-Attn over [patch_tokens || g_token]
    patch_tokens, g_token = SelfAttentionConcat(patch_tokens, g_token)
    # Cross-Attn: g_token (Q)  <-- v_tokens (K,V)
    g_token = CrossAttention(g_token, v_tokens)

y_hat = LinearProjection(concat(patch_tokens, g_token))  # forecast
loss  = mse(y_hat, y_true)

About

๐Ÿง‘๐Ÿปโ€๐Ÿ’ปTimeXer implementation ver. Sensor Calibration

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages