[AWS Re:Invent 2023] Deep dive into Amazon Aurora and its innovations

Created
November 29, 2023
Created by
D
DaEun Kim
Tags
AWS
Property

Global Database

  • primary cluster in region A
  • secondary cluster in region B
image

Swichover

  • μ–΄λ–€ 리전 내에 μžˆλŠ” primary ν΄λŸ¬μŠ€ν„°κ°€ 죽으면 λ‹€λ₯Έ 리전에 μžˆλŠ” secondary ν΄λŸ¬μŠ€ν„°κ°€ primary 으둜 승격 β†’ λŒ€μ‹  in-flight replicated λ°μ΄ν„°λŠ” μœ μ‹€λ  수 μžˆλ‹€.
  • ν΄λŸ¬μŠ€ν„° κ°„ 볡제λ₯Ό μœ„ν•΄ replication agent/server κ°€ μžˆλ‹€.
  • lag μ΅œμ†Œν™”ν•˜κΈ° μœ„ν•΄ 주둜 κ°€κΉŒμš΄ 리전에 secondary λ₯Ό λ§Œλ“ λ‹€.

Write fowarding

  • primary ν΄λŸ¬μŠ€ν„° λ‚΄ primary μΈμŠ€ν„΄μŠ€μ— μ“°κΈ° μˆ˜ν–‰ν•˜λ©΄ secondary ν΄λŸ¬μŠ€ν„° λ‚΄ primary/replica κΉŒμ§€ 데이터 볡제λ₯Ό μˆ˜ν–‰

Fast clones

Export to S3 via clone

  • parallel export available

Enhanced binlog

  • two phase commit 은 λ³΅μž‘ν•˜λ‹€ β†’ two phase commit ꡬ쑰λ₯Ό κ°œμ„ 

pgvector (for postgresql)

  • embedding vendor ν‚€μ›Œλ“œλ‘œ 벑터 μ‚¬μš© κ°€λŠ₯

Aurora serverless

  • μ• ν”Œλ¦¬μΌ€μ΄μ…˜ μ„œλ²„κ°€ scale in/out ν•˜λŠ” 정도에 따라 Aurora μ„œλ²„λ„ scale in/out ν•œλ‹€. (CPU/memory resizing)
    • λ©”λͺ¨λ¦¬ 리사이징 = buffer pool resizing (기본으둜 μ…‹μ—…λ˜λŠ” λ©”λͺ¨λ¦¬ λΉ„μœ¨ : 버퍼 ν’€ 75%, νž™ 25% β†’ 버퍼와 νž™μ˜ λΉ„μœ¨μ„ μžλ™μœΌλ‘œ μ‘°μ •)
    • λ‚¨λŠ” λ©”λͺ¨λ¦¬κ°€ 있으면 λ©”λͺ¨λ¦¬ μ‚¬μš©μ„ μžλ™μœΌλ‘œ 쀄인닀.

Blue/Green deployment

  • blue β†’ green ν΄λŸ¬μŠ€ν„° 으둜 데이터 볡제λ₯Ό ν•˜κΈ° λ•Œλ¬Έμ— cutover / rollback κ°€λŠ₯?
  • cutover μ΄ν›„μ—λŠ” ν˜„μž¬ next 된 ν΄λŸ¬μŠ€ν„°λ₯Ό λ“œλž?
  • μ‚¬μš© λͺ©μ 
    • DBMS 버전 μ—…λ°μ΄νŠΈ
    • DDL ν¬ν•¨ν•œ ν”Όμ²˜ 배포

Zero-ETL

  • by data streaming from Aurora to Redshift

New Aurora Storage Type

  • I/O-optimized
    • predictable price
    • improved price performance for IO-heavy workload
    • apply to cluster unit (ν΄λŸ¬μŠ€ν„° λ‹¨μœ„λ‘œ 적용 κ°€λŠ₯)
    • how it works : optimized reads with tiered cache
      • λ””λΉ„ λ‚΄ λ‹¨μœ„μ‹œκ°„ λ‹Ή λ””μŠ€ν¬ IO 횟수λ₯Ό 쀄인닀. β†’ μΊμ‹œ νžˆνŠΈμœ¨μ„ λ†’μ—¬μ•Ό κ°€λŠ₯.
        • μΊμ‹œ νžˆνŠΈμœ¨μ„ 높이렀면 νŒŒν‹°μ…”λ‹μ„ 잘 ν•΄μ•Ό.
        • μΊμ‹œ νžˆνŠΈμœ¨μ„ 높이렀면 인덱싱을 잘 ν•΄μ•Ό. (νž™ 트리 ꡬ쑰 λ‚΄ νŠΉμ • 인덱슀 블둝을 자주 μ ‘κ·Όν•˜λŠ” 만큼 ν•΄λ‹Ή μΈλ±μŠ€μ— λŒ€ν•œ μΊμ‹œ 히트율이 λ†’μ•„μ§ˆ κ²ƒμž„)

Limitless Database

  • scale β‰ˆ managed sharding
  • 샀딩을 μ μš©ν•˜λ©΄ μ§λ©΄ν•˜λŠ” κ³Όμ œλ“€
    • inconsistency
      • DDL μˆ˜ν–‰ μ‹œ μƒ€λ“œ 쀑에 일뢀가 μ‹€νŒ¨ν•˜λŠ” 경우
      • 동일 쿼리λ₯Ό μˆ˜ν–‰ν–ˆμœΌλ‚˜ μƒ€λ“œ 쀑에 일뢀가 쿼리 μ§€μ—°μœΌλ‘œ 인해 λ‹€λ₯Έ κ²°κ³Όλ₯Ό μ‘°νšŒν•˜λŠ” 경우
    • capacity managment
      • single database gets bigger β†’ shard 진행 β†’ 샀딩 λΆˆκ· ν˜• λ°œμƒ β†’ λ‘œλ“œκ°€ λ§Žμ€ μƒ€λ“œλŠ” 사양 늘림 (근데 λ‘œλ“œ 적은 μΈμŠ€ν„΄μŠ€λ“€λ„ 같이 λŠ˜λ €μ•Ό 됨?) β†’ λΆˆν•„μš”ν•œ κΈˆμ „μ  λΉ„μš©μœΌλ‘œ λŒμ•„μ˜΄
  • limitless λŠ” μƒ€λ”©μ—μ„œ λ°œμƒκ°€λŠ₯ν•œ λ¬Έμ œλ“€μ„ κ°œμ„ ν•œλ‹€.
    • 2백만 TPS 보μž₯
    • how?
      • automated re-sharding (λ‚΄λΆ€μ μœΌλ‘œ νŠΉμ • μƒ€λ“œμ— 데이터가 λͺ°λ¦¬λ©΄ ν•΄λ‹Ή μƒ€λ“œ λŒ€μƒμœΌλ‘œ λ‹€μ‹œ 샀딩 진행)
      • serverless capacity (νŠΈλž˜ν”½μ΄ 많고 μ μŒμ— 따라 사양을 λŠ˜λ Έλ‹€ μ€„μ˜€λ‹€)
      • consistent DDL/Query/Backup
        • λ‚΄λΆ€μ μœΌλ‘œ global clock 을 가지고 있음
        • cross-shard transaction 의 경우 global clock + 각 μƒ€λ“œμ˜ μŠ€λƒ…μƒ·μ„ μ‚¬μš©ν•˜μ—¬ μ‹œκ°„μ˜ 지연 없이 consistent read κ°€λŠ₯