I’m Juechu “Joy” Dong

Confidential Computing Computer Architect PhD candidate @Umich


Greetings from Joy! Welcome to my page. My Chinese name is 董珏初 Juechu (pronounced ge ü e, chew). If you find it hard to pronounce my name in mandarin, I’m totally fine with Joy.😊

Juechu (Joy) Dong is a PhD candidate at the University of Michigan CSE department advised by Prof. Satish Narayanasamy. Her research focuses on privacy enhancing technologies and large scale parallel computing. Her works seek to advance paralell and confidential computing solutions for enabling privacy-preserving data analytics solutions ranging from population scale genomic analysis to generative AI. Joy recieved dual Bachelor’s degree in Computer Engineering from the Shanghai Jiao Tong University and the University of Michigan. She was awarded Rackham International Student Fellowship.



  • [Jun 2024]

    Our work Toleo is accepted by ASPLOS. It will be included in the ASPLOS'24 preceeding and we will present it at ASPLOS'25. See you in Rotterdam~

  • [May 2024]

    I joined Meta Pytorch Compiler team this summer as a research scienctist intern. See you at Menlo Park~

  • [Mar 2024]

    Our work mm2-gb for long sequence DNA mapping is accepted by BioSys'24. Checkout our open sourced demo. Many thanks to AMD HPC team! see you in San Diego~

  • [Jan 2024]

    I passed PhD qualification test and becomes a PhD candidate.

Archived news ...



  • 2022 Sept - exp. 2027

    University of Michigan

    Ph.D in Computer Science and Engineering | Computer Architecture & Systems
  • 2022 Apr

    University of Michigen

    B.S.E. in Computer Engineering | GPA: 3.99/4.00

    Course work: EECS470 Computer Architecture (A), EECS482 Operating Systems (A), Parallele CUDA Programming (A)

  • 2022 Aug

    Shanghai Jiaotong Univeristy

    B.S.E. in Electrical & Computer Engineering | GPA: 3.82/4.00

    Course work: VE401 Probability Methods in Eng. (A+), VV186/VV285/VV286 Honors Mathematics II/III/IV (A-, A, A)


  • 2024 May - 2024 Aug


    Research Scientist Intern | PyTorch Team

    • Develop new techniques in TorchDynamo, TorchInductor, PyTorch core, PyTorch Distributed.
    • Explore the intersection of PyTorch compiler and PyTorch distributed.
    • Optimize Generative AI models across the stack (pre-training, fine-tuning, and inference).

  • 2022 May - 2022 Aug


    Deep Learning Compute Architect Intern | GPU Architecture

    GPU performance analysis, especially for deep learning workloads.
    Specialize in: GPU architecture, memory hierarchy & multi-device communication


  • ASPLOS'24 – Accepted

    Toleo: Scaling Freshness to Tera-scale Memory Using CXL and PIM

    Juechu Dong, Jonah Rosenblum, Satish Narayanasamy

    We will present Toleo at ASPLOS'25! [code]

    🌟Scale trusted memory size from hundreds of MB to tens of TB by expanding the span of trusted from a single trusted processor to an entire platform including intelligent memories.
    🌟Design a new scheme of freshness protection that reduces the space requirement by 50x.
    🌟Reduce deployment cost by spacing sharing one intelligent memory device among multiple CPUs.

  • RECOMB'24 – under submission

    SECRET-GWAS: Confidential Computing for Population-Scale GWAS

    Jonah Rosenblum, Juechu Dong, Satish Narayanasamy

    Develop a thousand-core platform on Azure Confidential Computing to conduct multi-institutional GWAS on millions of patients in less than a minute.
    Adapt Spark-based Hail genomic analysis framework to run on TEE under obliviousness requirement.
    Parallelize GWAS computation on 1k cores to achieve near linear speedup.

  • BioSys-2024

    mm2-gb: GPU Accelerated Minimap2 for Long Read DNA Mapping

    Juechu Dong, Xueshen Liu, Harisankar Sadasivan, Sriranjani Sitaraman, Satish Narayanasamy

    [code] [preprint] [slides] [AMD Blog]

    Performance Boost: Accelerate bottleneck step (chaining) of state-of-art long sequence mapping tool minimap2 by 2.57x-5.33x on GPU.
    Scales well: Optimize towards ultra long reads of 50kb+ to accommodate genome sequencing technology trend.
    Open Sourced! with active maintainance and optimization! Welcome community contributions~

  • 2021 Feb - 2021 Apr

    Out-of-Order Processor Design

    EECS470 Computer Architecture Project

    Design an out-of-order, 3-way scalar processor based on R10K design using system verilog. Add additional feature load store queue, advance branch predictor and cache heriachy.



  • Coordinator

    Computer Engineering Lab Reading Group

    Organize weekly paper reading presentations and discussions.
    Host talks from visiting researchers and professors.

  • Co-Founder & Vice President

    UM-SJTU Joint Institute Alumni Association

    Alumni Engagement: Organize alumni and student gatherings.
    Relationship Building: Involve in expanding SJTU - UM collaborations, connecting to JI sponsors, and building industry relationships.
    Career Advising: Organize students career development workshops.
    Welcoming: Host new student orientation events, organize airport pickups, and offer settle down help.
    Student Support: Support students during the stressful transition to start in a new university in a new country, and during urgent crisis.



  • FA2023

    Graduate Student Instructor: EECS570 Parallel Computer Architecture

    with Prof. Ran Dresliski @UMich
  • FA2023

    Graduate Student Instructor: EECS471 CUDA Programming

    with Dr. Valeriy Tenishev @UMich
  • FA2021, WN2022

    Instructional Aid: EECS470 Computer Architecture

    with Prof. Mark Brehob and Prof. Ronald Dreslinski @UMich

    Teach out of order processor design topics including branch prediction, pipelines, prefetching, caches etc. Hold lab sessions and develop exam problems regarding OoO processor design.

  • SP2021

    Teaching Assistant: VE401 Probabilistic Methods in Eng.

    Instructor Dr. Horst Hohberger @SJTU-UM Joint Institute

    Probability theory and statistics is interesting, important but often misunderstood. From a wonderful piecs of data one can draw non-sense conclusion if probabalistic methods are not used in the right way. While dealing with computer security, it’s important that we can come to a conclusion that sensitive data is “almost impossible” to leak.

  • SU2020

    Teaching Assistant: VP260 Honors Physics

    Instructor Dr. Mateusz Krzyzosiak @SJTU-UM Joint Institue.

    I enjoy teaching and want to devote my energy towards helping students. It was a remote semster and everyone was isolated and very stressful. I was lucky to be able to help and support students and be part of this wonderful teaching team.


  • Programming Language

    c/c++ cuda, (system)verilog HIP, bash, Makefile

  • Technologies/Frameworks

    GPU Tuning: nsight-compute/nsight-sys, omniperf/omnitrace/rocprof
    Formal Verification: Murphi
    SIMD: avx512, avx2 on Xeon Phi
    Simulation: SniperSim, DRAMSim, pinplay
    Confidential Computing: Open Enclave SDK, Intel SGX

  • Architectures

    AMD CDNA2 Instinct GPU, NVIDIA Hopper GPU, Intel Xeon Phi, Out-of-order CPU


  • Book stores, free markets and cafés are my must-visits while traveling. My recent best is Campfire Coffee, Negaunee, MI, in a tiny town near Marquette, upper Peninsula. Nice place to visit in fall.
  • Shanghai has only two seasons, winter and summer, and they switch randomly. It is otherwise a wonderful city to live in.
  • My source of metal support: food, friends, IKEA sharks and my cat Marie.
  • Best thing happened recently: I got a really good book about Hapsburg history on the street from a weird old man.


  • Ziqiao Ma PhD @UMich, Ann Arbor, Michigan
  • … (send me a message to link your page)





4844 Bob & Betty Beyster Building
2260 Hayward St
Ann Arbor, MI