Skip to content
View MyDarapy's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report MyDarapy

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
MyDarapy/README.md

Hi there, I'm Dara 👩👋

I am a machine learning research engineer.

My work and interests evolves around foundation multimodal models, large-scale pre-training, and inference optimization. I am currently deep-diving into HPC for deep learning, writing custom GPU kernels in Triton that leverage tiling, shared memory, and parallel execution to overcome the memory wall. My focus is on accelerating DL training and inference through IO-aware kernel design.

I am concerned about AI safety and interpretability so I occasionally do some mechnaistic intrepretability probing and write about some of my findings here


My Work


Get in touch

  • You can reach me via email
  • I regularly write about deep learning and high performance GPU programming for ML Blog
  • Connect with me on Linkedin and X

My Resumè

Link

Pinned Loading

  1. nano-vllm nano-vllm Public

    Building an inference engine from scratch

    Python

  2. multimodal-llms multimodal-llms Public

    framework for fusing continuous audio embeddings into a causal language model for audio understanding

    Python

  3. triton triton Public

    Custom kernels in triton for accelerating LLMs training and inference

    Python 5

  4. gpt-1-from-scratch gpt-1-from-scratch Public

    Rewriting and pretraining GPT-1 from scratch. Implementing Multihead Attention (MHA) in pyTorch from the original paper Improving Language Understanding by Generative Pre-Training (https://cdn.open…

    Python 9

  5. transformer-attenttion transformer-attenttion Public

    barebone implementation of every transformer component.

    Python 2

  6. ablate-compliance ablate-compliance Public

    identifying and ablating the activation-space directions that enable jailbreaks in large language models

    Python