Skip to content

Feature: copy and paste images#30

Open
abhi12299 wants to merge 13 commits intomeltylabs:mainfrom
abhi12299:image-paste
Open

Feature: copy and paste images#30
abhi12299 wants to merge 13 commits intomeltylabs:mainfrom
abhi12299:image-paste

Conversation

@abhi12299
Copy link
Contributor

@abhi12299 abhi12299 commented Sep 8, 2024

What?

Allows for copying and pasting images from clipboard in the claude conversation. Please merge #25 first as this branch is based on that PR.

  • Max of 5 images supported
  • Max size of 5MB per image (claude's restriction)
  • Check for image mime types supported by claude

How?

  • When user pastes one or more images, the base64 versions are sent to the electron process with the chatMessage RPC.
  • The images are saved in the meltyDir inside assets/images, and the path of the image is saved in the conversation joule. This keeps the serialisation/deserialisation fast.
  • When the UI needs to display the conversation, the images in the joules are read from the file system and converted to base64 and sent to the frontend.
  • Since the entire conversation is sent during the token streaming process:
    const processPartial = (partialConversation: Conversation) => {
    const dehydratedTask = this.dehydrateForWire();
    dehydratedTask.conversation = partialConversation;
    webviewNotifier.sendNotification("updateTask", {
    task: dehydratedTask,
    });

    We need to add a caching layer prior to reading the image files and sending the response to the UI so that we do not make unnecessary system calls to read the same files over and over again with each token. This is implemented with lru-cache in the datastore.

Demo:

img-paste_.mp4

Performance optimizations:

Implemented an LRU cache before reading images and sending the base64 version to frontend to avoid the situation described below:

Initially, base64 images were being stored as a part of the JSON structure of a given task. This introduced an overhead when all the joules need to be read during initialisation. Instead, we now store these images as files in the meltyDirectory and keep the path of the image as a part of the JSON structure. When the image data is needed by the frontend, we can read the file and send it over. I also implemented a simple LRU cache to avoid making frequent system calls to read the images.

In another PR, I will implement a drag-and-drop functionality to allow images to be dropped in the textarea.

@abhi12299 abhi12299 marked this pull request as draft September 8, 2024 10:46
@abhi12299
Copy link
Contributor Author

abhi12299 commented Sep 8, 2024

Update: performance improvements implemented.

Marking as draft as I'm working on a performance improvement, described here:

Currently, the base64 image is being stored as a part of the JSON structure of a given task. This introduces an overhead when all the joules need to be read during initialisation. Instead, we can store these images as files in the meltyDirectory and keep the path of the image as a part of the JSON structure. When the image data is needed by the frontend, we can read the file and send it over. We can also implement a simple LRU cache to avoid making frequent system calls to read the images.

@abhi12299 abhi12299 marked this pull request as ready for review September 8, 2024 14:05
@cbh123
Copy link
Contributor

cbh123 commented Sep 9, 2024

Woah! Took a look at the demo video, and this looks like awesome work. @jacksondc and I will take a look in the next couple days.

Thanks!

@abhi12299 abhi12299 force-pushed the image-paste branch 3 times, most recently from a21d595 to 25bcafb Compare September 10, 2024 05:44
@abhi12299
Copy link
Contributor Author

@cbh123 fixed merge conflicts in this branch as well.

@abhi12299
Copy link
Contributor Author

@jacksondc any plans to merge this one? Also, is there any other feature you've planned ahead? I can pick that up

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants