Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
126 changes: 125 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ Generative AI library for .NET 9.0 with built-in OpenAI ChatGPT and Google Gemin
- [ ] Text Embedding
- [ ] Moderation
- [ ] Response Streaming
- [ ] Multi-Modal Requests
- [x] Multi-Modal Requests
### Miscellaneous
- [x] Dependency Injection
- [x] Time Awareness
Expand Down Expand Up @@ -174,6 +174,130 @@ byte[] audio = await File.ReadAllBytesAsync("speech.mp3");
string translation = await client.TranslateAsync(audio);
```

## Multi-Modal Requests (Gemini)

To send multi-modal requests with the Gemini client (e.g., text combined with uploaded files like PDFs, images, videos), you first need to upload the file using the `FileService` (exposed via `GeminiClient.Files`) and then reference it in your chat message.

### 1. Accessing the File Service

The `IFileService` is accessible via the `Files` property of your `GeminiClient` instance.

**If using `GeminiClient` as a single instance:**

```cs
using ChatAIze.GenerativeCS.Clients;
using ChatAIze.GenerativeCS.Models.Gemini; // For GeminiFile
using ChatAIze.GenerativeCS.Providers.Gemini; // For IFileService
using System.IO;

var geminiClient = new GeminiClient("<GEMINI API KEY>");
IFileService fileService = geminiClient.Files;

// Example usage:
// string filePath = "path/to/your/file.pdf";
// string mimeType = "application/pdf"; // Adjust mime type accordingly
// GeminiFile? uploadedFile = await fileService.UploadFileAsync(filePath, mimeType, Path.GetFileName(filePath));

// if (uploadedFile != null)
// {
// Console.WriteLine($"File uploaded: {uploadedFile.Name}, URI: {uploadedFile.Uri}");
// // Now use uploadedFile.Uri in a ChatMessage
// }
```

**If using Dependency Injection:**

You register `GeminiClient` (which includes `IFileService` registration) during setup. You can then inject `GeminiClient` and access its `Files` property, or inject `IFileService` directly if you only need the file operations.

```cs
// In your Startup.cs or Program.cs (service registration shown in previous DI examples)
// builder.Services.AddGeminiClient("<GEMINI API KEY>");

// In your class, Option 1: Inject GeminiClient
// private readonly GeminiClient _geminiClient;
// private readonly IFileService _fileService; // Derived from GeminiClient
// public YourService(GeminiClient geminiClient)
// {
// _geminiClient = geminiClient;
// _fileService = geminiClient.Files;
// }

// In your class, Option 2: Inject IFileService directly (if preferred for just file ops)
// private readonly IFileService _fileService;
// public YourService(IFileService fileService) // Assumes IFileService is registered as shown previously
// {
// _fileService = fileService;
// }

// async Task ProcessFile()
// {
// string filePath = "path/to/your/file.pdf";
// string mimeType = "application/pdf";
// GeminiFile? uploadedFile = await _fileService.UploadFileAsync(filePath, mimeType, Path.GetFileName(filePath));
// // ...
// }
```

### 2. Uploading a File

Once you have an `IFileService` instance (e.g., from `geminiClient.Files`):

```cs
using ChatAIze.GenerativeCS.Clients; // For GeminiClient
using ChatAIze.GenerativeCS.Models.Gemini; // For GeminiFile
using ChatAIze.GenerativeCS.Providers.Gemini; // For IFileService
using System.IO;

// Assuming 'geminiClient' is an initialized GeminiClient instance
IFileService fileService = geminiClient.Files;

string filePath = "path/to/your/document.pdf";
string mimeType = "application/pdf"; // Change for other types e.g. "image/png", "video/mp4"
string displayName = Path.GetFileName(filePath);

GeminiFile? uploadedFile = await fileService.UploadFileAsync(filePath, mimeType, displayName);

if (uploadedFile != null)
{
Console.WriteLine($"File uploaded. Name: {uploadedFile.Name}, URI: {uploadedFile.Uri}");
// Store uploadedFile.Uri to use in a chat message
}
else
{
Console.WriteLine("File upload failed.");
}
```

### 3. Sending a Chat Message with the File

After uploading the file, you use its `Uri` (which typically starts with `files/your-file-id`) and `MimeType` in a `ChatMessage` by adding a `FileDataPart`.

```cs
using ChatAIze.GenerativeCS.Clients;
using ChatAIze.GenerativeCS.Models;
using ChatAIze.Abstractions.Chat; // For ChatRole

// Assuming 'geminiClient' is an initialized GeminiClient
// Assuming 'uploadedFile' is the GeminiFile object from the successful upload

if (uploadedFile != null && uploadedFile.Uri != null && uploadedFile.MimeType != null)
{
var chat = new Chat();
var userMessage = new ChatMessage();
userMessage.Role = ChatRole.User;
userMessage.Parts.Add(new TextPart("Please summarize this document."));
userMessage.Parts.Add(new FileDataPart(new FileDataSource(uploadedFile.MimeType, uploadedFile.Uri)));

chat.Messages.Add(userMessage);

// Using the existing CompleteAsync method which now supports parts
string response = await geminiClient.CompleteAsync(chat);
Console.WriteLine(response);
}
```

Supported file types and their MIME types for Gemini include a wide range (PDF, common document formats, images, audio, video). Refer to the official Google Gemini API documentation for the most up-to-date list of supported MIME types.

## Moderation
```cs
using ChatAIze.GenerativeCS.Clients;
Expand Down
19 changes: 11 additions & 8 deletions src/ChatAIze.GenerativeCS.csproj
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
<PropertyGroup>
<Title>Generative CS</Title>
<Product>Generative CS</Product>
<Version>0.14.2</Version>
<Version>0.15.0</Version>
<Company>ChatAIze</Company>
<Authors>Marcel Kwiatkowski</Authors>
<Copyright>© ChatAIze 2025</Copyright>
Expand All @@ -25,9 +25,12 @@
<SuppressSymbolPackageFormatValidation>true</SuppressSymbolPackageFormatValidation>
<Description>
Generative AI library for .NET 9.0 with built-in OpenAI ChatGPT and Google Gemini API clients
and support for C# function calling via reflection.
and support for C# function calling via reflection. Now includes multimodal support for Gemini,
allowing text, PDF, DOC, video, audio, and image file processing.
Features:
- Chat Completion
- Gemini Multimodal Requests (text with files)
- Gemini File Management (upload, get, list, delete)
- Response Streaming
- Text Embedding
- Text-to-Speech
Expand All @@ -49,21 +52,21 @@
chatgpt-api client co co-pilot complete completion completion-generator completion-provider
completions completions-generator completions-provider conversation conversational
conversational-ai copilot cs csharp davinci dialog dotnet dotnet-core embedding
embedding-model embeddings embeddings-model function function-calling functional
embedding-model embeddings embeddings-model file-data file-upload file-uri function function-calling functional
functional-gpt functions functions-calling gemini gemini-api gemini-api-client gemini-client
gemini-pro gemini-pro-api gemini-pro-api-client gemini-pro-client generation generative-ai
generative-cs generator google google-bard google-gemini google-gemini-pro
google-gemini-pro-api google-gemini-pro-api-client google-gemini-pro-client gpt gpt-3 gpt-4
gpt-function gpt-functions gpt3 gpt4 kernel language language-model learning library llama llm
gpt-function gpt-functions gpt3 gpt4 image kernel language language-model learning library llama llm
machine machine-learning method method-calling methods methods-calling microsot ml model
moderation natural natural-language-processing nlp open-ai open-ai-api open-ai-client openai
openai-api openai-api-client openai-client pilot pro processing prompt provider reflection
moderation multi-modal multimodal natural natural-language-processing nlp open-ai open-ai-api open-ai-client openai
openai-api openai-api-client openai-client pdf pilot pro processing prompt provider reflection
respond response response-completion response-generation rest rest-api restful restful-api
robot sdk search semantic sound speech speech-to-text stream streaming synthesis text
text-completion text-embedding text-embeddings text-generation text-synthesis text-to-speech
token transcript transcription transformer transformer-model transformers transformers-model
translation translator tts turbo vector vector-embedding vector-embeddings vector-search
vertex virtual virtual-assistant voice whisper whisper-api wrapper
translation translator tts txt turbo vector vector-embedding vector-embeddings vector-search
vertex video virtual virtual-assistant voice whisper whisper-api wrapper
</PackageTags>
</PropertyGroup>
<ItemGroup>
Expand Down
28 changes: 23 additions & 5 deletions src/Clients/GeminiClient.cs
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,17 @@ public class GeminiClient<TChat, TMessage, TFunctionCall, TFunctionResult>
where TFunctionResult : IFunctionResult, new()
{
private readonly HttpClient _httpClient = new();
public string? ApiKey { get; set; }
public IFileService Files { get; }

public GeminiClient()
{
if (string.IsNullOrWhiteSpace(ApiKey))
{
ApiKey = EnvironmentVariableManager.GetGeminiAPIKey();
}
var geminiOptions = new GeminiOptions { ApiKey = this.ApiKey };
Files = new FileService(_httpClient, Microsoft.Extensions.Options.Options.Create(geminiOptions));
}

public GeminiClient(string apiKey)
Expand All @@ -32,6 +36,8 @@ public GeminiClient(string apiKey)
{
ApiKey = EnvironmentVariableManager.GetGeminiAPIKey();
}
var geminiOptions = new GeminiOptions { ApiKey = this.ApiKey };
Files = new FileService(_httpClient, Microsoft.Extensions.Options.Options.Create(geminiOptions));
}

public GeminiClient(GeminiClientOptions<TMessage, TFunctionCall, TFunctionResult> options)
Expand All @@ -44,29 +50,41 @@ public GeminiClient(GeminiClientOptions<TMessage, TFunctionCall, TFunctionResult
}

DefaultCompletionOptions = options.DefaultCompletionOptions;
var geminiOptions = new GeminiOptions { ApiKey = this.ApiKey };
Files = new FileService(_httpClient, Microsoft.Extensions.Options.Options.Create(geminiOptions));
}

[ActivatorUtilitiesConstructor]
public GeminiClient(HttpClient httpClient, IOptions<GeminiClientOptions<TMessage, TFunctionCall, TFunctionResult>> options)
public GeminiClient(HttpClient httpClient, IOptions<GeminiClientOptions<TMessage, TFunctionCall, TFunctionResult>> clientOptions)
{
_httpClient = httpClient;
ApiKey = options.Value.ApiKey;
ApiKey = clientOptions.Value.ApiKey;

if (string.IsNullOrWhiteSpace(ApiKey))
{
ApiKey = EnvironmentVariableManager.GetGeminiAPIKey();
}

DefaultCompletionOptions = options.Value.DefaultCompletionOptions;
DefaultCompletionOptions = clientOptions.Value.DefaultCompletionOptions;

var fileServiceOptions = new GeminiOptions
{
ApiKey = clientOptions.Value.ApiKey
};
Files = new FileService(_httpClient, Microsoft.Extensions.Options.Options.Create(fileServiceOptions));
}

public GeminiClient(ChatCompletionOptions<TMessage, TFunctionCall, TFunctionResult> defaultCompletionOptions)
{
DefaultCompletionOptions = defaultCompletionOptions;
if (string.IsNullOrWhiteSpace(ApiKey))
{
ApiKey = EnvironmentVariableManager.GetGeminiAPIKey();
}
var geminiOptions = new GeminiOptions { ApiKey = this.ApiKey };
Files = new FileService(_httpClient, Microsoft.Extensions.Options.Options.Create(geminiOptions));
}

public string? ApiKey { get; set; }

public ChatCompletionOptions<TMessage, TFunctionCall, TFunctionResult> DefaultCompletionOptions { get; set; } = new();

public async Task<string> CompleteAsync(string prompt, ChatCompletionOptions<TMessage, TFunctionCall, TFunctionResult>? options = null, CancellationToken cancellationToken = default)
Expand Down
8 changes: 8 additions & 0 deletions src/Extensions/GeminiClientExtension.cs
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
using ChatAIze.GenerativeCS.Clients;
using ChatAIze.GenerativeCS.Models;
using ChatAIze.GenerativeCS.Options.Gemini;
using ChatAIze.GenerativeCS.Providers.Gemini;
using Microsoft.Extensions.DependencyInjection;

namespace ChatAIze.GenerativeCS.Extensions;
Expand Down Expand Up @@ -29,6 +30,13 @@ public static IServiceCollection AddGeminiClient<TChat, TMessage, TFunctionCall,
_ = services.AddSingleton<GeminiClient<TChat, TMessage, TFunctionCall, TFunctionResult>>();
_ = services.AddSingleton<GeminiClient>();

// Register IFileService to be resolved from the GeminiClient's Files property
_ = services.AddSingleton<IFileService>(sp =>
{
var client = sp.GetRequiredService<GeminiClient<TChat, TMessage, TFunctionCall, TFunctionResult>>();
return client.Files;
});

return services;
}

Expand Down
46 changes: 46 additions & 0 deletions src/Models/ChatContentPart.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
using System.Text.Json.Serialization;

namespace ChatAIze.GenerativeCS.Models
{
/// <summary>
/// Represents a part of a chat message content, which can be text, file data, etc.
/// </summary>
public interface IChatContentPart { }

public class TextPart : IChatContentPart
{
[JsonPropertyName("text")]
public string Text { get; set; }

public TextPart(string text)
{
Text = text;
}
}

public class FileDataPart : IChatContentPart
{
[JsonPropertyName("file_data")]
public FileDataSource FileData { get; set; }

public FileDataPart(FileDataSource fileData)
{
FileData = fileData;
}
}

public class FileDataSource
{
[JsonPropertyName("mime_type")]
public string MimeType { get; set; }

[JsonPropertyName("file_uri")]
public string FileUri { get; set; }

public FileDataSource(string mimeType, string fileUri)
{
MimeType = mimeType;
FileUri = fileUri;
}
}
}
Loading