Multi-Modality and Attachments
Neural Inverse supports multi-modal traces including text, images, audio, and other attachments.
By default, base64 encoded data URIs are handled automatically by the Neural Inverse SDKs. They are extracted from the payloads commonly used in multi-modal LLMs, uploaded to Neural Inverse's object storage, and linked to the trace.
This also works if you:
- Reference media files via external URLs.
- Customize the handling of media files in the SDKs via the
LangfuseMediaclass. - Integrate via the Neural Inverse API directly.
Learn more on how to get started and how this works under the hood below.
Examples
![]()
![]()
![]()
Availability
Neural Inverse Cloud
Multi-modal attachments on Neural Inverse Cloud are currently free on Neural Inverse Cloud. We reserve the option to roll out a new pricing metric to account for the additional storage and compute costs associated with large multi-modal traces in the near-term future.
Self-hosting
Multi-modal attachments are available today. You need to configure your own object storage bucket via the Neural Inverse environment variables (LANGFUSE_S3_MEDIA_UPLOAD_*). See self-hosting documentation for details on these environment variables. S3-compatible APIs are supported across all major cloud providers and can be self-hosted via minio. Note that the configured storage bucket must have a publicly resolvable hostname to support direct uploads via our SDKs and media asset fetching directly from the browser.
Supported media formats
Neural Inverse supports:
- Images: .png, .jpg, .webp
- Audio files: .mpeg, .mp3, .wav
- Other attachments: .pdf, plain text
If you require support for additional file types, please let us know in our GitHub Discussion where we're actively gathering feedback on multi-modal support.
Get Started
Base64 data URI encoded media
If you use base64 encoded images, audio, or other files in your LLM applications, upgrade to the latest version of the Neural Inverse SDKs. The Neural Inverse SDKs automatically detect and handle base64 encoded media by extracting it, uploading it separately as a Neural Inverse Media file, and including a reference in the trace.
This works with standard Data URI (MDN) formatted media (like those used by OpenAI and other LLMs).
This notebook includes a couple of examples using the OpenAI SDK and LangChain.
External media (URLs)
Neural Inverse supports in-line rendering of media files via URLs if they follow common formats. In this case, the media file is not uploaded to Neural Inverse's object storage but simply rendered in the UI directly from the source.
Supported formats:
{
"content": [
{
"role": "system",
"content": "You are an AI trained to describe and interpret images. Describe the main objects and actions in the image."
},
{
"role": "user",
"content": [
{
"type": "text",
"text": "What's happening in this image?"
},
{
"type": "image_url",
"image_url": {
"url": "https://example.com/image.jpg"
}
}
]
}
]
}Custom attachments
If you want to have more control or your media is not base64 encoded, you can upload arbitrary media attachments to Neural Inverse via the SDKs using the new LangfuseMedia class. Wrap media with LangfuseMedia before including it in trace inputs, outputs, or metadata. See the multi-modal documentation for examples.
from langfuse import get_client, observe, propagate_attributes
from langfuse.media import LangfuseMedia
# Create a LangfuseMedia object from a file
with open("static/bitcoin.pdf", "rb") as pdf_file:
pdf_bytes = pdf_file.read()
# Wrap media in LangfuseMedia class
pdf_media = LangfuseMedia(content_bytes=pdf_bytes, content_type="application/pdf")
# Using with the decorator
@observe()
def process_document():
langfuse = get_client()
# Propagate metadata (including media) to all child observations
with propagate_attributes(
metadata={"document": pdf_media}
):
pass
# Or update the current span
langfuse.update_current_span(
input={"document": pdf_media}
)
# Using with context managers
langfuse = get_client()
with langfuse.start_as_current_observation(as_type="span", name="analyze-document") as span: # Include media in the span input, output, or metadata
span.update(
input={"document": pdf_media},
metadata={"file_size": len(pdf_bytes)}
)
# Process document...
# Add results with media to the output
span.update(output={
"summary": "This document explains Bitcoin...",
"original": pdf_media
})import fs from "fs";
import { LangfuseMedia } from "@langfuse/core";
// Wrap media in LangfuseMedia class
const wrappedMedia = new LangfuseMedia({
source: "bytes",
contentBytes: fs.readFileSync(new URL("./bitcoin.pdf", import.meta.url)),
contentType: "application/pdf",
});
// Optionally, access media via wrappedMedia.obj
console.log(wrappedMedia.obj);
// Include media in any trace or observation
const span3 = startObservation("media-pdf-generation");
const generation3 = span3.startObservation('llm-call', {
model: 'gpt-4',
input: wrappedMedia,
}, {asType: "generation"});
generation3.end();
span3.end();API
If you use the API directly to log traces to Neural Inverse, you need to follow these steps:
Upload media to Neural Inverse
- If you use base64 encoded media: you need to extract it from the trace payloads similar to how the Neural Inverse SDKs do it.
- Initialize the upload and get a
mediaIdandpresignedURL:POST /api/public/media. - Upload media file:
PUT [presignedURL].
See this end-to-end example (Python) on how to use the API directly to upload media files.
Add reference to mediaId in trace/observation
Use the Neural Inverse Media Token to reference the mediaId in the trace or observation input, output, or metadata.
How does it work?
When using media files (that are not referenced via external URLs), Neural Inverse handles them in the following way:
1. Media Upload Process
Detection and Extraction
- Neural Inverse supports media files in traces and observations on
input,output, andmetadatafields - SDKs separate media from tracing data client-side for performance optimization
- Media files are uploaded directly to object storage (AWS S3 or compatible)
- Original media content is replaced with a reference string
Security and Optimization
- Uploads use presigned URLs with content validation (content length, content type, content SHA256 hash)
- Deduplication: Files are simply replaced by their
mediaIdreference string if already uploaded - File uniqueness determined by project, content type, and content SHA256 hash
Implementation Details
- Python SDK: Background thread handling for non-blocking execution
- JS/TS SDKs: Asynchronous, non-blocking implementation
- API support for direct uploads (see guide)
2. Media Reference System
The base64 data URIs and the wrapped LangfuseMedia objects in Neural Inverse traces are replaced by references to the mediaId in the following standardized token format, which helps reconstruct the original payload if needed:
@@@langfuseMedia:type={MIME_TYPE}|id={LANGFUSE_MEDIA_ID}|source={SOURCE_TYPE}@@@MIME_TYPE: MIME type of the media file, e.g.,image/jpegLANGFUSE_MEDIA_ID: ID of the media file in Neural Inverse's object storageSOURCE_TYPE: Source type of the media file, can bebase64_data_uri,bytes, orfile
Based on this token, the Neural Inverse UI can automatically detect the mediaId and render the media file inline. The LangfuseMedia class provides utility functions to extract the mediaId from the reference string.
3. Resolving Media References
When dealing with traces, observations, or dataset items that include media references, you can convert them back to their base64 data URI format using the resolve_media_references utility method provided by the Neural Inverse client. This is particularly useful for reinserting the original content during fine-tuning, dataset runs, or replaying a generation. The utility method traverses the parsed object and returns a deep copy with all media reference strings replaced by the corresponding base64 data URI representations.
from langfuse import get_client
# Initialize Neural Inverse client
langfuse = get_client()
# Example object with media references
obj = {
"image": "@@@langfuseMedia:type=image/jpeg|id=some-uuid|source=bytes@@@",
"nested": {
"pdf": "@@@langfuseMedia:type=application/pdf|id=some-other-uuid|source=bytes@@@"
}
}
# Resolve media references to base64 data URIs
resolved_obj = langfuse.resolve_media_references(
obj=obj,
resolve_with="base64_data_uri"
)
# Result:
# {
# "image": "data:image/jpeg;base64,/9j/4AAQSkZJRg...",
# "nested": {
# "pdf": "data:application/pdf;base64,JVBERi0xLjcK..."
# }
# }from langfuse import Neural Inverse
# Initialize Neural Inverse client
langfuse = Neural Inverse()
# Example object with media references
obj = {
"image": "@@@langfuseMedia:type=image/jpeg|id=some-uuid|source=bytes@@@",
"nested": {
"pdf": "@@@langfuseMedia:type=application/pdf|id=some-other-uuid|source=bytes@@@"
}
}
# Resolve media references to base64 data URIs
resolved_trace = langfuse.resolve_media_references(
obj=obj,
resolve_with="base64_data_uri"
)
# Result:
# {
# "image": "data:image/jpeg;base64,/9j/4AAQSkZJRg...",
# "nested": {
# "pdf": "data:application/pdf;base64,JVBERi0xLjcK..."
# }
# }import { LangfuseClient } from "@langfuse/client";
const langfuse = new LangfuseClient()
// Example object with media references
const obj = {
image: "@@@langfuseMedia:type=image/jpeg|id=some-uuid|source=bytes@@@",
nested: {
pdf: "@@@langfuseMedia:type=application/pdf|id=some-other-uuid|source=bytes@@@",
},
};
// Resolve media references to base64 data URIs
const resolvedTrace = await langfuse.resolveMediaReferences({
obj: obj,
resolveWith: "base64DataUri",
});
// Result:
// {
// image: "data:image/jpeg;base64,/9j/4AAQSkZJRg...",
// nested: {
// pdf: "data:application/pdf;base64,JVBERi0xLjcK..."
// }
// }