One of the most useful things an LLM can do, is to accept files š
Beside the need for us of typing something, and that can be a real pain if we talk about documents but it is something close to impossible if we refer to images and videos.
Thatās why the ability to upload a file is so comfortable!
There are plenty of approaches when it comes to files though, but since the exercise is focused on image uploads we can easily ignore the context window and convert the image with a standard base64 string.
This is not something that I would apply in production, but hey weāre learning new stuff here so letās push the optimization for a later time.
Matt has prepared for us an handy utility function fileToDataURL that returns a Promise and thanks to the readAsDataURL() method of the FileReader class we are able to have a data:*/*;base64, string that we can safely pass to our LLM.
const fileToDataURL = (file: File) => {
return new Promise<string>((resolve, reject) => {
const reader = new FileReader();
reader.onload = () => resolve(reader.result as string);
reader.onerror = reject;
reader.readAsDataURL(file);
});
};
All we have left to do in this specific exercise is to handle the āfile uploadā (Matt already have prepared the ChatInput component for the task) and when we send submit the data to our model of choice we can safely convert it with fileToDataURL.
<ChatInput
/* Other props */
onSubmit={async (e) => {
e.preventDefault();
// Getting all the data for the form
const formData = new FormData(
e.target as HTMLFormElement,
);
// Getting the file
const file = formData.get('file') as File;
// Convert file into a base64 URL string
const fileURL = await fileToDataURL(file);
sendMessage({
// Sending the message
});
setInput('');
setSelectedFile(null);
}}
/>
In this snippet I wanted to focus our attention to the onSubmit event and describe step by step what we were doing.
Now itās time to get into the how we are sending the information to our LLM.
We have the message the user typed with the input local state, we have the base64 URL in fileURL that we calculatedā¦
How do we send both informations to our LLM? Until now we saw how to leverage the sendMessage function to send a message like so:
sendMessage({ text: input.trim() });
A single object config with just a text prop where we pass the input the user has typed. But as listed a second ago, now we have two pieces of information that we want to send: the typed text and the attached fileā¦
Thankfully the sendMessage function is capable of handling several kind of inputs and while it is able to automatically handle the entire conversation held thanks to the useChat hook, we have to use a different configuration object for this task.
sendMessage({
parts: [
{
type: 'file',
url: fileURL,
mediaType: file.type,
},
{
type: 'text',
text: input,
},
],
});
This time we will not send a simple text as part of our configuration object, but we have to send a parts array.
If youāre curious like me, in the previous example where we just pass
{ text: input }tosendMessagewe are just leveraging a little trick ofuseChat. At the end of the day,sendMessagewill convert such object into a standard{ parts: [ { type: 'text', text: input } ] }configuration, so if youāre getting crazy to check types definition, now you have a quick answer on how it works š
So at the end of the day, weāre free to enhance the UIMessagePart weāre building into parts as long as our application knows what to do with the fields weāre passing. And thatās because the SDK does not force us to respect any kind of structure as long as we define a type with a string.