From 7cb5a70ca28f214aa5bbe0461f8b4e47c92004ee Mon Sep 17 00:00:00 2001
From: nb923 <139726680+nb923@users.noreply.github.com>
Date: Tue, 25 Mar 2025 15:08:25 -0400
Subject: [PATCH] Update application_nideesh_bharath_kumar_ai_api_evaluator.md
 to support images

---
 ...ication_nideesh_bharath_kumar_ai_api_evaluator.md | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/doc/proposals/2025/gsoc/application_nideesh_bharath_kumar_ai_api_evaluator.md b/doc/proposals/2025/gsoc/application_nideesh_bharath_kumar_ai_api_evaluator.md
index 14bd5363..c4f24c3f 100644
--- a/doc/proposals/2025/gsoc/application_nideesh_bharath_kumar_ai_api_evaluator.md
+++ b/doc/proposals/2025/gsoc/application_nideesh_bharath_kumar_ai_api_evaluator.md
@@ -119,7 +119,7 @@ This project is to develop a Dart-centered evaluation framework designed to simp
 
 **Architecture:**
 
-![architecture](images/nb923-proposal-architecture)
+![architecture](images/nb923-proposal-architecture.png)
 
 -   Frontend Layer: This layer will be the main API Dash app UI. It will use Flutter/Dart to build a UI for users to select the AI evaluation test specifications and obtain details such as API key, model name, API link, and other details. This layer will also display the real-time charts of the evaluations and final metrics.
     
@@ -142,23 +142,23 @@ This prototype contains a custom UI implementation of the AI evaluation layer, l
 
 The top right corner has a new button for API evaluations as show in the picture below:
 
-![prototype-image-one](images/nb923-proposal-prototype-one)
+![prototype-image-one](images/nb923-proposal-prototype-one.png)
 
 When selected, it prompts a selection of tests:
 
-![prototype-image-two](images/nb923-proposal-prototype-two)
+![prototype-image-two](images/nb923-proposal-prototype-two.png)
 
 Hellaswag is the only implemented test currently. When selected, it prompts a menu with model name, API URL, API key, and limit of dataset rows being tested. I recommend setting the limit to 20 to reduce API usage.
 
-![prototype-image-three](images/nb923-proposal-prototype-three)
+![prototype-image-three](images/nb923-proposal-prototype-three.png)
 
 When run is selected, it prompts a loading screen as the lm-evaluation-harness processes this request through a custom implementation of the provided models.
 
-![prototype-image-four](images/nb923-proposal-prototype-four)
+![prototype-image-four](images/nb923-proposal-prototype-four.png)
 
 After the evaluation is finished, it provides a quick value for the accuracy. This is a simple prototype and a limit of rows on the test is set; so, this metric should be taken with a grain of salt.
 
-![prototype-image-five](images/nb923-proposal-prototype-five)
+![prototype-image-five](images/nb923-proposal-prototype-five.png)
 
 Key changes are: