Skip to content

Conversation

@hanishkvc
Copy link
Contributor

@hanishkvc hanishkvc commented Nov 5, 2025

The alternate client side web ui at tools/server/public_simplechat with 0 setup builtin tool calling ++

This PR refactors the code such that the core classes including wrt tooling (collated tools manager into a class) sit in their own module files and inturn get imported for usage in the main runtime entrypoint file as well as other places where refered. This allows developers and static tools to be aware of the data structure and flow fully.

Add system date time timestamp tool call, so that the ai model doesnt hallucinate timestamps from thin air.

Renamed pdf_to_text tool call to fetch_pdf_as_text, so that ai model understands the intent/semnatic of that tool call better.

The previous PR in this series is at #16929

Except for the python type hinting and match related python check failure (isnt it time to enable these better code structurings to be used, without rising a check failure?), other auto check errors dont relate to public_simplechat in any way and can be safely ignored.

By pointing llama-server at this alternate webui, one can use the core builtin toolcalling features without any addiitional setup. And by running the included simpleproxy.py, one can enhance tool calling to include fetching of web pages and pdfs, as well as conversion to plain text for use by the ai model.

Read public_simplechat's readme for more details.

NOTE: This is a simple minded web client ui for exploring the llama-server rest api for developers as well as allowing basic utilitarian usage for end users.

Changed latestResponse type to an object instead of a string.
Inturn it contains entries for content, toolname and toolargs.

Added a custom clear logic due to the same and used it to replace
the previously simple assigning of empty string to latestResponse.

For now in all places where latestReponse is used, I have replaced
with latestReponse.content.

Next need to handle identifying the field being streamed and inturn
append to it. Also need to add logic to call tool, when tool_call
triggered by genai.
Update response_extract_stream to check for which field is being
currently streamed ie is it normal content or tool call func name
or tool call func args and then return the field name and extracted
value.

Previously it was always assumed that only normal content will be
returned.

Currently it is assumed that the server will only stream one of the
3 supported fields at any time and not more than one of them at the
same time.

TODO: Have to also add logic to extract the reasoning field later,
ie wrt gen ai models which give out their thinking.

Have updated append_response to expect both the key and the value
wrt the latestResponse object, which it will be manipualted.

Previously it was always assumed that content is what will be got
and inturn appended.
I was wrongly checking for finish_reason to be non null, before
trying to extract the genai content/toolcalls, have fixed this
oversight with the new flow in progress.

I had added few debug logs to identify the above issue, need to
remove them later. Note: given that debug logs are disabled by
replacing the debug function during this program's initialisation,
which I had forgotten about, I didnt get the debug messages and
had to scratch my head a bit, before realising this and the other
issue ;)

Also either when I had originally implemented simplechat 1+ years
back, or later due to changes on the server end, the streaming
flow sends a initial null wrt the content, where it only sets the
role. This was not handled in my flow on the client side, so a
null was getting prepended to the chat messages/responses from the
server. This has been fixed now in the new generic flow.
Make latestResponse into a new class based type instance wrt
ai assistant response, which is what it represents.

Move clearing, appending fields' values and getting assistant's
response info (irrespective of a content or toolcall response)
into this new class and inturn use the same.
Switch oneshot handler to use AssistantResponse, inturn currenlty
only handle the normal content in the response.

TODO: If any tool_calls in the oneshot response, it is currently
not handled.

Inturn switch the generic/toplevel handle response logic to use
AssistantResponse class, given that both oneshot and the
multipart/streaming flows use/return it.

Inturn add trimmedContent member to AssistantResponse class and
make the generic handle response logic to save the trimmed content
into this. Update users of trimmed to work with this structure.
As there could be failure wrt getting the response from the ai
server some where in between a long response spread over multiple
 parts, the logic uses the latestResponse to cache the response
as it is being received. However once the full response is got,
one needs to transfer it to a new instance of AssistantResponse
class, so that latestResponse can be cleared, while the new
instance can be used in other locations in the flow as needed.

Achieve the same now.
Previously if content was empty, it would have always sent the
toolcall info related version even if there was no toolcall info
in it. Fixed now to return empty string, if both content and
toolname are empty.
The implementations of javascript and simple_calculator now use
provided helpers to trap console.log messages when they execute
the code / expression provided by GenAi and inturn store the
captured log messages in the newly added result key in tc_switch

This should help trap the output generated by the provided code
or expression as the case maybe and inturn return the same to the
GenAi, for its further processing.
Checks for toolname to be defined or not in the GenAi's response

If toolname is set, then check if a corresponding tool/func exists,
and if so call the same by passing it the GenAi provided toolargs
as a object.

Inturn the text generated by the tool/func is captured and put
into the user input entry text box, with tool_response tag around
it.
As output generated by any tool/function call is currently placed
into the TextArea provided for End user (for their queries), bcas
the GenAi (engine/LLM) may be expecting the tool response to be
sent as a user role data with tool_response tag surrounding the
results from the tool call. So also now at the end of submit btn
click handling, the end user input text area is not cleared, if
there was a tool call handled, for above reasons.

Also given that running a simple arithmatic expression in itself
doesnt generate any output, so wrap them in a console.log, to
help capture the result using the console.log trapping flow that
is already setup.
and inform the GenAi/LLM about the same
Should hopeful ensure that the GenAi/LLM will generate appropriate
code/expression as the argument to pass to these tool calls, to
some extent.
Move tool calling logic into tools module.

Try trap async promise failures by awaiting results of tool calling
and putting full thing in an outer try catch. Have forgotten the
nitty gritties of JS flow, this might help, need to check.
So that when tool handler writes the result to the tc_switch, it
can make use of the same, to write to the right location.

NOTE: This also fixes the issue with I forgetting to rename the
key in js_run wrt writing of result.
to better describe how it will be run, so that genai/llm while
creating the code to run, will hopefully take care of any naunces
required.
Also as part of same, wrap the request details in the assistant
block using a similar tagging format as the tool_response in user
block.
Instead of automatically calling the requested tool with supplied
arguments, rather allow user to verify things before triggering the
tool.

NOTE: User already provided control over tool_response before
submitting it to the ai assistant.
Instead of automatically calling any requested tool by the GenAi
/ llm, that is from the tail end of the handle user submit btn
click,

Now if the GenAi/LLM has requested any tool to be called, then
enable the Tool Run related UI elements and fill them with the
tool name and tool args.

In turn the user can verify if they are ok with the tool being
called and the arguments being passed to it. Rather they can
even fix any errors in the tool usage like the arithmatic expr
to calculate that is being passed to simple_calculator or the
javascript code being passed to run_javascript_function_code

If user is ok with the tool call being requested, then trigger
the same.

The results if any will be automatically placed into the user
query text area.

User can cross verify if they are ok with the result and or
modify it suitabley if required and inturn submit the same to
the GenAi/LLM.
Also avoid showing Tool calling UI elements, when not needed to
be shown.
Update readme wrt searchDrops, auto settings ui creation

Rename tools-auto to tools-autoSecs, to make it easy to realise
that the value represents seconds.
Pretty print SimpleProxy gMe config

Dont ignore the got http response status text.

Update readme wrt why autoSecs
Allow user to clear the existing chat. The user does have the
option to load the just cleared chat, if required.

Add icons wrt clearing chat and settings.
Some ai's dont seem to be prefering to use this direct helper
provided for fetching pdf as text, on its own. Instead ai (gptoss)
seems to be keen on fetching raw pdf and extract text etal, so now
renaming the function call to try and make its semantic more
readily obivious hopefully.

It sometimes (not always) seem to assum fetch_web_url_text, can
convert pdf to text and return it. Maybe I need to place the
specific fetch pdf as text before the generic fetch web url text
and so...

With the rename, the pdf specific fetch seems to be getting used
more.
Have main classes defined independent of and away from runtime flow

Move out the entry point including runtime instantiation of the
core Me class (which inturn brings other class instances as neede)
into its own main.js file.

With this one should be able to import simplechat.js into other
files, where one might need the SimpleChat or MultiChat or Me class
definitions.
Now gMe can be used in toolweb with proper knowledge of available
members and can also be cross checked by tools
Given that Me is now passed to the tools logic during setup, have
the web worker handles in Me itself, instead of in tool related
modules.

Move setup of web worker related main thread callbacks, as well as
posting messages directly to these main thread callbacks, into Me.
So that all tools related management logic sits in tools module
itself, but is accessible from Me by having a instance of Tools.

The Workers moved into Tools class.

The tc_switch moved into Tools class.

The setup_workers, init, meta and tool_call moved into Tools class.
Rename Tools to ToolsManager to convey its semantic better.

Move setup of workers onmessage callback as well as directly passing
result to these callbacks into ToolsManager.

Now that Workers have been moved into ToolsManager, and ToolsManager
has been instantiated as a member of Me, use the same in place of
prev workers of Me.
Me.tools.toolNames is now directly updated by init of ToolsManager

The two then in the old tools.init was also unneeded then also as
both could have been merged into a single then, even then. However
with the new flow, the 1st then is no longer required.

Also now the direct calling of onmessage handler on the main thread
side wrt immidiate result from tool call is delayed for a cycling
through the events loop, by using a setTimeout.

No longer expose the tools module throught documents, given that
the tools module mainly contains ToolsManager, whose only instance
is available through the global gMe.

Move the devel related exposing throught document object into a
function of its own.
@hanishkvc hanishkvc force-pushed the hkvc_server_simplechat_toolcalling_v0450 branch from fbb3a3d to a659344 Compare November 6, 2025 10:07
@hanishkvc
Copy link
Contributor Author

force pushed with rebase to latest master

Add a pending object which maintains the pending toolcallid wrt
each chat session, when ever a tool call is made.

In turn when ever a tool call response is got cross check if its
toolcallid matches that in the pending list. If so accept the
tool call response and remove from pending list. If not just
ignore the response.

NOTE: The current implementation supports only 1 pending tool call
at any time.

NOTE: Had to change from a anonymous to arrow function so as to
be able to get access to the ToolsManager instance (this) from
within the function. ie make use of lexical binding semantic of
arrow functions.
ie if exception raised during tool call execution and or time out
occurs
Add forgotten to add , after simplechat entry.

Currently I am not strictly using the importmap feature, so the
error didnt create any problem, but the error was there which has
been fixed.
Take the existing urltext logic including its html parser and
strip it out to be simpler.
Add the meta data for the fetch xml as text tool call

Implement the handler and the setup tool call plumbing logic
At simpleproxy end

* Add the tag names hierarchy before contents of a tag

* Remember to convert the tagDrops to small case as HTMLParser base
  class seems to do that by default.

At the client ui end

* if undefined remember to pass a empty list wrt tagDrops.

* cleanup the func description and also mention possible tagDrops
  for RSS feeds in the tool meta
instead of the prefixing of tag heirarchy retain the xml structure
while parallely allowing unwanted tags and their contents to be
dropped.
Rename xmltext to xmlfiltered.

This simplifies the filtering related logic as well as gives more
fine grained flexibility wrt filtering bcas of re.
To make it easier for the ai model to understand that this works
mainly for html pages and not say xml or pdf or so. For those
one needs to use other explict tool calls provided like fetchpdftext
or fetchxmltext or so

The server service path renamed from urltext to htmltext.

SearchWebText also updated to use htmltext now
Dont even insert skipped tags as tag blocks with empty content.

This should make the resultant xml cleaner and make it use less
space.
Rather a chat with gpt-oss generated a assistant response which
included chat-content, chat-reasoning and chat-toolcall all in the
same response. On responding to same with tool call result, the
server http handshake responded with a 500 Internal server error,
So added this to get more details in this case, as well as in
general for future.
@hanishkvc
Copy link
Contributor Author

Have added logic to ignore delayed tool call results, if user and or auto trigger logic has already gone ahead without waiting for delayed tool call result.

Have added xmlfiltered tool call, to allow one to fetch xml while also filtering out unwanted fields/tags from them, to help keep things from overloading the limited context window and so.

Also some other general cleanup

This simple scheme doesnt work. Rather the pdf outline seems
to follow below logic

If a child list is found when processing the current list, dont
increment the numbering.
This increaments before itself, but we need to increment after
Pass a list to keep track of the numbering at different depths
as well as to delay incrementing the numbering to the last min

Dont let recursion go beyond a predefined limit
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

examples python python script changes server

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant