Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2.0 roadmap #180

Open
4 of 9 tasks
realcoloride opened this issue Jul 23, 2024 · 30 comments
Open
4 of 9 tasks

2.0 roadmap #180

realcoloride opened this issue Jul 23, 2024 · 30 comments

Comments

@realcoloride
Copy link
Owner

realcoloride commented Jul 23, 2024

Hello!

First of, thank you all for the support you guys bring to node_characterai.

I come here to talk about the fact I want to switch to the new endpoints and much more. I also feel limited by the codebase and I'd like to use something much more strongly typed and better written. Make a much more easier interface for beginners and a less confusing one as well.

In fact I have started to get very comfortable with programming in TypeScript and such I'd like to really make a better environment for node_characterai.

If you have suggestions for the rewrite please let me know.

Major (remaining) objectives roadmap:

  • Remaining endpoints, see 2.0 roadmap #180 (comment)
  • Group chat support
  • Write README.md
  • Properly type out un typed properties
  • Personas support
  • Merge branch to master
  • Add proper internal documentation
  • Test everything and ensure stability (TODO: unit testing)
  • Deploy as package

Current branch: https://github.com/realcoloride/node_characterai/tree/2.0

Let me know if you guys are interested in a rewrite and or if you have ideas/feedback.

@realcoloride realcoloride pinned this issue Jul 23, 2024
This was referenced Jul 23, 2024
@realcoloride
Copy link
Owner Author

realcoloride commented Jul 23, 2024

Note: switching to the new endpoint will cut off features like guest mode, and some endpoints/features might be removed since they are not available on the new interface.

@realcoloride
Copy link
Owner Author

Hello there again, I've opened a new branch (2.0) to handle directly here: https://github.com/realcoloride/node_characterai/tree/2.0

Currently work in progress. I am busy working on another project at the same time, but I will try my best to keep up for a new update.

@matsukky
Copy link

Hi, do you think it will be possible to use/manage persona?

@realcoloride
Copy link
Owner Author

What do you mean by persona?

@matsukky
Copy link

Persona

@realcoloride
Copy link
Owner Author

Hello everyone,

I come forward to indicate the migration might be happening sooner than I thought. Announced today, the CharacterAI team have officially announced the departure of the old characterAI interface and endpoints (which will inevitably break the older versions of node_characterai).

Source: https://www.reddit.com/r/CharacterAI/s/W0T6cZ3B9q

Personally, I think this is a bad idea. I believe the new website has a worse interface and worse language capabilities and the model quality isn't the same and plenty of features are missing.
Most of the community is against it (including me) but this is probably not going to change the matter.

So, I recommend users of node_characterai to prepare for these changes as the new rewrite of the package will come sooner than expected and everyone will probably have to migrate, and I will have to also handle the new package version faster.

The new migration will also take new features in place, and they will probably be in newer versions of the 2.0+ branch.
image

The changes will take immediate effect on September 10th.

@realcoloride
Copy link
Owner Author

realcoloride commented Aug 28, 2024

Persona

Sure. This will be probably added as a feature in node_characterai if possible.

@realcoloride
Copy link
Owner Author

realcoloride commented Aug 30, 2024

Hello, anyone else been having issues using the new interface? I am getting a lot of CORS related errors whilst using the website.
chrome_y3JI4lFkbn
image
image

@ming736
Copy link

ming736 commented Aug 30, 2024

Hello, anyone else been having issues using the new interface? I am getting a lot of CORS related errors whilst using the website. chrome_y3JI4lFkbn image image

I just checked, and I'm not having any issues currently.

@realcoloride realcoloride changed the title Thinking about rewriting the whole package in TypeScript (2.0 roadmap) 2.0 roadmap Sep 24, 2024
@realcoloride
Copy link
Owner Author

Hello there! Just want to let you know all that I am actively working on 2.0 and it took me a long time to find the perfect design, but I will now aim for developer experience first.

I just managed to get my first dms to send. Still got quite the stuff to go.
Code_mXDBKAwfKH
image

Also, no puppeteer is used. Atleast for now.
Current branch: https://github.com/realcoloride/node_characterai/tree/2.0

Again like usual, if you have feedback, feel free to let me know.
Cheers

@IqroNegoro
Copy link

Used puppeteer is hard enough to setup and make server resources too heavy, hope you can make it more lightweight in 2.0 ver, wish u the best for this!

@realcoloride
Copy link
Owner Author

realcoloride commented Sep 28, 2024 via email

@gamersindo1223
Copy link

Hello, Thank you for your support. I am using the Android's app's endpoints. Meaning that if they plan on adding anti fetching measures like they do with cloudflare, they would have to constantly update the Android app, which would probably cause some problems for their massive mobile app user base. It all relies on how long before I am forced to use puppeteer for good. For now, it has been doing fine. Some endpoints are not available without cloudflare or specific things that I cannot fully get (like following an user for example) but we'll eventually get to it. Cheers

On Sat, Sep 28, 2024 at 11:33 AM Nadila Vira @.> wrote: Used puppeteer is hard enough to setup and make server resources too heavy, hope you can make it more lightweight in 2.0 ver, wish u the best for this! — Reply to this email directly, view it on GitHub <#180 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AZ4WO5M5REBEE5X7RN3ZP2TZYZZVVAVCNFSM6AAAAABLJPKDL6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGOBQGU4DCNRVHE . You are receiving this because you authored the thread.Message ID: @.>

Just a suggestion, you may want to look into curl_cffi / curl_impersonate

@realcoloride
Copy link
Owner Author

realcoloride commented Oct 7, 2024

Just a suggestion, you may want to look into curl_cffi / curl_impersonate

It's not really necessary at the moment, but it is definitely something worth a note. Thank you for the suggestion.

@Xtr4F
Copy link

Xtr4F commented Oct 9, 2024

Just a suggestion, you may want to look into curl_cffi / curl_impersonate

+ 1 for curl impersonate. You don't need a whole browser, in fact you just need any http library that can make its TLS fingerprint look like the browser's one.

In short, the connection parameters of regular http libraries are different from those used by popular browsers and some websites use this to block requests from “not real” users. You can read more about this here: https://lwthiker.com/networks/2022/06/17/tls-fingerprinting.html

@realcoloride
Copy link
Owner Author

realcoloride commented Nov 3, 2024

TODO 2.0 (better) Roadmap

Everything already done will not be shown here. Check the links by yourself to see what's done and what not.
Note that most of these endpoints are lacking due to me needing to reverse the Android app... which is hella hard.
This message aims to add more transparency for the development.

TODO means that I have not planned on looking forward to it yet. The classes below are in priority, but the extents of the classes shown below (because I use classes from now on and types) are not shown for the sake of simplicity, see the codebase for yourself.

The following emojis have the following meaning:

  • 🔒 Needs an alternative endpoint (is cloudflare protected)
  • 🔍 Needs research
  • 🤷 No idea/no endpoint
  • ⚡ Requires lower level work beforehand

Character class roadmap

  • createGroupChat() 🔒⚡

CharacterAI (client) class roadmap

  • connectToConversation() for groupchat 🔍⚡

Message class roadmap

  • edit() for groupchat 🔍⚡

Note: Remaining endpoints that need the mobile app are hidden until found.

Conclusion

For more information and if you wanna see the rest for yourself > 2.0 branch

If you want to give ideas or feedback, let me know!

@realcoloride
Copy link
Owner Author

Hello, after a long time and work, the I am proud to say the TTS/SST call feature is working

It is experimental and I am using ffmpeg/ffplay magic to play back and record audio, but you can also use the passthrough streams that will give you back PCM audio or listen for audio and transcribe it back.

You can also hang up interrupt the character when its speaking, mute yourself or know if the character is speaking with .isCharacterSpeaking, and some events to detect what you are saying (with plenty of parameters) and when the character starts and stops speaking are available.

    const call = await dm.call({
        microphoneDevice: 'default',
        useSpeakerForPlayback: true
    });

    call.on('userSpeechProgress', candidate => console.log("User speech candidate:", candidate));
    call.on('userSpeechEnded', candidate => console.log("User ended speech candidate:", candidate));

    call.on('characterBeganSpeaking', () => console.log("Character started speaking"));
    call.on('characterEndedSpeaking', () => console.log("Character ended speaking"));

For the microphone input, you can choose a microphone device by name or the default one (tested only on windows).
Everything here has not been tested on other platforms, so I cannot guarantee stability yet, but simply a foundation for later.

Playback is optional on the default speaker.

Warning: everything I say or I show is subject to change!

So, here is a demo:

p1ip92_1.mp4

There are a lot of voice related things to finish like editing voices and they need heavy testing and more stability insurance, but I am writing the base foundation to do heavy QA later on.

Cheers

@realcoloride
Copy link
Owner Author

Hello,

The main remaining objectives have been updated to be more accurate.

@realcoloride
Copy link
Owner Author

Hello,

Persona related features and management have been completed.

@realcoloride
Copy link
Owner Author

realcoloride commented Nov 16, 2024

Unavailable features list (temporary, subject to change)

Below are all implemented but limited features (not available yet) because of the following reasons and their potential solutions. Everything below is subject to change.

Class Method Reason Solution Would need mobile endpoint
Character setVote() CSRF Cookie required Find CSRF Cookie or mobile endpoint
Character hide() CSRF Cookie required Find CSRF Cookie or mobile endpoint
PublicProfile follow() CSRF Cookie required Find CSRF Cookie or mobile endpoint
PublicProfile unfollow() CSRF Cookie required Find CSRF Cookie or mobile endpoint

@realcoloride
Copy link
Owner Author

Hello, I need a quick poll to be done here for those viewing the thread.

I plan on making GroupChat functionalities later or now, which could take some time.


Please vote:
👍 Would you like to have a faster release but no group chat functionalities until its done in later updates
or
👎 Would you like to delay the release but with group chat functionalities expected


Note that Group Chats are a niche functionality that are not really used by most users and is only available on mobile.

@realcoloride
Copy link
Owner Author

realcoloride commented Dec 11, 2024

Hello, I began my README.md rewrite, if you have any suggestions or things you want me to add, please let me know.

To see: https://github.com/realcoloride/node_characterai/tree/2.0
To see the README.md file: https://github.com/realcoloride/node_characterai/blob/2.0/README.md

EDIT: The README.md is complete.

Cheers

@realcoloride
Copy link
Owner Author

realcoloride commented Dec 11, 2024

Hello again!

I have published an experimental beta version of the package with Typescript and Javascript support for 2.0.

If you wish to test the new 2.0 version in beta, as I will pass some real QA testing and fix any potential bugs, and to get your opinion on the new improved feel of the package.
Please feel free to try it out if you wish!

Warning: please do not use this in production, and use this preferably in a new project and wait for a stable release before migrating.
To install

npm install node_characterai@beta

To update if you want to check for updates (I will do regular beta updates until it is stable)

npm update node_characterai

The version includes the calling feature support. (EDIT: Currently broken, looking to fix.)
If you encounter any bugs or issues or you have questions, please let me know.

Also, please know that soon, I will merge the beta codebase with the main/master one. If you wish to keep that codebase somewhere or archive, feel free to fork it now.

Otherwise, stay tuned for the stable release.

@realcoloride
Copy link
Owner Author

realcoloride commented Dec 12, 2024

Hello,

I come to report that for some reason, when calling, I get weird crash dumps and crashes overall. It might be an issue related to rtc-node (which I tried to report the issue here: livekit/node-sdks#355) but it also might be something to do with naudiodon (the library I use for recording audio) or a cocktail of both doing some really strange stuff.

The call feature is currently broken until further notice, but everything else is not. I will keep you posted of course.

If you do get the same issues, please let me know. So far I can replicate it on windows.

I will try to find a workaround for the issue.

Cheers

@IqroNegoro
Copy link

Hello,

I come to report that for some reason, when calling, I get weird crash dumps and crashes overall. It might be an issue related to rtc-node (which I tried to report the issue here: livekit/node-sdks#355) but it also might be something to do with naudiodon (the library I use for recording audio) or a cocktail of both doing some really strange stuff.

The call feature is currently broken until further notice, but everything else is not. I will keep you posted of course.

If you do get the same issues, please let me know. So far I can replicate it on windows.

I will try to find a workaround for the issue.

Cheers

i have issue with this package too when installing this package,

it says

npm error gyp ERR! find VS msvs_version not set from command line or npm config
npm error gyp ERR! find VS VCINSTALLDIR not set, not running in VS Command Prompt
npm error gyp ERR! find VS could not use PowerShell to find Visual Studio 2017 or newer, try re-running with '--loglevel silly' for more details
npm error gyp ERR! find VS Failure details: undefined
npm error gyp ERR! find VS not looking for VS2017 as it is only supported up to Node.js 21
npm error gyp ERR! find VS not looking for VS2017 as it is only supported up to Node.js 21
npm error gyp ERR! find VS not looking for VS2017 as it is only supported up to Node.js 21
npm error gyp ERR! find VS not looking for VS2015 as it is only supported up to Node.js 18
npm error gyp ERR! find VS not looking for VS2013 as it is only supported up to Node.js 8
npm error gyp ERR! find VS
npm error gyp ERR! find VS **************************************************************
npm error gyp ERR! find VS You need to install the latest version of Visual Studio
npm error gyp ERR! find VS including the "Desktop development with C++" workload.
npm error gyp ERR! find VS For more information consult the documentation at:
npm error gyp ERR! find VS https://github.com/nodejs/node-gyp#on-windows
npm error gyp ERR! find VS **************************************************************
npm error gyp ERR! find VS
npm error gyp ERR! configure error
npm error gyp ERR! stack Error: Could not find any Visual Studio installation to use

should i really need installing the latest version VS2022? (tbh i dont want it for something that i don't use), this happen when installing naudiodon package (use --loglevel silly)

just wanna see your response that maybe facing this issue later, thank you!

@realcoloride
Copy link
Owner Author

Hello, to answer your response, I am not really sure. I'm trying to get rid of the audio libraries in question as soon as possible and I apologize for any inconvenience caused.

@feelinSleepy
Copy link

feelinSleepy commented Jan 5, 2025

Hello,

I come to report that for some reason, when calling, I get weird crash dumps and crashes overall. It might be an issue related to rtc-node (which I tried to report the issue here: livekit/node-sdks#355) but it also might be something to do with naudiodon (the library I use for recording audio) or a cocktail of both doing some really strange stuff.

The call feature is currently broken until further notice, but everything else is not. I will keep you posted of course.

If you do get the same issues, please let me know. So far I can replicate it on windows.

I will try to find a workaround for the issue.

Cheers

Hey again! I've been gone for a looong time. Seeing about getting everything updated, I'm getting an error on my RPi 3B+, it states :
"Error: Cannot find module '@livekit/rtc-node-linux-arm-gnueabihf'"
I'm assuming it's related to this issue?

EDIT : After some more digging, seems like it's simply a missing dependency based on the RPi's CPU. LiveKit doesn't seem to have released something to support the CPU though. If anyone knows anything more, let me know! For now, seems like I need to downgrade for the time being.

EDIT 2 : For those interested, it is indeed possible to simply lobotomize this fine man's code with a scalpel to remove all of the voice related code, and get it functioning properly(atleast, with the very basic messaging system) on an RPi. It seems to be working very well this way, as of tonight.

@realcoloride
Copy link
Owner Author

realcoloride commented Jan 6, 2025

Hello there,

"Error: Cannot find module '@livekit/rtc-node-linux-arm-gnueabihf'"
EDIT : After some more digging, seems like it's simply a missing dependency based on the RPi's CPU. LiveKit doesn't seem to have released something to support the CPU though. If anyone knows anything more, let me know! For now, seems like I need to downgrade for the time being.

Unfortunately this is a culmination of issues I can't really solve properly myself so I will have to find a compromise and make the voice features optional (as in, probably separating the packages). I deeply apologize for the inconveniences and I don't have the proper hardware (I currently only have Intel x64 based CPU machines with Windows or Linux) to extensively test every platform on.

While I try to make my code as open as possible, libraries like this make it really hard for me to interact with core components. In such I started working on my own audio processing node library in C++ but it is taking me more time than I expected and I am not sure where it will lead at the current moment.

This branch is still beta meaning I am trying to get things in order but seeing that the LiveKit team isn't really reactive to my issue and there are no workarounds and crashes are much more regular, it leaves me with either the decision of splitting the voice features as its own package with its own dependencies or having to find a build compromise.

There's also less realistic goals in my opinion like rewriting my own LiveKit client which would take more time than I need to but these issues need to be redirected to the respective teams. I cannot handle all of the problems for them.

EDIT 2 : For those interested, it is indeed possible to simply lobotomize this fine man's code with a scalpel to remove all of the voice related code, and get it functioning properly(atleast, with the very basic messaging system) on an RPi. It seems to be working very well this way, as of tonight.

While you are free to do so I don't recommend relying on that because if I make updates or update the code you will have trouble organizing. Hopefully I will try to find the best solution possible for everyone.

I had almost no free time for myself recently but I am still actively trying to tie the ropes together and make sure the best experience is ensured for the developers is possible, so, please feel free so suggest if you have ideas or comments aswell.

My goal is and always was to make the simplest and best experience possible but relying on unstable or unreliable dependencies create a lot of problems I hate having to go through since I do not have the full control on them.

EDIT: Any other feature than calling should be functional and OK. If not please report me the issues.

Cheers

@feelinSleepy
Copy link

Unfortunately this is a culmination of issues I can't really solve properly myself so I will have to find a compromise and make the voice features optional (as in, probably separating the packages). I deeply apologize for the inconveniences and I don't have the proper hardware (I currently only have Intel x64 based CPU machines with Windows or Linux) to extensively test every platform on.

While I try to make my code as open as possible, libraries like this make it really hard for me to interact with core components. In such I started working on my own audio processing node library in C++ but it is taking me more time than I expected and I am not sure where it will lead at the current moment.

This branch is still beta meaning I am trying to get things in order but seeing that the LiveKit team isn't really reactive to my issue and there are no workarounds and crashes are much more regular, it leaves me with either the decision of splitting the voice features as its own package with its own dependencies or having to find a build compromise.

There's also less realistic goals in my opinion like rewriting my own LiveKit client which would take more time than I need to but these issues need to be redirected to the respective teams. I cannot handle all of the problems for them.

Ah, you have nothing to apologize for, such is life when trying to make an API for something that doesn't seem to want one hahaha. In my personal opinion, I think it'd be best to simply split packages like you said. Different, more advanced functionalities(I.E. - voice) can be seperate and simply installed in tandem when/if the developer wants to use it.

While you are free to do so I don't recommend relying on that because if I make updates or update the code you will have trouble organizing. Hopefully I will try to find the best solution possible for everyone.

I had almost no free time for myself recently but I am still actively trying to tie the ropes together and make sure the best experience is ensured for the developers is possible, so, please feel free so suggest if you have ideas or comments aswell.

My goal is and always was to make the simplest and best experience possible but relying on unstable or unreliable dependencies create a lot of problems I hate having to go through since I do not have the full control on them.

EDIT: Any other feature than calling should be functional and OK. If not please report me the issues.

Cheers
Well, yeah, but such is the nature of the beast haha. It was either a hail Mary of lobotomization, or not being able to use it at all due to Livekit. There was only about six files that needed to be modified, and a few that got deleted.

Don't stress yourself! What you've done already is magical in my opinion, simply work on it when you can and want to, that's all anyone can ask for. So far, I haven't tested many features, I simply use the messaging part with DiscordJS to make my bot a little more interactive and fun for my server. They do seem to love it. Only noticable issue is that every few hours, it seems to stop working with CAI. I've tried both sending a message to a CAI bot to keep the account "active" every 3HRs and simply re-authing every 1.5HRs, but neither seems to have worked so far. I don't think its an issue with your code- I just need to figure out the nuances of CAI's(what I assume is) account inactivity disconnection. If you have any knowledge to share regarding that, please let me know! Other than that- have a great day friend, and keep up the outstanding work.

@realcoloride
Copy link
Owner Author

I just need to figure out the nuances of CAI's(what I assume is) account inactivity disconnection. If you have any knowledge to share regarding that, please let me know! Other than that- have a great day friend, and keep up the outstanding work.

Hello again, sorry for the late response. I believe that might be the websocket disconnecting. I need to look into automatically reconnecting if the websocket ever disconnects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants